Risk scoring applies quantitative methods to agent oversight. Rather than waiting for a human to notice that an agent is doing something problematic, a risk model continuously evaluates observable features — what files are being modified, how fast tokens are being consumed, what the error rate looks like, whether the agent is operating within its defined scope — and produces a score that can trigger alerts or automatic quarantine.
The features used in risk models typically combine static factors (file sensitivity based on path patterns), dynamic factors (rate of change versus the agent's historical baseline), and outcome factors (test pass rate, reviewer feedback). Logistic regression is a common modeling approach because the feature weights are interpretable: you can explain why a score is high in terms of specific feature contributions.
Risk scoring is not a substitute for review of agent output quality. It is an early warning system for behavioral anomalies that are detectable without reading the agent's code. A high risk score should trigger closer scrutiny; it does not by itself indicate the agent has done something wrong.