feat: cumulative risk-unit action budgets with shadow mode

## Context

Research (2026-03-24): VentureBeat article on testing autonomous agents proposes risk-unit action budgets alongside monetary cost. Deep dive revealed a clear gap: **risk assessment is per-action and stateless** -- an agent can execute 50 MEDIUM-risk actions in a row with no escalation.

## Current State

- `RiskClassifier` maps ~25 `ActionType` values to 4 `ApprovalRiskLevel` tiers (per-action, stateless)
- `BudgetEnforcer` tracks only monetary cost (USD) with pre-flight/in-flight/task-boundary checks
- Progressive trust gates tool access but does not accumulate risk
- No concept of cumulative risk tracking or risk budget exhaustion
- No shadow mode for pre-deployment calibration

## Scope

### Risk scoring model (`security/risk_scorer.py`)
- `RiskScore` frozen Pydantic model with 4 float dimensions (0.0-1.0): `reversibility` (inverse), `blast_radius`, `data_sensitivity`, `external_visibility`
- Weighted sum produces scalar `risk_units: float`
- `RiskScorer` protocol (pluggable: static map, context-aware, or LLM-assisted)
- `DefaultRiskScorer` extends `_DEFAULT_RISK_MAP` to carry `RiskScore` per action type

### Cumulative risk tracker (`budget/risk_tracker.py`)
- `RiskRecord` (agent_id, task_id, action_type, risk_units, timestamp)
- `RiskTracker` parallel to `CostTracker`: append-only store, `get_agent_risk()`, `get_task_risk()`, `get_total_risk()`

### Risk budget enforcement
- Extend `BudgetConfig` with `risk_budget` section: `per_task_risk_limit`, `per_agent_daily_risk_limit`, `total_daily_risk_limit`, alert thresholds
- `BudgetEnforcer` gains `RiskTracker` dependency, parallel risk checks alongside monetary checks
- `RiskBudgetExhaustedError` (subclass of `BudgetExhaustedError`)
- Opt-in: `risk_budget.enabled: false` by default

### Shadow mode (`security/config.py`)
- `SecurityEnforcementMode` enum: `active` / `shadow` / `disabled`
- In shadow mode: `SecOpsService` logs verdicts and risk accumulation but never blocks
- Records what would have been escalated for calibration

### Auto-downgrade integration
- `RISK_BUDGET_EXHAUSTED` added to `DowngradeReason`
- Progressive trust: agents consuming high risk units earn trust more slowly

## Files

**New:**
- `src/synthorg/budget/risk_tracker.py`
- `src/synthorg/budget/risk_config.py`
- `src/synthorg/security/risk_scorer.py`

**Modified:**
- `src/synthorg/budget/config.py`, `enforcer.py`, `errors.py`
- `src/synthorg/security/rules/risk_classifier.py`, `service.py`, `config.py`, `models.py`
- `src/synthorg/core/enums.py`
- `src/synthorg/engine/agent_engine.py`
- `docs/design/operations.md`

## Deliverables

- [ ] `RiskScore` model and `RiskScorer` protocol with default implementation
- [ ] `RiskTracker` with cumulative per-agent/task/global tracking
- [ ] Risk budget config and enforcement in `BudgetEnforcer`
- [ ] Shadow mode in `SecOpsService`
- [ ] `RISK_BUDGET_EXHAUSTED` downgrade reason
- [ ] Unit tests for risk scoring, tracking, enforcement, shadow mode
- [ ] Design spec update (`docs/design/operations.md`)

## Research

- Deep dive: `research/risk-unit-action-budgets.md` (project memory)
- Source: [Testing Autonomous Agents (VentureBeat)](https://venturebeat.com/orchestration/testing-autonomous-agents-or-how-i-learned-to-stop-worrying-and-embrace)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cumulative risk-unit action budgets with shadow mode #806

Context

Current State

Scope

Risk scoring model (`security/risk_scorer.py`)

Cumulative risk tracker (`budget/risk_tracker.py`)

Risk budget enforcement

Shadow mode (`security/config.py`)

Auto-downgrade integration

Files

Deliverables

Research

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: cumulative risk-unit action budgets with shadow mode #806

Description

Context

Current State

Scope

Risk scoring model (security/risk_scorer.py)

Cumulative risk tracker (budget/risk_tracker.py)

Risk budget enforcement

Shadow mode (security/config.py)

Auto-downgrade integration

Files

Deliverables

Research

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Risk scoring model (`security/risk_scorer.py`)

Cumulative risk tracker (`budget/risk_tracker.py`)

Shadow mode (`security/config.py`)