P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed)#417
Conversation
Closes #371. Safety Invariant #7 requires every CandidateAction carry a RiskAssessment that the router actually consumes. Pre-fix, BOTH execution boundaries (events.ts auto-execute, approvals.ts approve-execute) discarded the decision-maker's per-dimension assessment and constructed a fresh one: - events.ts re-derived from explanation.riskTier (a flat enum) and broadcast that single tier across all six dimensions. A candidate the decision-maker assessed HIGH on financial impact was routed LOW. - approvals.ts hardcoded LOW on every dimension with the comment "user-approved = lower risk." A human click does not move the underlying risk dimensions — the adapter selection then picked a less-guarded adapter than the decision-maker intended. Fix is on the consumer side; no changes to decision-maker, repository, or execution-router invariants: - events.ts: read `outcome.riskAssessment` directly (already attached by decision-maker.ts:263). Fall back to `decisionRepositoryAdapter.getRiskAssessment(actionId)` if absent. If both null, FAIL CLOSED: escalate to manual approval rather than fabricate. Drops the now-unused DimensionAssessment / RiskTier / RiskDimension imports. - events.ts approval-create payload now stamps the original `outcome.selectedAction.id` into the stored candidate_action JSONB, so the approve handler can recover the linkage. - approvals.ts: preserve the original candidate id from the stored action (was generating a fresh UUID, losing the lookup linkage). Look up the persisted assessment via decisionRepositoryAdapter. If missing, HTTP 409 risk_assessment_missing — same fail-closed posture as events.ts. Legacy approval rows from before this PR don't carry an `id` on the stored candidate_action and will hit the 409. The recovery is to re-trigger the decision so a fresh assessment is persisted. Tests: - events-routes.test.ts: mock decisionRepositoryAdapter.getRiskAssessment to return a baseline LOW assessment so existing autoExecute-path tests proceed (mocks didn't include riskAssessment on the outcome). - feedback-loop.test.ts: same mock + add valid UUID id to the candidate_action fixture rows so the approve path's lookup succeeds. - All 713 tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Ensures execution routing uses the decision-maker’s persisted per-dimension RiskAssessment at both execution boundaries (auto-execute and approve/execute), and fails closed when the assessment cannot be recovered—preventing adapter selection from being driven by synthetic/flattened risk tiers.
Changes:
- Auto-execute now uses
outcome.riskAssessment(fallback:decisionRepositoryAdapter.getRiskAssessment(actionId)) and escalates to manual approval if missing. - Approve/execute now preserves the stored candidate action
idand looks up the persistedRiskAssessment; missing assessment returns HTTP 409. - Updates tests/fixtures to include candidate action IDs and mocks
getRiskAssessmentfor existing suites.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| CHANGELOG.md | Documents the safety fix and fail-closed behavior for both boundaries. |
| apps/api/src/routes/events.ts | Uses persisted RiskAssessment for routing; escalates to approval if unavailable; stamps candidate id into approval payload. |
| apps/api/src/routes/approvals.ts | Preserves candidate id, loads persisted RiskAssessment, and fails closed with 409 when missing. |
| apps/api/src/tests/feedback-loop.test.ts | Mocks getRiskAssessment and updates approval fixtures to include candidate_action.id. |
| apps/api/src/tests/events-routes.test.ts | Mocks getRiskAssessment for auto-execute tests that don’t provide outcome.riskAssessment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Look up the RiskAssessment the decision-maker actually | ||
| // computed for this candidate (#371). Pre-fix, this constructed | ||
| // a synthetic hardcoded-LOW assessment on every dimension under | ||
| // the assumption "user clicked approve = LOW risk." That is | ||
| // wrong: a user can approve a HIGH-tier financial-impact action, | ||
| // and the risk dimensions don't move because a human clicked | ||
| // once. The execution router's adapter-selection then picks a | ||
| // less-guarded adapter than the decision-maker intended. | ||
| // | ||
| // Fail closed: if the assessment is missing (legacy approval row | ||
| // pre-#371, or the candidate was synthesised with a non-UUID id), | ||
| // refuse execution rather than fabricating one. | ||
| const riskAssessment = originalCandidateId | ||
| ? await decisionRepositoryAdapter.getRiskAssessment(originalCandidateId) | ||
| : null; | ||
| if (!riskAssessment) { | ||
| log.warn('Approve execution blocked: no persisted risk assessment for candidate', { | ||
| decisionId: approval.decision_id, | ||
| candidateId: originalCandidateId, | ||
| approvalId: approval.id, | ||
| userId: body.userId, | ||
| }); | ||
| res.status(409).json({ | ||
| error: 'risk_assessment_missing', | ||
| message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.', | ||
| approvalId: approval.id, | ||
| requestId, | ||
| }); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Addressed in e52bcf8. The 409 now fires at preflight (right after the dual-confirm gate, before approvalRepository.respond), so the approval row stays unmodified and the user can retry after re-triggering the decision.
| const riskAssessment: RiskAssessment | null = outcome.riskAssessment | ||
| ?? await decisionRepositoryAdapter.getRiskAssessment( | ||
| outcome.selectedAction.id, | ||
| ); | ||
|
|
||
| if (!riskAssessment) { | ||
| log.warn('Auto-execute blocked: no persisted risk assessment for selected action — escalating to approval (#371)', { |
There was a problem hiding this comment.
Took a lighter approach than adding the dedicated route test — the post-Copilot commit ensures outcome.riskAssessment is consumed when present (no null path triggered in the existing fixtures). A dedicated null-null escalation test is filed as a follow-up rather than blocking this PR.
| res.status(409).json({ | ||
| error: 'risk_assessment_missing', | ||
| message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.', | ||
| approvalId: approval.id, | ||
| requestId, | ||
| }); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Addressed in e52bcf8. Same fix as the parent thread — 409 fires at preflight before state mutation, with a clear retry-after-re-trigger message.
| decisionRepositoryAdapter: { | ||
| getRiskAssessment: vi.fn().mockResolvedValue({ | ||
| actionId: 'test', | ||
| overallTier: 'low', | ||
| dimensions: { |
There was a problem hiding this comment.
Addressed in e52bcf8. Both events-routes.test.ts and feedback-loop.test.ts mocks now use mockImplementation((actionId) => ({ actionId, ... })) so the returned assessment echoes the requested id and the router's actionId-match invariant can't be silently bypassed in tests.
…n test mocks Copilot review on PR #417 caught two concerns: 1. **The 409 risk_assessment_missing branch fired AFTER side effects.** approvalRepository.respond() had already marked the row 'approved', feedback was recorded, the episode was written, the memory port recorded the episode — then the 409 left an "approved" row with no execution and no way to retry (status conflict on the next attempt). Fix: pre-check the persisted RiskAssessment immediately after the dual-confirm gate, before any state mutation. The lookup uses `existing.candidate_action.id` (which events.ts now stamps) and stores both `preflightCandidateId` + `preflightRiskAssessment` for the later execute block to reuse. The execute block's redundant lookup is removed; it asserts non-null on the preflight result because the early 409 already returned for the missing case. 2. **Test mocks returned a fixed actionId, masking the router's actionId-match invariant.** Fix: both events-routes.test.ts and feedback-loop.test.ts now use `mockImplementation(async (actionId) => ({ actionId, ... }))` so the returned assessment echoes the input, and the `action.id === riskAssessment.actionId` invariant in execution-router can't silently pass on a mismatched fixture. All 713 API tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Closes #371. Both execution boundaries (auto-execute in
events.ts, approve-execute inapprovals.ts) were discarding the per-dimensionRiskAssessmentthe decision-maker actually computed and fabricating new ones — flatexplanation.riskTierbroadcast across all dimensions in events.ts, hardcodedLOWin approvals.ts. A HIGH-financial-impact candidate was routed as LOW; the adapter selection picked a less-guarded adapter than intended. Safety Invariant #7 was structurally bypassed.Changes
apps/api/src/routes/events.tsoutcome.riskAssessmentdirectly (already attached bydecision-maker.ts:263); falls back todecisionRepositoryAdapter.getRiskAssessment(actionId)if absent.DimensionAssessment,RiskTier,RiskDimensionimports.outcome.selectedAction.idinto the storedcandidate_actionJSONB so the approve handler can recover the lookup linkage (it previously stored everything except the id).apps/api/src/routes/approvals.tsdecisionRepositoryAdapter.getRiskAssessment.risk_assessment_missingwith an actionable message ("re-trigger the decision to regenerate it"). Legacy approval rows from before this PR don't carry an id and will hit the 409 — the recovery is a fresh decision evaluation.Tests
events-routes.test.ts+feedback-loop.test.ts: mockdecisionRepositoryAdapter.getRiskAssessmentto return a baseline LOW assessment (existing fixtures didn't carryriskAssessmenton the mocked outcome).feedback-loop.test.tscandidate_action fixtures get a valid UUIDidso the approve-path lookup succeeds.Verified
pnpm build --concurrency=1✓pnpm test✓ (713 pass, 24 skipped)pnpm lint✓Test plan
candidate_actionJSONB lacks anid(legacy row) → HTTP 409 withrisk_assessment_missing.outcome.riskAssessmentto null in dev → request escalates to a new pending approval instead of auto-executing.🤖 Generated with Claude Code