P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed) by jayzalowitz · Pull Request #417 · jayzalowitz/skytwin

jayzalowitz · 2026-05-25T22:58:14Z

Summary

Closes #371. Both execution boundaries (auto-execute in events.ts, approve-execute in approvals.ts) were discarding the per-dimension RiskAssessment the decision-maker actually computed and fabricating new ones — flat explanation.riskTier broadcast across all dimensions in events.ts, hardcoded LOW in approvals.ts. A HIGH-financial-impact candidate was routed as LOW; the adapter selection picked a less-guarded adapter than intended. Safety Invariant #7 was structurally bypassed.

Changes

`apps/api/src/routes/events.ts`

Auto-execute reads outcome.riskAssessment directly (already attached by decision-maker.ts:263); falls back to decisionRepositoryAdapter.getRiskAssessment(actionId) if absent.
If both null → FAIL CLOSED: convert to a manual-approval row instead of executing with a synthetic assessment.
Drops now-unused DimensionAssessment, RiskTier, RiskDimension imports.
Approval-create payload now stamps outcome.selectedAction.id into the stored candidate_action JSONB so the approve handler can recover the lookup linkage (it previously stored everything except the id).

`apps/api/src/routes/approvals.ts`

Preserve the original candidate id from the stored action (was generating a fresh UUID, losing the linkage).
Look up persisted assessment via decisionRepositoryAdapter.getRiskAssessment.
Missing assessment → HTTP 409 risk_assessment_missing with an actionable message ("re-trigger the decision to regenerate it"). Legacy approval rows from before this PR don't carry an id and will hit the 409 — the recovery is a fresh decision evaluation.

Tests

events-routes.test.ts + feedback-loop.test.ts: mock decisionRepositoryAdapter.getRiskAssessment to return a baseline LOW assessment (existing fixtures didn't carry riskAssessment on the mocked outcome).
feedback-loop.test.ts candidate_action fixtures get a valid UUID id so the approve-path lookup succeeds.
All 713 tests still pass.

Verified

pnpm build --concurrency=1 ✓
pnpm test ✓ (713 pass, 24 skipped)
pnpm lint ✓

Test plan

Trigger an auto-execute path → execution runs with the decision-maker's per-dimension assessment (verify in execution-router logs the adapter was chosen per the actual tier, not a synthetic LOW).
Approve a pending approval with the persisted assessment present → executes.
Approve an approval whose candidate_action JSONB lacks an id (legacy row) → HTTP 409 with risk_assessment_missing.
Force outcome.riskAssessment to null in dev → request escalates to a new pending approval instead of auto-executing.

🤖 Generated with Claude Code

Closes #371. Safety Invariant #7 requires every CandidateAction carry a RiskAssessment that the router actually consumes. Pre-fix, BOTH execution boundaries (events.ts auto-execute, approvals.ts approve-execute) discarded the decision-maker's per-dimension assessment and constructed a fresh one: - events.ts re-derived from explanation.riskTier (a flat enum) and broadcast that single tier across all six dimensions. A candidate the decision-maker assessed HIGH on financial impact was routed LOW. - approvals.ts hardcoded LOW on every dimension with the comment "user-approved = lower risk." A human click does not move the underlying risk dimensions — the adapter selection then picked a less-guarded adapter than the decision-maker intended. Fix is on the consumer side; no changes to decision-maker, repository, or execution-router invariants: - events.ts: read `outcome.riskAssessment` directly (already attached by decision-maker.ts:263). Fall back to `decisionRepositoryAdapter.getRiskAssessment(actionId)` if absent. If both null, FAIL CLOSED: escalate to manual approval rather than fabricate. Drops the now-unused DimensionAssessment / RiskTier / RiskDimension imports. - events.ts approval-create payload now stamps the original `outcome.selectedAction.id` into the stored candidate_action JSONB, so the approve handler can recover the linkage. - approvals.ts: preserve the original candidate id from the stored action (was generating a fresh UUID, losing the lookup linkage). Look up the persisted assessment via decisionRepositoryAdapter. If missing, HTTP 409 risk_assessment_missing — same fail-closed posture as events.ts. Legacy approval rows from before this PR don't carry an `id` on the stored candidate_action and will hit the 409. The recovery is to re-trigger the decision so a fresh assessment is persisted. Tests: - events-routes.test.ts: mock decisionRepositoryAdapter.getRiskAssessment to return a baseline LOW assessment so existing autoExecute-path tests proceed (mocks didn't include riskAssessment on the outcome). - feedback-loop.test.ts: same mock + add valid UUID id to the candidate_action fixture rows so the approve path's lookup succeeds. - All 713 tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Ensures execution routing uses the decision-maker’s persisted per-dimension RiskAssessment at both execution boundaries (auto-execute and approve/execute), and fails closed when the assessment cannot be recovered—preventing adapter selection from being driven by synthetic/flattened risk tiers.

Changes:

Auto-execute now uses outcome.riskAssessment (fallback: decisionRepositoryAdapter.getRiskAssessment(actionId)) and escalates to manual approval if missing.
Approve/execute now preserves the stored candidate action id and looks up the persisted RiskAssessment; missing assessment returns HTTP 409.
Updates tests/fixtures to include candidate action IDs and mocks getRiskAssessment for existing suites.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
CHANGELOG.md	Documents the safety fix and fail-closed behavior for both boundaries.
apps/api/src/routes/events.ts	Uses persisted `RiskAssessment` for routing; escalates to approval if unavailable; stamps candidate `id` into approval payload.
apps/api/src/routes/approvals.ts	Preserves candidate `id`, loads persisted `RiskAssessment`, and fails closed with 409 when missing.
apps/api/src/tests/feedback-loop.test.ts	Mocks `getRiskAssessment` and updates approval fixtures to include `candidate_action.id`.
apps/api/src/tests/events-routes.test.ts	Mocks `getRiskAssessment` for auto-execute tests that don’t provide `outcome.riskAssessment`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jayzalowitz · 2026-05-25T23:33:48Z

+        // Look up the RiskAssessment the decision-maker actually
+        // computed for this candidate (#371). Pre-fix, this constructed
+        // a synthetic hardcoded-LOW assessment on every dimension under
+        // the assumption "user clicked approve = LOW risk." That is
+        // wrong: a user can approve a HIGH-tier financial-impact action,
+        // and the risk dimensions don't move because a human clicked
+        // once. The execution router's adapter-selection then picks a
+        // less-guarded adapter than the decision-maker intended.
+        //
+        // Fail closed: if the assessment is missing (legacy approval row
+        // pre-#371, or the candidate was synthesised with a non-UUID id),
+        // refuse execution rather than fabricating one.
+        const riskAssessment = originalCandidateId
+          ? await decisionRepositoryAdapter.getRiskAssessment(originalCandidateId)
+          : null;
+        if (!riskAssessment) {
+          log.warn('Approve execution blocked: no persisted risk assessment for candidate', {
+            decisionId: approval.decision_id,
+            candidateId: originalCandidateId,
+            approvalId: approval.id,
+            userId: body.userId,
+          });
+          res.status(409).json({
+            error: 'risk_assessment_missing',
+            message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.',
+            approvalId: approval.id,
+            requestId,
+          });
+          return;
+        }


Addressed in e52bcf8. The 409 now fires at preflight (right after the dual-confirm gate, before approvalRepository.respond), so the approval row stays unmodified and the user can retry after re-triggering the decision.

jayzalowitz · 2026-05-25T23:33:48Z

+        const riskAssessment: RiskAssessment | null = outcome.riskAssessment
+          ?? await decisionRepositoryAdapter.getRiskAssessment(
+            outcome.selectedAction.id,
+          );
+
+        if (!riskAssessment) {
+          log.warn('Auto-execute blocked: no persisted risk assessment for selected action — escalating to approval (#371)', {


Took a lighter approach than adding the dedicated route test — the post-Copilot commit ensures outcome.riskAssessment is consumed when present (no null path triggered in the existing fixtures). A dedicated null-null escalation test is filed as a follow-up rather than blocking this PR.

jayzalowitz · 2026-05-25T23:33:49Z

+          res.status(409).json({
+            error: 'risk_assessment_missing',
+            message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.',
+            approvalId: approval.id,
+            requestId,
+          });
+          return;
+        }


Addressed in e52bcf8. Same fix as the parent thread — 409 fires at preflight before state mutation, with a clear retry-after-re-trigger message.

jayzalowitz · 2026-05-25T23:33:50Z

+  decisionRepositoryAdapter: {
+    getRiskAssessment: vi.fn().mockResolvedValue({
+      actionId: 'test',
+      overallTier: 'low',
+      dimensions: {


Addressed in e52bcf8. Both events-routes.test.ts and feedback-loop.test.ts mocks now use mockImplementation((actionId) => ({ actionId, ... })) so the returned assessment echoes the requested id and the router's actionId-match invariant can't be silently bypassed in tests.

…n test mocks Copilot review on PR #417 caught two concerns: 1. **The 409 risk_assessment_missing branch fired AFTER side effects.** approvalRepository.respond() had already marked the row 'approved', feedback was recorded, the episode was written, the memory port recorded the episode — then the 409 left an "approved" row with no execution and no way to retry (status conflict on the next attempt). Fix: pre-check the persisted RiskAssessment immediately after the dual-confirm gate, before any state mutation. The lookup uses `existing.candidate_action.id` (which events.ts now stamps) and stores both `preflightCandidateId` + `preflightRiskAssessment` for the later execute block to reuse. The execute block's redundant lookup is removed; it asserts non-null on the preflight result because the early 409 already returned for the missing case. 2. **Test mocks returned a fixed actionId, masking the router's actionId-match invariant.** Fix: both events-routes.test.ts and feedback-loop.test.ts now use `mockImplementation(async (actionId) => ({ actionId, ... }))` so the returned assessment echoes the input, and the `action.id === riskAssessment.actionId` invariant in execution-router can't silently pass on a mismatched fixture. All 713 API tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 25, 2026 22:58

Copilot started reviewing on behalf of jayzalowitz May 25, 2026 22:58 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

jayzalowitz merged commit 542339f into main May 25, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed)#417

P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed)#417
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/p1.1-risk-assessment

jayzalowitz commented May 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

jayzalowitz May 25, 2026

Uh oh!

jayzalowitz May 25, 2026

Uh oh!

jayzalowitz May 25, 2026

Uh oh!

jayzalowitz May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayzalowitz commented May 25, 2026

Summary

Changes

apps/api/src/routes/events.ts

apps/api/src/routes/approvals.ts

Tests

Verified

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

jayzalowitz May 25, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 25, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 25, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`apps/api/src/routes/events.ts`

`apps/api/src/routes/approvals.ts`