Skip to content

P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed)#417

Merged
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/p1.1-risk-assessment
May 25, 2026
Merged

P1.1 #371: use persisted RiskAssessment at execution boundaries (fail closed)#417
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/p1.1-risk-assessment

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Closes #371. Both execution boundaries (auto-execute in events.ts, approve-execute in approvals.ts) were discarding the per-dimension RiskAssessment the decision-maker actually computed and fabricating new ones — flat explanation.riskTier broadcast across all dimensions in events.ts, hardcoded LOW in approvals.ts. A HIGH-financial-impact candidate was routed as LOW; the adapter selection picked a less-guarded adapter than intended. Safety Invariant #7 was structurally bypassed.

Changes

apps/api/src/routes/events.ts

  • Auto-execute reads outcome.riskAssessment directly (already attached by decision-maker.ts:263); falls back to decisionRepositoryAdapter.getRiskAssessment(actionId) if absent.
  • If both null → FAIL CLOSED: convert to a manual-approval row instead of executing with a synthetic assessment.
  • Drops now-unused DimensionAssessment, RiskTier, RiskDimension imports.
  • Approval-create payload now stamps outcome.selectedAction.id into the stored candidate_action JSONB so the approve handler can recover the lookup linkage (it previously stored everything except the id).

apps/api/src/routes/approvals.ts

  • Preserve the original candidate id from the stored action (was generating a fresh UUID, losing the linkage).
  • Look up persisted assessment via decisionRepositoryAdapter.getRiskAssessment.
  • Missing assessment → HTTP 409 risk_assessment_missing with an actionable message ("re-trigger the decision to regenerate it"). Legacy approval rows from before this PR don't carry an id and will hit the 409 — the recovery is a fresh decision evaluation.

Tests

  • events-routes.test.ts + feedback-loop.test.ts: mock decisionRepositoryAdapter.getRiskAssessment to return a baseline LOW assessment (existing fixtures didn't carry riskAssessment on the mocked outcome).
  • feedback-loop.test.ts candidate_action fixtures get a valid UUID id so the approve-path lookup succeeds.
  • All 713 tests still pass.

Verified

  • pnpm build --concurrency=1
  • pnpm test ✓ (713 pass, 24 skipped)
  • pnpm lint

Test plan

  • Trigger an auto-execute path → execution runs with the decision-maker's per-dimension assessment (verify in execution-router logs the adapter was chosen per the actual tier, not a synthetic LOW).
  • Approve a pending approval with the persisted assessment present → executes.
  • Approve an approval whose candidate_action JSONB lacks an id (legacy row) → HTTP 409 with risk_assessment_missing.
  • Force outcome.riskAssessment to null in dev → request escalates to a new pending approval instead of auto-executing.

🤖 Generated with Claude Code

Closes #371.

Safety Invariant #7 requires every CandidateAction carry a RiskAssessment
that the router actually consumes. Pre-fix, BOTH execution boundaries
(events.ts auto-execute, approvals.ts approve-execute) discarded the
decision-maker's per-dimension assessment and constructed a fresh one:

- events.ts re-derived from explanation.riskTier (a flat enum) and
  broadcast that single tier across all six dimensions. A candidate
  the decision-maker assessed HIGH on financial impact was routed LOW.
- approvals.ts hardcoded LOW on every dimension with the comment
  "user-approved = lower risk." A human click does not move the
  underlying risk dimensions — the adapter selection then picked a
  less-guarded adapter than the decision-maker intended.

Fix is on the consumer side; no changes to decision-maker, repository,
or execution-router invariants:

- events.ts: read `outcome.riskAssessment` directly (already attached
  by decision-maker.ts:263). Fall back to
  `decisionRepositoryAdapter.getRiskAssessment(actionId)` if absent.
  If both null, FAIL CLOSED: escalate to manual approval rather than
  fabricate. Drops the now-unused DimensionAssessment / RiskTier /
  RiskDimension imports.
- events.ts approval-create payload now stamps the original
  `outcome.selectedAction.id` into the stored candidate_action JSONB,
  so the approve handler can recover the linkage.
- approvals.ts: preserve the original candidate id from the stored
  action (was generating a fresh UUID, losing the lookup linkage).
  Look up the persisted assessment via decisionRepositoryAdapter. If
  missing, HTTP 409 risk_assessment_missing — same fail-closed
  posture as events.ts.

Legacy approval rows from before this PR don't carry an `id` on the
stored candidate_action and will hit the 409. The recovery is to
re-trigger the decision so a fresh assessment is persisted.

Tests:
- events-routes.test.ts: mock decisionRepositoryAdapter.getRiskAssessment
  to return a baseline LOW assessment so existing autoExecute-path tests
  proceed (mocks didn't include riskAssessment on the outcome).
- feedback-loop.test.ts: same mock + add valid UUID id to the
  candidate_action fixture rows so the approve path's lookup succeeds.
- All 713 tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 25, 2026 22:58

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ensures execution routing uses the decision-maker’s persisted per-dimension RiskAssessment at both execution boundaries (auto-execute and approve/execute), and fails closed when the assessment cannot be recovered—preventing adapter selection from being driven by synthetic/flattened risk tiers.

Changes:

  • Auto-execute now uses outcome.riskAssessment (fallback: decisionRepositoryAdapter.getRiskAssessment(actionId)) and escalates to manual approval if missing.
  • Approve/execute now preserves the stored candidate action id and looks up the persisted RiskAssessment; missing assessment returns HTTP 409.
  • Updates tests/fixtures to include candidate action IDs and mocks getRiskAssessment for existing suites.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
CHANGELOG.md Documents the safety fix and fail-closed behavior for both boundaries.
apps/api/src/routes/events.ts Uses persisted RiskAssessment for routing; escalates to approval if unavailable; stamps candidate id into approval payload.
apps/api/src/routes/approvals.ts Preserves candidate id, loads persisted RiskAssessment, and fails closed with 409 when missing.
apps/api/src/tests/feedback-loop.test.ts Mocks getRiskAssessment and updates approval fixtures to include candidate_action.id.
apps/api/src/tests/events-routes.test.ts Mocks getRiskAssessment for auto-execute tests that don’t provide outcome.riskAssessment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apps/api/src/routes/approvals.ts Outdated
Comment on lines +436 to +465
// Look up the RiskAssessment the decision-maker actually
// computed for this candidate (#371). Pre-fix, this constructed
// a synthetic hardcoded-LOW assessment on every dimension under
// the assumption "user clicked approve = LOW risk." That is
// wrong: a user can approve a HIGH-tier financial-impact action,
// and the risk dimensions don't move because a human clicked
// once. The execution router's adapter-selection then picks a
// less-guarded adapter than the decision-maker intended.
//
// Fail closed: if the assessment is missing (legacy approval row
// pre-#371, or the candidate was synthesised with a non-UUID id),
// refuse execution rather than fabricating one.
const riskAssessment = originalCandidateId
? await decisionRepositoryAdapter.getRiskAssessment(originalCandidateId)
: null;
if (!riskAssessment) {
log.warn('Approve execution blocked: no persisted risk assessment for candidate', {
decisionId: approval.decision_id,
candidateId: originalCandidateId,
approvalId: approval.id,
userId: body.userId,
});
res.status(409).json({
error: 'risk_assessment_missing',
message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.',
approvalId: approval.id,
requestId,
});
return;
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e52bcf8. The 409 now fires at preflight (right after the dual-confirm gate, before approvalRepository.respond), so the approval row stays unmodified and the user can retry after re-triggering the decision.

Comment on lines +579 to +585
const riskAssessment: RiskAssessment | null = outcome.riskAssessment
?? await decisionRepositoryAdapter.getRiskAssessment(
outcome.selectedAction.id,
);

if (!riskAssessment) {
log.warn('Auto-execute blocked: no persisted risk assessment for selected action — escalating to approval (#371)', {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a lighter approach than adding the dedicated route test — the post-Copilot commit ensures outcome.riskAssessment is consumed when present (no null path triggered in the existing fixtures). A dedicated null-null escalation test is filed as a follow-up rather than blocking this PR.

Comment thread apps/api/src/routes/approvals.ts Outdated
Comment on lines +458 to +465
res.status(409).json({
error: 'risk_assessment_missing',
message: 'This approval cannot be executed: the original risk assessment is not on file. Re-trigger the decision to regenerate it.',
approvalId: approval.id,
requestId,
});
return;
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e52bcf8. Same fix as the parent thread — 409 fires at preflight before state mutation, with a clear retry-after-re-trigger message.

Comment on lines +61 to +65
decisionRepositoryAdapter: {
getRiskAssessment: vi.fn().mockResolvedValue({
actionId: 'test',
overallTier: 'low',
dimensions: {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e52bcf8. Both events-routes.test.ts and feedback-loop.test.ts mocks now use mockImplementation((actionId) => ({ actionId, ... })) so the returned assessment echoes the requested id and the router's actionId-match invariant can't be silently bypassed in tests.

…n test mocks

Copilot review on PR #417 caught two concerns:

1. **The 409 risk_assessment_missing branch fired AFTER side effects.**
   approvalRepository.respond() had already marked the row 'approved',
   feedback was recorded, the episode was written, the memory port
   recorded the episode — then the 409 left an "approved" row with no
   execution and no way to retry (status conflict on the next attempt).

   Fix: pre-check the persisted RiskAssessment immediately after the
   dual-confirm gate, before any state mutation. The lookup uses
   `existing.candidate_action.id` (which events.ts now stamps) and
   stores both `preflightCandidateId` + `preflightRiskAssessment` for
   the later execute block to reuse. The execute block's redundant
   lookup is removed; it asserts non-null on the preflight result
   because the early 409 already returned for the missing case.

2. **Test mocks returned a fixed actionId, masking the router's
   actionId-match invariant.**

   Fix: both events-routes.test.ts and feedback-loop.test.ts now use
   `mockImplementation(async (actionId) => ({ actionId, ... }))` so the
   returned assessment echoes the input, and the
   `action.id === riskAssessment.actionId` invariant in
   execution-router can't silently pass on a mismatched fixture.

All 713 API tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jayzalowitz jayzalowitz merged commit 542339f into main May 25, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P1.1 RiskAssessment fork in auto-execute path

2 participants