fix(decision-engine): record per-candidate policy verdicts; close partial-block leak#82
Merged
Merged
Conversation
…tial-block leak Closes 2/3 of #80. Items 2 (whatWouldIDo partial-block leak) and 3 (blocked-by-policy explanation persistence). E2E coverage (item 1) deferred to a follow-up. What changed: - DecisionOutcome gains an optional policyVerdicts: Record<actionId, 'allowed' | 'requires-approval' | 'denied'>. Populated by evaluate(), not persisted (DB layer ignores unknown fields). - evaluate() now evaluates every candidate against policy instead of breaking at the first allowed one. The first non-denied candidate still wins (no behavior change for selection) but lower-scored candidates get verdicts so consumers can filter accurately. Adds K-1 extra in-memory policy checks per decision in the worst case. - whatWouldIDo filters alternativeActions using verdicts. Blocked candidates no longer leak as suggestions. Conservative fallback: when verdicts are unavailable (legacy code path or stale outcome), alternatives default to empty. - ExplanationGenerator now sets escalationRationale for the no-action case (every candidate blocked). Previously the audit log dropped the policy-block reason. Safety Invariant #2. - formatForUser uses a context-aware label: "Why no action was taken" for blocked-by-policy records, "Why approval was needed" for requires-approval records. Tests: +6 (decision-engine: +3, explanations: +3). All existing tests still pass: decision-engine 82/82, explanations 33/33, full repo 38/38 turbo tasks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes the “partial-block leak” by recording a per-candidate policy verdict during decision evaluation and using those verdicts to filter whatWouldIDo alternatives, and it also ensures blocked-by-policy explanations persist the escalation rationale for audit fidelity.
Changes:
- Extend
DecisionOutcomewith optionalpolicyVerdictsand add thePolicyVerdicttype. - Update
DecisionMaker.evaluate()to evaluate policy for every candidate and populatepolicyVerdicts; updatewhatWouldIDoto conservatively filter alternatives using these verdicts. - Persist
escalationRationalefor the no-action (all-blocked) path and adjust user formatting labels accordingly; add unit tests for both fixes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/shared-types/src/index.ts | Re-exports the new PolicyVerdict type. |
| packages/shared-types/src/decision.ts | Adds DecisionOutcome.policyVerdicts and defines the PolicyVerdict union. |
| packages/decision-engine/src/decision-maker.ts | Records per-candidate policy verdicts in evaluate() and filters whatWouldIDo alternatives to non-denied candidates (with conservative fallback). |
| packages/decision-engine/src/tests/decision-maker.test.ts | Adds coverage for policyVerdicts population and partial/legacy filtering behavior in whatWouldIDo. |
| packages/explanations/src/explanation-generator.ts | Ensures escalationRationale is set for no-action outcomes and uses a context-aware label in formatForUser. |
| packages/explanations/src/tests/explanation-generator.test.ts | Adds tests for escalation rationale persistence and the new label behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2 tasks
jayzalowitz
added a commit
that referenced
this pull request
Apr 27, 2026
…h work (#87) Captures eight PRs (#77, #78, #79, #81, #82, #83, #84, #85, #86) under an [Unreleased] section. No code changes — pure documentation. Bumping VERSION + cutting a tag is left as a separate decision so the release moment stays explicit. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes 2/3 of #80 (items 2 and 3). E2E coverage (item 1) deferred to a follow-up PR.
Item 2 — Partial-block leak in `whatWouldIDo` (the bigger fix)
Item 3 — Blocked-by-policy explanation persistence
Test plan
Refs #80
🤖 Generated with Claude Code