Safety kernel follow-up: E2E coverage, partial-block leak, blocked-by-policy explanation persistence

## Context

Three items surfaced during the post-merge review of #75 (PR #78). The PR landed the router boundary guards, the \`whatWouldIDo\` no-leak fix for the all-blocked case, and the \`decision:blocked-by-policy\` SSE event — but two acceptance criteria from the original plan were deferred and one new finding emerged from reviewing the merged code.

This issue tracks all three so they don't get lost.

## Current State (verified 2026-04-26, post #78 merge)

### 1. E2E coverage for the safety kernel — deferred from #75

The original #75 plan listed two e2e tests as acceptance criteria #1–#5:

- E2E: ingest event → policy blocks all candidates → assert \`outcome.autoExecute === false\`, no row in \`execution_plans\`, no \`decision:executed\` SSE event
- E2E: ingest event with \`trustTier: 'observer'\` → assert approval row created, \`execution_plans\` count unchanged → approve via API → assert plan created within 2s

Neither shipped. The unit-level coverage is solid (router guards + decision-maker mock evaluator + events-routes handler test), but no test exercises the full DB → API → SSE path. A future refactor that bypasses the handler's \`outcome.autoExecute\` guard would not be caught by the current suite.

### 2. Partial-block leak in \`whatWouldIDo\`

\`packages/decision-engine/src/decision-maker.ts:240–246\` now handles the all-blocked case (returns empty \`alternativeActions\` when \`selectedAction === null\`).

But the partial case still leaks. \`evaluate()\` (lines 139–162) iterates scored candidates, calls \`policyEvaluator.evaluate\` for each, and \`break\`s on the first allowed one. Lower-scored candidates are never policy-evaluated. \`outcome.allCandidates\` therefore mixes:
- (a) the winner
- (b) higher-scored candidates that were policy-blocked
- (c) lower-scored candidates that were never evaluated

\`alternativeActions = allCandidates - selectedAction\` returns (b) + (c). Group (b) is a real leak: those candidates were explicitly denied but \`whatWouldIDo\` surfaces them as suggestions the user could take.

### 3. Blocked-by-policy ExplanationRecord persistence

#75 acceptance criterion #9 said the handler should "write an \`ExplanationRecord\` with non-empty \`escalationRationale\`" when policy blocks every candidate. PR #78 only emits the SSE event — the explanation record is not written for the no-action path.

\`apps/api/src/routes/events.ts:177–181\` already calls \`explanationGenerator.generate(...)\` before the branch logic, but for the no-action case the resulting \`ExplanationRecord\` has \`escalationRationale: undefined\` (because \`outcome.requiresApproval\` is false). The audit log loses the policy-block reason.

### 4. Unreachable \`!action\` guard

\`packages/execution-router/src/execution-router.ts:152–154\` checks \`if (!action)\` inside \`assertValidExecutionInputs\`. TypeScript guarantees \`CandidateAction\` is non-null, so this branch is unreachable in practice. No test exercises it.

## Proposed Change

### 1. E2E tests (largest item)
Add two tests to \`apps/api/src/__tests__/e2e-api.test.ts\` under a new \`Policy safety kernel\` describe block. They run only with \`E2E=true\` so they don't gate normal CI.

### 2. Per-candidate policy verdicts
Modify \`evaluate()\` in \`decision-maker.ts\` to record the policy verdict for every scored candidate (\`'allowed' | 'requires-approval' | 'denied'\`). Either widen \`DecisionOutcome\` to carry a \`Map<actionId, verdict>\` or thread the verdicts through internally and expose them only to \`whatWouldIDo\`.

\`whatWouldIDo\` filters \`alternativeActions\` to verdicts === \`'allowed'\` or \`'requires-approval'\`.

### 3. Explanation persistence for blocked-by-policy
Update \`buildEscalationRationale\` (or generate logic) so the no-action case sets \`escalationRationale\` to \`outcome.reasoning\`. Same audit fidelity as the requires-approval path.

### 4. Cover the \`!action\` guard
Add one test asserting \`assertValidExecutionInputs(null as unknown as CandidateAction, validAssessment)\` throws.

## Acceptance Criteria

1. \`E2E=true\` test for blocking policy: ingest event → response \`outcome.autoExecute === false\`, \`outcome.selectedAction === null\` → DB query for \`execution_plans\` by \`decision_id\` returns 0 rows → no \`decision:executed\` SSE captured during the test window
2. \`E2E=true\` test for approval gate: ingest event with \`trustTier: 'observer'\` → approval pending, \`execution_plans\` count unchanged → \`POST /api/approvals/:id/respond\` with action \`approve\` → \`execution_plans\` row created with matching \`decision_id\` within 2s
3. \`whatWouldIDo\` with one candidate allowed and two candidates blocked → \`response.alternativeActions\` does not include either blocked candidate
4. Same scenario, all candidates blocked → \`response.alternativeActions === []\` (regression test for current behavior)
5. \`apps/api/src/routes/events.ts\` no-action branch → persisted \`ExplanationRecord\` for that decision has non-empty \`escalationRationale\` containing \`outcome.reasoning\`
6. \`assertValidExecutionInputs(null as unknown as CandidateAction, validAssessment)\` throws \`InvariantViolationError\`
7. All existing tests pass; net new tests: ~6
8. PR passes \`/review\` before merge

## Effort Estimate

- E2E tests (#1, #2): ~1.5h (mostly fixture wiring + SSE tail helpers)
- Per-candidate verdicts (#3, #4): ~1.5h (touches DecisionOutcome type, evaluate(), whatWouldIDo, plus new tests)
- Explanation persistence (#5): ~30min
- \`!action\` guard test (#6): ~5min

**Total: ~3.5h Claude Code time**

## Files Reference

| File | Change |
|------|--------|
| \`apps/api/src/__tests__/e2e-api.test.ts\` | New \`Policy safety kernel\` describe (+2 e2e tests) |
| \`packages/decision-engine/src/decision-maker.ts\` | Track per-candidate verdicts in \`evaluate()\`, filter in \`whatWouldIDo\` |
| \`packages/decision-engine/src/__tests__/decision-maker.test.ts\` | Partial-block leak tests (+2) |
| \`packages/explanations/src/explanation-generator.ts\` | Set \`escalationRationale\` for no-action case |
| \`packages/execution-router/src/__tests__/execution-router.test.ts\` | Cover the \`!action\` branch (+1) |
| \`apps/api/src/routes/events.ts\` | (No change — explanation generator already called; behavior follows from #5) |

## Non-Goals

- Not unifying audit logging across SSE + DB — separate work
- Not adding \`InvariantViolationError\` to \`@skytwin/core\` — only relevant if other packages start enforcing the same invariants

## Related

- Closed by partial work in #78
- Companion to #76 (explanations test coverage) — adds the missing escalationRationale path


File	Change
`apps/api/src/tests/e2e-api.test.ts`	New `Policy safety kernel` describe (+2 e2e tests)
`packages/decision-engine/src/decision-maker.ts`	Track per-candidate verdicts in `evaluate()`, filter in `whatWouldIDo`
`packages/decision-engine/src/tests/decision-maker.test.ts`	Partial-block leak tests (+2)
`packages/explanations/src/explanation-generator.ts`	Set `escalationRationale` for no-action case
`packages/execution-router/src/tests/execution-router.test.ts`	Cover the `!action` branch (+1)
`apps/api/src/routes/events.ts`	(No change — explanation generator already called; behavior follows from #5)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safety kernel follow-up: E2E coverage, partial-block leak, blocked-by-policy explanation persistence #80

Context

Current State (verified 2026-04-26, post #78 merge)

1. E2E coverage for the safety kernel — deferred from #75

2. Partial-block leak in `whatWouldIDo`

3. Blocked-by-policy ExplanationRecord persistence

4. Unreachable `!action` guard

Proposed Change

1. E2E tests (largest item)

2. Per-candidate policy verdicts

3. Explanation persistence for blocked-by-policy

4. Cover the `!action` guard

Acceptance Criteria

Effort Estimate

Files Reference

Non-Goals

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Safety kernel follow-up: E2E coverage, partial-block leak, blocked-by-policy explanation persistence #80

Description

Context

Current State (verified 2026-04-26, post #78 merge)

1. E2E coverage for the safety kernel — deferred from #75

2. Partial-block leak in `whatWouldIDo`

3. Blocked-by-policy ExplanationRecord persistence

4. Unreachable `!action` guard

Proposed Change

1. E2E tests (largest item)

2. Per-candidate policy verdicts

3. Explanation persistence for blocked-by-policy

4. Cover the `!action` guard

Acceptance Criteria

Effort Estimate

Files Reference

Non-Goals

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions