Skip to content

Safety kernel test: prove policy denial blocks auto-execution end-to-end #75

@jayzalowitz

Description

@jayzalowitz

Safety kernel test: prove policy denial blocks auto-execution end-to-end

Context

Safety Invariant #1 (CLAUDE.md) is the non-negotiable rule of the decision pipeline: "Never auto-execute without a policy check." Every path that produces an autoExecute: true outcome must have passed PolicyEvaluator.evaluate().

Today the invariant is enforced structurallyDecisionMaker.evaluate() runs every candidate through the policy evaluator, and apps/api/src/routes/events.ts branches on outcome.autoExecute before calling the execution router. A unit test in decision-maker.test.ts:186 covers the blocked path with a mock evaluator.

What's missing is an integration-level test that proves the invariant at the handler boundary: when the policy layer denies all candidates, no execution adapter is invoked, no execution plan is persisted, and no plan_completed/plan_failed event is emitted. Without this, a future refactor that accidentally calls executeWithRoutingStreaming outside the outcome.autoExecute branch would pass unit tests and pass type-checking.

This is the single highest-leverage safety test in the codebase. The whole trust model collapses if the policy check is ever bypassed.

Claude Code estimate: ~2-3h

Current State (verified 2026-04-23)

Policy enforcement path

packages/decision-engine/src/decision-maker.ts:139-162 loops scored candidates, calls policyEvaluator.evaluate(...), and only sets selectedAction when policyDecision.allowed === true. If every candidate is blocked, selectedAction stays null and outcome.autoExecute stays false.

apps/api/src/routes/events.ts:205-218 escalation branch: if outcome.requiresApproval, creates an approval and emits decision:pending-approval — does not call execution router.

apps/api/src/routes/events.ts:219-310 execution branch: guarded by outcome.autoExecute && outcome.selectedAction, calls executionRouter.executeWithRoutingStreaming(...), persists an execution_plans row, emits decision:step / decision:executed.

Unit coverage

packages/decision-engine/src/__tests__/decision-maker.test.ts:186-212 — "should deny action when policy blocks it" — uses a mock PolicyEvaluator that returns { allowed: false } and asserts outcome.autoExecute === false and outcome.selectedAction === null. Good, but the mock collapses the whole policy layer into one return value.

Integration coverage gaps

  • apps/api/src/__tests__/e2e-api.test.ts has approval-lifecycle and policy-CRUD tests but none that exercise the "all candidates blocked" path and assert execution router was never called.
  • No test covers the requiresApproval escalation branch (events.ts:205-218) end-to-end against a real database with a user-scoped policy.
  • No test covers the whatWouldIDo prediction path in decision-maker.ts:237: does the predict/query flow honor the same policy layer, or can it leak action recommendations that would actually be blocked at execution time?
  • packages/execution-router/src/__tests__/execution-router.test.ts does not have a "never executes without upstream policy pass" check — the router accepts any CandidateAction + RiskAssessment pair handed to it, so the invariant lives entirely in the caller.

Observability gaps

When a policy blocks, outcome.reasoning gets set (decision-maker.ts:160) but nothing is emitted to the audit log or SSE stream. A user has no way to see "SkyTwin declined to act because policy X blocked Y" unless they read the stored DecisionOutcome.

Proposed Change

1. Add e2e test: policy denial blocks execution

New test in apps/api/src/__tests__/e2e-api.test.ts (new describe('Policy safety kernel') block):

  • Create a user at trustTier: 'confident' (would normally auto-execute).
  • Create a custom action policy via POST /api/policies/:userId that blocks actionType: 'email-send' unconditionally (e.g. conditions: { block: true }).
  • Ingest an event that would generate an email-send candidate.
  • Assert: response outcome.autoExecute === false, outcome.selectedAction === null.
  • Assert: GET /api/audit/:userId shows zero execution events for this decision.
  • Assert: no row in execution_plans with decision_id matching the ingested decision.
  • Assert: the decision:executed SSE event was never emitted (tail SSE during the test window).

2. Add e2e test: requiresApproval escalation path

Same block, second test:

  • Create a user at trustTier: 'observer' (forces escalation).
  • Ingest event.
  • Assert: outcome.requiresApproval === true, outcome.autoExecute === false.
  • Assert: approval row created with status: 'pending'.
  • Assert: no execution plan created until approval is approved.
  • Approve via POST /api/approvals/:id/respond.
  • Assert: execution plan created after approval.

3. Lock the invariant at the router boundary

Add a runtime guard in packages/execution-router/src/execution-router.ts on the entry to executeWithRoutingStreaming:

  • If caller has not supplied a RiskAssessment with overallTier, throw InvariantViolationError.
  • Log a structured warning if called with an action whose confidence === SPECULATIVE and overallTier === HIGH/CRITICAL (not a block, just a tripwire).

Add unit test in packages/execution-router/src/__tests__/execution-router.test.ts:

  • Invoking executeWithRoutingStreaming(action, null as unknown as RiskAssessment, userId) throws.
  • Invoking with mismatched actionId between action and assessment throws.

4. Cover the predict path

Add unit test in decision-maker.test.ts for whatWouldIDo:

  • Mock evaluator returns allowed: false for all candidates.
  • Assert whatWouldIDo response does not recommend a blocked action — either returns the top allowed candidate or an explicit "no recommendation" result.

5. Emit blocked-by-policy audit event

In events.ts, add a branch: if !outcome.selectedAction && !outcome.requiresApproval, emit decision:blocked-by-policy SSE event and write an ExplanationRecord with escalationRationale: outcome.reasoning. This makes the invariant observable.

Acceptance Criteria

  1. E2E test creates user with trustTier: 'confident' + blocking policy for email-send → ingests matching event → response outcome.autoExecute === false and outcome.selectedAction === null.
  2. Same test → execution_plans table queried by decision_id returns zero rows.
  3. Same test → no decision:step or decision:executed SSE event emitted in the test window.
  4. E2E test with trustTier: 'observer' + standard event → approval row created with status pendingexecution_plans count unchanged before approval.
  5. Same test → POST /api/approvals/:id/respond with action approve → within 2s, execution_plans row created with matching decision_id.
  6. ExecutionRouter.executeWithRoutingStreaming throws InvariantViolationError when called with null or undefined RiskAssessment.
  7. ExecutionRouter.executeWithRoutingStreaming throws when action.id !== assessment.actionId.
  8. whatWouldIDo with all candidates blocked by a mock evaluator returns no recommended action (does not leak a blocked candidate).
  9. When all candidates are blocked by policy, handler writes an ExplanationRecord with non-empty escalationRationale and emits decision:blocked-by-policy SSE event with shape { decisionId, reason }.
  10. grep -rn "executeWithRoutingStreaming" apps/ packages/ — every call site is either inside an if (outcome.autoExecute) guard or behind an approval check. No orphan call sites.
  11. All existing tests pass. New test count: +6 unit, +2 e2e.
  12. PR passes /review before merge.

Testing Plan

Layer What Count
Unit (execution-router) Null/undefined RiskAssessment throws +1
Unit (execution-router) Mismatched actionId throws +1
Unit (decision-maker) whatWouldIDo honors policy blocks +2
Unit (explanations) Blocked-by-policy ExplanationRecord shape +1
Unit (events handler) Emits decision:blocked-by-policy branch +1
E2E Policy blocks → no execution, no plan, no SSE +1
E2E Approval gate → plan created only after approve +1

Effort Estimate

Total: ~2-3h Claude Code time

Files Reference

File Change
apps/api/src/__tests__/e2e-api.test.ts Add Policy safety kernel describe block (+2 tests)
packages/execution-router/src/execution-router.ts Add runtime invariant guard on entry
packages/execution-router/src/__tests__/execution-router.test.ts Add guard tests (+2 tests)
packages/decision-engine/src/__tests__/decision-maker.test.ts Add whatWouldIDo policy tests (+2 tests)
apps/api/src/routes/events.ts Add decision:blocked-by-policy branch + audit emit
packages/explanations/src/__tests__/explanation-generator.test.ts Add blocked-by-policy record test (see companion issue)
packages/core/src/errors.ts Add InvariantViolationError class if not present

Non-Goals

  • Not rewriting the policy engine or changing evaluation semantics.
  • Not adding UI affordances for blocked decisions (covered separately by dashboard work).
  • Not adding per-candidate reasoning traces — one reasoning string on DecisionOutcome is sufficient for this iteration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions