test(safety): lock policy-check-blocks-execution invariant by jayzalowitz · Pull Request #78 · jayzalowitz/skytwin

jayzalowitz · 2026-04-26T20:27:11Z

Summary

`ExecutionRouter` now throws `InvariantViolationError` when called without a `RiskAssessment` or with a mismatched `actionId` — pins Safety Invariants feat: bootstrap SkyTwin monorepo (v0.1.0.0) #1 and feat: real OpenClaw execution, 6 new domains, expanded onboarding #7 at the boundary so a future caller that skips the decision pipeline cannot silently auto-execute (+4 unit tests)
`DecisionMaker.whatWouldIDo` no longer leaks blocked candidates via `alternativeActions` when policy denies every candidate; returns empty alternatives and surfaces the blocking reason via `policyNotes` (+1 unit test pinning the no-leak contract)
`POST /api/events/ingest` emits a `decision:blocked-by-policy` SSE event when no action was selected and no approval was created, so users can see the policy result instead of silent ingestion (+1 unit test)

Production call sites (`apps/api/src/routes/events.ts:230`, `apps/api/src/routes/approvals.ts:264`) already build matching `RiskAssessment` objects — the new guards are inert for them and active against future orphan callers.

Test plan

`pnpm --filter @skytwin/execution-router test` — 66/66 (was 62)
`pnpm --filter @skytwin/decision-engine test` — 79/79 (was 78)
`pnpm --filter @skytwin/api test` — 142 passed | 22 skipped (was 141)
`pnpm test` — 38/38 turbo tasks
`pnpm lint` — 30/30 turbo tasks

Closes #75

🤖 Generated with Claude Code

Closes #75. Three additive guards on the safety kernel: 1. ExecutionRouter throws InvariantViolationError when called without a RiskAssessment or with a CandidateAction whose id does not match the assessment. Pins Safety Invariants #1 and #7 at the boundary, so a future caller that bypasses the decision pipeline cannot silently auto-execute. (+4 unit tests) 2. DecisionMaker.whatWouldIDo no longer leaks blocked candidates as alternativeActions when policy denies every candidate. Returns an empty alternatives array and surfaces the blocking reason via policyNotes so the prediction reflects what the user could actually take. (+1 unit test pinning the no-leak contract) 3. POST /api/events/ingest emits a decision:blocked-by-policy SSE event when no action was selected and no approval was created, so users see the policy result instead of silent ingestion. (+1 unit test) Production call sites (apps/api/src/routes/events.ts:230, apps/api/src/routes/approvals.ts:264) already build matching RiskAssessments — guards are inert for them, active against new orphan callers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Locks several safety invariants at key boundaries by adding runtime guards in the execution router, preventing policy-blocked action leakage in the prediction API, and improving observability for “no action taken” outcomes via SSE.

Changes:

Add InvariantViolationError + runtime input validation to ExecutionRouter (including streaming path) with new unit tests.
Prevent DecisionMaker.whatWouldIDo() from returning blocked candidates via alternativeActions when nothing is policy-allowed; surface policyNotes instead.
Emit a new decision:blocked-by-policy SSE event from POST /api/events/ingest when no action is selected, with route-level test coverage.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
packages/execution-router/src/index.ts	Re-export `InvariantViolationError` for external consumers.
packages/execution-router/src/execution-router.ts	Add invariant guard + new error type; enforce action/assessment consistency before executing/streaming.
packages/execution-router/src/tests/execution-router.test.ts	Add unit tests asserting invariant guard behavior for sync + streaming entrypoints.
packages/decision-engine/src/decision-maker.ts	Prevent “alternativeActions” leakage when policy blocks all candidates; add `policyNotes` behavior.
packages/decision-engine/src/tests/decision-maker.test.ts	Add test pinning the “no-leak when all blocked” whatWouldIDo contract.
apps/api/src/routes/events.ts	Emit `decision:blocked-by-policy` SSE event when ingest results in no selected action and no approval/execution.
apps/api/src/tests/events-routes.test.ts	Add test asserting the new blocked-by-policy SSE emission and that execution is not triggered.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T20:31:47Z

+      // Surface "no action taken" outcomes (every candidate blocked, or none
+      // generated) so the user can see why nothing happened. Without this the
+      // event ingest is silent and the policy decision is invisible — Safety
+      // Invariant #1 (every auto-execute path went through a policy check) is
+      // structurally enforced upstream, but the *result* of that check needs
+      // to be observable.
+      if (!outcome.selectedAction && !approvalRequest && !executionResult) {
+        sseManager.emit(userId, 'decision:blocked-by-policy', {
+          decisionId: decision.id,
+          reason: outcome.reasoning,
+          domain: decision.domain,
+          situationType: decision.situationType,
+          urgency: decision.urgency,
+        });
+      }


The SSE event name decision:blocked-by-policy is emitted whenever selectedAction is null, which also includes the "no candidates generated" path (DecisionMaker returns selectedAction: null with reasoning like "No candidate actions could be generated..."). That means this event can represent non-policy cases and may mislead clients that treat it as a policy denial. Consider distinguishing these outcomes (e.g., emit a separate event for allCandidates.length === 0, or include a reasonCode/blockedByPolicy field in the payload, or tighten the condition to only use this event when there were candidates but none were allowed).

Copilot · 2026-04-26T20:31:47Z

+  it('emits decision:blocked-by-policy when no action was selected (Safety Invariant #1)', async () => {
+    mockEvaluate.mockResolvedValue({
+      autoExecute: false,
+      requiresApproval: false,
+      reasoning: 'All candidates blocked by policy "No travel auto-bookings".',
+      selectedAction: null,
+      allCandidates: [],
+    });
+
+    const res = await request(buildApp(), 'POST', '/api/events/ingest', {
+      userId: 'user-1',
+      source: 'test',
+      type: 'travel_decision',
+    });


This test describes a travel policy block ("No travel auto-bookings") and sends type: 'travel_decision', but mockInterpret in beforeEach still returns a calendar decision (domain: 'calendar', situationType: 'calendar_conflict'). The test passes because it doesn’t assert on domain/situationType, but the mismatch makes the scenario harder to reason about. Consider overriding mockInterpret in this test to return a travel decision, or adjust the test wording/request body to match the mocked decision.

) Two follow-ups from the post-merge review of #78: - Add the missing test for assertValidExecutionInputs(null, ...). The branch is unreachable via TypeScript, but defensive code on a safety boundary deserves a test so future refactors can't silently delete it. - Convert the seven shared-fixture describe blocks in explanation-generator.test.ts from module-level const to per-test beforeEach. Saved records no longer accumulate across it() calls in the same describe, so future "save called once" assertions won't inherit cross-contamination. Net new tests: +1. Existing tests still 30/30 explanations, now 67/67 execution-router. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…h work (#87) Captures eight PRs (#77, #78, #79, #81, #82, #83, #84, #85, #86) under an [Unreleased] section. No code changes — pure documentation. Bumping VERSION + cutting a tag is left as a separate decision so the release moment stays explicit. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 26, 2026 20:27

Copilot started reviewing on behalf of jayzalowitz April 26, 2026 20:27 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

jayzalowitz merged commit 8f03ee7 into main Apr 26, 2026
12 checks passed

jayzalowitz deleted the jayzalowitz/policy-blocks-execution branch April 26, 2026 20:38

This was referenced Apr 26, 2026

Safety kernel follow-up: E2E coverage, partial-block leak, blocked-by-policy explanation persistence #80

Closed

test: cover null-action guard and isolate explanation test fixtures #81

Merged

jayzalowitz mentioned this pull request Apr 27, 2026

docs: capture safety-kernel and dashboard-polish work in CHANGELOG #87

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(safety): lock policy-check-blocks-execution invariant#78

test(safety): lock policy-check-blocks-execution invariant#78
jayzalowitz merged 1 commit into
mainfrom
jayzalowitz/policy-blocks-execution

jayzalowitz commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayzalowitz commented Apr 26, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants