P1.9 #379: global kill switch (operator env + per-user toggle + banner)#421
Conversation
Closes #379. Three coordinated levers + a chrome banner give every install the panic button CLAUDE.md launch-criterion #8 requires. Lever 1 — operator env var: - SKYTWIN_AUTO_EXECUTE_DISABLED=true on the API/worker process. - Read ONCE at PolicyEvaluator construction (override via ctor option for tests so the env-var read isn't a hidden dependency). - New early-return at the top of `evaluate()` sits AHEAD of every other check (trust tier, injection guard, autonomy, quiet hours, policy rules) so no downstream allow path can bypass it. - Uses { allowed: true, requiresApproval: true } not deny — actions still land in the Approvals queue. Lever 2 — per-user toggle: - autonomy_settings.paused: true | false on the user row. - New PUT /api/users/:userId/autonomy-pause endpoint sets/clears the flag, writes pausedAt + optional pausedReason on transition. - AutonomySettings type gains paused, pausedAt, pausedReason fields in @skytwin/shared-types/user.ts. Lever 3 — chrome banner: - index.html ships a sticky #autonomy-banner above <main>. - app.js updateAutonomyBanner() fetches GET /autonomy-state, renders operator + user lines independently, and shows a Resume button only for the user-pause line (operator pause needs an env-var change). - Refreshed on every navigate() + every 30s; backed off when API known offline. - New GET /api/users/:userId/autonomy-state returns combined state. Settings page: - New "Pause auto-execution" card with confirmation modal on both transitions and optional reason prompt. - Hydrated from /autonomy-state on render; refreshes banner + on-page state together after a flip. - Coexists with the existing "Pause everything (demote to observer)" button — different lever, clearer label. Operator pause reason wins when both flags are set so the banner copy reflects who set the pause. Tests: - 5 new policy-engine tests cover the operator-paused, user-paused, both-paused (operator wins), neither-paused regression, and isGloballyPaused() reporting matrix. - 177 policy-engine tests, 713 API tests pass. Safety Invariant #1 preserved: the new check strengthens the single PolicyEvaluator.evaluate funnel rather than adding a parallel path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a multi-layer “pause auto-execution” kill switch (global operator env var + per-user toggle) and surfaces it in the product UI (settings + persistent banner), with policy-engine enforcement and supporting API endpoints.
Changes:
- Policy engine: introduces a global pause flag (read from
SKYTWIN_AUTO_EXECUTE_DISABLEDat construction) and a per-user pause flag (autonomySettings.paused) intended to force manual approval. - API: adds user-scoped endpoints to set/clear the per-user pause and to report combined pause state for UI rendering.
- Web UI: adds a sticky chrome banner and a new Settings card to toggle/present pause state, plus changelog + tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/shared-types/src/user.ts | Extends AutonomySettings with paused, pausedAt, pausedReason. |
| packages/policy-engine/src/policy-evaluator.ts | Adds global pause initialization + pause handling in evaluate() and isGloballyPaused(). |
| packages/policy-engine/src/tests/policy-evaluator.test.ts | Adds unit tests for operator/user pause behavior. |
| apps/api/src/routes/users.ts | Adds PUT /autonomy-pause and GET /autonomy-state endpoints. |
| apps/web/public/js/pages/settings.js | Adds Settings UI + toggle handler to pause/resume per-user auto-execution. |
| apps/web/public/js/app.js | Adds updateAutonomyBanner() polling + resume handler and wires it into navigation/interval refresh. |
| apps/web/public/index.html | Adds the global banner DOM container. |
| apps/web/public/css/styles.css | Adds styling for the sticky banner and layout offset class. |
| CHANGELOG.md | Documents the new kill-switch feature set and behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Kill-switch escalation (#379) — runs AHEAD of every other check. | ||
| // Sits ahead of the trust-tier gate and the injection guard so no | ||
| // downstream allow path can bypass it. Uses `allowed: true, | ||
| // requiresApproval: true` rather than a deny — the action is still | ||
| // valid; the user can review + approve it manually. Operator state | ||
| // takes precedence in the reason string when both are set so the | ||
| // banner copy reflects who set the pause. | ||
| const userPaused = Boolean(autonomySettings?.paused); | ||
| if (this.globallyPaused || userPaused) { | ||
| const reason = this.globallyPaused | ||
| ? 'Auto-execution disabled by operator (SKYTWIN_AUTO_EXECUTE_DISABLED). Actions require manual approval until the operator restores normal mode.' | ||
| : 'Auto-execution paused by user. Resume from Settings to let your twin act on signals again.'; | ||
| return { | ||
| allowed: true, | ||
| requiresApproval: true, | ||
| reason, | ||
| }; | ||
| } |
There was a problem hiding this comment.
Addressed in 1838931. Capture killSwitchActive at top, apply at the END alongside the existing requiresApproval merge. Denies short-circuit first (new test: 'kill switch does NOT override a deny — domain blocklist still wins'). Injection guard's confirmationLevel flows through unchanged. Quiet-hours early-return also propagates killSwitchActive so a paused user can't bypass via quiet hours.
| /** | ||
| * Whether the operator-level kill switch is currently engaged. Used | ||
| * by the `/api/users/:userId/autonomy-state` endpoint to surface the | ||
| * operator-paused state to the dashboard banner. | ||
| */ | ||
| isGloballyPaused(): boolean { | ||
| return this.globallyPaused; | ||
| } |
There was a problem hiding this comment.
Addressed in 1838931. Updated the docstring to honestly describe both paths — evaluator snapshots at construction, route reads live env var; they agree by construction. A future refactor can collapse to a single source.
| .autonomy-banner { | ||
| position: fixed; | ||
| top: 0; | ||
| left: 0; | ||
| right: 0; | ||
| z-index: 9999; | ||
| background: var(--danger, #c0392b); | ||
| color: white; | ||
| padding: 0.6rem 1.25rem; | ||
| display: flex; | ||
| align-items: center; | ||
| justify-content: space-between; | ||
| gap: 1rem; | ||
| font-size: 0.9rem; | ||
| font-weight: 500; | ||
| box-shadow: 0 2px 8px rgba(0, 0, 0, 0.25); | ||
| } | ||
| .autonomy-banner[hidden] { display: none; } | ||
| .autonomy-banner-lines { display: flex; flex-direction: column; gap: 0.15rem; flex: 1; } | ||
| .autonomy-banner-line[hidden] { display: none; } | ||
| .autonomy-banner-resume { | ||
| background: white; | ||
| color: var(--danger, #c0392b); | ||
| border: 0; | ||
| padding: 0.35rem 0.9rem; | ||
| font-weight: 600; | ||
| border-radius: 4px; | ||
| cursor: pointer; | ||
| } | ||
| .autonomy-banner-resume:hover { background: var(--bg-card, #f0f0f0); } | ||
| .autonomy-banner-resume[hidden] { display: none; } | ||
| /* When the banner is visible, push the sidebar + main content down so | ||
| * nothing slides behind it. Body padding-top is the simplest fix that | ||
| * doesn't require touching layout for every chrome element. The class | ||
| * is toggled by app.js when the banner shows/hides. */ | ||
| body.has-autonomy-banner { padding-top: 2.5rem; } | ||
|
|
There was a problem hiding this comment.
Addressed in 1838931. Added .sidebar { top: 2.5rem; height: calc(100vh - 2.5rem); } under body.has-autonomy-banner so both fixed layers move together. Caveat documented inline: 2.5rem assumes a single-line banner; multi-line wrap on narrow screens needs a future CSS var driven from banner.offsetHeight.
| <!-- Kill-switch banner (#379). Always rendered first inside <main> so | ||
| it sits above the page header, sticks to the top via CSS, and | ||
| cannot be dismissed. Visibility controlled by app.js's | ||
| updateAutonomyBanner() against /api/users/:userId/autonomy-state. | ||
| Operator-set env var and per-user toggle each render their own | ||
| line so the banner can communicate who set the pause. --> | ||
| <div id="autonomy-banner" class="autonomy-banner" hidden role="status" aria-live="polite"> |
There was a problem hiding this comment.
Addressed in 1838931. Comment now describes the actual structure (fixed sibling of sidebar + main) and the has-autonomy-banner body-class coupling.
| const msg = target | ||
| ? 'Pause your twin? Every action will be routed to the approvals queue for you to review manually.' | ||
| : 'Resume auto-execution? Your twin will start acting on signals again.'; | ||
| if (!window.confirm(msg)) return; | ||
| let reason; | ||
| if (target) { | ||
| reason = window.prompt('Optional: why are you pausing? (saved with the audit log; leave blank to skip)', '') || undefined; | ||
| } |
There was a problem hiding this comment.
Addressed in 1838931. Copy changed to 'stored on your user record' to match reality. A proper audit-log row was spec'd in the issue but deferred — landing the engine + banner first; the audit-log entry can be a follow-up against a trust_tier_audit-like table.
…S sidebar; copy Copilot review on PR #421 surfaced five issues: 1. CRITICAL — kill switch was overriding deny verdicts. Pre-fix the early-return at the top of evaluate() turned every action — including spend-cap-exceeded, domain-blocked, policy- denied — into {allowed: true, requiresApproval: true}. Denies are strictly stricter than approvals and must never be relaxed. Also dropped the injection-guard `confirmationLevel` for extreme- severity actions. Fix: capture killSwitchActive + killSwitchReason at the top, but APPLY them at the very end alongside the existing requiresApproval merge. Denies still short-circuit first; the injection guard's confirmationLevel flows through unchanged. The quiet-hours early-return also propagates killSwitchActive so a paused user can't bypass via quiet hours. New tests in policy-evaluator.test.ts: - "kill switch does NOT override a deny — domain blocklist still wins" 2. isGloballyPaused docstring claimed the autonomy-state endpoint used it, but the endpoint reads process.env directly. Updated the docstring to honestly describe both paths and note they agree by construction (snapshot vs live read of the same env var) — future refactor can collapse to one path. 3. CSS banner pushed body content down but NOT the sidebar (which is position: fixed; top: 0), so the banner covered the sidebar header. Added `.sidebar { top: 2.5rem; height: calc(100vh - 2.5rem); }` under `body.has-autonomy-banner` so both layers move together. Caveat documented inline: 2.5rem assumes a single-line banner; two-line wrap on narrow screens needs a future CSS var driven from banner.offsetHeight. 4. HTML comment said the banner is "inside <main>" but it's actually a fixed sibling of sidebar + main. Comment updated to describe the actual structure + the `has-autonomy-banner` body-class coupling. 5. Settings prompt copy claimed pausedReason is "saved with the audit log" but no audit table is written today — only the user's autonomy_settings JSONB. Copy changed to "stored on your user record" to match reality. (A proper audit-log row was spec'd in the issue but deferred — landing the engine + banner first; the audit-log row can be a follow-up against an existing trust_tier_audit-like table.) 178 policy-engine tests pass (+1). All other tests still green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1. Grammar — "Both routes a candidate action…" was singular/plural mismatch. Fixed to "Both route a candidate action…". 2. Layer 7 pause-ordering claim was inaccurate. I'd written that the pause check "sits ahead of the trust-tier gate and the injection guard", but per the post-Copilot review on #421 `PolicyEvaluator.evaluate()` captures pause state at the top and APPLIES it at the END so denies (domain blocklist, spend- limit, policy deny, injection-guard confirmationLevel) aren't overridden. Rewrote the paragraph to describe the actual semantic: pause escalates an otherwise-allowed action to manual approval; a denied action stays denied. 3. Trust-tier "all three must clear" claim was over-broad. The time-in-tier floor (#373) is engine + threshold only — the production callers that build ApprovalStats (the progress endpoint, the promotion-eligibility job) don't yet populate `hoursInCurrentTier`. Added an explicit "Enforcement caveat" bullet so the doc no longer over-promises against the engine. Reflects the same scope note from the original P1.3 PR. 4. CHANGELOG layout — the new "Changed (docs)" heading I added ended up grouping the prior #389 onboarding fix entry under it because I didn't re-emit the "Fixed (Epic A — onboarding, #389)" heading. Restored the heading so the #389 entry is back under its proper section. Documentation-only — no code touched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(safety-model): sync with shipped Tier-2 fixes docs/safety-model.md ended at Layer 6 (Approval Routing) and didn't document the three new layers that landed during Tier-2 polish: Layer 7: Global pause + per-user pause (#379) - SKYTWIN_AUTO_EXECUTE_DISABLED operator env var - autonomy_settings.paused per-user toggle - Sits ahead of trust-tier gate + injection guard Layer 8: Right to erasure (#376) - DELETE /api/users/:userId?confirm=delete-my-data - userPurgeRepository.purgeUser in a single transaction - Cascade via migration 061 (#413) collapses 32 tables Layer 9: Access audit log (#393) - access_log table + accessLogRepository - decrypt_oauth_token rows from DbTokenStore - Fire-and-forget; never blocks legitimate decrypt Each new entry follows the existing layer template (what it is, what it gates, where the code lives, can/can't do, interaction with layers above/below). Trust-tier-progression section also added the time-in-tier floor (#373) as the third gate alongside consecutiveApprovals and minApprovalRatio — 24h / 72h / 168h before lifting the tier. Pointers to the shape-lock test (promotion-thresholds-shape.test.ts) and the cascade E2E test (cascade-cleanup.e2e.test.ts) so a future reader can verify the doc matches the engine. Documentation-only — no code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(safety-model): address Copilot review on PR #451 (4 findings) 1. Grammar — "Both routes a candidate action…" was singular/plural mismatch. Fixed to "Both route a candidate action…". 2. Layer 7 pause-ordering claim was inaccurate. I'd written that the pause check "sits ahead of the trust-tier gate and the injection guard", but per the post-Copilot review on #421 `PolicyEvaluator.evaluate()` captures pause state at the top and APPLIES it at the END so denies (domain blocklist, spend- limit, policy deny, injection-guard confirmationLevel) aren't overridden. Rewrote the paragraph to describe the actual semantic: pause escalates an otherwise-allowed action to manual approval; a denied action stays denied. 3. Trust-tier "all three must clear" claim was over-broad. The time-in-tier floor (#373) is engine + threshold only — the production callers that build ApprovalStats (the progress endpoint, the promotion-eligibility job) don't yet populate `hoursInCurrentTier`. Added an explicit "Enforcement caveat" bullet so the doc no longer over-promises against the engine. Reflects the same scope note from the original P1.3 PR. 4. CHANGELOG layout — the new "Changed (docs)" heading I added ended up grouping the prior #389 onboarding fix entry under it because I didn't re-emit the "Fixed (Epic A — onboarding, #389)" heading. Restored the heading so the #389 entry is back under its proper section. Documentation-only — no code touched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Closes #379. Adds the panic button CLAUDE.md launch-criterion #8 requires. Three coordinated levers + a chrome banner. No redeploy needed to halt auto-execution.
Changes
Policy engine — single funnel strengthened
PolicyEvaluatorconstructor readsSKYTWIN_AUTO_EXECUTE_DISABLED(override via ctor option for tests).autonomySettings.paused→{ allowed: true, requiresApproval: true }. Actions still land in the Approvals queue; nothing auto-executes.isGloballyPaused()accessor for routes that need to surface the state.Type —
AutonomySettingspaused?: boolean,pausedAt?: string,pausedReason?: string.API
PUT /api/users/:userId/autonomy-pause— body{ paused: boolean, reason?: string }; merges into the JSONB autonomy_settings (preserves spend caps / domains / overrides / quiet hours).GET /api/users/:userId/autonomy-state— returns{ globalPause, userPause, pausedAt, pausedReason }.Chrome banner
index.htmladds a sticky#autonomy-bannerabove<main>.app.jsupdateAutonomyBanner()polls/autonomy-stateon everynavigate()+ every 30s. Two independent lines (operator + user). Resume button only on the user line. Body gets.has-autonomy-bannerclass so chrome doesn't slide behind it.Settings page
/autonomy-stateon render; refreshes banner + on-page state together.Tests
isGloballyPaused()reporting.Verified
pnpm build✓pnpm test✓ (zero failures)pnpm lint✓Test plan
SKYTWIN_AUTO_EXECUTE_DISABLED=trueand restart API → banner shows "Auto-execution paused by operator…" on every route.curl http://localhost:3100/api/users/<uuid>/autonomy-statereturns combined state.🤖 Generated with Claude Code