Skip to content

feat(#183 AC#4 partial): zero-trust mode policy + UI#222

Merged
jayzalowitz merged 1 commit into
mainfrom
feat/zero-trust-mode
May 8, 2026
Merged

feat(#183 AC#4 partial): zero-trust mode policy + UI#222
jayzalowitz merged 1 commit into
mainfrom
feat/zero-trust-mode

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Closes the policy + UI half of #183 AC#4. The container runtime hooks themselves (the actual --network=none spawn) live in the desktop app and are deferred to #180's environmental work.

What's in here

Backend policy logic

  • getEffectiveRiskModifier(server) returns 1 + (zero_trust_mode ? 1 : 0), stacking on the existing MCP_HOST_TRUST_PROFILE.riskModifier of 1
  • applyZeroTrustOverride() forces every action from a zero-trust server to require approval regardless of the user's trust tier
  • New helpers exported from @skytwin/policy-engine

Repository + migration

  • mcp-server-repository.setZeroTrustMode(id, enabled) toggles the existing mcp_servers.zero_trust_mode column from migration 027
  • Migration 034 extends capability_provenance_nodes.node_type CHECK to include 'zero_trust_change' so toggle events can be audited

API routes (sessionAuth + requireOwnership)

  • POST /api/capabilities/:id/zero-trust/enable
  • POST /api/capabilities/:id/zero-trust/disable

Both write a capability_provenance_nodes row with node_type = 'zero_trust_change' and payload { from, to }.

Web UX

New "Zero-trust mode" card on capability-detail.js with:

  • State badge (enabled / disabled)
  • Toggle button
  • Explanation text describing the +1 risk modifier and the future container isolation
  • Re-renders on toggle to reflect new state
  • Singleton-delegator (uses existing _capabilityDetailListenerWired guard)

Tests

  • 6 unit tests for getEffectiveRiskModifier + applyZeroTrustOverride
  • 4 unit tests for setZeroTrustMode repo
  • 8 API integration tests for /zero-trust/enable and /disable (UUID validation, ownership check, provenance write side-effect, 400/403/404 paths)
  • 18 new tests total
  • Full suites remain green: 144 policy-engine, 179 db, 395 api

Out of scope (deferred to environmental work)

Why force-approve on every zero-trust action

The use case is "I don't trust this third-party server" — a user who enables zero-trust is by definition requesting the highest-scrutiny mode. The +1 riskModifier stacks on the existing MCP_HOST_TRUST_PROFILE.riskModifier (1+1=2), so the effective signal to the decision pipeline is "treat this as higher risk AND require human in the loop." That's the right safety posture. High-volume automation servers (crons, pipelines) become unusable in zero-trust mode by design — zero-trust is opt-in for users who want preview-before-run semantics, not a default.

Test plan

  • pnpm --filter @skytwin/policy-engine test → 144 pass
  • pnpm --filter @skytwin/db test → 179 pass
  • pnpm --filter @skytwin/api test → 395 pass
  • All builds clean

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 8, 2026 17:37
@jayzalowitz jayzalowitz added enhancement New feature or request capability-loop Capability Acquisition Loop epic + children (OSS launch v1) labels May 8, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the “zero-trust mode” toggle surface area (policy helpers, persistence/auditability, API endpoints, and web UI) for MCP capability servers, as part of #183 AC#4 (with container/network isolation deferred to #180).

Changes:

  • Introduces @skytwin/policy-engine zero-trust helpers (getEffectiveRiskModifier, applyZeroTrustOverride) plus unit tests.
  • Adds DB support for toggling mcp_servers.zero_trust_mode and auditing via new provenance node type + migration.
  • Adds API endpoints to enable/disable zero-trust mode and a new UI card on the capability detail page to toggle and display status.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
packages/policy-engine/src/zero-trust.ts Adds zero-trust helper functions for risk modifier + approval override.
packages/policy-engine/src/index.ts Re-exports the new zero-trust helpers/types.
packages/policy-engine/src/tests/zero-trust.test.ts Unit tests for the new policy-engine helpers.
packages/db/src/repositories/provenance-repository.ts Extends provenance node type union to include zero_trust_change.
packages/db/src/repositories/mcp-server-repository.ts Adds setZeroTrustMode repository method.
packages/db/src/migrations/034-zero-trust-provenance.sql Extends DB CHECK constraint to allow zero_trust_change.
packages/db/src/tests/zero-trust-toggle.test.ts Unit tests for setZeroTrustMode.
apps/api/src/routes/capabilities.ts Adds POST /:id/zero-trust/enable and /disable endpoints + provenance write.
apps/api/src/tests/zero-trust-routes.test.ts Integration-style route tests for the enable/disable endpoints.
apps/web/public/js/pages/capability-detail.js Adds zero-trust UI card + click handler to toggle and re-render.
CHANGELOG.md Documents the new zero-trust policy/UI/API/migration work.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1843 to +1859
const updatedServer = await mcpServerRepository.setZeroTrustMode(id, true);
if (!updatedServer) {
res.status(404).json({ error: 'Capability server not found after update' });
return;
}

// Hard rail: write provenance node for every zero-trust toggle.
// Payload records the transition only — no user content.
await provenanceRepository.writeNode({
userId,
nodeType: 'zero_trust_change',
refTable: 'mcp_servers',
refId: id,
serverId: id,
payload: { from: previousValue, to: true },
});

Comment on lines +1900 to +1910
// Hard rail: write provenance node for every zero-trust toggle.
// Payload records the transition only — no user content.
await provenanceRepository.writeNode({
userId,
nodeType: 'zero_trust_change',
refTable: 'mcp_servers',
refId: id,
serverId: id,
payload: { from: previousValue, to: false },
});

Comment on lines +290 to +294
<div style="font-size: 0.85rem; color: var(--text-muted); margin: 0.5rem 0 0.75rem;">
When enabled, this capability runs with elevated scrutiny. Adds +1 risk modifier
to every action proposal and forces explicit approval regardless of trust tier.
Container-level network isolation is enforced by the desktop app (when installed).
</div>
Comment on lines +206 to +212
* Toggle zero-trust mode for a single MCP server (#183 AC#4).
*
* When enabled, the policy engine applies an additional +1 riskModifier
* to all action proposals and forces every action to require approval
* regardless of trust tier.
*
* Container-level network isolation is enforced by the desktop app (#180).
Comment thread CHANGELOG.md
Comment on lines +9 to +15
### Added — Backend policy logic

- `getEffectiveRiskModifier(server)` returns `1 + (zero_trust_mode ? 1 : 0)`,
stacking on the existing `MCP_HOST_TRUST_PROFILE.riskModifier` of 1.
- `applyZeroTrustOverride()` forces every action from a zero-trust server
to require approval regardless of the user's trust tier.
- New helpers exported from `@skytwin/policy-engine`.
Closes the policy + UI half of #183 AC#4. The container runtime hooks
themselves (the actual --network=none spawn) live in the desktop app
and are deferred to #180's environmental work.

Backend policy logic:
- getEffectiveRiskModifier(server) returns 1 + (zero_trust_mode ? 1 : 0),
  stacking on the existing MCP_HOST_TRUST_PROFILE.riskModifier of 1.
- applyZeroTrustOverride() forces every action from a zero-trust server
  to require approval regardless of trust tier.
- Helpers exported from @skytwin/policy-engine.

Repository + migration:
- mcp-server-repository.setZeroTrustMode(id, enabled) toggles the
  existing mcp_servers.zero_trust_mode column.
- Migration 034-zero-trust-provenance.sql extends the
  capability_provenance_nodes.node_type CHECK to include
  'zero_trust_change' for toggle audits.

API routes (sessionAuth + requireOwnership):
- POST /api/capabilities/:id/zero-trust/enable
- POST /api/capabilities/:id/zero-trust/disable
- Both write a capability_provenance_nodes row with
  node_type = 'zero_trust_change' and payload { from, to }.

Web UX:
- New "Zero-trust mode" card on capability-detail page with state badge,
  toggle button, and explanation text. Re-renders on toggle to reflect
  new state. Singleton-delegator (existing _capabilityDetailListenerWired
  guard).

Tests: 18 new (6 policy-engine + 4 db + 8 api). All 144/179/395 suites
remain green.

Out of scope:
- Container runtime hooks (--network=none spawn in mcp-host) — needs
  Docker / Electron testing, lives in #180
- Per-server allowlist of OAuth-provider domains — needs network policy
  infra, lives in #180
- E2E test verifying the container has no internet — needs Docker

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jayzalowitz jayzalowitz force-pushed the feat/zero-trust-mode branch from 92ff9b2 to 546613f Compare May 8, 2026 17:46
@jayzalowitz jayzalowitz merged commit 366f40c into main May 8, 2026
8 checks passed
jayzalowitz added a commit that referenced this pull request May 9, 2026
…on (#233)

The runtime hook deferred from #222. Stdio MCP servers with
zero_trust_mode=true now spawn inside docker run --network=none with
hardened security flags. Real integration test passes locally on
Docker 29.4.1 — confirmed an outbound http.get from inside the
container fails with no internet.

packages/mcp-host/src/docker-spawn.ts:
- isDockerAvailable() — which docker / where docker check
- spawnInDockerNoNetworkAsync() — spawns with full security flags:
  --network=none --rm --init --memory=512m --cpus=1
  --user=uid:gid --read-only --cap-drop=ALL
  --security-opt=no-new-privileges
- buildDockerArgs() exported for unit tests

packages/mcp-host/src/docker-stdio-transport.ts:
- DockerStdioTransport implements the MCP SDK Transport interface
  over pre-spawned Docker stdin/stdout. Container exit wires to
  onclose for CircuitBreaker notification.

McpServerConfig gains optional zeroTrustMode flag. McpServerHandle
gains failedToIsolate flag — set when Docker is unavailable but the
user requested zero-trust. Backend marks the flag; UI surfacing of
"isolation requested but Docker not available" is a follow-up.

installServer flow: when stdio + zeroTrustMode + Docker available,
wraps spawn in Docker. When Docker unavailable, logs warning + sets
failedToIsolate + falls back to regular stdio (backward compat for
users without Docker).

Constraints:
- Stdio transport only. HTTP/SSE servers are remote — --network=none
  would block all access. Documented.
- User must pre-install MCP packages via `npm install -g`. Container
  mounts host's global node_modules (ro) so npx can find them.
- 512MB / 1 CPU resource cap by default.
- Never runs as root inside the container (--user=uid:gid).

Tests: 18 unit + 1 integration. The integration test spawns
node:22-alpine with --network=none, attempts http.get, confirms it
fails. Skipped automatically when Docker is unavailable. 66/69 mcp-host
tests pass (3 pre-existing skipped). Full turbo: 34/34 tasks clean.

Closes: #183 AC#4.

Out of scope (follow-up): per-server OAuth domain allowlist (sidecar
proxy or iptables rules), per-package selective mount instead of full
global node_modules mount (security gap noted by reviewer for v1.1
multi-tenant), pre-built containers per MCP server.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 9, 2026
… epics

Audited Copilot review comments across PRs #198, #206-#215, #218, #219,
Themed follow-ups will land separately.

Security
- Shell injection in @skytwin/memory-gbrain searchSemantic (#215):
  switch from execSync with shell to execFileSync (no shell), so query
  metacharacters cannot inject. Added regression test.
- redactPII / redactPayload skipped arrays — PII in array-of-object
  payloads leaked to provenance and API responses (#209, #210, #211).
  Both helpers now recurse arrays. Tests updated (one previously asserted
  the bug as expected behavior).
- /api/capabilities/suggestions spread the row, leaking raw
  evidence_sources alongside the redacted preview (#211). Switched to
  explicit field projection.
- Email-redaction regex [A-Z|a-z] matched literal | as TLD char (#211).
  Fixed to [A-Za-z].

Correctness
- Credential vault never engaged: only the worker creates DbTokenStore
  but never called setKeyCache, so at-rest encryption + lazy migration
  were dead weight (#212). Worker now owns a KeyCache and wires it;
  cross-process unlock IPC is a #212 follow-up.
- DXT routes broken under real auth: getUserId read req.user?.id but
  production middleware sets req.authenticatedUserId (#219). Switched
  field; tests updated to mirror production middleware.
- Twin MCP provenance only fired on success (#209). Wrapped each tool
  handler in try/finally so audit fires for both success and failure;
  provenance failures never mask the underlying tool result.
- Migration 027 used inline INDEX (...) WHERE inside CREATE TABLE,
  which CockroachDB does not accept (#198). Pulled the partial indexes
  out as standalone CREATE INDEX IF NOT EXISTS — idempotent.
- Onboarding hard-coded first_run_choice='about-me' at three call
  sites (#208). Track _wizardState.firstRunChoice on welcome-screen
  selection, read everywhere downstream.
- Briefing generator capped at LIMIT 500 silently dropped users
  past the cap (#206). Switched to a 500-row paged scan.
- Zero-trust mode helpers exist but aren't wired into the decision
  pipeline (#222). Updated CHANGELOG and capability-detail copy to be
  honest; #222 follow-up tracks the wiring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 9, 2026
… epics

Audited Copilot review comments across PRs #198, #206-#215, #218, #219,
Themed follow-ups will land separately.

Security
- Shell injection in @skytwin/memory-gbrain searchSemantic (#215):
  switch from execSync with shell to execFileSync (no shell), so query
  metacharacters cannot inject. Added regression test.
- redactPII / redactPayload skipped arrays — PII in array-of-object
  payloads leaked to provenance and API responses (#209, #210, #211).
  Both helpers now recurse arrays. Tests updated (one previously asserted
  the bug as expected behavior).
- /api/capabilities/suggestions spread the row, leaking raw
  evidence_sources alongside the redacted preview (#211). Switched to
  explicit field projection.
- Email-redaction regex [A-Z|a-z] matched literal | as TLD char (#211).
  Fixed to [A-Za-z].

Correctness
- Credential vault never engaged: only the worker creates DbTokenStore
  but never called setKeyCache, so at-rest encryption + lazy migration
  were dead weight (#212). Worker now owns a KeyCache and wires it;
  cross-process unlock IPC is a #212 follow-up.
- DXT routes broken under real auth: getUserId read req.user?.id but
  production middleware sets req.authenticatedUserId (#219). Switched
  field; tests updated to mirror production middleware.
- Twin MCP provenance only fired on success (#209). Wrapped each tool
  handler in try/finally so audit fires for both success and failure;
  provenance failures never mask the underlying tool result.
- Migration 027 used inline INDEX (...) WHERE inside CREATE TABLE,
  which CockroachDB does not accept (#198). Pulled the partial indexes
  out as standalone CREATE INDEX IF NOT EXISTS — idempotent.
- Onboarding hard-coded first_run_choice='about-me' at three call
  sites (#208). Track _wizardState.firstRunChoice on welcome-screen
  selection, read everywhere downstream.
- Briefing generator capped at LIMIT 500 silently dropped users
  past the cap (#206). Switched to a 500-row paged scan.
- Zero-trust mode helpers exist but aren't wired into the decision
  pipeline (#222). Updated CHANGELOG and capability-detail copy to be
  honest; #222 follow-up tracks the wiring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 9, 2026
… epics (#226)

* fix(copilot-sweep batch 1): security + correctness across this week's epics

Audited Copilot review comments across PRs #198, #206-#215, #218, #219,
Themed follow-ups will land separately.

Security
- Shell injection in @skytwin/memory-gbrain searchSemantic (#215):
  switch from execSync with shell to execFileSync (no shell), so query
  metacharacters cannot inject. Added regression test.
- redactPII / redactPayload skipped arrays — PII in array-of-object
  payloads leaked to provenance and API responses (#209, #210, #211).
  Both helpers now recurse arrays. Tests updated (one previously asserted
  the bug as expected behavior).
- /api/capabilities/suggestions spread the row, leaking raw
  evidence_sources alongside the redacted preview (#211). Switched to
  explicit field projection.
- Email-redaction regex [A-Z|a-z] matched literal | as TLD char (#211).
  Fixed to [A-Za-z].

Correctness
- Credential vault never engaged: only the worker creates DbTokenStore
  but never called setKeyCache, so at-rest encryption + lazy migration
  were dead weight (#212). Worker now owns a KeyCache and wires it;
  cross-process unlock IPC is a #212 follow-up.
- DXT routes broken under real auth: getUserId read req.user?.id but
  production middleware sets req.authenticatedUserId (#219). Switched
  field; tests updated to mirror production middleware.
- Twin MCP provenance only fired on success (#209). Wrapped each tool
  handler in try/finally so audit fires for both success and failure;
  provenance failures never mask the underlying tool result.
- Migration 027 used inline INDEX (...) WHERE inside CREATE TABLE,
  which CockroachDB does not accept (#198). Pulled the partial indexes
  out as standalone CREATE INDEX IF NOT EXISTS — idempotent.
- Onboarding hard-coded first_run_choice='about-me' at three call
  sites (#208). Track _wizardState.firstRunChoice on welcome-screen
  selection, read everywhere downstream.
- Briefing generator capped at LIMIT 500 silently dropped users
  past the cap (#206). Switched to a 500-row paged scan.
- Zero-trust mode helpers exist but aren't wired into the decision
  pipeline (#222). Updated CHANGELOG and capability-detail copy to be
  honest; #222 follow-up tracks the wiring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(copilot-sweep batch 1, post-/review): address Copilot inline comments on #226

Per the new merge gate (CLAUDE.md), every Copilot inline comment is
addressed before merge. Five comments on #226, all valid:

- Migration 027 idempotency: if the inline `INDEX (...) WHERE ...` form
  ever ran successfully (CRDB version-dependent), it would have created
  an auto-named partial index. Added defensive
  `DROP INDEX IF EXISTS <table>@<auto-name>` for both
  mcp_servers_last_active_at_idx and app_suggestions_user_id_idx so a
  re-run doesn't leave duplicate partial indexes covering the same
  predicate.
- briefing-generator: switched OFFSET pagination to keyset pagination
  (`AND user_id > $last`). OFFSET pagination on DISTINCT scales
  quadratically with user count — the briefing job's runtime would
  grow unboundedly. Keyset stays linear and the (user_id, status) index
  serves the range scan directly.
- DXT route docstring: removed the "other route modules use the same
  order" claim — it was misleading (other routes still vary in
  precedence). Now just notes a shared-helper #226 follow-up if the
  ordering proves load-bearing.
- gbrain-port test: renamed `mockExecSync` → `mockExecFileSync` so the
  variable name matches the API under test (execFileSync).

Tests: all 421 api + 48 worker + 20 memory-gbrain tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(CHANGELOG): add post-/review fixes subsection for #226

Per CLAUDE.md "Review Discipline": post-/review fixes get their own
commits AND a CHANGELOG subsection so the audit trail of "what review
caught" is visible to future readers.

VERSION not bumped — this batch lands as part of the stacked sweep
(#226#232); a single VERSION bump will follow when the full chain
merges, to avoid CHANGELOG/VERSION conflicts on each cascade rebase.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

capability-loop Capability Acquisition Loop epic + children (OSS launch v1) enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants