fix(copilot-sweep batch 1): security + correctness across this week's epics#226
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR delivers “batch 1” fixes from a cross-epic audit, focusing on security-critical and clearly broken behaviors across the memory-gbrain CLI integration, PII redaction, capability suggestions API, worker credential-vault wiring, MCP provenance auditing, CockroachDB migration compatibility, onboarding telemetry, and briefing generation.
Changes:
- Hardened security boundaries: removed shell execution from
memory-gbrainsearch, fixed email-redaction regex, ensured PII redaction recurses into arrays, and prevented leaking rawevidence_sourcesfrom suggestions responses. - Fixed correctness gaps in production wiring: worker now wires a
KeyCacheintoDbTokenStore, DXT routes now readreq.authenticatedUserId, MCP provenance is written on both success and failure, onboarding records the correctfirst_run_choice, and briefing generation no longer silently caps at 500 users. - Improved migration/runtime honesty: adjusted migration 027 partial-index creation for CRDB compatibility and updated user-facing copy / CHANGELOG to reflect current zero-trust wiring status.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Adds workspace linkage for @skytwin/credential-vault dependency. |
| packages/memory-gbrain/src/gbrain-port.ts | Switches to execFileSync for shell-injection safety; returns empty capabilities when gbrain isn’t installed. |
| packages/memory-gbrain/src/tests/gbrain-port.test.ts | Updates mocks for execFileSync and adds regression coverage for argv-based invocation. |
| packages/db/src/migrations/027-capability-acquisition.sql | Moves partial indexes to standalone CREATE INDEX IF NOT EXISTS statements for CRDB compatibility. |
| CHANGELOG.md | Documents the batch-1 sweep and clarifies zero-trust enforcement status. |
| apps/worker/src/jobs/briefing-generator.ts | Replaces hard LIMIT 500 with paging over active users. |
| apps/worker/src/index.ts | Wires a worker-local KeyCache into DbTokenStore for vault decrypt/lazy migration readiness. |
| apps/worker/package.json | Adds @skytwin/credential-vault dependency. |
| apps/web/public/js/pages/onboarding.js | Persists and reuses welcome-screen first-run choice instead of hard-coding 'about-me'. |
| apps/web/public/js/pages/capability-detail.js | Updates copy to accurately reflect current zero-trust enforcement status. |
| apps/twin-mcp-server/src/server.ts | Ensures external-agent provenance writes fire via try/finally and don’t mask tool errors. |
| apps/twin-mcp-server/src/audit/provenance-writer.ts | Fixes recursive PII redaction to traverse arrays and nested structures. |
| apps/twin-mcp-server/src/tests/auth.test.ts | Adds regression coverage for array-of-object PII redaction. |
| apps/api/src/routes/dxt.ts | Fixes user identity extraction to prefer req.authenticatedUserId and fall back safely. |
| apps/api/src/routes/capabilities.ts | Fixes recursive redaction for arrays, tightens suggestions response projection, fixes email regex. |
| apps/api/src/tests/provenance-graph-routes.test.ts | Updates expectations and adds coverage for array redaction recursion. |
| apps/api/src/tests/dxt-routes.test.ts | Aligns tests with production auth shape (req.authenticatedUserId). |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ); | ||
| -- Partial index — pulled out of CREATE TABLE because some CRDB versions | ||
| -- choke on inline `INDEX (...) WHERE ...`. Standalone form (matches the | ||
| -- pattern in 011-sessions.sql). |
| INDEX (user_id) WHERE status = 'pending' | ||
| updated_at TIMESTAMPTZ NOT NULL DEFAULT now() | ||
| ); | ||
| -- Pulled out of CREATE TABLE for the same CRDB-compat reason as above. |
Comment on lines
+53
to
+66
| let offset = 0; | ||
| for (;;) { | ||
| const result = await query<{ user_id: string }>( | ||
| `SELECT DISTINCT user_id | ||
| FROM mcp_servers | ||
| WHERE status IN ('active', 'installed', 'authorized') | ||
| ORDER BY user_id | ||
| LIMIT $1 OFFSET $2`, | ||
| [BRIEFING_USER_PAGE_SIZE, offset], | ||
| ); | ||
| if (result.rows.length === 0) break; | ||
| for (const row of result.rows) allIds.push(row.user_id); | ||
| if (result.rows.length < BRIEFING_USER_PAGE_SIZE) break; | ||
| offset += BRIEFING_USER_PAGE_SIZE; |
Comment on lines
+26
to
+27
| * `?userId=` then to a legacy `req.user.id`. Other route modules use the same | ||
| * order — keep them in sync. |
Comment on lines
+12
to
18
| import { execFileSync } from 'node:child_process'; | ||
| import { isGbrainInstalled } from '../cli-detector.js'; | ||
| import { GbrainMemoryPort, NotImplementedError } from '../gbrain-port.js'; | ||
| import type { SemanticHit } from '@skytwin/memory-port'; | ||
|
|
||
| const mockExecSync = execSync as ReturnType<typeof vi.fn>; | ||
| const mockExecSync = execFileSync as ReturnType<typeof vi.fn>; | ||
| const mockIsInstalled = isGbrainInstalled as ReturnType<typeof vi.fn>; |
This was referenced May 8, 2026
jayzalowitz
added a commit
that referenced
this pull request
May 9, 2026
…ents on #226 Per the new merge gate (CLAUDE.md), every Copilot inline comment is addressed before merge. Five comments on #226, all valid: - Migration 027 idempotency: if the inline `INDEX (...) WHERE ...` form ever ran successfully (CRDB version-dependent), it would have created an auto-named partial index. Added defensive `DROP INDEX IF EXISTS <table>@<auto-name>` for both mcp_servers_last_active_at_idx and app_suggestions_user_id_idx so a re-run doesn't leave duplicate partial indexes covering the same predicate. - briefing-generator: switched OFFSET pagination to keyset pagination (`AND user_id > $last`). OFFSET pagination on DISTINCT scales quadratically with user count — the briefing job's runtime would grow unboundedly. Keyset stays linear and the (user_id, status) index serves the range scan directly. - DXT route docstring: removed the "other route modules use the same order" claim — it was misleading (other routes still vary in precedence). Now just notes a shared-helper #226 follow-up if the ordering proves load-bearing. - gbrain-port test: renamed `mockExecSync` → `mockExecFileSync` so the variable name matches the API under test (execFileSync). Tests: all 421 api + 48 worker + 20 memory-gbrain tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3b28a89 to
4eda18d
Compare
jayzalowitz
added a commit
that referenced
this pull request
May 9, 2026
…ents on #226 Per the new merge gate (CLAUDE.md), every Copilot inline comment is addressed before merge. Five comments on #226, all valid: - Migration 027 idempotency: if the inline `INDEX (...) WHERE ...` form ever ran successfully (CRDB version-dependent), it would have created an auto-named partial index. Added defensive `DROP INDEX IF EXISTS <table>@<auto-name>` for both mcp_servers_last_active_at_idx and app_suggestions_user_id_idx so a re-run doesn't leave duplicate partial indexes covering the same predicate. - briefing-generator: switched OFFSET pagination to keyset pagination (`AND user_id > $last`). OFFSET pagination on DISTINCT scales quadratically with user count — the briefing job's runtime would grow unboundedly. Keyset stays linear and the (user_id, status) index serves the range scan directly. - DXT route docstring: removed the "other route modules use the same order" claim — it was misleading (other routes still vary in precedence). Now just notes a shared-helper #226 follow-up if the ordering proves load-bearing. - gbrain-port test: renamed `mockExecSync` → `mockExecFileSync` so the variable name matches the API under test (execFileSync). Tests: all 421 api + 48 worker + 20 memory-gbrain tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz
added a commit
that referenced
this pull request
May 9, 2026
Per CLAUDE.md "Review Discipline": post-/review fixes get their own commits AND a CHANGELOG subsection so the audit trail of "what review caught" is visible to future readers. VERSION not bumped — this batch lands as part of the stacked sweep (#226 → #232); a single VERSION bump will follow when the full chain merges, to avoid CHANGELOG/VERSION conflicts on each cascade rebase. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… epics Audited Copilot review comments across PRs #198, #206-#215, #218, #219, Themed follow-ups will land separately. Security - Shell injection in @skytwin/memory-gbrain searchSemantic (#215): switch from execSync with shell to execFileSync (no shell), so query metacharacters cannot inject. Added regression test. - redactPII / redactPayload skipped arrays — PII in array-of-object payloads leaked to provenance and API responses (#209, #210, #211). Both helpers now recurse arrays. Tests updated (one previously asserted the bug as expected behavior). - /api/capabilities/suggestions spread the row, leaking raw evidence_sources alongside the redacted preview (#211). Switched to explicit field projection. - Email-redaction regex [A-Z|a-z] matched literal | as TLD char (#211). Fixed to [A-Za-z]. Correctness - Credential vault never engaged: only the worker creates DbTokenStore but never called setKeyCache, so at-rest encryption + lazy migration were dead weight (#212). Worker now owns a KeyCache and wires it; cross-process unlock IPC is a #212 follow-up. - DXT routes broken under real auth: getUserId read req.user?.id but production middleware sets req.authenticatedUserId (#219). Switched field; tests updated to mirror production middleware. - Twin MCP provenance only fired on success (#209). Wrapped each tool handler in try/finally so audit fires for both success and failure; provenance failures never mask the underlying tool result. - Migration 027 used inline INDEX (...) WHERE inside CREATE TABLE, which CockroachDB does not accept (#198). Pulled the partial indexes out as standalone CREATE INDEX IF NOT EXISTS — idempotent. - Onboarding hard-coded first_run_choice='about-me' at three call sites (#208). Track _wizardState.firstRunChoice on welcome-screen selection, read everywhere downstream. - Briefing generator capped at LIMIT 500 silently dropped users past the cap (#206). Switched to a 500-row paged scan. - Zero-trust mode helpers exist but aren't wired into the decision pipeline (#222). Updated CHANGELOG and capability-detail copy to be honest; #222 follow-up tracks the wiring. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ents on #226 Per the new merge gate (CLAUDE.md), every Copilot inline comment is addressed before merge. Five comments on #226, all valid: - Migration 027 idempotency: if the inline `INDEX (...) WHERE ...` form ever ran successfully (CRDB version-dependent), it would have created an auto-named partial index. Added defensive `DROP INDEX IF EXISTS <table>@<auto-name>` for both mcp_servers_last_active_at_idx and app_suggestions_user_id_idx so a re-run doesn't leave duplicate partial indexes covering the same predicate. - briefing-generator: switched OFFSET pagination to keyset pagination (`AND user_id > $last`). OFFSET pagination on DISTINCT scales quadratically with user count — the briefing job's runtime would grow unboundedly. Keyset stays linear and the (user_id, status) index serves the range scan directly. - DXT route docstring: removed the "other route modules use the same order" claim — it was misleading (other routes still vary in precedence). Now just notes a shared-helper #226 follow-up if the ordering proves load-bearing. - gbrain-port test: renamed `mockExecSync` → `mockExecFileSync` so the variable name matches the API under test (execFileSync). Tests: all 421 api + 48 worker + 20 memory-gbrain tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per CLAUDE.md "Review Discipline": post-/review fixes get their own commits AND a CHANGELOG subsection so the audit trail of "what review caught" is visible to future readers. VERSION not bumped — this batch lands as part of the stacked sweep (#226 → #232); a single VERSION bump will follow when the full chain merges, to avoid CHANGELOG/VERSION conflicts on each cascade rebase. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
86e8d04 to
fc40c27
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Audit of Copilot review comments across this week's 15 epic PRs (#198,
#206-#215, #218, #219, #221, #222) found ~70 unresolved REAL issues. This
PR ships batch 1: the security-critical and obviously-broken items.
Lower-severity items (frontend nits, observability rollup architecture,
adaptive-prompt input mismatches, vault rotation polish, daemon scripts)
are scoped into themed follow-up PRs.
Security
from `execSync` (shell) to `execFileSync` (no shell) so the user
query cannot inject (`$()`, backticks, `;`, etc.). Regression test
added.
array-of-object payloads (e.g. email recipients) leaked to provenance
and API responses. Both helpers now recurse arrays.
returning raw `evidence_sources` (unredacted source signals) along
with the redacted preview. Switched to explicit projection.
char — fixed.
Correctness
`DbTokenStore` instantiation (in worker) never called `setKeyCache`,
so lazy migration never fired. Worker now owns a `KeyCache` and wires
it. Cross-process unlock IPC is a feat(#183 follow-up): credential vault encryption #212 follow-up.
read `req.user?.id` but production middleware sets
`req.authenticatedUserId`. Field switched; tests updated to mirror
production middleware.
external-agent tool call must be audited. Wrapped each handler in
try/finally; provenance failures never mask the original tool error.
which CockroachDB does not accept. Pulled partial indexes out as
standalone `CREATE INDEX IF NOT EXISTS` — idempotent.
path. Track `_wizardState.firstRunChoice` on welcome-screen choice;
read downstream.
Switched to paged scan over `mcp_servers`.
Updated CHANGELOG and capability-detail copy to be honest about the
current scope; feat(#183 AC#4 partial): zero-trust mode policy + UI #222 follow-up tracks the wiring.
Follow-up PRs (batch 2+, themed sweeps)
always empty, registryId/display_name fallback, unreachable base64 try/catch,
`listForUser` loading blobs.
`updateAccessToken` stale-encrypted column, rotation rejecting null
refresh tokens, `/status` endpoint missing keyVersion/lastRotated, tx
error log omits the error.
keys vs snake_case prompt templates; tests use schema-invalid mocks.
buffered in API process — empty rollup in production. Plus
`writeBucket` overwriting latency percentiles.
systemd `StartLimit*` in wrong section, launchd path inconsistency,
Windows `fileURLToPath`.
`textContent`, double-badges after accept/reject, `high_autonomy` copy
branch, `_promotionSseWired` flag set before wire confirmed,
duplicate `[unreleased]` headings.
Test plan
(3 cases each), and `capabilities()` empty-when-not-installed
(cannot do from this workspace; will verify in deploy)
🤖 Generated with Claude Code