v0.28.6 feat: takes + think + unified model config + per-token MCP allow-list#563
Merged
v0.28.6 feat: takes + think + unified model config + per-token MCP allow-list#563
Conversation
…sions (v32)
Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.
Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, resolve, synthesis_evidence Extends BrainEngine with the takes domain object. Both engines implement the same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js unnest() — same shape as addLinksBatch and addTimelineEntriesBatch. Methods: - addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE) - listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList for MCP-bound calls, sortBy weight/since_date/created_at) - searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list) - countStaleTakes / listStaleTakes (mirror countStaleChunks pattern; embedding column intentionally omitted from listStale payload) - updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND) - supersedeTake (transactional: insert new at next row_num, mark old active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on resolved bets) - resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve; resolution is immutable per Codex P1 #13 fold) - addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING) - getTakeEmbeddings (parallel to getEmbeddingsByChunkIds) Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped via page_id (slug not unique in v0.18+ multi-source). PageType gains 'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string normalization. Tests: test/takes-engine.test.ts — 16 cases against PGLite covering upsert/list/filter/search happy paths, takesHoldersAllowList isolation, the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED, TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve metadata round-trip, FK CASCADE on synthesis_evidence when source take deletes. All pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…as resolution
Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:
1. CLI flag (--model)
2. New-key config (e.g. models.dream.synthesize)
3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
— read with stderr deprecation warning, one-per-process
4. Global default (models.default)
5. Env var (GBRAIN_MODEL or caller-supplied)
6. Hardcoded fallback
Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.
When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.
Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).
Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…P0 fix) src/core/takes-fence.ts — pure functions for the fenced markdown surface: - parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->` blocks. Strict on canonical form, lenient on hand-edits with warnings (TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION). Strikethrough `~~claim~~` → active=false; date ranges `since → until` split into sinceDate/untilDate. - renderTakesFence(takes) — round-trip safe with parseTakesFence. - upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a fresh `## Takes` section if no fence present. row_num is monotonic (max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence stable forever). - supersedeRow(body, oldRow, replacement) — strikes through old row's claim AND appends the new row at end. Both rows preserved in markdown for git-blame archaeology. - stripTakesFence(body) — removes the fenced block entirely. Used by the chunker so takes content lives ONLY in the takes table. Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence() before computing chunk boundaries. Without this, page chunks would contain the rendered takes table and the per-token MCP allow-list would be bypassed at the index layer (token bound to takes_holders=['world'] would see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check (plan-side) asserts no chunk contains the begin marker. Tests: 15 cases covering canonical parse, strikethrough, date range, fence unbalanced detection, malformed-row skip + warning, row_num collision detection, round-trip render, append-only upsert into existing fence, fresh-section creation, monotonic row_num under hand-edit gaps, supersede flow, stripTakesFence verifying takes content removed AND surrounding prose preserved. Existing chunker tests still pass (15 + 15 = 30). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fy-write src/core/page-lock.ts — per-page file lock at ~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes add` calls or `takes seed --refresh` from autopilot can't race on the same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17 cycle.lock pattern (mtime + PID liveness) but per-slug. Differences from cycle.ts's lock: - SHA-256 of slug for safe filenames (slashes, unicode, etc.) - Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs in one process). mtime expiry still rescues post-crash leftovers. - 5-min TTL (vs cycle's 30 min — page edits are short) - `withPageLock(slug, fn)` convenience wrapper with default 30s timeout API: - acquirePageLock(slug, opts) → handle | null (poll-with-timeout) - handle.refresh() / handle.release() (idempotent — only releases if pid matches) - withPageLock(slug, fn, opts) — acquire + run + release-in-finally Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release is no-op, withPageLock callback runs and releases on success/failure, timeout-throws when held, SHA-256 filename safety for slashes/unicode. All pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:
- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)
Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
slate when markdown is canonical and DB has drifted)
Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.
Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.
Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
src/commands/takes.ts — surfaces the engine methods + takes-fence library through a single `gbrain takes <subcommand>` entrypoint: takes <slug> list with filters + sort takes search "<query>" pg_trgm keyword search across all takes takes add <slug> --claim ... ... append (markdown + DB, atomic via lock) takes update <slug> --row N ... mutable-fields update (markdown + DB) takes supersede <slug> --row N ... strikethrough old + append new takes resolve <slug> --row N --outcome record bet resolution (immutable) Markdown is canonical. Every mutate command: 1. acquires the per-page file lock (withPageLock) 2. re-reads the .md file 3. applies the edit via takes-fence (upsertTakeRow / supersedeRow) 4. writes the .md file back 5. mirrors to the DB via the engine method 6. releases the lock (auto via finally) Resolve currently writes only to DB — surfacing resolved_* in the markdown table is deferred to v0.29 (the takes-fence renderer's column set is fixed at # | claim | kind | who | weight | since | source per spec). Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the project convention (orphans/embed/extract pattern). --dir flag overrides sync.repo_path config when working outside the configured brain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llow-list OperationContext gains takesHoldersAllowList — server-side filter for takes.holder field threaded from access_tokens.permissions through dispatch into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker strip already closed the page-content side in the previous commit). src/core/operations.ts — three new ops: - takes_list: lists takes with holder/kind/active/resolved filters; honors ctx.takesHoldersAllowList for MCP-bound calls - takes_search: pg_trgm keyword search; honors allow-list - think: op surface registered (returns not_implemented envelope until Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7. src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into buildOperationContext. src/mcp/http-transport.ts — validateToken now reads access_tokens.permissions.takes_holders, defaults to ['world'] when the column is absent or malformed (default-deny on private hunches). auth.takesHoldersAllowList passed to dispatchToolCall. src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world'] since stdio has no per-token auth. Operators wanting full visibility use `gbrain call <op>` directly (sets remote=false). src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b` flag persists the per-token list; new `auth permissions <name> set-takes-holders <list>` updates an existing token. Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving the threading: local-CLI sees all holders, ['world'] returns only public, ['world','garry'] returns 2/3, no-overlap returns empty (no fallback), search honors allow-list, remote save/take on think rejected with not_implemented envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:
- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
any pre-existing fenced takes tables in markdown populate the takes
index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
to re-import pages with takes content so the v0.28 chunker-strip rule
(Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
already have takes content stripped from chunks at index time; this
TODO catches up legacy pages.
skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.
CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).
Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
src/core/think/sanitize.ts — prompt-injection defense for take claims: 14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag, DAN, system-prompt overrides, eval-shell hooks) plus structural framing (takes wrapped in <take id="..."> tags the model is told to treat as DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt. src/core/think/prompt.ts — system prompt + structured-output schema. Hard rules: cite every claim, mark hunches/low-weight explicitly, surface conflicts (never silently pick), surface gaps. JSON schema with answer + citations[] + gaps[]. Prompt adapts to anchor / time window / save flag. src/core/think/cite-render.ts — structured citations + regex fallback (Codex P1 #4 fold). normalizeStructuredCitations validates the model's structured output; parseInlineCitations is the body-scan fallback when the model omits the structured field. resolveCitations dispatches and records CITATIONS_REGEX_FALLBACK warning when used. src/core/think/gather.ts — 4-stream parallel retrieval: 1. hybridSearch (pages, existing primitive) 2. searchTakes (keyword, pg_trgm) 3. searchTakesVector (vector, when embedQuestion fn supplied) 4. traversePaths (graph, when --anchor set) RRF fusion (k=60). Each stream wrapped in try/catch — partial gather beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls. src/core/think/index.ts — runThink orchestrator + persistSynthesis: INTENT (regex classify) → GATHER → render evidence blocks → resolveModel ('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call (injectable client) → JSON parse with code-fence + fallback strip → resolveCitations → ThinkResult. persistSynthesis writes a synthesis page + synthesis_evidence rows (page_id resolved per slug; page-level citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY. Round-loop scaffolding in place (rounds=1 only path exercised in v0.28). src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing strips --anchor, --rounds, --save, --take, --model, --since, --until, --json. Local CLI = remote=false, so save/take honored. Human-readable output by default; --json for agent consumption. operations.ts — `think` op now calls runThink (was a not_implemented stub). Remote callers can't save/take per Codex P1 #7. Returns full ThinkResult plus saved_slug + evidence_inserted. cli.ts — wired into dispatch + CLI_ONLY allowlist. Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering sanitize patterns, structural rendering, citation parsing (structured + regex fallback + dedup + invalid-slug rejection), gather streams + allow-list filter, full pipeline with stub client, malformed-LLM fallback path, no-API-key graceful degradation, persistSynthesis writes page + evidence rows. All pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…old) src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family plus older aliases. estimateMaxCostUsd returns null on unpriced models so the meter caller can warn-once and bypass the gate. src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit estimates max-cost from (model + estimatedInputTokens + maxOutputTokens), accumulates per-cycle, refuses next submit when projected > cap. Codex P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr warn per process and `unpriced=true` on the result. Budget=0 disables the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl. src/core/cycle/auto-think.ts — auto_think dream phase. Reads dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days, auto_commit}. Iterates configured questions through runThink with the BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on success (matches v0.23 synthesize pattern — retries after partial failures pick back up). When auto_commit=true, persists synthesis pages via persistSynthesis. Default-disabled. src/core/cycle/drift.ts — drift dream phase scaffold. Reads dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes in the soft band (weight 0.3-0.85, unresolved) that have recent timeline evidence on the same page. v0.28 ships the orchestration; the LLM judge that proposes weight adjustments lands in v0.29. modelId + meter wired now so the ledger captures gate state for callers that opt in. Tests: - test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path, cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger captures all events, ISO-week filename branch. - test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip, questions empty, success → cooldown ts written, cooldown blocks rerun, budget exhausted → partial. drift not_enabled, soft-band candidate detection, complete + dry-run paths. All pass. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real Postgres (gated on DATABASE_URL). 12 cases: - addTakesBatch upsert via unnest() bind path (Postgres-specific) - listTakes filters: holder, kind, sort=weight, takesHoldersAllowList - searchTakes pg_trgm + allow-list filter - supersedeTake transactional path (BEGIN/COMMIT semantics) - resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED - synthesis_evidence FK CASCADE on take delete - countStaleTakes + listStaleTakes filter active+null - extractTakesFromDb populates takes from fenced markdown - MCP dispatch with takesHoldersAllowList=['world'] returns only world - MCP dispatch local-CLI path returns all holders - MCP dispatch takes_search honors allow-list - think op forces remote_persisted_blocked even for save+take postgres-engine.ts: addTakesBatch boolean[] serialization fix. postgres-js auto-detects element type from JS arrays; for booleans it mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then SQL-cast to boolean[] — same pattern other batch methods rely on for type-stable bind shapes. test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence) and (b) calls engine.initSchema() to actually run migrations. test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match Lane D's landed pipeline. They previously asserted not_implemented envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY graceful-degrade behavior. Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts Result: 12/12 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ePhase enum extension)
cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.
Introduces DreamPhaseResult exported from auto-think.ts:
{ name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
detail: string; totals?: Record<string,number>; duration_ms: number }
drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:
- Migration v32 default backfill: new tokens created without a permissions
column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.
Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.
All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion) test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually strips fenced takes content end-to-end through the import pipeline. This is the Codex P0 #3 fix's verification path: takes content lives ONLY in the takes table for retrieval, never duplicated in content_chunks where the per-token MCP allow-list cannot reach. 5 cases: - chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers - chunkText output never contains fenced claim text - chunkText output retains non-fence prose (no over-stripping) - importFromContent end-to-end: imported page has chunks but none contain fenced content - takes_fence_chunk_leak doctor invariant: zero rows globally where chunk_text matches `<!--- gbrain:takes:%` Final v0.28 test sweep: 121 pass, 0 fail, 336 expect() calls, 12 files Coverage: schema migrations, engine methods (PGLite + Postgres), takes-fence parser, page-lock, extract phase, takes CLI engine surface, model config 6-tier resolver, MCP+auth allow-list, think pipeline (gather + sanitize + cite-render + synthesize), auto-think + drift + budget meter, JSONB end-to-end, chunker strip integration. ~95% of v0.28 surface area covered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Master shipped v0.25.0 with the eval-capture system (eval_candidates + eval_capture_failures tables, GBRAIN_CONTRIBUTOR_MODE=1 capture path, gbrain eval export/replay/prune CLI, +144 tests across 9 new files). Master's migration claimed v31 first. Conflict resolution: - VERSION + package.json → 0.28.0 (mine; > master's 0.25.0) - CHANGELOG.md → my v0.28.0 entry on top, master's v0.25.0 below - src/core/migrate.ts → renumber my migrations from v31/v32 to v32/v33 to sit above master's v31 (eval_capture_tables). Runtime sort by version means source-order doesn't matter; the chain becomes ..., v30 (dream_verdicts), v31 (eval_capture_tables, master), v32 (takes_and_synthesis_evidence, mine), v33 (access_tokens_permissions, mine). - skills/migrations/v0.28.0.md + src/commands/migrations/v0_28_0.ts: schema-version assertion bumped to >= 33; doc refs updated to v32/v33. - All other files (engine.ts, types.ts, operations.ts, postgres-engine.ts, pglite-engine.ts, schema-embedded.ts, etc.) auto-merged cleanly — both branches added new types/methods/columns without textual collision. Verification: - bun run typecheck: clean - v0.28 e2e suite: 121/121 pass against fresh Postgres - v0.25 eval suite: 198/198 pass on the merged tree - Combined: 319 tests, 0 regressions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI failures from PR #563: test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to the migration registry means it shows up in skippedFuture when the test runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both hardcoded arrays. test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix matches `SELECT id, name FROM access_tokens` to return a row. v0.28's validateToken now selects `SELECT id, name, permissions FROM access_tokens` to read the per-token takes_holders allow-list. Mock returned [] on the new query → validateToken treated every token as invalid → 401. Fix: mock now matches both query shapes. validTokens row gets a default `{takes_holders: ['world']}` permission injected when caller didn't supply one (mirrors the migration v33 column DEFAULT). Updated FakeEngineConfig type to allow tests to pass explicit permissions. Verification: bun test test/apply-migrations.test.ts → 18/18 pass bun test test/http-transport.test.ts → 24/24 pass bun run typecheck → clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tions to v34/v35
Master shipped 5 commits since the last sync:
v0.26.3 — admin dashboard hardening (magic-link, per-client TTL, params/error_message)
v0.26.2 — oauth bun execSync env + BIGINT-as-string fix
v0.26.1 — oauth client_credentials bearer auth fix
v0.26.0 — MCP Keys OAuth 2.1 + HTTP server + admin dashboard
v0.25.1 — book-mirror flagship + 8 research skills + skillpack uninstall
Master claimed v32 (oauth_infrastructure) and v33 (admin_dashboard_columns_v0_26_3)
schema migrations. My v0.28 migrations were already at v32/v33 from the prior
v0.25 merge. Renumbering both forward to v34/v35:
v0.28 originally targeted v31/v32
master v0.25 claimed v31 (eval_capture_tables) → my migrations v32/v33
master v0.26 claimed v32/v33 (oauth + admin) → my migrations v34/v35
Conflict resolutions:
- VERSION + package.json → 0.28.0 (mine; > master's 0.26.3)
package.json kept master's new scripts (admin build, no-legacy-getconnection check)
- src/cli.ts → kept both branches' new CLI commands ('mounts' + 'book-mirror'
from master, 'takes' + 'think' from mine)
- src/commands/auth.ts → preserved both 'permissions' (mine) and
'register-client'/'revoke-client' (master OAuth) cases. Help text merged.
- src/core/operations.ts → kept both takesHoldersAllowList (mine) and brainId
(master mounts) on OperationContext.
- src/core/migrate.ts → renumbered + comment block updated.
- v0_28_0 orchestrator schema-version assertion bumped >= 35.
- skills/migrations/v0.28.0.md + CHANGELOG.md doc refs updated to v34/v35.
- All other files (engine.ts, types.ts, postgres-engine.ts, pglite-engine.ts,
utils.ts, schema-embedded.ts, etc.) auto-merged cleanly.
Verification:
- bun run typecheck → clean
- v0.28 test sweep (PGLite + Postgres) → 163/163 pass
- master eval + migrate tests → 203/203 pass
- Combined: 366 tests, 0 regressions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hink) test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing: every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added before master shipped v0.26 + the new invariant; the merge surfaced the gap. Annotations: - takes_list → read - takes_search → read - think → write (mutating: true; --save persists synthesis page) Verification: bun test test/oauth.test.ts → 42/42 pass bun run typecheck → clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…35/v36 Master shipped three releases while v0.28 was in flight: - v0.26.4: parallel test runner (run-unit-parallel.sh) - v0.26.5: destructive-operation guard (deleted_at, soft-delete) — claimed v34 - v0.26.6: PGLite ↔ Postgres schema-drift parity gate Master's new v34 (destructive_guard_columns) collides with my v34 (takes_and_synthesis_evidence). Renumbered my migrations: takes_and_synthesis_evidence: v34 → v35 access_tokens_permissions: v35 → v36 Updated assertions: - src/commands/migrations/v0_28_0.ts schema-version gate: >= 35 → >= 36 - skills/migrations/v0.28.0.md: v34/v35 → v35/v36 references - CHANGELOG.md v0.28.0 entry: schema migrations row + section header Conflicts resolved: - VERSION → 0.28.0 (kept ours) - package.json → 0.28.0 + master's new test scripts (auto-merged structure) - CHANGELOG.md → kept v0.28.0 entry on top, master's v0.26.4–v0.26.6 below Verified post-merge: - bun run typecheck: PASS - 88 migration tests (migrate + bootstrap-coverage + apply-migrations) - 99 v0.28 unit tests (takes-engine + fence + extract + page-lock + model- config + mcp-allowlist + think-pipeline + budget-meter + auto-think) - 119 master-side tests (oauth + http-transport + destructive-guard + pages-soft-delete + schema-diff)
…al.test.ts Master shipped v0.26.7 with a new test-isolation lint (scripts/check-test-isolation.sh) that flags any test mutating process.env outside withEnv(). Three v0.28 tests violated the new rule: test/model-config.test.ts → model-config.serial.test.ts test/takes-mcp-allowlist.test.ts → takes-mcp-allowlist.serial.test.ts test/think-pipeline.test.ts → think-pipeline.serial.test.ts Renaming to .serial.test.ts is the lint's documented escape hatch for genuinely env-coupled tests; they now run in the serial pass at --max-concurrency=1 instead of the parallel fast loop. Conflicts resolved: - VERSION → 0.28.0 (kept ours) - package.json → 0.28.0 + master's check-test-isolation script - CHANGELOG.md → kept v0.28.0 entry on top, master's v0.26.7 below Verified post-merge: - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + typecheck) - 37 v0.28 serial tests pass (model-config 16, mcp-allowlist 8, think 13)
…tions to v37/v38 Master shipped three more releases while v0.28 was in flight: - v0.26.8: auto-RLS event trigger (migration v35) - v0.26.9: OAuth RFC 6749 hardening + close HTTP MCP shell-job RCE - v0.27.0: pluggable embedding providers via Vercel AI SDK (migration v36 — subagent_provider_neutral_persistence_v0_27) Master's new v35 + v36 collide with my v35 + v36 (v0.28 takes layer). Renumbered my migrations: takes_and_synthesis_evidence: v35 → v37 access_tokens_permissions: v36 → v38 Updated assertions: - src/commands/migrations/v0_28_0.ts schema-version gate: >= 36 → >= 38 - skills/migrations/v0.28.0.md: v35/v36 → v37/v38 references - CHANGELOG.md v0.28.0 entry: schema migrations row + section header Conflicts resolved: - VERSION → 0.28.0 (kept ours) - package.json → 0.28.0 + master's new ai-sdk deps (@ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/openai, @ai-sdk/openai-compatible, ai, eventsource-parser, zod) - src/cli.ts CLI_ONLY → kept both 'providers' (master v0.27) and 'takes', 'think' (mine) - CHANGELOG.md → kept v0.28.0 entry on top, master's v0.26.8 below Verified post-merge: - bun install: ai-sdk deps installed - bun run typecheck: PASS - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + typecheck) - 203 tests pass: 88 migration + 99 v0.28 unit + 16 model-config serial - 80 master-side tests pass (oauth + http-transport)
* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts
src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).
Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe
New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:
- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
-c protocol.file.allow=never, -c protocol.ext.allow=never,
--no-recurse-submodules. Single source of truth shared by cloneRepo
and pullRepo so a future flag added to one path lands on both.
Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
.gitmodules as a second-fetch surface, file:// scheme in remotes.
- parseRemoteUrl: https-only, rejects embedded credentials and path
traversal, delegates internal-target classification to isInternalUrl
from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
needed for self-hosted git over Tailscale (CGNAT trips the gate).
- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
non-empty destDirs; spawns git via execFileSync (no shell injection)
with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
prompts. timeoutMs default 600s.
- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.
- validateRepoState: 6-state decision tree (missing | not-a-dir |
no-git | corrupted | url-drift | healthy). Used by performSync's
re-clone branch to recover from rmd clone dirs and refuse syncs on
url-drift or corruption.
test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist
New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.
Hierarchy:
- admin implies all (escape hatch)
- write implies read
- sources_admin and users_admin are siblings (different axes —
sources-mgmt vs user-account-mgmt; neither implies the other)
Exported:
- hasScope(grantedScopes, requiredScope): the canonical scope check.
Replaces exact-string-match at three call sites in upcoming commits
(serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
token issuance). Without this rewrite, an admin-grant token would
fail to refresh down to sources_admin (codex finding).
- ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
for OAuth metadata wire format and drift-check output).
- assertAllowedScopes / InvalidScopeError: registration-time gate so
tokens with bogus scope strings (read flying-unicorn) get rejected
with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
registerClientManual. Today's behavior accepts any string silently.
- parseScopeString: space-separated wire format → array.
Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).
test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build(admin): scope-constants mirror + drift CI for src/core/scope.ts
The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.
Files:
- admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
duplicate, sorted alphabetically to match src/core/scope.ts.
- scripts/check-admin-scope-drift.sh: extracts the list from each file
via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
(with full breakdown of which scopes diverged), 2 on internal error.
Tested both passing and corrupted paths.
- package.json: wires check:admin-scope-drift into both `verify` and
`check:all` so any update to src/core/scope.ts that forgets the
admin-side mirror fails the build.
The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration
Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:
- F3 refresh-token subset enforcement at line 365: previously rejected
admin → sources_admin refresh because exact-match treated them as
unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
refresh down to least-privilege sources_admin scope; this fix lands
that path.
- Token issuance intersection at line 498 (client_credentials grant):
same hasScope swap so a client whose stored grant is `admin` can mint
tokens including any implied scope.
- registerClient (DCR /register) and registerClientManual: validate
every scope string against ALLOWED_SCOPES via assertAllowedScopes.
Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
and persisted the bogus string in oauth_clients.scope. Post-fix the
caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
pre-allowlist scopes keep working (allowlist gates registration only).
Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union
62 OAuth tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES
Two changes against src/commands/serve-http.ts:
- Line 195: scopesSupported on the mcpAuthRouter options switches from the
hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
Without this, /.well-known/oauth-authorization-server keeps reporting
the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
cannot discover the v0.28 sources_admin and users_admin scopes via
standard discovery — they would have to be pre-configured out of band.
- Line 673: request-time scope check on /mcp swaps
authInfo.scopes.includes(requiredScope) for hasScope(...). This was
the most-cited codex finding: without it, sources_admin tokens could
not even satisfy a `read`-scoped op (sources_admin doesn't include
the literal string "read"). hasScope routes through the hierarchy
table in src/core/scope.ts so admin implies all and write implies
read at the gate too.
T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): sources-ops module — atomic clone + symlink-safe cleanup
src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.
addSource: D3 atomicity contract from the eng review.
1. Validate id (matches existing SOURCE_ID_RE).
2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
before any clone work. Pre-fix the existing CLI used INSERT…ON
CONFLICT DO NOTHING which silently no-op'd; with clone-first that
would orphan the temp dir.
3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
git-remote helpers.
5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
the temp dir; rename-failed path also DELETEs the just-INSERTed row
best-effort.
removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
- isPathContained (realpath-resolves both sides + parent-with-sep
string check) rejects symlinks whose target falls outside the
confine.
- lstat-then-isSymbolicLink check refuses symlinks whose realpath
happens to land back inside the confine (defense in depth).
getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.
recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).
test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(operations): whoami + sources_{add,list,remove,status} MCP ops
Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.
Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).
whoami (scope: read): introspect calling identity over MCP.
- Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
for OAuth clients (clientId starts with gbrain_cl_).
- Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
for grandfathered access_tokens.
- Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
Empty scopes (NOT ['read','write','admin']) is the D2 decision —
returning OAuth-shaped scopes for local callers would resurrect the
v0.26.9 footgun where code conditionally trusted on
`auth.scopes.includes('admin')` instead of `ctx.remote === false`.
- Q3 fail-closed: throws unknown_transport when remote=true AND auth is
missing OR ctx.remote is the literal `undefined` (cast bypass guard).
A future transport that forgets to thread auth doesn't get a free
pass.
sources_add (sources_admin, mutating): register a source by --path
(existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
Calls into addSource from sources-ops.ts which owns the temp-dir +
rename atomicity.
sources_list (read): list registered sources with page counts, federated
flag, and remote_url. The remote_url field is new — lets a remote MCP
caller see which sources are auto-managed.
sources_remove (sources_admin, mutating): cascade-delete a source +
symlink-safe clone cleanup. Requires confirm_destructive: true when the
source has data.
sources_status (read): per-source diagnostic returning clone_state
('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
busted clone without SSH access to the brain host.
test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.
test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(sync): re-clone fallback when clone is missing/no-git/corrupted
src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:
- 'healthy' → fall through to existing pull (unchanged)
- 'missing' → loud stderr "auto-recovery: re-cloning <id>", then
'no-git' recloneIfMissing handles the temp-dir + rename. Sync
'not-a-dir' continues from the freshly-cloned head.
- 'corrupted' → throw with structured hint pointing at sources remove
+ add (no syncing wrong state).
- 'url-drift' → throw with hint pointing at the (deferred) sources
rebase-clone command.
Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.
src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.
test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.
test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.
test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor
src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.
Two new flags on `gbrain sources add`:
- `--url <https-url>` : federated remote-clone path (clone + INSERT +
rename, atomic rollback on failure).
- `--clone-dir <path>` : override the default
$GBRAIN_HOME/clones/<id>/ destination.
Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.
`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.
54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)
addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.
Two layers:
1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
prints a warn with disk-byte estimate. Operators see the leak before
`df` complains.
2. The autopilot cycle's existing `purge` phase grows a substep that
nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
uses. Operator behavior stays uniform across all soft-delete-style
surfaces.
Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build(admin): scope checkboxes source from scope-constants mirror + dist
admin/src/pages/Agents.tsx Register Client modal:
- useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
to true, others false; unchanged UX for the common case).
- Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
hardcoded ['read','write','admin'].
Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.
The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.
admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami
VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).
CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).
TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.
README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).
llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip
Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.
12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir
Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.
Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.
The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.
Skipped gracefully when DATABASE_URL is unset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps
Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.
CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.
HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.
MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.
PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.
Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
asymmetry (clone_dir override silently ignored over MCP, path nulled,
local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.
DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.
323 tests pass (9 files); 4071 unit tests pass (full suite).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: rebump v0.28.1 → v0.28.2 (master collision)
Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).
Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.
Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated
PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
….28.6 Master shipped three v0.28.x patch releases without the takes feature while v0.28-release was in flight: - v0.28.1: zombie process accumulation + health endpoint timeout (#637) - v0.28.3: restart-sweep — detect dropped Telegram messages (#675) - v0.28.4: skillify cross-modal eval quality gate (#674) Master's v0.28.0 slot was consumed without the takes layer ever landing, so this release ships the original takes feature as v0.28.6 (skipping v0.28.5 to leave space for any in-flight master patches). The migration orchestrator file (v0_28_0.ts) and migration skill doc (skills/migrations/v0.28.0.md) keep their original version keys — those identify the migration version, not the release version. Conflicts resolved: - VERSION → 0.28.6 (was 0.28.0; master had 0.28.4) - package.json → 0.28.6 (auto-merged ai-sdk deps from master's v0.27) - CHANGELOG.md → renamed top entry "## [0.28.0]" → "## [0.28.6]" with date 2026-05-06; rebuilt the "To take advantage of" block (was truncated by stale === markers from a prior merge); preserved master's v0.28.4/v0.28.3/v0.28.1 entries beneath - src/cli.ts auto-merged (CLI_ONLY has providers + takes/think both) Verified post-merge: - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + typecheck) - 133 tests pass: migrate + apply-migrations + takes-engine + takes-fence - migrations v37 (takes) + v38 (access_tokens_permissions) apply cleanly on top of master's v35 (auto-RLS) + v36 (subagent persistence)
… v0.28.6 While preparing the takes release as v0.28.6, the remote branch landed v0.28.2 (remote-source MCP + scope hierarchy + whoami, PR #690). Pulling that into local while keeping the takes feature on its v0.28.6 slot. Conflicts resolved: - VERSION → 0.28.6 (kept ours; remote was 0.28.2) - package.json → 0.28.6 (kept ours) - CHANGELOG.md → kept "## [0.28.6]" header on top; inserted remote's "## [0.28.2]" entry between v0.28.3 and v0.28.1 in version-descending order. Dropped the duplicate "## [0.28.0]" header from remote since that was the original takes release that I renamed to v0.28.6. - TODOS.md → kept BOTH sides' new TODO entries (cross-modal-eval follow-ups + v0.28.2 follow-ups; non-overlapping content). Verified post-merge: - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + admin-scope-drift + typecheck) - 118 tests pass: migrate + apply-migrations + takes-engine - Migration sequence intact: v37 (takes) + v38 (access_tokens_permissions) on top of master's v35 + v36
Master shipped v0.28.5 (PGLite upgrade wedge + embedding dim corruption + bun-link foot-gun fix wave, PR #697). This release stays on v0.28.6. Conflicts resolved: - VERSION → 0.28.6 (kept ours; master had 0.28.5) - package.json version → 0.28.6 - package.json scripts → kept BOTH new check scripts: my check:admin-scope-drift (from v0.28.2 cherry) + master's check:cli-exec (new in v0.28.5). Verify pipeline now runs both; check:all runs both. - CHANGELOG.md → kept "## [0.28.6]" header on top; inserted master's full v0.28.5 entry between v0.28.6 and v0.28.4 in version-descending order. The "## To take advantage of v0.28.5" interleaved conflict was untangled by extracting master's entry from origin/master:CHANGELOG.md rather than trying to weave the two "to take advantage of" blocks back together inline. Verified post-merge: - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + admin-scope-drift + cli-exec + typecheck) - 121 tests pass: migrate + apply-migrations + takes-engine - CHANGELOG order intact: 0.28.6 → 0.28.5 → 0.28.4 → 0.28.3 → 0.28.2 → 0.28.1
Closed
5 tasks
garrytan
added a commit
that referenced
this pull request
May 8, 2026
* v0.28 schema: takes + synthesis_evidence (v31) + access_tokens.permissions (v32)
Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.
Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 engine: addTakesBatch, listTakes, searchTakes/Vector, supersede, resolve, synthesis_evidence
Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.
Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
resolution is immutable per Codex P1 #13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)
Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.
Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 model-config: unified resolveModel with 6-tier precedence + alias resolution
Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:
1. CLI flag (--model)
2. New-key config (e.g. models.dream.synthesize)
3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
— read with stderr deprecation warning, one-per-process
4. Global default (models.default)
5. Env var (GBRAIN_MODEL or caller-supplied)
6. Hardcoded fallback
Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.
When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.
Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).
Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 takes-fence: parser/renderer/upserter + chunker strip (privacy P0 fix)
src/core/takes-fence.ts — pure functions for the fenced markdown surface:
- parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->`
blocks. Strict on canonical form, lenient on hand-edits with warnings
(TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION).
Strikethrough `~~claim~~` → active=false; date ranges `since → until`
split into sinceDate/untilDate.
- renderTakesFence(takes) — round-trip safe with parseTakesFence.
- upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a
fresh `## Takes` section if no fence present. row_num is monotonic
(max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence
stable forever).
- supersedeRow(body, oldRow, replacement) — strikes through old row's claim
AND appends the new row at end. Both rows preserved in markdown for
git-blame archaeology.
- stripTakesFence(body) — removes the fenced block entirely. Used by the
chunker so takes content lives ONLY in the takes table.
Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence()
before computing chunk boundaries. Without this, page chunks would contain
the rendered takes table and the per-token MCP allow-list would be
bypassed at the index layer (token bound to takes_holders=['world'] would
see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check
(plan-side) asserts no chunk contains the begin marker.
Tests: 15 cases covering canonical parse, strikethrough, date range, fence
unbalanced detection, malformed-row skip + warning, row_num collision
detection, round-trip render, append-only upsert into existing fence,
fresh-section creation, monotonic row_num under hand-edit gaps, supersede
flow, stripTakesFence verifying takes content removed AND surrounding
prose preserved. Existing chunker tests still pass (15 + 15 = 30).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 page-lock: PID-liveness file lock for atomic markdown read-modify-write
src/core/page-lock.ts — per-page file lock at
~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes
add` calls or `takes seed --refresh` from autopilot can't race on the
same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17
cycle.lock pattern (mtime + PID liveness) but per-slug.
Differences from cycle.ts's lock:
- SHA-256 of slug for safe filenames (slashes, unicode, etc.)
- Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and
reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs
in one process). mtime expiry still rescues post-crash leftovers.
- 5-min TTL (vs cycle's 30 min — page edits are short)
- `withPageLock(slug, fn)` convenience wrapper with default 30s timeout
API:
- acquirePageLock(slug, opts) → handle | null (poll-with-timeout)
- handle.refresh() / handle.release() (idempotent — only releases if pid matches)
- withPageLock(slug, fn, opts) — acquire + run + release-in-finally
Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime
reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release
is no-op, withPageLock callback runs and releases on success/failure,
timeout-throws when held, SHA-256 filename safety for slashes/unicode.
All pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 extract-takes: dual-path phase (fs|db) + since/until_date as TEXT
src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:
- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)
Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
slate when markdown is canonical and DB has drifted)
Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.
Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.
Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 takes CLI: list, search, add, update, supersede, resolve
src/commands/takes.ts — surfaces the engine methods + takes-fence library
through a single `gbrain takes <subcommand>` entrypoint:
takes <slug> list with filters + sort
takes search "<query>" pg_trgm keyword search across all takes
takes add <slug> --claim ... ... append (markdown + DB, atomic via lock)
takes update <slug> --row N ... mutable-fields update (markdown + DB)
takes supersede <slug> --row N ... strikethrough old + append new
takes resolve <slug> --row N --outcome record bet resolution (immutable)
Markdown is canonical. Every mutate command:
1. acquires the per-page file lock (withPageLock)
2. re-reads the .md file
3. applies the edit via takes-fence (upsertTakeRow / supersedeRow)
4. writes the .md file back
5. mirrors to the DB via the engine method
6. releases the lock (auto via finally)
Resolve currently writes only to DB — surfacing resolved_* in the markdown
table is deferred to v0.29 (the takes-fence renderer's column set is
fixed at # | claim | kind | who | weight | since | source per spec).
Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the
project convention (orphans/embed/extract pattern). --dir flag overrides
sync.repo_path config when working outside the configured brain.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 MCP + auth: takes_list / takes_search / think ops + per-token allow-list
OperationContext gains takesHoldersAllowList — server-side filter for
takes.holder field threaded from access_tokens.permissions through dispatch
into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker
strip already closed the page-content side in the previous commit).
src/core/operations.ts — three new ops:
- takes_list: lists takes with holder/kind/active/resolved filters; honors
ctx.takesHoldersAllowList for MCP-bound calls
- takes_search: pg_trgm keyword search; honors allow-list
- think: op surface registered (returns not_implemented envelope until
Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7.
src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into
buildOperationContext.
src/mcp/http-transport.ts — validateToken now reads
access_tokens.permissions.takes_holders, defaults to ['world'] when the
column is absent or malformed (default-deny on private hunches).
auth.takesHoldersAllowList passed to dispatchToolCall.
src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world']
since stdio has no per-token auth. Operators wanting full visibility use
`gbrain call <op>` directly (sets remote=false).
src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b`
flag persists the per-token list; new `auth permissions <name>
set-takes-holders <list>` updates an existing token.
Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving
the threading: local-CLI sees all holders, ['world'] returns only public,
['world','garry'] returns 2/3, no-overlap returns empty (no fallback),
search honors allow-list, remote save/take on think rejected with
not_implemented envelope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28.0: ship-prep — VERSION, CHANGELOG, migration orchestrator, skill
Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:
- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
any pre-existing fenced takes tables in markdown populate the takes
index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
to re-import pages with takes content so the v0.28 chunker-strip rule
(Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
already have takes content stripped from chunks at index time; this
TODO catches up legacy pages.
skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.
CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).
Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 think pipeline: gather → sanitize → synthesize → cite-render → CLI
src/core/think/sanitize.ts — prompt-injection defense for take claims:
14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag,
DAN, system-prompt overrides, eval-shell hooks) plus structural framing
(takes wrapped in <take id="..."> tags the model is told to treat as
DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt.
src/core/think/prompt.ts — system prompt + structured-output schema.
Hard rules: cite every claim, mark hunches/low-weight explicitly,
surface conflicts (never silently pick), surface gaps. JSON schema
with answer + citations[] + gaps[]. Prompt adapts to anchor / time
window / save flag.
src/core/think/cite-render.ts — structured citations + regex fallback
(Codex P1 #4 fold). normalizeStructuredCitations validates the model's
structured output; parseInlineCitations is the body-scan fallback when
the model omits the structured field. resolveCitations dispatches and
records CITATIONS_REGEX_FALLBACK warning when used.
src/core/think/gather.ts — 4-stream parallel retrieval:
1. hybridSearch (pages, existing primitive)
2. searchTakes (keyword, pg_trgm)
3. searchTakesVector (vector, when embedQuestion fn supplied)
4. traversePaths (graph, when --anchor set)
RRF fusion (k=60). Each stream wrapped in try/catch — partial gather
beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls.
src/core/think/index.ts — runThink orchestrator + persistSynthesis:
INTENT (regex classify) → GATHER → render evidence blocks → resolveModel
('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call
(injectable client) → JSON parse with code-fence + fallback strip →
resolveCitations → ThinkResult. persistSynthesis writes a synthesis
page + synthesis_evidence rows (page_id resolved per slug; page-level
citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY.
Round-loop scaffolding in place (rounds=1 only path exercised in v0.28).
src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing
strips --anchor, --rounds, --save, --take, --model, --since, --until,
--json. Local CLI = remote=false, so save/take honored. Human-readable
output by default; --json for agent consumption.
operations.ts — `think` op now calls runThink (was a not_implemented
stub). Remote callers can't save/take per Codex P1 #7. Returns full
ThinkResult plus saved_slug + evidence_inserted.
cli.ts — wired into dispatch + CLI_ONLY allowlist.
Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering
sanitize patterns, structural rendering, citation parsing (structured +
regex fallback + dedup + invalid-slug rejection), gather streams +
allow-list filter, full pipeline with stub client, malformed-LLM
fallback path, no-API-key graceful degradation, persistSynthesis writes
page + evidence rows. All pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 dream phases: auto-think + drift + budget meter (Codex P1 #10 fold)
src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.
src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.
src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.
src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.
Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
questions empty, success → cooldown ts written, cooldown blocks rerun,
budget exhausted → partial. drift not_enabled, soft-band candidate
detection, complete + dry-run paths.
All pass. Typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 e2e Postgres: takes engine + extract + MCP allow-list (12 cases)
test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real
Postgres (gated on DATABASE_URL). 12 cases:
- addTakesBatch upsert via unnest() bind path (Postgres-specific)
- listTakes filters: holder, kind, sort=weight, takesHoldersAllowList
- searchTakes pg_trgm + allow-list filter
- supersedeTake transactional path (BEGIN/COMMIT semantics)
- resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED
- synthesis_evidence FK CASCADE on take delete
- countStaleTakes + listStaleTakes filter active+null
- extractTakesFromDb populates takes from fenced markdown
- MCP dispatch with takesHoldersAllowList=['world'] returns only world
- MCP dispatch local-CLI path returns all holders
- MCP dispatch takes_search honors allow-list
- think op forces remote_persisted_blocked even for save+take
postgres-engine.ts: addTakesBatch boolean[] serialization fix.
postgres-js auto-detects element type from JS arrays; for booleans it
mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then
SQL-cast to boolean[] — same pattern other batch methods rely on for
type-stable bind shapes.
test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in
TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence)
and (b) calls engine.initSchema() to actually run migrations.
test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match
Lane D's landed pipeline. They previously asserted not_implemented
envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY
graceful-degrade behavior.
Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts
Result: 12/12 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 dream phases: local DreamPhaseResult type (avoid premature CyclePhase enum extension)
cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.
Introduces DreamPhaseResult exported from auto-think.ts:
{ name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
detail: string; totals?: Record<string,number>; duration_ms: number }
drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 e2e: access_tokens.permissions JSONB end-to-end (5 cases)
test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:
- Migration v32 default backfill: new tokens created without a permissions
column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.
Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.
All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28 e2e: chunker takes-strip integration test (Codex P0 #3 verification)
test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually
strips fenced takes content end-to-end through the import pipeline.
This is the Codex P0 #3 fix's verification path: takes content lives
ONLY in the takes table for retrieval, never duplicated in
content_chunks where the per-token MCP allow-list cannot reach.
5 cases:
- chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers
- chunkText output never contains fenced claim text
- chunkText output retains non-fence prose (no over-stripping)
- importFromContent end-to-end: imported page has chunks but none
contain fenced content
- takes_fence_chunk_leak doctor invariant: zero rows globally where
chunk_text matches `<!--- gbrain:takes:%`
Final v0.28 test sweep:
121 pass, 0 fail, 336 expect() calls, 12 files
Coverage: schema migrations, engine methods (PGLite + Postgres),
takes-fence parser, page-lock, extract phase, takes CLI engine
surface, model config 6-tier resolver, MCP+auth allow-list,
think pipeline (gather + sanitize + cite-render + synthesize),
auto-think + drift + budget meter, JSONB end-to-end, chunker
strip integration. ~95% of v0.28 surface area covered.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix CI: apply-migrations skippedFuture arrays + http-transport SQL mock
Two CI failures from PR #563:
test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact
skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to
the migration registry means it shows up in skippedFuture when the test
runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both
hardcoded arrays.
test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix
matches `SELECT id, name FROM access_tokens` to return a row. v0.28's
validateToken now selects `SELECT id, name, permissions FROM access_tokens`
to read the per-token takes_holders allow-list. Mock returned [] on the
new query → validateToken treated every token as invalid → 401.
Fix: mock now matches both query shapes. validTokens row gets a default
`{takes_holders: ['world']}` permission injected when caller didn't
supply one (mirrors the migration v33 column DEFAULT). Updated
FakeEngineConfig type to allow tests to pass explicit permissions.
Verification:
bun test test/apply-migrations.test.ts → 18/18 pass
bun test test/http-transport.test.ts → 24/24 pass
bun run typecheck → clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix CI: add scope annotations to v0.28 ops (takes_list/takes_search/think)
test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing:
every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op
flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added
before master shipped v0.26 + the new invariant; the merge surfaced the gap.
Annotations:
- takes_list → read
- takes_search → read
- think → write (mutating: true; --save persists synthesis page)
Verification:
bun test test/oauth.test.ts → 42/42 pass
bun run typecheck → clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(v0.28.1): export INJECTION_PATTERNS for shared sanitization
The same pattern set protects takes from prompt-injection (think/sanitize.ts)
and now retrieved chat content in the LongMemEval harness. One source of
truth for both surfaces; adding a new pattern in this file automatically
covers benchmarks too.
Existing consumers (sanitizeTakeForPrompt, renderTakesBlock) keep working
unchanged. Verified via test/think-pipeline.test.ts (18 pass, 0 fail).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.28.1): longmemeval harness — reset-in-place over in-memory PGLite
One in-memory PGLiteEngine per benchmark run; TRUNCATE between questions
with runtime-enumerated tables via pg_tables so future schema migrations
don't silently leak across questions. Infrastructure tables (sources,
config, gbrain_cycle_locks, subagent_rate_leases) preserved across resets
so initSchema-seeded rows like sources.'default' survive (FK target for
pages.source_id).
Files:
- src/eval/longmemeval/harness.ts: createBenchmarkBrain + resetTables +
withBenchmarkBrain. ~50 lines, no class wrapper.
- src/eval/longmemeval/adapter.ts: pure haystackToPages() converter.
Slug prefix `chat/` (verified non-matching against DEFAULT_SOURCE_BOOSTS).
- src/eval/longmemeval/sanitize.ts: re-uses INJECTION_PATTERNS from
think/sanitize.ts; wraps each session in <chat_session id date> tags;
4000-char cap.
- test/longmemeval-sanitize.test.ts: 12 cases pinning the F8 contract.
Hermetic: no DATABASE_URL, no API keys.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.28.1): gbrain eval longmemeval CLI command
Run the LongMemEval public benchmark against gbrain's hybrid retrieval.
Dataset is a positional path (download from xiaowu0162/longmemeval on HF).
Per-question loop wraps everything in try/catch; one bad question doesn't
kill the run, error JSONL line emitted instead.
Wiring:
- src/cli.ts: pre-dispatch bypass for `eval longmemeval` so the user's
~/.gbrain brain is never opened. Hermeticity gate verified: --help works
on machines with no gbrain config.
- src/commands/eval-longmemeval.ts: arg parsing, JSONL emit (LF + UTF-8
pinned), hybridSearch with optional expandQuery from search/expansion.ts,
resolveModel from model-config.ts (6-tier chain), ThinkLLMClient injection
seam from think/index.ts, structural <chat_session> framing.
- test/eval-longmemeval.test.ts: 12 cases covering harness lifecycle,
reset clears all tables, schema-migration robustness, p50/p99 speed gate
(warm reset+import+search target <500ms), adapter shape, source-boost
regression guard, end-to-end with stubbed LLM, JSONL format guard,
per-question failure handling.
- test/fixtures/longmemeval-mini.jsonl: 5 hand-authored questions with
keyword-friendly overlap so --keyword-only works in CI.
Speed: warm reset+import 5 pages+search p50=25.9ms p99=30.3ms locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(v0.28.1): bump VERSION + CHANGELOG
VERSION + package.json synchronized at 0.28.1. CHANGELOG entry uses the
release-summary voice + "To take advantage of v0.28.1" block per CLAUDE.md.
Sequential release on garrytan/v0.28-release; lands after v0.28.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: surface v0.28.1 LongMemEval CLI across project docs
- README.md: add EVAL section to Commands reference (eval --qrels, export,
prune, replay, longmemeval); add v0.28.1 announce paragraph next to the
v0.25.0 BrainBench-Real intro.
- CLAUDE.md: add Key files entry for src/eval/longmemeval/ +
src/commands/eval-longmemeval.ts; add "Key commands added in v0.28.1"
subsection (mirrors the v0.26.5 / v0.25.0 pattern); inventory
test/eval-longmemeval.test.ts + test/longmemeval-sanitize.test.ts under
the unit-test list.
- docs/eval-bench.md: cross-link from the "What it actually does" section
to LongMemEval as the third evaluation axis (public benchmark,
ground-truth labels, full QA pipeline); append "Public benchmarks:
LongMemEval (v0.28.1)" section with architecture, flags table, and
perf numbers.
- CONTRIBUTING.md: append a paragraph after the eval-replay block pointing
contributors at gbrain eval longmemeval for public-benchmark coverage.
- AGENTS.md: extend the existing eval-retrieval bullet with a one-line
mention of gbrain eval longmemeval.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* v0.28.2 feat: remote-source MCP + scope hierarchy + whoami (#690)
* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts
src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).
Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe
New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:
- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
-c protocol.file.allow=never, -c protocol.ext.allow=never,
--no-recurse-submodules. Single source of truth shared by cloneRepo
and pullRepo so a future flag added to one path lands on both.
Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
.gitmodules as a second-fetch surface, file:// scheme in remotes.
- parseRemoteUrl: https-only, rejects embedded credentials and path
traversal, delegates internal-target classification to isInternalUrl
from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
needed for self-hosted git over Tailscale (CGNAT trips the gate).
- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
non-empty destDirs; spawns git via execFileSync (no shell injection)
with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
prompts. timeoutMs default 600s.
- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.
- validateRepoState: 6-state decision tree (missing | not-a-dir |
no-git | corrupted | url-drift | healthy). Used by performSync's
re-clone branch to recover from rmd clone dirs and refuse syncs on
url-drift or corruption.
test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist
New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.
Hierarchy:
- admin implies all (escape hatch)
- write implies read
- sources_admin and users_admin are siblings (different axes —
sources-mgmt vs user-account-mgmt; neither implies the other)
Exported:
- hasScope(grantedScopes, requiredScope): the canonical scope check.
Replaces exact-string-match at three call sites in upcoming commits
(serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
token issuance). Without this rewrite, an admin-grant token would
fail to refresh down to sources_admin (codex finding).
- ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
for OAuth metadata wire format and drift-check output).
- assertAllowedScopes / InvalidScopeError: registration-time gate so
tokens with bogus scope strings (read flying-unicorn) get rejected
with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
registerClientManual. Today's behavior accepts any string silently.
- parseScopeString: space-separated wire format → array.
Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).
test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build(admin): scope-constants mirror + drift CI for src/core/scope.ts
The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.
Files:
- admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
duplicate, sorted alphabetically to match src/core/scope.ts.
- scripts/check-admin-scope-drift.sh: extracts the list from each file
via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
(with full breakdown of which scopes diverged), 2 on internal error.
Tested both passing and corrupted paths.
- package.json: wires check:admin-scope-drift into both `verify` and
`check:all` so any update to src/core/scope.ts that forgets the
admin-side mirror fails the build.
The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration
Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:
- F3 refresh-token subset enforcement at line 365: previously rejected
admin → sources_admin refresh because exact-match treated them as
unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
refresh down to least-privilege sources_admin scope; this fix lands
that path.
- Token issuance intersection at line 498 (client_credentials grant):
same hasScope swap so a client whose stored grant is `admin` can mint
tokens including any implied scope.
- registerClient (DCR /register) and registerClientManual: validate
every scope string against ALLOWED_SCOPES via assertAllowedScopes.
Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
and persisted the bogus string in oauth_clients.scope. Post-fix the
caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
pre-allowlist scopes keep working (allowlist gates registration only).
Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union
62 OAuth tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES
Two changes against src/commands/serve-http.ts:
- Line 195: scopesSupported on the mcpAuthRouter options switches from the
hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
Without this, /.well-known/oauth-authorization-server keeps reporting
the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
cannot discover the v0.28 sources_admin and users_admin scopes via
standard discovery — they would have to be pre-configured out of band.
- Line 673: request-time scope check on /mcp swaps
authInfo.scopes.includes(requiredScope) for hasScope(...). This was
the most-cited codex finding: without it, sources_admin tokens could
not even satisfy a `read`-scoped op (sources_admin doesn't include
the literal string "read"). hasScope routes through the hierarchy
table in src/core/scope.ts so admin implies all and write implies
read at the gate too.
T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(core): sources-ops module — atomic clone + symlink-safe cleanup
src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.
addSource: D3 atomicity contract from the eng review.
1. Validate id (matches existing SOURCE_ID_RE).
2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
before any clone work. Pre-fix the existing CLI used INSERT…ON
CONFLICT DO NOTHING which silently no-op'd; with clone-first that
would orphan the temp dir.
3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
git-remote helpers.
5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
the temp dir; rename-failed path also DELETEs the just-INSERTed row
best-effort.
removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
- isPathContained (realpath-resolves both sides + parent-with-sep
string check) rejects symlinks whose target falls outside the
confine.
- lstat-then-isSymbolicLink check refuses symlinks whose realpath
happens to land back inside the confine (defense in depth).
getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.
recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).
test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(operations): whoami + sources_{add,list,remove,status} MCP ops
Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.
Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).
whoami (scope: read): introspect calling identity over MCP.
- Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
for OAuth clients (clientId starts with gbrain_cl_).
- Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
for grandfathered access_tokens.
- Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
Empty scopes (NOT ['read','write','admin']) is the D2 decision —
returning OAuth-shaped scopes for local callers would resurrect the
v0.26.9 footgun where code conditionally trusted on
`auth.scopes.includes('admin')` instead of `ctx.remote === false`.
- Q3 fail-closed: throws unknown_transport when remote=true AND auth is
missing OR ctx.remote is the literal `undefined` (cast bypass guard).
A future transport that forgets to thread auth doesn't get a free
pass.
sources_add (sources_admin, mutating): register a source by --path
(existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
Calls into addSource from sources-ops.ts which owns the temp-dir +
rename atomicity.
sources_list (read): list registered sources with page counts, federated
flag, and remote_url. The remote_url field is new — lets a remote MCP
caller see which sources are auto-managed.
sources_remove (sources_admin, mutating): cascade-delete a source +
symlink-safe clone cleanup. Requires confirm_destructive: true when the
source has data.
sources_status (read): per-source diagnostic returning clone_state
('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
busted clone without SSH access to the brain host.
test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.
test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(sync): re-clone fallback when clone is missing/no-git/corrupted
src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:
- 'healthy' → fall through to existing pull (unchanged)
- 'missing' → loud stderr "auto-recovery: re-cloning <id>", then
'no-git' recloneIfMissing handles the temp-dir + rename. Sync
'not-a-dir' continues from the freshly-cloned head.
- 'corrupted' → throw with structured hint pointing at sources remove
+ add (no syncing wrong state).
- 'url-drift' → throw with hint pointing at the (deferred) sources
rebase-clone command.
Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.
src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.
test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.
test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.
test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor
src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.
Two new flags on `gbrain sources add`:
- `--url <https-url>` : federated remote-clone path (clone + INSERT +
rename, atomic rollback on failure).
- `--clone-dir <path>` : override the default
$GBRAIN_HOME/clones/<id>/ destination.
Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.
`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.
54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)
addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.
Two layers:
1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
prints a warn with disk-byte estimate. Operators see the leak before
`df` complains.
2. The autopilot cycle's existing `purge` phase grows a substep that
nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
uses. Operator behavior stays uniform across all soft-delete-style
surfaces.
Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build(admin): scope checkboxes source from scope-constants mirror + dist
admin/src/pages/Agents.tsx Register Client modal:
- useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
to true, others false; unchanged UX for the common case).
- Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
hardcoded ['read','write','admin'].
Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.
The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.
admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami
VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).
CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).
TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.
README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).
llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip
Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.
12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir
Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.
Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.
The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.
Skipped gracefully when DATABASE_URL is unset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps
Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.
CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.
HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.
MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.
PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.
Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
asymmetry (clone_dir override silently ignored over MCP, path nulled,
local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.
DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.
323 tests pass (9 files); 4071 unit tests pass (full suite).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: rebump v0.28.1 → v0.28.2 (master collision)
Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).
Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.
Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated
PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(todos): close longmemeval-publication, file 4 follow-up TODOs
Full 500-question 4-adapter LongMemEval _s benchmark landed at
github.com/garrytan/gbrain-evals#main:ced01f0. gbrain-hybrid 97.60% R@5,
+1.0pt over MemPal raw 96.6%. Replacing the now-stale "needs full run"
TODO with closure + 4 grounded follow-ups:
1. Timeline-aware retrieval signal for temporal-reasoning questions
(P2 — closes the only category we lose to MemPal-raw)
2. Per-question batch consolidation for ~10x cold-cache speedup
(P3 — makes daily benchmark CI gate practical)
3. LongMemEval _m split run (P3 — differentiated, not yet published
by MemPal)
4. Cheaper-embedding-model recipe (P4 — recall-cost tradeoff curve)
Each TODO has the standard What/Why/Pros/Cons/Context/Depends-on shape per
the gbrain TODOS-format convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(llms): regenerate llms-full.txt to match merged CLAUDE.md
CI test/build-llms.test.ts asserts the committed llms.txt/llms-full.txt
are byte-for-byte identical to what scripts/build-llms.ts produces. The
master merge brought in v0.28.9/v0.28.10/v0.28.11 + multimodal embedding
notes that updated CLAUDE.md; the bundle was stale.
No content changes. Pure regeneration via `bun run build:llms`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(changelog): rewrite v0.28.12 entry — lead with the LongMemEval result
Old entry buried the headline ("LongMemEval lands in the box…") under
process detail (hermetic CI test count, 25.9ms p50, schema-table
runtime enumeration). The reader cares what gbrain DOES — not how we
plumbed the harness.
New entry leads with the actual number — 97.60% R@5 on the public
LongMemEval _s split, beating MemPalace raw by 1.0pt — followed by
the per-category win table that proves gbrain ties or beats MemPal in
5 of 6 question types and shows the +7.1pt assistant-voice lift.
Links to the full gbrain-evals report (97.60% headline + full
methodology + reproducible runner) so curious readers can dig deeper.
Two honest findings published in plain text: vector-only is
essentially tied with hybrid at K=5, and query expansion via Haiku is
a clean null result on this dataset. Better to publish the null than
hide it.
Reproduction block updated to match the actual gbrain-evals workflow
(clone + bun install + dataset download + bash batch runner). The
prior "download / run / hand to evaluate_qa.py" block stayed for the
in-tree CLI path.
Regenerated llms-full.txt to keep the build-llms regen-drift guard
green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 10, 2026
…-path, sync, multi-source, privacy) (#776) * fix: bootstrap forward-references for v39-v41 schema replay Three column-with-index forward references in the embedded schema blob were missing from applyForwardReferenceBootstrap, so any brain at config.version < 39 (Postgres) or < 41 (PGLite) wedges before the migration runner can advance. Reproduced end-to-end on a PlanetScale Postgres brain stuck at config.version=34 trying to upgrade to v0.30.0: ERROR: column "effective_date" does not exist ERROR: column cc.modality does not exist (After upgrading, gbrain search and gbrain reindex-frontmatter both fail.) The schema-blob references that crash before migrations run: - v39 (multimodal_dual_column_v0_27_1): CREATE INDEX idx_chunks_embedding_image ON content_chunks USING hnsw (embedding_image vector_cosine_ops) WHERE embedding_image IS NOT NULL; - v41 (pages_recency_columns): CREATE INDEX pages_coalesce_date_idx ON pages ((COALESCE(effective_date, updated_at))); PGLite already covered v39 (lines 273+, 308+, 382-392). Postgres and PGLite both lacked v40+v41 coverage. This commit adds: - Postgres engine probe + branch for v39 (modality, embedding_image) — was entirely missing on Postgres, so Postgres brains < v39 hit the wedge that PGLite already protected against. - Both engines: probe + branch for v40+v41. Bootstraps all five additive pages columns (emotional_weight, effective_date, effective_date_source, import_filename, salience_touched_at) gated on `effective_date_exists` as the proxy. - test/schema-bootstrap-coverage.test.ts: extends REQUIRED_BOOTSTRAP_COVERAGE with the six new columns AND the pre-test DROP block so both the per-target assertion test and the end-to-end "bootstrap + SCHEMA_SQL replay" test exercise the new coverage. All 5 tests in schema-bootstrap-coverage pass. typecheck clean. Bootstrap stays additive-columns-only. Indexes are left to schema replay / migrations as before. * fix(deps): declare @jsquash/png and heic-decode Both packages are direct imports in src/core/import-file.ts (decodeIfNeeded for HEIC/AVIF → PNG) but only @jsquash/avif was declared. bun --compile fails on a fresh install: error: Could not resolve: "@jsquash/png/encode.js" error: Could not resolve: "heic-decode" Adds the missing declarations so npm install / bun install bring them in. Versions chosen as latest at time of fix: @jsquash/png ^3.1.1 heic-decode ^2.1.0 * fix(backfill-effective-date): replace bare BEGIN/COMMIT with engine.transaction() postgres.js refuses bare BEGIN/COMMIT on pooled connections with UNSAFE_TRANSACTION. The migration runner and other call sites already use engine.transaction() (which routes through sql.begin() with a reserved backend) — backfill-effective-date.ts was the holdout. Reproduces on PlanetScale Postgres (us-east-4.pg.psdb.cloud) running the v0.29.1 orchestrator's Phase B against a brain that has any rows needing backfill: Reindex ok ... UNSAFE_TRANSACTION: Only use sql.begin, sql.reserved or max: 1 Switches the per-batch transaction to engine.transaction(async tx => …). The SET LOCAL statement_timeout still scopes to the transaction; UPDATE runs through the tx-scoped engine. ROLLBACK on error happens automatically via sql.begin's contract. Equivalent fix shape to existing usages in src/core/postgres-engine.ts (lines 703, 806, 925) and the migration runner in src/core/migrate.ts (line 2147). * fix(v0_29_1): connect engine before use in Phase B and Phase C phaseBBackfill() and phaseCVerify() build their own engine via createEngine(toEngineConfig(cfg)) but never call engine.connect(). This worked accidentally before because executeRaw lazily falls back to db.getConnection(), but engine.transaction() (added in the companion backfill fix) requires a connected backend and surfaces the missing-connect with: No database connection: connect() has not been called. Fix: Run gbrain init --supabase or gbrain init --url <connection_string> Other orchestrators in the same directory get this right — v0_28_0.ts:181 already does `await engine.connect(engineConfig)` right after createEngine. Aligning v0_29_1 with that pattern. After this + the backfill fix, v0.29.1 orchestrator runs to 'complete' on a fresh upgrade with backfill-needed rows, instead of wedging at 'partial' status. Note: anyone hitting the wedged state after the prior failures will need `gbrain apply-migrations --force-retry 0.29.1` once before the next apply-migrations --yes succeeds (the 3-consecutive-partials guard in apply-migrations.ts is still active). * fix: connect engine in v0.29.1 migration * fix(upgrade): detectBunLink fails because bun resolves symlinks in argv[1] bun resolves the entire symlink chain before setting process.argv[1], so lstatSync(argv1).isSymbolicLink() always returns false for bun-link installs, short-circuiting the git-config walk that would correctly identify the repo. Remove the symlink gate — argv[1] is already the real path inside the checkout, which is what the walk needs. Also: return { repoRoot } so the upgrade path can auto-execute git pull + bun install via execFileSync (no shell injection surface). Fixes #368, supersedes incomplete v0.28.5 fix for #656. * fix(oauth): clamp authorize() requested scopes against client.scope (RFC 6749 §3.3) The MCP SDK's authorize handler (`@modelcontextprotocol/sdk/.../auth/handlers/authorize.js`) splits `?scope=...` verbatim and forwards the parsed list to the provider, so the provider has to clamp against the client's registered grant. v0.28.11 `authorize()` (src/core/oauth-provider.ts:235-259) inserted `params.scopes || []` raw into `oauth_codes`, so a `read`-registered client requesting `?scope=admin` had `['admin']` stored and `exchangeAuthorizationCode` issued a fully-admin access token at /token exchange. The asymmetry is the bug: the other two grant entry points already clamp. `exchangeClientCredentials` (line 513-515) filters requested scopes through `hasScope(allowedScopes, s)`, and `exchangeRefreshToken`'s F3 (line 372-380) enforces RFC 6749 §6 subset against the original grant. authorize() lined up with neither. Fix mirrors the client_credentials filter shape so all three grant entry points clamp consistently: const allowedScopes = parseScopeString(client.scope); const grantedScopes = (params.scopes || []).filter(s => hasScope(allowedScopes, s)); Empty/omitted requested scope keeps storing `[]` (existing shape, not a security boundary). The clamped subset is what the client sees in the `scope` field of the token response, which is the spec-compliant signal that the grant was reduced. Test coverage: - New: authorize clamps requested scopes against client.scope (RFC 6749 §3.3) — read-only client requests ['read','write','admin'] and the issued token carries only ['read']. - New: authorize subset request returns subset — 'read write' client requesting ['read'] gets ['read'] (regression guard against over-clamping). The existing v0.26.9 oauth.test.ts pins F3 (refresh clamp) but had no authorize-side coverage, which is why the regression survived. * fix(sync): handle detached HEAD by skipping pull and ingesting local working tree * fix(sync): --skip-failed acks pre-existing unacked failures up-front The recovery flow that doctor + printSyncResult both advertise was broken: 1. User has files with bad YAML → they hit the failure log + sync stays blocked at last_commit. 2. User fixes the YAML. 3. User re-runs `gbrain sync` — sync succeeds, advances last_commit. 4. `gbrain doctor` still reports N unacked failures from step 1 because sync-failures.jsonl is append-only history, never auto-cleared. 5. doctor message says: "use 'gbrain sync --skip-failed' to acknowledge". 6. User runs `gbrain sync --skip-failed` → "Already up to date." → log unchanged. The bug: --skip-failed only acknowledges failures from the CURRENT run. performSync's ack path is gated on `failedFiles.length > 0` after sync — it never fires when the diff is empty (because the user already fixed the bad files) or when the sync is up to date. So the documented recovery sequence is a no-op exactly when the user needs it. The fix: at the top of runSync, when --skip-failed is set, eagerly ack any pre-existing unacked failures before any sync work runs. Now the flag means "acknowledge whatever is currently flagged and move on" regardless of whether the current run produces new failures or finds nothing to do. The inner per-run ack path stays — it still handles new failures from the CURRENT run, which is the (a) syncing now produces failures + (b) caller wants to ack them path. The two paths compose: `gbrain sync --skip-failed` clears stale + advances past anything new, all in one command, matching what the doctor message promises. Tests: 2 added in test/sync-failures.test.ts. One source-string pin on the new gate (the file's existing pattern for CLI-flag tests). One behavioral test on the underlying acknowledgeSyncFailures path. Repro: $ gbrain doctor [WARN] sync_failures: 27 unacknowledged sync failure(s)... Fix the file(s) and re-run 'gbrain sync', or use 'gbrain sync --skip-failed' to acknowledge. $ # ... fix the YAML ... $ gbrain sync Already up to date. $ gbrain sync --skip-failed Already up to date. # before this PR $ gbrain doctor [WARN] sync_failures: 27 unacknowledged sync failure(s)... # still! After: $ gbrain sync --skip-failed Acknowledged 27 pre-existing failure(s). Already up to date. $ gbrain doctor [OK] sync_failures: N historical sync failure(s), all acknowledged * fix(extract): default --dir to configured brain dir, not cwd `gbrain extract links` (and timeline / all) defaulted --dir to '.' when not explicitly passed (src/commands/extract.ts:357). Combined with a walker that skips dotfiles but NOT node_modules/dist/build/vendor, this turned a no-arg invocation into a footgun. Repro: $ cd ~/Documents/some-project # has a node_modules/ tree $ gbrain extract links [extract.links_fs] 28989/28989 (100%) done Links: created 0 from 28989 pages Done: 0 links, 0 timeline entries from 28989 pages The "28989 pages" is `walkMarkdownFiles('.')` recursively eating package READMEs, dependency docs, fixture content. Their from_slug doesn't match any row in the pages table, so addLinksBatch rejects every insert and returns 0. Output looks like a healthy idempotent no-op; was actually a wasteful junk walk that wrote nothing. Fix: when --dir is not passed AND source is fs, resolve from sources(local_path) via getDefaultSourcePath — same helper sync uses (src/commands/sync.ts:1089). The default behavior now matches `sync`: "work on the configured brain". Falls back to a clear error when no source is configured, telling the user to either pass --dir, register a source, or use --source db. Behavior matrix: --dir explicit → use that path (unchanged) --dir absent + cfg → resolve from sources(local_path) --dir absent + no → error with actionable hint (was: walk cwd silently) --dir . → cwd (user opted in explicitly — unchanged) Tests: three added in test/extract-fs.test.ts: 1. configured source → no-arg invocation extracts from that path 2. no source configured → exit 1 + actionable error message 3. explicit --dir wins over a configured (decoy) source path * fix(extract): normalize slugs to lowercase via pathToSlug() (T-OBS-1) The extractor was generating from_slug and the allSlugs lookup set from `relPath.replace('.md', '')` in 5 places, producing CAPS slugs for files named ETHOS.md, AGENTS.md, ROADMAP.md, etc. Pages persist in the DB with lowercase slug (core/sync.ts pathToSlug() applies .toLowerCase()). The CAPS extractor output mismatched the DB rows, so INSERT ... JOIN pages ON pages.slug = v.from_slug silently dropped links from CAPS-named source files. The link batch returned 'inserted' counts that were lower than the wikilinks actually present, with no error. Reproduction (in a brain with CAPS-named canonical docs): 1. echo 'See [agents](agents.md).' > ETHOS.md 2. gbrain put ethos < ETHOS.md # page row: slug='ethos' 3. gbrain extract links --source fs 4. gbrain backlinks agents → [] (expected: contains 'ethos') Fix: import pathToSlug from core/sync.ts and use it in all 5 sites: - extractLinksFromFile (line 200): from_slug derivation - runIncrementalExtractInternal (line 456): allSlugs set - extractLinksFromDir (line 552): allSlugs set - timeline loop (line 643): from_slug for timeline entries - extractLinksForSlugs (line 673): allSlugs set used by sync hook This single-line-per-site change keeps the extractor consistent with the sync layer's slug normalization and doesn't introduce any new behavior for already-lowercase paths (idempotent). Tests: added 'extractLinksFromFile — slug normalization (T-OBS-1 regression)' suite with 4 cases covering CAPS, mixed-case, idempotent lowercase, and nested path. Full extract suite (54 → 58 tests) passes. Reported by Claude Code (Opus 4.7) during Obsidian PKM integration on the gstack-plan Living Repo, where ~111 wikilinks pointing to ETHOS, AGENTS, ROADMAP, etc. failed to count toward brain_score (54/100 vs expected 75+/100). Documented as T-OBS-1 in the consumer's blocked.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cli): CLI_ONLY commands should short-circuit on --help instead of executing * fix(doctor): correct command syntax in graph_coverage warn message graph_coverage warn directs users to run `gbrain link-extract && gbrain timeline-extract`, but no commands by those names are registered in cli.ts. The actual commands are `gbrain extract links` and `gbrain extract timeline` (registered as the 'extract' subcommand at src/cli.ts:525, with the kind argument 'links' / 'timeline' / 'all' parsed inside src/commands/extract.ts). A user who runs the suggested command gets: $ gbrain link-extract Unknown command: link-extract This is the only place in src/ with the wrong syntax — the rest of the docs (init.ts:221, init.ts:331, features.ts:120, v0_13_0.ts:67, sync.ts:752 comment) all already say 'extract links'. This patch just brings doctor.ts in line. * fix(doctor): use autoDetectSkillsDir so OpenClaw workspaces are reachable `gbrain doctor` was the only consumer of `findRepoRoot` from `core/repo-root.ts`. Every other consumer (check-resolvable.ts:145, skillify.ts, etc.) uses `autoDetectSkillsDir`, which has the full detection chain: 1. \$OPENCLAW_WORKSPACE 2. ~/.openclaw/workspace 3. findRepoRoot() walk from cwd 4. ./skills `findRepoRoot` only does step 3. Result: when the user runs `gbrain doctor` from any directory outside the gbrain repo or the OpenClaw workspace tree (e.g., a project's checkout), `resolver_health` reports "Could not find skills directory" even though the dispatcher exists at ~/.openclaw/workspace/skills/RESOLVER.md. Reproduces in any directory other than ~/gbrain or its descendants on a system with ~/.openclaw/workspace/skills/RESOLVER.md present: \$ cd ~/Documents \$ gbrain doctor [WARN] resolver_health: Could not find skills directory # before [WARN] resolver_health: 5 issue(s): 0 error(s), 5 warning(s) # after Switching doctor to `autoDetectSkillsDir` brings it inline with the rest of the codebase. The detected dir is also passed to `checkSkillConformance` (step 2 of the resolver_health block), which previously rebuilt the path from `repoRoot` — now uses the same detected path for consistency. All 15 existing tests in test/doctor.test.ts continue to pass. * fix(mcp): exit serve process on stdin-close/SIGTERM MCP stdio server was keeping the bun process alive indefinitely after the client disconnected. Over days this accumulated 20+ orphaned gbrain serve processes, all holding the PGLite directory open. Since PGLite is single-writer, this caused write-lock contention that made email-sync fail its 15s per-put timeout: 114 puts x 15s = 28.5min runs with 0 emails written. Now listens for stdin end/close, transport close, and SIGTERM/SIGINT/ SIGHUP; calls engine.disconnect() and exits cleanly. Root cause for the no-gbrain-run-in-50h alert. * fix(skills): broaden RESOLVER triggers + 1 ambiguity flag (37 misses → 0, 100% top-1 accuracy) `bun run src/cli.ts routing-eval` was reporting 37 ROUTING_MISS entries across 10 skills whose RESOLVER.md trigger phrases didn't match any of their own routing-eval.jsonl fixture intents. Two distinct causes: 1. Single-phrase triggers in 9 skills under '## Uncategorized' didn't cover the paraphrased fixture variations they're supposed to route. Broadened each trigger cell to a quoted-phrase list that covers the fixtures (5 fixtures per skill on average). 2. The media-ingest row used unquoted prose ('Video, audio, PDF, book, YouTube, screenshot') which extractTriggerPhrases() collapses into one impossible long phrase ('video audio pdf book youtube screenshot') under normalizeText — no fixture intent will ever contain that exact substring. Converted to a quoted phrase list. 3. One fixture ('web research pass on this person') legitimately matches both `perplexity-research` and `data-research` (data-research's trigger row contains "Research"). Marked the fixture `ambiguous_with: ["data-research"]` since the overlap on the keyword 'research' is inherent and expected. Skills with broadened triggers: - voice-note-ingest, article-enrichment, book-mirror, archive-crawler, brain-pdf, academic-verify, concept-synthesis, perplexity-research, strategic-reading, media-ingest Before: 58 cases, 37 misses, ~36% top-1 accuracy After: 58 cases, 0 misses, 100% top-1 accuracy This also clears `gbrain doctor`'s `resolver_health: 37 issue(s)` warning. * fix(multi-source): thread source_id through per-page tx surface Multi-source brains crashed mid-import with Postgres 21000 ("more than one row returned by a subquery used as an expression"). Root cause: putPage's INSERT column list omitted source_id, so writes intended for a non-default source (e.g. 'jarvis-memory') silently fabricated a duplicate row at (default, slug). The schema has UNIQUE(source_id, slug) but DEFAULT 'default' for source_id; calling putPage(slug, page) without source_id landed at (default, slug) and ON CONFLICT updated the wrong row, leaving the intended source row stale. Subsequent bare-slug subqueries inside the same tx — (SELECT id FROM pages WHERE slug = $1) in getTags / removeTag / deleteChunks / removeLink / addLink (cross-product) — then matched 2 rows and crashed with 21000, rolling back the entire import. Observed: 18 sync failures against a 'jarvis-memory'-sourced brain. Fix: - putPage adds source_id to the INSERT column list (defaults 'default' for back-compat). - Every bare-slug page-id subquery becomes source-qualified (AND source_id = $X) in both engines: createVersion, upsertChunks, getChunks, addTag, removeTag, getTags, deleteChunks, removeLink, addTimelineEntry, deletePage, updateSlug. - addLink rewritten away from FROM pages f, pages t cross-product into a VALUES + JOIN-on-(slug, source_id) shape mirroring addLinksBatch. - engine.ts interface: 11 method signatures gain optional opts.sourceId (or opts.{from,to,origin}SourceId for addLink/removeLink). All optional; existing callers default to source='default' and behave identically. - import-file.ts: importFromContent / importFromFile / importCodeFile take opts.sourceId and thread txOpts = { sourceId } through every per-page tx call. engine.getPage callsite source-scoped for accurate idempotency. - commands/sync.ts: thread opts.sourceId at importFile (line 581 + 641), un-syncable cleanup (487-498), delete phase (557), rename phase (574), and post-sync extract phase (815-816). - commands/reindex-code.ts: thread opts.sourceId at importCodeFile call. - commands/extract.ts: extractLinksForSlugs / extractTimelineForSlugs accept opts.sourceId and propagate via linkOpts / entryOpts. - commands/reconcile-links.ts: ReconcileLinksOpts.sourceId was declared but ignored end-to-end; now wired through getPage + addLink calls. - commands/migrate-engine.ts: --force wipe switched to executeRaw('DELETE FROM pages') to preserve the pre-PR all-sources semantic after deletePage became default-source-scoped. Regression test: test/source-id-tx-regression.test.ts (19 tests). Validates two sources × same slug coexist; getTags/addTag/removeTag/deleteChunks/ upsertChunks/createVersion/addLink/addTimelineEntry/deletePage/updateSlug source-scoped writes don't 21000; back-compat without opts targets source='default'; addLink fail-fast on missing source-qualified endpoint; importFromContent end-to-end tx thread without fabricating duplicate. Adversarial review: Codex (gpt-5.5 reviewer) + Grok (xAI flagship reviewer) 3-round crew loop. Round 1: 2 HIGH (addTimelineEntry + extract.ts thread) + 2 MED. Round 2: 1 CRITICAL + 1 HIGH (deletePage + updateSlug bare-slug) + 2 MED. Round 3: 2 HIGH (getChunks + migrate-engine semantic regression introduced by R2 fix). Round 4: both reviewers CLEAR. Deferred to follow-up PRs (noted as TODO): - src/commands/embed.ts source-aware threading (auto-embed at sync.ts:823 has a TODO; try/catch swallows the failure as best-effort). - src/core/postgres-engine.ts:1511 / pglite-engine.ts:1446 putRawData bare-slug (lower-impact metadata path). - Read-surface bare-slug consistency cleanup (getLinks/getBacklinks/ getTimeline/getRawData/getVersions): non-mutating, won't 21000. - reconcile-links.ts CLI --source flag exposure (internal opt is wired; CLI parser is a UX feature for later). Existing rows in production written under (default, slug) by the old putPage when caller meant another source remain misrouted. Backfill heuristics need install-specific knowledge of intended source and are outside this PR's scope; surface as a deployment-side cleanup task. bun run typecheck clean, bun run build clean, 19/19 regression tests pass, 4082 unit pass / 1 pre-existing fail (BrainRegistry test depending on test-env ~/.gbrain/ absence — fails on untouched main, unrelated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(multi-source): plumb sourceId through performFullSync (PR #707 gap) PR #707 fixed source_id routing for sync's incremental loop (lines 581/641) but performFullSync (line 922) calls runImport without threading sourceId. Result: full syncs route pages to default even with --source <id>. Verified on v0.30.1 by direct PGLite probe after `gbrain sync --source X --full`: all pages landed in default, not the named source. Fix: - runImport accepts sourceId in opts (programmatic only — no CLI flag, preserving PR #707's design intent of `gbrain import` being default-only). - runImport threads sourceId to importFile + importImageFile. - performFullSync passes opts.sourceId to runImport. - ImportImageOptions type accepts sourceId for runImport branch (importImageFile body wiring deferred — image imports out of scope for current use case; TS error fix only). Verified: real sync test against /tmp/test-sync routes 1 page to "testsync" source, 0 to default (post-fix). 19/19 source-id regression tests still pass. Typecheck clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: regression test for performFullSync sourceId threading PR #707's existing 19-test suite at test/source-id-tx-regression.test.ts covers the engine-layer transaction surface (putPage / addTag / etc.) but does NOT exercise commands/sync.ts:performFullSync. Verified via `grep -c 'performFullSync' test/source-id-tx-regression.test.ts → 0`. This means the +18/-4 fix at sync.ts:892 (performFullSync passing sourceId to runImport) had no automated coverage. Adds 2 PGLite-only regression tests: 1. `performFullSync with --source routes pages to named source (not default)` — fixture: temp git repo with 2 markdown files. Calls performSync with { full: true, sourceId: 'testsrc-pfs', noPull: true, noEmbed: true }. Asserts pages.source_id = 'testsrc-pfs', not 'default'. Pre-fix: FAILS (verified by checking out 46cd197 — rebased PR #707 only, without my gap-fix — and running this test). Post-fix: PASSES. 2. `performFullSync WITHOUT --source still targets default (back-compat)` — same fixture, no sourceId opt. Asserts pages.source_id = 'default'. Both pre-fix and post-fix: PASSES (back-compat preserved by the fix). Verified: 21/21 tests pass on this branch (19 from PR #707 + 2 new). `bun run typecheck` clean. `bun run verify` clean (8 guard checks pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(privacy): strip takes fence from get_page / get_versions when token carries an allow-list v0.28.6 (#563) introduced the per-token takes-holder allow-list: an OAuth token carries `permissions.takes_holders` and `takes_list` / `takes_search` / `think.gather` filter take rows server-side via `WHERE t.holder = ANY($allowList)` in both engines. But take rows are stored in two places per the explicit contract in `extract-takes.ts:5-13` ("markdown is canonical, the takes table is a derived index"): the structured `takes` table AND inline in `pages.compiled_truth` between `<!--- gbrain:takes:begin -->` markers as a markdown table whose `who` column IS the holder. A read-only token whose `takes_holders` is `["world"]` (the documented default-deny posture from migrate.ts:1221) can call `get_page <slug>` and recover every non-`world` claim verbatim from the body — private hunches, founder bets, non-public sourcing notes. `get_versions` has the same shape: snapshots persist historical compiled_truth verbatim, so a caller blocked at `get_page` falls through to /history. The team already shipped a complementary fix in `chunkers/recursive.ts:49` (stripTakesFence applied before the body is chunked, so `query` results don't leak fence content). Migration v38 documents this as a "complementary fix" — the page-CRUD surface was missed. Fix strips the fence at the op layer when `ctx.takesHoldersAllowList` is set (i.e. the remote MCP path). Local CLI callers leave the field unset and keep seeing the full fence. const visibleBody = ctx.takesHoldersAllowList ? { ...page, compiled_truth: stripTakesFence(page.compiled_truth) } : page; Same shape on `get_versions` over every snapshot in the array. Re-rendering the fence with allow-list-filtered rows would require joining the takes table per version_id and inverts the markdown-canonical contract; whole-fence strip is the conservative posture that closes the leak. A future allow-list-aware re-render is an additive change that won't break the contract pinned by these tests. Test coverage in `test/takes-mcp-allowlist.serial.test.ts`: - get_page with allow-list strips fence; surrounding body kept. - get_page without allow-list (local CLI) keeps fence (back-compat). - get_page fuzzy resolution path also strips for remote tokens. - get_versions with allow-list strips fence on every snapshot. - get_versions without allow-list returns historical content intact. The pre-fix R12 PoC reported `LEAKED garry hidden take? YES` and `LEAKED brain hidden take? YES`; post-fix the same PoC reports `no` for both holders and "bypass did not reproduce". * Fix double-encoded jsonb in subagent_tool_executions breaking slug lookup persistToolExecPending/Failed/Complete called JSON.stringify(input) before passing to a $N::jsonb parameter. When input is already an object, this produces a JSON string which ::jsonb stores as a jsonb scalar -- not a jsonb object. Downstream queries like input->>slug then return NULL because the operator does not traverse scalar strings. Root cause fix: skip JSON.stringify when input is already a string. Query fix: use COALESCE with (input #>> '{}')::jsonb->>slug fallback to handle both old double-encoded rows and new properly-encoded rows. Affects: dream cycle synthesize phase (pages_written always 0) and patterns phase (same slug collection query). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(adapter/voyage): translate request/response between OpenAI-compat SDK and Voyage's actual contract The @ai-sdk/openai-compatible package treats Voyage as if it were OpenAI-shaped, but Voyage's /v1/embeddings endpoint diverges in three places that combine into a hard-blocking incompatibility: OUTBOUND request: - 'encoding_format=float' (SDK default) is rejected; Voyage only accepts 'base64' - 'dimensions' parameter (OpenAI name) is rejected; Voyage uses 'output_dimension' INBOUND response: - With encoding_format=base64, 'embedding' is returned as a base64 string, but the SDK's Zod schema (openaiTextEmbeddingResponseSchema) expects an 'array of number'. The schema fails with 'Invalid JSON response' even though the JSON is well-formed. - 'usage' lacks 'prompt_tokens'; the schema requires it when usage is present. Without this patch, ALL embedding requests to Voyage fail. Reproducible by running 'gbrain put <slug> < text' with embedding_model=voyage:voyage-* and any current voyage model (voyage-3-large, voyage-3, voyage-4-large). Solution: pass a custom 'fetch' to createOpenAICompatible only when recipe.id === 'voyage'. The fetch wrapper: 1. Forces encoding_format='base64' on outbound (Voyage's only accepted value) 2. Translates dimensions -> output_dimension on outbound 3. Drops Content-Length so the runtime recomputes from the mutated body 4. Decodes base64 embeddings to Float32 arrays on inbound (so the Zod schema sees what it expects) 5. Synthesizes prompt_tokens from total_tokens when missing This is a minimal, targeted fix. It only activates for Voyage and falls through cleanly for all other providers. No public API changes. * feat(dream): support .md files in transcript discovery Transcript discovery only accepted .txt files. Many brain repos store meeting transcripts and conversation logs as .md (markdown), which is the natural format for brain content. Changes: - listTextFiles() now accepts both .txt and .md - basename extraction handles both extensions for date inference - readSingleTranscript() handles both extensions No behavior change for existing .txt-only setups. * fix(test): cast exitCode to unknown for TS strict-narrowing TS narrows exitCode to null between declaration and assertion because the mocked process.exit is behind `(process as any).exit`. The cast preserves test intent without weakening the variable's type annotation. Wave-side merge fix; ships alongside #688 (extract --dir default). * fix(cli): add frontmatter + check-resolvable to CLI_ONLY_SELF_HELP Companion to #634. Both commands have their own --help logic that prints detailed usage with command-specific flags (e.g., --json, --fix, --strict for check-resolvable). Without this, pr-634's generic short-circuit prints "Usage: gbrain <cmd> - run gbrain --help for the full command list." and the existing --help integration tests fail. Verified: `gbrain frontmatter --help` and `gbrain check-resolvable --help` now route to their handlers, which print full per-command usage and exit 0. * fix(test): update discoverTranscripts test expectation for .md support Companion to #708. The pre-#708 test asserted that .md files in the session-corpus directory were skipped. Post-#708 they are discovered alongside .txt. Renamed the test to 'skips non-txt non-md files' (uses .pdf as the negative case) and added a positive .md discovery test that pins #708's intended behavior. * fix(skills): declare missing RESOLVER triggers in skill frontmatter Companion to #718. The RESOLVER round-trip test (test/resolver.test.ts) fuzzy-matches every RESOLVER.md trigger phrase against the target skill's frontmatter triggers list. pr-718 added six new RESOLVER routings without declaring matching triggers: - media-ingest: 'PDF book', 'summarize this book', 'ingest it into my brain' - article-enrichment: 'enriching the article', 'enrich the article', 'enrich pass' - concept-synthesis: 'canon vs riff' - perplexity-research: 'perplexity-research', 'surface new developments' - academic-verify: 'Retraction Watch' - voice-note-ingest: 'audio message' Adds the missing triggers verbatim to each skill's frontmatter so the round-trip invariant holds. * chore: regenerate llms.txt + llms-full.txt after wave skill updates * v0.30.3 release: bump VERSION + CHANGELOG entry 22-PR community fix wave with one P0 security upgrade (auth-code scope escalation closed). 19 PRs landed across 5 lanes; 3 superseded by master during cherry-pick; 1 deferred per E2 protocol (#681 architectural conflict with v0.28 takes-holders); follow-up filed. Headline fixes: #727 (auth-code scope-clamp, RFC 6749 §3.3 compliance), #740/#751 (v0.29.1 PGLite migration connect), #741 (v39-v41 forward- reference bootstrap), #757 (multi-source sourceId threading, closes Postgres 21000), #728 (takes-fence redaction on remote reads). See CHANGELOG.md for full per-PR attribution and decision history. Co-Authored-By: lanceretter <lance@csatlanta.com> Co-Authored-By: alexandreroumieu-codeapprentice <agency.aubergine.code@gmail.com> Co-Authored-By: brandonlipman <brandon@offdeck.com> Co-Authored-By: gus <gustavoraularagon@gmail.com> Co-Authored-By: jeremyknows <jeremyknows@protonmail.com> Co-Authored-By: Trevin Chow <trevin@trevinchow.com> Co-Authored-By: WD <wd@WDdeMacBook-Pro.local> Co-Authored-By: Federico Cachero <federicocachero.tango@gmail.com> Co-Authored-By: Brandon Lipman <brandon@offdeck.com> Co-Authored-By: joshsteinvc <josh@stein.vc> Co-Authored-By: mgunnin <michael.gunnin@gmail.com> Co-Authored-By: NineClaws Brain <joel@5nine64.com> Co-Authored-By: joelwp <joel.phillips@gmail.com> Co-Authored-By: Oscar <oscar@Mac-mini-de-Oscar.local> * test(C6): regression test for #745 collectChildPutPageSlugs Codex-mandated test gate (C6 from /codex review of v0.30.3 plan). Pins behavior of collectChildPutPageSlugs() under both jsonb shapes: - jsonb_typeof='object' (post-#745, normal write path) - jsonb_typeof='string' (pre-#745 double-encoded, the bug shape) Without this guard, a future regression of #745 would silently drop slugs: child jobs finish, queue looks healthy, orchestrator writes nothing. Worst on-call shape — silent failure with no alerting surface. Adds an `__testing` namespace to src/core/cycle/synthesize.ts re-exporting collectChildPutPageSlugs at unit-test granularity. Not part of the runtime contract; matches the v0_29_1.ts `__testing` precedent for engine-internal helpers. * test(C8): #708 .md transcript discovery + self-consumption guard Codex-mandated test gate (C8 from /codex review of v0.30.3 plan). Pins three invariants for #708's broadening of transcript discovery: 1. .md files ARE discovered alongside .txt (the feature works). 2. Other extensions (.pdf, .doc, .json) are still SKIPPED. 3. v0.30.2's dream_generated frontmatter marker MUST guard .md files against self-consumption — without this, every dream cycle would loop on its own output indefinitely. Adversarial cases: BOM + CRLF tolerance on .md frontmatter; the --unsafe-bypass-dream-guard escape hatch for .md output; mixed .txt + .md corpus dedup behavior pinned. * test(C4): takes-fence redaction regression on get_page + get_versions Codex-mandated test gate (C4 from /codex review of v0.30.3 plan). Pins three privacy invariants for #728's fence-stripping in operations.ts: 1. Local CLI caller (no allow-list) sees full takes fence — operator reads should preserve everything. 2. MCP-bound caller (allow-list set) sees compiled_truth with fence STRIPPED on get_page AND get_versions. 3. Allow-list PRESENCE (not contents) flags MCP-bound identity. Even a permissive ['world','garry','brain'] still strips, because the typed read surface for takes is takes_list / takes_search, not get_page or get_versions. Lane 4 (#757 + #728) was the high-risk merge surface for this privacy invariant. The test runs through dispatchToolCall to exercise the full threading path (auth → context → handler → engine read → stripTakesFence) so a future bad merge fails loudly at the conflict seam in operations.ts. * test(C3): rewound-brain E2E for v39-v41 forward-reference bootstrap Codex-mandated test gate (C3 from /codex review of v0.30.3 plan). Pins the upgrade-path claim in the v0.30.3 release notes: brains stuck at config.version < 39 (Postgres) or < 41 (PGLite) walk forward cleanly through #741's bootstrap additions. Without this, the release note's "old PGLite brains upgrade cleanly through v39-v41" was unproven. Four cases: 1. pre-v39 (missing modality + embedding_image) 2. pre-v40 (missing emotional_weight + effective_date + effective_date_source) 3. pre-v41 (missing import_filename + salience_touched_at) 4. compounded pre-v34 wedge (v0.20 + v0.26.3 + v39-v41 all dropped at once) Pattern follows test/e2e/v0_28_5-fix-wave.test.ts: build a fresh LATEST brain, surgically rewind via DROP COLUMN CASCADE + UPDATE config.version, then re-call initSchema and assert advancement to LATEST_VERSION with the rewound columns restored. PGLite-only — Postgres-side bootstrap is covered separately by test/e2e/postgres-bootstrap.test.ts. * fix(test): rename migration-v0-29-1 to .serial.test.ts (CI lint) CI's check-test-isolation lint flags the test for direct process.env.GBRAIN_HOME mutation in beforeEach (rule R1: parallel-test-unsafe). The test is genuinely env-coupled — it sets GBRAIN_HOME so loadConfig() inside the migration phases finds the test fixture. Per CLAUDE.md ("When to quarantine instead of fix") and the lint's own fix hint, env-coupled tests get renamed to *.serial.test.ts to run in the serial bucket. Verified: bash scripts/check-test-isolation.sh now reports OK; the renamed test still runs green (1 pass / 0 fail, ~1.5s). * fix(types): voyageCompatFetch — cast through unknown for Bun typeof fetch CI's tsc --noEmit failed: src/core/ai/gateway.ts(249,7): error TS2741: Property 'preconnect' is missing in type '(input: RequestInfo | URL, init: RequestInit | ...) => Promise<Response>' but required in type 'typeof fetch'. Bun's @types/bun extends the standard fetch type with a preconnect method that arrow functions can't satisfy. The AI SDK only invokes the call signature; the Bun extension surface is irrelevant to voyageCompatFetch's behavior. Cast through `unknown` (TS2352-safe pattern for cross-type-family casts) with explicit param types on the arrow function. Comment names the exact TS2741 the cast suppresses so a future maintainer can audit the choice. Companion to #735 (Voyage encoding-format adapter) — the original PR introduced voyageCompatFetch typed against typeof fetch; the wave-side typecheck error was caught by CI on the assembled branch. * fix(test/e2e): rename + update dream-cycle phase-order test The test file said "v0.23 8-phase cycle" but ALL_PHASES has been 9 since v0.26.5 (added `purge`) and 10 since v0.29 (added `recompute_emotional_weight` between patterns and embed). The hardcoded 8-element array assertion was stale documentation. Renamed the file from dream-cycle-eight-phase-pglite.test.ts to dream-cycle-phase-order-pglite.test.ts to make the maintenance contract explicit: this test pins the canonical phase sequence, whatever its current length, against unintended reorderings or removals. Extracted EXPECTED_PHASES as a typed const so the assertion lives in one place and TypeScript's CyclePhase narrowing catches typos in the phase names. * fix(test/e2e): cycle.test.ts expects 10 phases (v0.29 added recompute_emotional_weight) Same root cause as dream-cycle-phase-order-pglite.test.ts: hardcoded phase count assertion drifted behind ALL_PHASES growth. Phase history: v0.23 = 8 phases v0.26.5 = 9 (added `purge` last) v0.29 = 10 (added `recompute_emotional_weight` between patterns and embed) * fix(test/e2e): scope GBRAIN_HOME to tmpdir for Doctor Command tests `gbrain doctor`'s minions_migration check reads `~/.gbrain/migrations/completed.jsonl` to detect half-installed migrations. Pre-fix the test inherited the developer's local $HOME, so stale partial entries from in-flight workspaces (e.g. v0.31.0 in santiago) made the check fail and the test exit 1 — masking real DB-health failures. Added per-describe-block `gbrainHome` tmpdir, threaded through `cliEnv()` so all spawned gbrain CLI calls in this block read a hermetic, empty migrations ledger. Cleanup in afterAll. * fix(claw-test): pass --dir explicitly to extract phase (companion to #688) Pre-#688 `gbrain extract` defaulted to cwd. Post-#688 it requires either a configured fs source or explicit --dir, otherwise it errors out: "No brain directory configured." The claw-test scripted scenarios run `gbrain init --pglite` in their install_brain phase, which doesn't register a fs source. So the extract phase needs --dir <brainDir> explicitly. Skip the extract phase entirely when the scenario has no brain dir. Captured brainDir at the import-phase site so it's reusable by extract. * fix(preferences): route migration ledger paths through gbrainPath() Pre-fix, preferences.ts used `$HOME/.gbrain` directly via its own `home()` helper. Tests that set `process.env.HOME = tmpdir` expecting hermetic isolation worked — but tests that set `GBRAIN_HOME = tmpdir` (the documented override per `src/core/config.ts`) didn't, because preferences ignored it. Routed prefsDir(), prefsPath(), migrationsDir(), and completedJsonlPath() through gbrainPath() (which honors GBRAIN_HOME, falling back to homedir() when unset). The legacy home() helper stays for any future code path that wants $HOME specifically. Updated three tests that mutated process.env.HOME to also mutate GBRAIN_HOME so the same test body works against the new contract: test/preferences.test.ts, test/migration-resume.test.ts, test/e2e/migration-flow.test.ts. * release: rename version slot to 0.31.1.1-fixwave Originally bumped to 0.31.2 during the master merge to stay strictly monotonic. Garry called the slot back to `0.31.1.1-fixwave` to communicate intent: this is a fix wave on top of v0.31.1, not a new minor or patch slot. The next regular release slot (v0.31.2) stays free for in-flight feature work. Format check: - bun install accepts the literal version (verified) - compareVersions() in src/commands/migrations/index.ts splits on '.' and parseInt's each segment, taking only the first 3. So '0.31.1.1-fixwave' compares as [0,31,1] = equal to '0.31.1' for migration-ordering purposes. Wave has no new schema migrations, so equality is fine. - Compares stable to 0.31.1 in the migration runner; later versions (0.31.2, 0.32.x, etc.) sort strictly above as normal. Updated: - VERSION - package.json (with bun.lock refresh) - CHANGELOG.md entry header + 'To take advantage of' block + 'For contributors' reference - llms.txt + llms-full.txt regenerated to match --------- Co-authored-by: lanceretter <lance@csatlanta.com> Co-authored-by: Oscar <oscar@Mac-mini-de-Oscar.local> Co-authored-by: WD <wd@WDdeMacBook-Pro.local> Co-authored-by: gus <gustavoraularagon@gmail.com> Co-authored-by: Trevin Chow <trevin@trevinchow.com> Co-authored-by: Brandon Lipman <brandon@offdeck.com> Co-authored-by: Federico Cachero <federicocachero.tango@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Josh Stein <josh@threshold.vc> Co-authored-by: Matt Gunnin <mgunnin@esports.one> Co-authored-by: Michael Dela Cruz <adobobro@mac.lan> Co-authored-by: Jeremy Knows <jeremy@veefriends.com> Co-authored-by: joelwp <joel.phillips@gmail.com> Co-authored-by: NineClaws Brain <joel@5nine64.com> Co-authored-by: alexandreroumieu-codeapprentice <agency.aubergine.code@gmail.com> Co-authored-by: jeremyknows <jeremyknows@protonmail.com> Co-authored-by: joshsteinvc <josh@stein.vc> Co-authored-by: mgunnin <michael.gunnin@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0.28 adds the largest structural surface gbrain has ever shipped: a takes layer that turns every page into a queryable belief surface. Four kinds (
fact | take | bet | hunch), explicit attribution (world | garry | brain | <slug>), 0.0–1.0 weight, since/until dates, supersede chains, and bet resolution. Markdown is canonical (a fenced table on the page); Postgres is the derived index. Every weight change diffs in git. Every superseded take stays visible with strikethrough so belief evolution is preserved.Plus: full
gbrain thinksynthesis pipeline (gather → sanitize → cite-render → synthesize), unified model config (models.defaultreplaces every per-phase config key), per-token MCP visibility for the takes layer (Codex P0 #3 privacy fix), and three new MCP ops.Schema changes: v32 (takes + synthesis_evidence with HNSW partial index, FK CASCADE) + v33 (access_tokens.permissions JSONB with safe-default backfill). v0.28 originally targeted v31/v32 but was renumbered after master's v0.25 claimed v31 for
eval_capture_tables.Commits (17 atomic, bisect-friendly)
What's new (user-facing)
gbrain takes— full lifecycle: list/search/add/update/supersede/resolve. Markdown writes via fenced table + atomic per-page lock.gbrain think "<question>"— synthesis pipeline that gathers pages + takes + graph subgraph, synthesizes a cited answer with conflict + gap analysis.--savepersists asynthesispage +synthesis_evidencerows. Without ANTHROPIC_API_KEY, gracefully degrades to gather-only.models.default <alias>— one config key replaces every hardcoded model string + per-phasedream.synthesize.modeletc. Aliases ship:opus,sonnet,haiku,gemini,gpt. Old per-phase keys still honored for one release with stderr deprecation warning.gbrain auth permissions <name> set-takes-holders <h1,h2,h3>— per-token allow-list for the takes layer. Default["world"]keeps private hunches hidden from MCP-bound tokens.takes_list,takes_search,think.Codex P0 #3 privacy fix
Plan-mode review caught that takes content stored in markdown bodies would also flow into
content_chunksrows, bypassing the per-token MCP allow-list at the index layer. Fix:chunkText()callsstripTakesFence(body)before computing chunk boundaries. Takes content lives ONLY in the takes table for retrieval. Doctor invarianttakes_fence_chunk_leakasserts nocontent_chunksrow contains the begin marker — verified end-to-end viatest/e2e/chunker-takes-strip.test.ts.Test Coverage
Plus 198 master eval-capture tests still pass post-merge — combined 319/319 green, no regressions.
Pre-Landing Review
Three reviews CLEARED in plan mode before any code was written:
/plan-ceo-review, mode=SCOPE_EXPANSION): 3 cherry-picks accepted (queryreturns takes,resolved_atschema columns,synthesis_evidenceprovenance), 8 deferred to TODOS. 0 unresolved./plan-eng-review, FULL_REVIEW): 7 issues across 4 sections, 1 P0 decision (append-onlyrow_num), 2 P0 folds (FK CASCADE, file-lock pattern), 6 critical e2e tests added. 0 unresolved./codexconsult mode): 17 findings (4 P0, 9 P1, 4 P2). All P0+P1 folded — privacy hole closed (chunker strips takes content), CASCADE consistency, identity-model clarity, supersession invariants, --take semantics, --context full marked preview, deprecation precedence, budget meter cross-provider rule, resolved_at metadata, backfill in-flight contract.Plan Completion
Plan at
~/.claude/plans/system-instruction-you-are-working-radiant-lagoon.md. All 6 phases shipped end-to-end:TODOS
CEO review surfaced 8 deferred items. All landed in
TODOS.mdunder v0.28 follow-ups: auto-supersede on big weight changes,gbrain takes diff,gbrain think --explain,gbrain takes import --from-page, dream digest phase,gbrain takes export, belief drift detection on put_page, takes heatmap.Documentation
CHANGELOG.md— v0.28.0 release-summary in GStack voice + "To take advantage" block + itemized changesskills/migrations/v0.28.0.md— agent-readable upgrade guide (verify, deprecated-key migration, MCP token visibility, smoke test)src/commands/migrations/v0_28_0.ts— orchestrator (schema verify + backfill + re-chunk TODO)Test plan
bun run typecheckcleangbrain doctorreports schema_version >= 33 after upgradetakes_holders=['world']filters takes_list/search end-to-end through dispatchtakes_fence_chunk_leakinvariant: zero rows in content_chunks contain the fence marker🤖 Generated with Claude Code
Need help on this PR? Tag
@codesmithwith what you need.