Skip to content

Skillpack Section 16: Deterministic Collectors — Code for Data, LLMs for Judgment#13

Merged
garrytan merged 1 commit intomasterfrom
skillpack-section-16-deterministic-collectors
Apr 9, 2026
Merged

Skillpack Section 16: Deterministic Collectors — Code for Data, LLMs for Judgment#13
garrytan merged 1 commit intomasterfrom
skillpack-section-16-deterministic-collectors

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Apr 9, 2026

New section for the GBrain Skillpack.

Pattern: When LLMs keep failing at mechanical formatting (links, URLs, IDs), move that work to a deterministic script. Feed the LLM pre-formatted data. Code handles integrity, LLM handles judgment.

Example: Email inbox triage where Gmail links kept getting dropped. Fixed by a Node.js collector that generates links from message IDs — code cannot forget a URL.

All examples use user@example.com placeholder data. No real PII.

…dgment

Pattern for when LLMs keep failing at mechanical formatting tasks despite
prompt fixes. Move mechanical work to deterministic code, feed LLM
pre-formatted data. Real example: email URL generation.
@garrytan garrytan merged commit 00217fe into master Apr 9, 2026
2 checks passed
garrytan added a commit that referenced this pull request Apr 28, 2026
Issue #13 of the eng review: storage.ts and export.ts loaded every page
in the brain (limit: 1_000_000) to check tier membership. On the 200K-page
brains this feature targets, that's the wall-clock and memory landmine
the feature exists to fix.

Adds an optional `slugPrefix` field to PageFilters. Both engines implement
it as `WHERE slug LIKE prefix || '%' ESCAPE '\'`, with literal escaping of
LIKE metacharacters (%, _, \) so user-supplied prefixes like `media/x/`
are treated as exact string prefixes.

Performance: the (source_id, slug) UNIQUE constraint on the pages table
gives both engines a btree index that supports LIKE-prefix range scans.
An EXPLAIN on Postgres confirms the index range scan rather than a seq
scan. PGLite has the same index shape via pglite-schema.ts.

Consumers updated:
  - export.ts: --slug-prefix flag now goes engine-side (no in-memory
    .filter(...)). The --restore-only path queries each db_only directory
    with slugPrefix in a loop instead of one full-table scan, with seen-set
    deduplication and disk-existence check inline.
  - storage.ts: keeps the full-scan path because storage-status needs the
    "unspecified" bucket count, which can't be computed without enumerating
    every page. Comment notes that step 5 (single-walk filesystem scan)
    will reduce per-page disk syscall cost.

2 new test cases on PGLiteEngine: slugPrefix happy path (3 tier dirs,
asserts only matching slugs return) and metacharacter escape regression
(asserts safe/ doesn't match unrelated slugs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 30, 2026
)

* feat: storage tiering — git-tracked vs supabase-only directories

Brain repos scaling to 200K+ files. Bulk data (tweets, articles, transcripts)
bloats git repos and slows operations. New storage config in gbrain.yml lets
users declare git-tracked and supabase-only directories.

Changes:
- New config: storage.git_tracked and storage.supabase_only in gbrain.yml
- gbrain sync auto-manages .gitignore for supabase-only paths
- gbrain export --restore-only restores missing supabase-only files from DB
- New gbrain storage status command shows tier breakdown
- Config validation warns on conflicts
- 8 tests passing, full docs at docs/storage-tiering.md

Backward compatible — systems without gbrain.yml work unchanged.

* feat: add getDefaultSourcePath() typed accessor (step 1/15)

Single source of truth for "what brain repo are we operating against?"
Replaces ad-hoc raw SQL in storage.ts:38 (Issue #3 of eng review). Used by
both gbrain storage status and gbrain export --restore-only.

Returns null on miss, throws on DB error. Composes with the existing
resolveSourceId chain so it honors --source flag / GBRAIN_SOURCE env /
.gbrain-source dotfile / longest-prefix CWD match / brain-level default.

4 new test cases covering happy path, missing local_path, DB error
propagation, and CWD-prefix resolution priority.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: replace gray-matter with dedicated YAML parser (step 2/15)

The original storage-config.ts called gray-matter on a delimiter-less YAML
file. Gray-matter only parses YAML inside `---` frontmatter blocks; without
delimiters, it returns `{data: {}}`. Result: loadStorageConfig() always
returned null, the entire feature was a silent no-op for every user.

Original eng review's P0 confidence-9 finding (Issue #1).

Replaces gray-matter with a small dedicated parser for the gbrain.yml shape
(top-level `storage:` section, two array-valued nested keys). Yaml-lite was
considered first, but its flat key:value design doesn't handle nested
arrays. The dedicated parser is ~50 lines and trades expressiveness for
zero-dep, predictable parsing of a file format we control.

Adds the Issue #1B sanity warning (locked B): when gbrain.yml exists but
has no storage section (or empty arrays), warn once-per-process so the
user sees their config didn't take. The single test that would have caught
the original P0 — write a real gbrain.yml, call loadStorageConfig, assert
non-null — now exists.

Also tightens loadStorageConfig per D36: distinguishes "absent" (silent
null) from "unreadable" (throws). The previous code silently swallowed
read errors, hiding broken installs.

8 new test cases: real-disk happy path, comments + blank lines, quoted
values, missing storage section warning, empty section warning,
once-per-process warning suppression, unreadable file behavior, and the
existing helper tests (validation, tier matching, edge cases) all still
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: rename storage keys to db_tracked/db_only (step 3/15)

The vendor-specific names "supabase_only" and "git_tracked" hardcoded a
backend (Supabase) into the config schema. gbrain ships two engines —
PGLite and Postgres-via-Supabase. The canonical distinction is "lives in
the brain DB only" vs "lives in the brain DB and on disk under git." Both
work on either engine.

Renamed throughout (Issue #4 of eng review):
  git_tracked    → db_tracked
  supabase_only  → db_only
  isGitTracked() → isDbTracked()
  isSupabaseOnly() → isDbOnly()
  StorageTier 'git_tracked'/'supabase_only' → 'db_tracked'/'db_only'

Backward compatibility (D3 lock):
  loadStorageConfig accepts both shapes. Loader resolution order per the
  eng-review pass-2 finding: parse YAML → if canonical keys present use
  them, else if deprecated keys present map to canonical AND emit
  once-per-process deprecation warning → THEN run validation.
  Validation always sees the canonical shape so error messages reference
  db_tracked/db_only regardless of which keys the user wrote.

  The deprecation warning suggests `gbrain doctor --fix` for an automated
  rename (D72 — fix path lands in step 7).

  When both shapes coexist in one file, canonical wins and a stronger
  warning fires ("deprecated keys ignored — remove them").

Aliases isGitTracked/isSupabaseOnly kept for now to avoid churning the
sync.ts / export.ts / storage.ts call sites in this commit; they'll be
removed in a follow-up step. Storage.ts's tier-bucket initializers and
output strings updated. ASCII output replaces unicode box-drawing per D10.

gbrain.yml example file updated to canonical keys with explanatory
comments.

2 new test cases: deprecated-key fallback (asserts both shapes load
correctly with warning), canonical-wins-over-deprecated (asserts the
"both shapes coexist" path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add slugPrefix to PageFilters with engine-side filter (step 4/15)

Issue #13 of the eng review: storage.ts and export.ts loaded every page
in the brain (limit: 1_000_000) to check tier membership. On the 200K-page
brains this feature targets, that's the wall-clock and memory landmine
the feature exists to fix.

Adds an optional `slugPrefix` field to PageFilters. Both engines implement
it as `WHERE slug LIKE prefix || '%' ESCAPE '\'`, with literal escaping of
LIKE metacharacters (%, _, \) so user-supplied prefixes like `media/x/`
are treated as exact string prefixes.

Performance: the (source_id, slug) UNIQUE constraint on the pages table
gives both engines a btree index that supports LIKE-prefix range scans.
An EXPLAIN on Postgres confirms the index range scan rather than a seq
scan. PGLite has the same index shape via pglite-schema.ts.

Consumers updated:
  - export.ts: --slug-prefix flag now goes engine-side (no in-memory
    .filter(...)). The --restore-only path queries each db_only directory
    with slugPrefix in a loop instead of one full-table scan, with seen-set
    deduplication and disk-existence check inline.
  - storage.ts: keeps the full-scan path because storage-status needs the
    "unspecified" bucket count, which can't be computed without enumerating
    every page. Comment notes that step 5 (single-walk filesystem scan)
    will reduce per-page disk syscall cost.

2 new test cases on PGLiteEngine: slugPrefix happy path (3 tier dirs,
asserts only matching slugs return) and metacharacter escape regression
(asserts safe/ doesn't match unrelated slugs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf: single-walk filesystem scan via walkBrainRepo() (step 5/15)

Issue #14 of the eng review: storage.ts called existsSync + statSync
per-page in a synchronous loop. On a 200K-page brain that's 400K syscalls
serialized. Wall-clock landmine.

Adds src/core/disk-walk.ts with walkBrainRepo(repoPath) — one recursive
readdirSync walk, builds a Map<slug, {size, mtimeMs}>. Storage.ts looks
up each DB page in the map (O(1)) instead of stat-checking on demand.
Slug derivation matches the pages-table convention: people/alice.md on
disk becomes people/alice as the map key.

Skipped during walk:
  - dot-directories (.git, .gbrain, .vscode, etc) — not part of the brain
    namespace
  - node_modules — guards against accidentally walking into imported repos
  - non-.md files (sidecar JSON, binaries) — tracked by the brain through
    the files table, not by slug

Reusable: future commands (gbrain doctor's storage_tiering check, the
optional autopilot tier-fix path) get the same walk for free.

9 new test cases: empty dir, nonexistent dir, top-level files, nested
dirs, dot-dir skipping, node_modules skipping, non-.md filtering, size
capture, mtimeMs capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: path-segment matching for tier directories (step 6/15)

Issue #5 + D6 of the eng review: tier matching used slug.startsWith(dir),
which falsely matches 'media/xerox/foo' against 'media/x' if a user wrote
the directory without a trailing slash.

The new matcher requires the configured directory to end with `/` and
treats it as a canonical path-segment ancestor:

  media/x/   matches  media/x/tweet-1       ✓
  media/x/   doesn't  media/xerox/foo       ✗
  media/x    refused  media/x/tweet-1       (matcher requires trailing /)

Non-canonical input (no trailing slash) is refused outright. Step 7's
auto-normalizing validator converts user-written 'media/x' → 'media/x/'
on load, so the matcher never sees non-canonical input from real configs.
The behavior tested here is the strict matcher's contract.

Regression test pins the media/xerox collision case explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: auto-normalize trailing-slash, throw on tier overlap (step 7/15)

D7+D8 of the eng review: validation was warnings-only. Users miss warnings.
Now:

  - Cosmetic: missing trailing slash auto-corrected, one-time info note
    showing what changed ("normalized 2 storage paths: 'people' →
    'people/', 'media/x' → 'media/x/'"). Once-per-process to keep noise low.

  - Semantic: same directory in both tiers throws StorageConfigError.
    Ambiguous routing — does media/ win as db_tracked or db_only? — is a
    real bug the user must fix. Caller propagates to the CLI for a clean
    exit-1 with actionable message.

loadStorageConfig now applies normalize+validate after merging deprecated
keys, so the path-segment matcher (step 6) only ever sees canonical
trailing-slash directories.

The pure validateStorageConfig kept for callers who want the warnings list
without the auto-fix side effects (gbrain doctor's reporting path).

2 new test cases: auto-normalize round-trip with warning text assertion,
overlap throws StorageConfigError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: wire manageGitignore into runSync, only on success (step 8/15)

Issue #2 of the eng review: manageGitignore was defined and never
invoked. Docs claimed "auto-managed by gbrain" — false. Users hit a
.gitignore that never updated and committed db_only directories anyway.

Wire-up: runSync now calls manageGitignore after each successful
performSync return, in both watch and one-shot modes.

Eng review pass-2 finding #1: skip on dry_run AND blocked_by_failures
status. A sync that aborted partway has stale state; mutating .gitignore
based on a partially-loaded config invites drift. Failure-skip test
added (uses .gitignore-as-a-directory to simulate write failure;
asserts warning fired and disk wasn't corrupted).

Hardened manageGitignore itself with three additional behaviors:

  - GBRAIN_NO_GITIGNORE=1 escape hatch (D23) for shared-repo setups
    where a maintainer wants gbrain to leave .gitignore alone.

  - Submodule detection (D49). When repoPath/.git is a regular file
    (gitdir: ... pointer), the repo is a git submodule. Submodule
    .gitignore changes don't survive parent submodule updates, so we
    skip with an actionable warning ("add db_only directories to your
    parent repo's .gitignore manually").

  - Graceful failure (D9). Read errors, write errors, and
    StorageConfigError (overlap from step 7) all log a warning and
    return — sync's primary job (moving data) shouldn't die because of
    a side-effect on .gitignore.

manageGitignore is now exported (previously private) so the
storage-sync test file can hit it directly without spinning up sync.

9 new test cases: no-op without gbrain.yml, no-op with empty db_only,
happy-path append, idempotency (run twice, single entry), preservation
of user-written rules, GBRAIN_NO_GITIGNORE skip, submodule skip,
.git-directory normal path, write-failure graceful warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: D5 resolution chain for --restore-only and storage status (step 9/15)

D5 of the eng review: gbrain export --restore-only without --repo
silently fell through to the regular export path, dumping every page in
the database to the wrong directory. Hard regression risk.

Now exits 1 with an actionable message when --restore-only has no
--repo AND no configured default source. Resolution order:
  1. Explicit --repo flag
  2. Typed sources.getDefault() (reuses step 1's accessor)
  3. Hard error — never fall through to cwd

storage.ts:38 also bypassed BrainEngine with raw SQL and a bare
try/catch (Issue #3 + Issue #9). Replaced with the same typed
getDefaultSourcePath() — single source of truth, errors propagate
cleanly to the user, no silent cwd fallback.

Regular export (no --restore-only) keeps its current behavior per D26:
exports include everything, --repo is optional.

4 new test cases on PGLite in-memory:
  - hard-errors with no --repo + no default
  - explicit --repo wins
  - falls back to sources default local_path
  - non-restore export does not require --repo

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: split storage.ts into pure data + JSON + human formatters (step 10/15)

Issue #10 of the eng review: getStorageStatus and runStorageStatus mixed
data gathering, JSON serialization, and human-readable output in one
function. Hard to test, hard to reuse, mismatched the orphans.ts pattern
that CLAUDE.md cites as the precedent.

Now three pure functions + a thin dispatcher:

  getStorageStatus(engine, repoPath) — async, returns StorageStatusResult.
    Side effects: engine.listPages + one walkBrainRepo (Issue #14).
    Exported so MCP exposure (D14) and gbrain doctor (D13) can consume the
    same data without re-running the loop.

  formatStorageStatusJson(result) — pure, returns indented JSON. Stable
    contract on the StorageStatusResult shape, suitable for orchestrators.

  formatStorageStatusHuman(result) — pure, returns ASCII text (D10 — no
    unicode box-drawing). Composable into other commands later.

  runStorageStatus(engine, args) — thin dispatcher: parses --repo /
    --json, calls getStorageStatus, picks a formatter, prints.

8 new test cases on the formatters: JSON parse round-trip, null-config
fallback, missing-files capped at 10 with rollup, ASCII-only assertion
(D10 regression guard), warnings inline, configuration listing, disk-
usage block omitted when zero bytes.

The StorageStatusResult interface is now exported as a public type, so
gbrain doctor's storage_tiering check can build its own findings from
the same shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* types: distinct PageCountsByTier and DiskUsageByTier (step 11/15)

Issue #11 of the eng review: pagesByTier (page counts) and
diskUsageByTier (byte totals) shared the same structural type
(Record<StorageTier, number>). Both are tier-keyed numeric maps but
carry semantically different units. A future bug that swaps them at a
call site (e.g., displaying disk bytes where the count belongs) wouldn't
trip the compiler.

Replaced with distinct nominal types via a brand field. Structurally
identical at runtime (no overhead) but compile-time disjoint —
TypeScript catches accidental cross-assignment.

  PageCountsByTier   { db_tracked, db_only, unspecified } : numbers (count)
  DiskUsageByTier    { db_tracked, db_only, unspecified } : numbers (bytes)

Both initialized in getStorageStatus, both threaded into
StorageStatusResult, both consumed by formatStorageStatusHuman /
formatStorageStatusJson without further changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: PGLite soft-warn + full lifecycle test (step 12/15)

D4: storage tiering on PGLite is a partial feature. The "DB" the pages
live in IS the local file gbrain uses for everything else, so "db_only"
has no real offload effect. The .gitignore management still helps
(keeps bulk content out of git history), so we warn and proceed —
not refuse.

Two warning sites (once-per-process each via module-local flags):
  - storage status: warns at runStorageStatus entry
  - sync: warns inside manageGitignore when engineKind='pglite' and
    config has db_only entries

Both phrased actionably ("To get full tiering, migrate to Postgres
with `gbrain migrate --to supabase`").

manageGitignore signature now takes an optional `engineKind` param.
runSync passes engine.kind. Stand-alone callers (tests, future
gbrain doctor --fix path) can omit it.

New test: test/storage-pglite.test.ts — D8 + D4 lifecycle. 6 cases:
engine.kind assertion, getStorageStatus loading gbrain.yml + reporting
tier counts, manageGitignore PGLite-warn (once per process), Postgres
no-warn, slugPrefix on PGLite, end-to-end (config + putPage + status
+ gitignore).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: add trailing-newline CI guard (step 14/15)

Issue #7 of the eng review: all four new files in the original
storage-tiering branch lacked POSIX trailing newlines. Linters complain,
git diffs phantom-flag every future edit. We've been adding newlines as
each file landed; this commit catches the regression class.

scripts/check-trailing-newline.sh:
  - sibling to check-jsonb-pattern.sh / check-progress-to-stdout.sh per
    CLAUDE.md's CI guard pattern
  - portable to bash 3.2 (macOS default; no mapfile, no associative arrays)
  - covers src/**, test/**, gbrain.yml, top-level *.md
  - reports each missing file by path and exits 1

Wired into `bun run test` between progress-to-stdout and typecheck.

Also fixed docs/storage-tiering.md (pre-existing missing newline from
the original branch — caught by the new guard on first run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.23.0 — VERSION, CHANGELOG, README, CLAUDE.md, storage-tiering.md (step 15/15)

VERSION → 0.23.0 (minor bump for new feature surface).

CHANGELOG entry in Garry voice with the canonical format:
  - Two-line bold headline ("Storage tiering, finally working...")
  - Lead paragraph naming what was broken before and what users get now
  - "Numbers that matter" before/after table for the 6 things that
    actually changed
  - "What this means for your brain" closer
  - "To take advantage of v0.23.0" self-repair block (per CLAUDE.md
    convention) — 6 numbered steps users can follow
  - Itemized changes split into critical fixes / new+renamed surface /
    architecture cleanup / tests + CI guards

CLAUDE.md "Key files" gains four new entries: storage-config.ts,
disk-walk.ts, the v0.23.0 storage.ts shape, and gbrain.yml itself.

README.md gains a new "Storage tiering" section between Skillify and
Getting Data In with the canonical example + commands + link to the
full guide.

docs/storage-tiering.md rewritten end-to-end with canonical key names
(db_tracked / db_only), v0.23.0 hardening details (idempotency,
submodule detection, GBRAIN_NO_GITIGNORE, dry-run gating), the
resolution chain for --restore-only, the auto-normalize +
throw-on-overlap validator, and the PGLite engine note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: e2e Postgres lifecycle for storage tiering (step 16/16)

Per the v0.23.0 plan: full lifecycle E2E against real Postgres.

  - engine.kind === 'postgres' assertion
  - Full lifecycle: write 4 pages (1 db_tracked, 2 db_only, 1 unspecified)
    → getStorageStatus reports correct tier counts → human formatter
    renders → manageGitignore writes managed block → idempotency check
    → getDefaultSourcePath() resolves the configured local_path.
  - Container restart simulation: 2 db_only pages in DB, files missing
    on disk → status.missingFiles.length === 2 → slugPrefix engine
    filter on Postgres returns exactly the tier slugs.
  - slugPrefix index-based range scan regression: 50 media/x/* + 50
    people/p-* pages → slugPrefix='media/x/' returns exactly 50.
  - getDefaultSourcePath returns null when default source has no
    local_path (the hard-error path that replaces the original silent
    cwd fallback).
  - manageGitignore on Postgres engine does NOT emit the PGLite
    soft-warn (cross-engine assertion).

Skips gracefully when DATABASE_URL is unset, per CLAUDE.md E2E pattern.
Run via: DATABASE_URL=... bun test test/e2e/storage-tiering.test.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump version 0.23.0 → 0.22.9

Reverts the minor bump back to a patch-style version on the v0.22 line.
Storage tiering ships within the v0.22.x train alongside the recent
fix waves. Updates VERSION, package.json, CHANGELOG header + body refs,
CLAUDE.md Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* chore: bump version 0.22.9 → 0.22.11

Sibling workspaces claimed v0.22.10 in the queue. This branch advances
to v0.22.11 to keep the version monotonic on master.

Updates VERSION, package.json, CHANGELOG header + body refs, CLAUDE.md
Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* fix: address Codex pre-landing review findings (4 fixes)

Codex found 4 real issues during pre-landing review of v0.22.11 diff:

[P0] export --restore-only fell through to full export when
storageConfig was null (no gbrain.yml present). On older or
misconfigured brains, the recovery command would silently dump the
entire database. src/commands/export.ts now refuses with an actionable
error before any page query fires — matches the D5 lock spirit
("never silently fall through").

[P1] manageGitignore wire-up only fired when --repo was passed
explicitly. performSync resolves the repo from sync.repo_path or
sources.local_path, so the common `gbrain sync` path (after
setup, no flag) never updated .gitignore. src/commands/sync.ts now
uses the same source-resolver chain as the rest of /ship: opts.repoPath
→ getDefaultSourcePath → null. Fires in both watch and one-shot modes.

[P2] getDefaultSourcePath only consulted sources.local_path, missing
the legacy global sync.repo_path config key that pre-v0.18 brains use.
Added a fallback to engine.getConfig('sync.repo_path') when the
sources row has NULL local_path. Pre-v0.18 brains now work without
forcing a `gbrain sources add . --path .` migration.

[P2] sync --all multi-source loop never called manageGitignore even
though src.local_path was already known. Each source now gets its own
gitignore update on successful sync.

Tests:
  - test/storage-export.test.ts: replaced the old "falls through to
    full export" test with one that asserts the new refusal path
    (storage-tiering config required for --restore-only).
  - test/source-resolver.test.ts: added a fallback test exercising the
    legacy sync.repo_path code path for pre-v0.18 brains.
  - All 78 storage-tiering tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms.txt + llms-full.txt for v0.22.11

Per CLAUDE.md: "Run `bun run build:llms` after adding a new doc."
The README's new Storage tiering section + the rewritten
docs/storage-tiering.md changed the inlined bundle. test/build-llms.test.ts
catches the drift and was failing on master pre-regen.

* fix: typecheck error in disk-walk.ts (CI #73350475897)

tsc --noEmit failed in CI because ReturnType<typeof readdirSync> with
withFileTypes:true picks an overload union that includes
Dirent<Buffer<ArrayBufferLike>>. Strict tsc treats entry.name as Buffer,
so .startsWith / .endsWith / string comparisons all blew up.

Annotate the variable as Dirent[] (string-based) and cast through unknown,
matching the pattern sync.ts already uses for its own filesystem walk.
Same runtime behavior; clean typecheck.

Tests still 9/9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FUSED-ID pushed a commit to FUSED-ID/gbrain-cli-fork that referenced this pull request May 3, 2026
…, resolve, synthesis_evidence

Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.

Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
  for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
  embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
  active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
  resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
  resolution is immutable per Codex P1 garrytan#13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)

Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.

Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 7, 2026
…low-list (#563)

* v0.28 schema: takes + synthesis_evidence (v31) + access_tokens.permissions (v32)

Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.

Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 engine: addTakesBatch, listTakes, searchTakes/Vector, supersede, resolve, synthesis_evidence

Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.

Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
  for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
  embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
  active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
  resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
  resolution is immutable per Codex P1 #13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)

Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.

Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 model-config: unified resolveModel with 6-tier precedence + alias resolution

Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:

  1. CLI flag (--model)
  2. New-key config (e.g. models.dream.synthesize)
  3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
     — read with stderr deprecation warning, one-per-process
  4. Global default (models.default)
  5. Env var (GBRAIN_MODEL or caller-supplied)
  6. Hardcoded fallback

Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.

When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.

Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).

Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes-fence: parser/renderer/upserter + chunker strip (privacy P0 fix)

src/core/takes-fence.ts — pure functions for the fenced markdown surface:
- parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->`
  blocks. Strict on canonical form, lenient on hand-edits with warnings
  (TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION).
  Strikethrough `~~claim~~` → active=false; date ranges `since → until`
  split into sinceDate/untilDate.
- renderTakesFence(takes) — round-trip safe with parseTakesFence.
- upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a
  fresh `## Takes` section if no fence present. row_num is monotonic
  (max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence
  stable forever).
- supersedeRow(body, oldRow, replacement) — strikes through old row's claim
  AND appends the new row at end. Both rows preserved in markdown for
  git-blame archaeology.
- stripTakesFence(body) — removes the fenced block entirely. Used by the
  chunker so takes content lives ONLY in the takes table.

Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence()
before computing chunk boundaries. Without this, page chunks would contain
the rendered takes table and the per-token MCP allow-list would be
bypassed at the index layer (token bound to takes_holders=['world'] would
see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check
(plan-side) asserts no chunk contains the begin marker.

Tests: 15 cases covering canonical parse, strikethrough, date range, fence
unbalanced detection, malformed-row skip + warning, row_num collision
detection, round-trip render, append-only upsert into existing fence,
fresh-section creation, monotonic row_num under hand-edit gaps, supersede
flow, stripTakesFence verifying takes content removed AND surrounding
prose preserved. Existing chunker tests still pass (15 + 15 = 30).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 page-lock: PID-liveness file lock for atomic markdown read-modify-write

src/core/page-lock.ts — per-page file lock at
~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes
add` calls or `takes seed --refresh` from autopilot can't race on the
same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17
cycle.lock pattern (mtime + PID liveness) but per-slug.

Differences from cycle.ts's lock:
- SHA-256 of slug for safe filenames (slashes, unicode, etc.)
- Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and
  reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs
  in one process). mtime expiry still rescues post-crash leftovers.
- 5-min TTL (vs cycle's 30 min — page edits are short)
- `withPageLock(slug, fn)` convenience wrapper with default 30s timeout

API:
- acquirePageLock(slug, opts) → handle | null (poll-with-timeout)
- handle.refresh() / handle.release() (idempotent — only releases if pid matches)
- withPageLock(slug, fn, opts) — acquire + run + release-in-finally

Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime
reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release
is no-op, withPageLock callback runs and releases on success/failure,
timeout-throws when held, SHA-256 filename safety for slashes/unicode.
All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 extract-takes: dual-path phase (fs|db) + since/until_date as TEXT

src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:

- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
  compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)

Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
  slate when markdown is canonical and DB has drifted)

Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.

Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.

Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes CLI: list, search, add, update, supersede, resolve

src/commands/takes.ts — surfaces the engine methods + takes-fence library
through a single `gbrain takes <subcommand>` entrypoint:

  takes <slug>                          list with filters + sort
  takes search "<query>"                pg_trgm keyword search across all takes
  takes add <slug> --claim ... ...      append (markdown + DB, atomic via lock)
  takes update <slug> --row N ...       mutable-fields update (markdown + DB)
  takes supersede <slug> --row N ...    strikethrough old + append new
  takes resolve <slug> --row N --outcome  record bet resolution (immutable)

Markdown is canonical. Every mutate command:
  1. acquires the per-page file lock (withPageLock)
  2. re-reads the .md file
  3. applies the edit via takes-fence (upsertTakeRow / supersedeRow)
  4. writes the .md file back
  5. mirrors to the DB via the engine method
  6. releases the lock (auto via finally)

Resolve currently writes only to DB — surfacing resolved_* in the markdown
table is deferred to v0.29 (the takes-fence renderer's column set is
fixed at # | claim | kind | who | weight | since | source per spec).

Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the
project convention (orphans/embed/extract pattern). --dir flag overrides
sync.repo_path config when working outside the configured brain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 MCP + auth: takes_list / takes_search / think ops + per-token allow-list

OperationContext gains takesHoldersAllowList — server-side filter for
takes.holder field threaded from access_tokens.permissions through dispatch
into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker
strip already closed the page-content side in the previous commit).

src/core/operations.ts — three new ops:
- takes_list: lists takes with holder/kind/active/resolved filters; honors
  ctx.takesHoldersAllowList for MCP-bound calls
- takes_search: pg_trgm keyword search; honors allow-list
- think: op surface registered (returns not_implemented envelope until
  Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7.

src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into
buildOperationContext.

src/mcp/http-transport.ts — validateToken now reads
access_tokens.permissions.takes_holders, defaults to ['world'] when the
column is absent or malformed (default-deny on private hunches).
auth.takesHoldersAllowList passed to dispatchToolCall.

src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world']
since stdio has no per-token auth. Operators wanting full visibility use
`gbrain call <op>` directly (sets remote=false).

src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b`
flag persists the per-token list; new `auth permissions <name>
set-takes-holders <list>` updates an existing token.

Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving
the threading: local-CLI sees all holders, ['world'] returns only public,
['world','garry'] returns 2/3, no-overlap returns empty (no fallback),
search honors allow-list, remote save/take on think rejected with
not_implemented envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.0: ship-prep — VERSION, CHANGELOG, migration orchestrator, skill

Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:

- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
  applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
  any pre-existing fenced takes tables in markdown populate the takes
  index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
  to re-import pages with takes content so the v0.28 chunker-strip rule
  (Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
  already have takes content stripped from chunks at index time; this
  TODO catches up legacy pages.

skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.

CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).

Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 think pipeline: gather → sanitize → synthesize → cite-render → CLI

src/core/think/sanitize.ts — prompt-injection defense for take claims:
14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag,
DAN, system-prompt overrides, eval-shell hooks) plus structural framing
(takes wrapped in <take id="..."> tags the model is told to treat as
DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt.

src/core/think/prompt.ts — system prompt + structured-output schema.
Hard rules: cite every claim, mark hunches/low-weight explicitly,
surface conflicts (never silently pick), surface gaps. JSON schema
with answer + citations[] + gaps[]. Prompt adapts to anchor / time
window / save flag.

src/core/think/cite-render.ts — structured citations + regex fallback
(Codex P1 #4 fold). normalizeStructuredCitations validates the model's
structured output; parseInlineCitations is the body-scan fallback when
the model omits the structured field. resolveCitations dispatches and
records CITATIONS_REGEX_FALLBACK warning when used.

src/core/think/gather.ts — 4-stream parallel retrieval:
  1. hybridSearch (pages, existing primitive)
  2. searchTakes (keyword, pg_trgm)
  3. searchTakesVector (vector, when embedQuestion fn supplied)
  4. traversePaths (graph, when --anchor set)
RRF fusion (k=60). Each stream wrapped in try/catch — partial gather
beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls.

src/core/think/index.ts — runThink orchestrator + persistSynthesis:
INTENT (regex classify) → GATHER → render evidence blocks → resolveModel
('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call
(injectable client) → JSON parse with code-fence + fallback strip →
resolveCitations → ThinkResult. persistSynthesis writes a synthesis
page + synthesis_evidence rows (page_id resolved per slug; page-level
citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY.
Round-loop scaffolding in place (rounds=1 only path exercised in v0.28).

src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing
strips --anchor, --rounds, --save, --take, --model, --since, --until,
--json. Local CLI = remote=false, so save/take honored. Human-readable
output by default; --json for agent consumption.

operations.ts — `think` op now calls runThink (was a not_implemented
stub). Remote callers can't save/take per Codex P1 #7. Returns full
ThinkResult plus saved_slug + evidence_inserted.

cli.ts — wired into dispatch + CLI_ONLY allowlist.

Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering
sanitize patterns, structural rendering, citation parsing (structured +
regex fallback + dedup + invalid-slug rejection), gather streams +
allow-list filter, full pipeline with stub client, malformed-LLM
fallback path, no-API-key graceful degradation, persistSynthesis writes
page + evidence rows. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: auto-think + drift + budget meter (Codex P1 #10 fold)

src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.

src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.

src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.

src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.

Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
  cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
  captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
  questions empty, success → cooldown ts written, cooldown blocks rerun,
  budget exhausted → partial. drift not_enabled, soft-band candidate
  detection, complete + dry-run paths.

All pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e Postgres: takes engine + extract + MCP allow-list (12 cases)

test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real
Postgres (gated on DATABASE_URL). 12 cases:
- addTakesBatch upsert via unnest() bind path (Postgres-specific)
- listTakes filters: holder, kind, sort=weight, takesHoldersAllowList
- searchTakes pg_trgm + allow-list filter
- supersedeTake transactional path (BEGIN/COMMIT semantics)
- resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED
- synthesis_evidence FK CASCADE on take delete
- countStaleTakes + listStaleTakes filter active+null
- extractTakesFromDb populates takes from fenced markdown
- MCP dispatch with takesHoldersAllowList=['world'] returns only world
- MCP dispatch local-CLI path returns all holders
- MCP dispatch takes_search honors allow-list
- think op forces remote_persisted_blocked even for save+take

postgres-engine.ts: addTakesBatch boolean[] serialization fix.
postgres-js auto-detects element type from JS arrays; for booleans it
mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then
SQL-cast to boolean[] — same pattern other batch methods rely on for
type-stable bind shapes.

test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in
TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence)
and (b) calls engine.initSchema() to actually run migrations.

test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match
Lane D's landed pipeline. They previously asserted not_implemented
envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY
graceful-degrade behavior.

Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts
Result: 12/12 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: local DreamPhaseResult type (avoid premature CyclePhase enum extension)

cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.

Introduces DreamPhaseResult exported from auto-think.ts:
  { name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
    detail: string; totals?: Record<string,number>; duration_ms: number }

drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: access_tokens.permissions JSONB end-to-end (5 cases)

test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:

- Migration v32 default backfill: new tokens created without a permissions
  column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
  holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
  via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.

Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.

All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: chunker takes-strip integration test (Codex P0 #3 verification)

test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually
strips fenced takes content end-to-end through the import pipeline.
This is the Codex P0 #3 fix's verification path: takes content lives
ONLY in the takes table for retrieval, never duplicated in
content_chunks where the per-token MCP allow-list cannot reach.

5 cases:
- chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers
- chunkText output never contains fenced claim text
- chunkText output retains non-fence prose (no over-stripping)
- importFromContent end-to-end: imported page has chunks but none
  contain fenced content
- takes_fence_chunk_leak doctor invariant: zero rows globally where
  chunk_text matches `<!--- gbrain:takes:%`

Final v0.28 test sweep:
  121 pass, 0 fail, 336 expect() calls, 12 files
  Coverage: schema migrations, engine methods (PGLite + Postgres),
  takes-fence parser, page-lock, extract phase, takes CLI engine
  surface, model config 6-tier resolver, MCP+auth allow-list,
  think pipeline (gather + sanitize + cite-render + synthesize),
  auto-think + drift + budget meter, JSONB end-to-end, chunker
  strip integration. ~95% of v0.28 surface area covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: apply-migrations skippedFuture arrays + http-transport SQL mock

Two CI failures from PR #563:

test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact
skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to
the migration registry means it shows up in skippedFuture when the test
runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both
hardcoded arrays.

test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix
matches `SELECT id, name FROM access_tokens` to return a row. v0.28's
validateToken now selects `SELECT id, name, permissions FROM access_tokens`
to read the per-token takes_holders allow-list. Mock returned [] on the
new query → validateToken treated every token as invalid → 401.

Fix: mock now matches both query shapes. validTokens row gets a default
`{takes_holders: ['world']}` permission injected when caller didn't
supply one (mirrors the migration v33 column DEFAULT). Updated
FakeEngineConfig type to allow tests to pass explicit permissions.

Verification:
  bun test test/apply-migrations.test.ts → 18/18 pass
  bun test test/http-transport.test.ts   → 24/24 pass
  bun run typecheck                       → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: add scope annotations to v0.28 ops (takes_list/takes_search/think)

test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing:
every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op
flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added
before master shipped v0.26 + the new invariant; the merge surfaced the gap.

Annotations:
- takes_list   → read
- takes_search → read
- think        → write (mutating: true; --save persists synthesis page)

Verification:
  bun test test/oauth.test.ts → 42/42 pass
  bun run typecheck            → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.2 feat: remote-source MCP + scope hierarchy + whoami (#690)

* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts

src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).

Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe

New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:

- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
  -c protocol.file.allow=never, -c protocol.ext.allow=never,
  --no-recurse-submodules. Single source of truth shared by cloneRepo
  and pullRepo so a future flag added to one path lands on both.
  Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
  .gitmodules as a second-fetch surface, file:// scheme in remotes.

- parseRemoteUrl: https-only, rejects embedded credentials and path
  traversal, delegates internal-target classification to isInternalUrl
  from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
  100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
  GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
  needed for self-hosted git over Tailscale (CGNAT trips the gate).

- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
  non-empty destDirs; spawns git via execFileSync (no shell injection)
  with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
  prompts. timeoutMs default 600s.

- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.

- validateRepoState: 6-state decision tree (missing | not-a-dir |
  no-git | corrupted | url-drift | healthy). Used by performSync's
  re-clone branch to recover from rmd clone dirs and refuse syncs on
  url-drift or corruption.

test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist

New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.

Hierarchy:
  - admin implies all (escape hatch)
  - write implies read
  - sources_admin and users_admin are siblings (different axes —
    sources-mgmt vs user-account-mgmt; neither implies the other)

Exported:
  - hasScope(grantedScopes, requiredScope): the canonical scope check.
    Replaces exact-string-match at three call sites in upcoming commits
    (serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
    token issuance). Without this rewrite, an admin-grant token would
    fail to refresh down to sources_admin (codex finding).
  - ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
    for OAuth metadata wire format and drift-check output).
  - assertAllowedScopes / InvalidScopeError: registration-time gate so
    tokens with bogus scope strings (read flying-unicorn) get rejected
    with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
    registerClientManual. Today's behavior accepts any string silently.
  - parseScopeString: space-separated wire format → array.

Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).

test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope-constants mirror + drift CI for src/core/scope.ts

The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.

Files:
  - admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
    duplicate, sorted alphabetically to match src/core/scope.ts.
  - scripts/check-admin-scope-drift.sh: extracts the list from each file
    via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
    (with full breakdown of which scopes diverged), 2 on internal error.
    Tested both passing and corrupted paths.
  - package.json: wires check:admin-scope-drift into both `verify` and
    `check:all` so any update to src/core/scope.ts that forgets the
    admin-side mirror fails the build.

The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration

Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:

- F3 refresh-token subset enforcement at line 365: previously rejected
  admin → sources_admin refresh because exact-match treated them as
  unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
  refresh down to least-privilege sources_admin scope; this fix lands
  that path.

- Token issuance intersection at line 498 (client_credentials grant):
  same hasScope swap so a client whose stored grant is `admin` can mint
  tokens including any implied scope.

- registerClient (DCR /register) and registerClientManual: validate
  every scope string against ALLOWED_SCOPES via assertAllowedScopes.
  Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
  and persisted the bogus string in oauth_clients.scope. Post-fix the
  caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
  pre-allowlist scopes keep working (allowlist gates registration only).

Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union

62 OAuth tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES

Two changes against src/commands/serve-http.ts:

- Line 195: scopesSupported on the mcpAuthRouter options switches from the
  hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
  Without this, /.well-known/oauth-authorization-server keeps reporting
  the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
  cannot discover the v0.28 sources_admin and users_admin scopes via
  standard discovery — they would have to be pre-configured out of band.

- Line 673: request-time scope check on /mcp swaps
  authInfo.scopes.includes(requiredScope) for hasScope(...). This was
  the most-cited codex finding: without it, sources_admin tokens could
  not even satisfy a `read`-scoped op (sources_admin doesn't include
  the literal string "read"). hasScope routes through the hierarchy
  table in src/core/scope.ts so admin implies all and write implies
  read at the gate too.

T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): sources-ops module — atomic clone + symlink-safe cleanup

src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.

addSource: D3 atomicity contract from the eng review.
  1. Validate id (matches existing SOURCE_ID_RE).
  2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
     before any clone work. Pre-fix the existing CLI used INSERT…ON
     CONFLICT DO NOTHING which silently no-op'd; with clone-first that
     would orphan the temp dir.
  3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
  4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
     git-remote helpers.
  5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
  6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
     the temp dir; rename-failed path also DELETEs the just-INSERTed row
     best-effort.

removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
  - isPathContained (realpath-resolves both sides + parent-with-sep
    string check) rejects symlinks whose target falls outside the
    confine.
  - lstat-then-isSymbolicLink check refuses symlinks whose realpath
    happens to land back inside the confine (defense in depth).

getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.

recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).

test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): whoami + sources_{add,list,remove,status} MCP ops

Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.

Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).

whoami (scope: read): introspect calling identity over MCP.
  - Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
    for OAuth clients (clientId starts with gbrain_cl_).
  - Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
    for grandfathered access_tokens.
  - Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
    Empty scopes (NOT ['read','write','admin']) is the D2 decision —
    returning OAuth-shaped scopes for local callers would resurrect the
    v0.26.9 footgun where code conditionally trusted on
    `auth.scopes.includes('admin')` instead of `ctx.remote === false`.
  - Q3 fail-closed: throws unknown_transport when remote=true AND auth is
    missing OR ctx.remote is the literal `undefined` (cast bypass guard).
    A future transport that forgets to thread auth doesn't get a free
    pass.

sources_add (sources_admin, mutating): register a source by --path
  (existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
  Calls into addSource from sources-ops.ts which owns the temp-dir +
  rename atomicity.

sources_list (read): list registered sources with page counts, federated
  flag, and remote_url. The remote_url field is new — lets a remote MCP
  caller see which sources are auto-managed.

sources_remove (sources_admin, mutating): cascade-delete a source +
  symlink-safe clone cleanup. Requires confirm_destructive: true when the
  source has data.

sources_status (read): per-source diagnostic returning clone_state
  ('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
  'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
  busted clone without SSH access to the brain host.

test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.

test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sync): re-clone fallback when clone is missing/no-git/corrupted

src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:

  - 'healthy'    → fall through to existing pull (unchanged)
  - 'missing'    → loud stderr "auto-recovery: re-cloning <id>", then
  'no-git'         recloneIfMissing handles the temp-dir + rename. Sync
  'not-a-dir'      continues from the freshly-cloned head.
  - 'corrupted'  → throw with structured hint pointing at sources remove
                   + add (no syncing wrong state).
  - 'url-drift'  → throw with hint pointing at the (deferred) sources
                   rebase-clone command.

Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.

src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.

test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.

test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.

test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor

src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.

Two new flags on `gbrain sources add`:
  - `--url <https-url>` : federated remote-clone path (clone + INSERT +
    rename, atomic rollback on failure).
  - `--clone-dir <path>` : override the default
    $GBRAIN_HOME/clones/<id>/ destination.

Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.

`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.

54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)

addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.

Two layers:

1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
   walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
   prints a warn with disk-byte estimate. Operators see the leak before
   `df` complains.

2. The autopilot cycle's existing `purge` phase grows a substep that
   nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
   uses. Operator behavior stays uniform across all soft-delete-style
   surfaces.

Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope checkboxes source from scope-constants mirror + dist

admin/src/pages/Agents.tsx Register Client modal:
  - useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
    to true, others false; unchanged UX for the common case).
  - Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
    hardcoded ['read','write','admin'].

Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.

The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.

admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami

VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).

CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).

TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.

README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).

llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip

Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.

12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir

Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.

Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.

The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.

Skipped gracefully when DATABASE_URL is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps

Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.

CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.

HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.

MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.

PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.

Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
  link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
  asymmetry (clone_dir override silently ignored over MCP, path nulled,
  local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.

DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.

323 tests pass (9 files); 4071 unit tests pass (full suite).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump v0.28.1 → v0.28.2 (master collision)

Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).

Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.

Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated

PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 8, 2026
* v0.28 schema: takes + synthesis_evidence (v31) + access_tokens.permissions (v32)

Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.

Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 engine: addTakesBatch, listTakes, searchTakes/Vector, supersede, resolve, synthesis_evidence

Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.

Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
  for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
  embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
  active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
  resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
  resolution is immutable per Codex P1 #13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)

Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.

Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 model-config: unified resolveModel with 6-tier precedence + alias resolution

Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:

  1. CLI flag (--model)
  2. New-key config (e.g. models.dream.synthesize)
  3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
     — read with stderr deprecation warning, one-per-process
  4. Global default (models.default)
  5. Env var (GBRAIN_MODEL or caller-supplied)
  6. Hardcoded fallback

Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.

When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.

Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).

Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes-fence: parser/renderer/upserter + chunker strip (privacy P0 fix)

src/core/takes-fence.ts — pure functions for the fenced markdown surface:
- parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->`
  blocks. Strict on canonical form, lenient on hand-edits with warnings
  (TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION).
  Strikethrough `~~claim~~` → active=false; date ranges `since → until`
  split into sinceDate/untilDate.
- renderTakesFence(takes) — round-trip safe with parseTakesFence.
- upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a
  fresh `## Takes` section if no fence present. row_num is monotonic
  (max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence
  stable forever).
- supersedeRow(body, oldRow, replacement) — strikes through old row's claim
  AND appends the new row at end. Both rows preserved in markdown for
  git-blame archaeology.
- stripTakesFence(body) — removes the fenced block entirely. Used by the
  chunker so takes content lives ONLY in the takes table.

Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence()
before computing chunk boundaries. Without this, page chunks would contain
the rendered takes table and the per-token MCP allow-list would be
bypassed at the index layer (token bound to takes_holders=['world'] would
see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check
(plan-side) asserts no chunk contains the begin marker.

Tests: 15 cases covering canonical parse, strikethrough, date range, fence
unbalanced detection, malformed-row skip + warning, row_num collision
detection, round-trip render, append-only upsert into existing fence,
fresh-section creation, monotonic row_num under hand-edit gaps, supersede
flow, stripTakesFence verifying takes content removed AND surrounding
prose preserved. Existing chunker tests still pass (15 + 15 = 30).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 page-lock: PID-liveness file lock for atomic markdown read-modify-write

src/core/page-lock.ts — per-page file lock at
~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes
add` calls or `takes seed --refresh` from autopilot can't race on the
same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17
cycle.lock pattern (mtime + PID liveness) but per-slug.

Differences from cycle.ts's lock:
- SHA-256 of slug for safe filenames (slashes, unicode, etc.)
- Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and
  reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs
  in one process). mtime expiry still rescues post-crash leftovers.
- 5-min TTL (vs cycle's 30 min — page edits are short)
- `withPageLock(slug, fn)` convenience wrapper with default 30s timeout

API:
- acquirePageLock(slug, opts) → handle | null (poll-with-timeout)
- handle.refresh() / handle.release() (idempotent — only releases if pid matches)
- withPageLock(slug, fn, opts) — acquire + run + release-in-finally

Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime
reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release
is no-op, withPageLock callback runs and releases on success/failure,
timeout-throws when held, SHA-256 filename safety for slashes/unicode.
All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 extract-takes: dual-path phase (fs|db) + since/until_date as TEXT

src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:

- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
  compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)

Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
  slate when markdown is canonical and DB has drifted)

Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.

Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.

Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes CLI: list, search, add, update, supersede, resolve

src/commands/takes.ts — surfaces the engine methods + takes-fence library
through a single `gbrain takes <subcommand>` entrypoint:

  takes <slug>                          list with filters + sort
  takes search "<query>"                pg_trgm keyword search across all takes
  takes add <slug> --claim ... ...      append (markdown + DB, atomic via lock)
  takes update <slug> --row N ...       mutable-fields update (markdown + DB)
  takes supersede <slug> --row N ...    strikethrough old + append new
  takes resolve <slug> --row N --outcome  record bet resolution (immutable)

Markdown is canonical. Every mutate command:
  1. acquires the per-page file lock (withPageLock)
  2. re-reads the .md file
  3. applies the edit via takes-fence (upsertTakeRow / supersedeRow)
  4. writes the .md file back
  5. mirrors to the DB via the engine method
  6. releases the lock (auto via finally)

Resolve currently writes only to DB — surfacing resolved_* in the markdown
table is deferred to v0.29 (the takes-fence renderer's column set is
fixed at # | claim | kind | who | weight | since | source per spec).

Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the
project convention (orphans/embed/extract pattern). --dir flag overrides
sync.repo_path config when working outside the configured brain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 MCP + auth: takes_list / takes_search / think ops + per-token allow-list

OperationContext gains takesHoldersAllowList — server-side filter for
takes.holder field threaded from access_tokens.permissions through dispatch
into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker
strip already closed the page-content side in the previous commit).

src/core/operations.ts — three new ops:
- takes_list: lists takes with holder/kind/active/resolved filters; honors
  ctx.takesHoldersAllowList for MCP-bound calls
- takes_search: pg_trgm keyword search; honors allow-list
- think: op surface registered (returns not_implemented envelope until
  Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7.

src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into
buildOperationContext.

src/mcp/http-transport.ts — validateToken now reads
access_tokens.permissions.takes_holders, defaults to ['world'] when the
column is absent or malformed (default-deny on private hunches).
auth.takesHoldersAllowList passed to dispatchToolCall.

src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world']
since stdio has no per-token auth. Operators wanting full visibility use
`gbrain call <op>` directly (sets remote=false).

src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b`
flag persists the per-token list; new `auth permissions <name>
set-takes-holders <list>` updates an existing token.

Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving
the threading: local-CLI sees all holders, ['world'] returns only public,
['world','garry'] returns 2/3, no-overlap returns empty (no fallback),
search honors allow-list, remote save/take on think rejected with
not_implemented envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.0: ship-prep — VERSION, CHANGELOG, migration orchestrator, skill

Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:

- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
  applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
  any pre-existing fenced takes tables in markdown populate the takes
  index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
  to re-import pages with takes content so the v0.28 chunker-strip rule
  (Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
  already have takes content stripped from chunks at index time; this
  TODO catches up legacy pages.

skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.

CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).

Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 think pipeline: gather → sanitize → synthesize → cite-render → CLI

src/core/think/sanitize.ts — prompt-injection defense for take claims:
14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag,
DAN, system-prompt overrides, eval-shell hooks) plus structural framing
(takes wrapped in <take id="..."> tags the model is told to treat as
DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt.

src/core/think/prompt.ts — system prompt + structured-output schema.
Hard rules: cite every claim, mark hunches/low-weight explicitly,
surface conflicts (never silently pick), surface gaps. JSON schema
with answer + citations[] + gaps[]. Prompt adapts to anchor / time
window / save flag.

src/core/think/cite-render.ts — structured citations + regex fallback
(Codex P1 #4 fold). normalizeStructuredCitations validates the model's
structured output; parseInlineCitations is the body-scan fallback when
the model omits the structured field. resolveCitations dispatches and
records CITATIONS_REGEX_FALLBACK warning when used.

src/core/think/gather.ts — 4-stream parallel retrieval:
  1. hybridSearch (pages, existing primitive)
  2. searchTakes (keyword, pg_trgm)
  3. searchTakesVector (vector, when embedQuestion fn supplied)
  4. traversePaths (graph, when --anchor set)
RRF fusion (k=60). Each stream wrapped in try/catch — partial gather
beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls.

src/core/think/index.ts — runThink orchestrator + persistSynthesis:
INTENT (regex classify) → GATHER → render evidence blocks → resolveModel
('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call
(injectable client) → JSON parse with code-fence + fallback strip →
resolveCitations → ThinkResult. persistSynthesis writes a synthesis
page + synthesis_evidence rows (page_id resolved per slug; page-level
citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY.
Round-loop scaffolding in place (rounds=1 only path exercised in v0.28).

src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing
strips --anchor, --rounds, --save, --take, --model, --since, --until,
--json. Local CLI = remote=false, so save/take honored. Human-readable
output by default; --json for agent consumption.

operations.ts — `think` op now calls runThink (was a not_implemented
stub). Remote callers can't save/take per Codex P1 #7. Returns full
ThinkResult plus saved_slug + evidence_inserted.

cli.ts — wired into dispatch + CLI_ONLY allowlist.

Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering
sanitize patterns, structural rendering, citation parsing (structured +
regex fallback + dedup + invalid-slug rejection), gather streams +
allow-list filter, full pipeline with stub client, malformed-LLM
fallback path, no-API-key graceful degradation, persistSynthesis writes
page + evidence rows. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: auto-think + drift + budget meter (Codex P1 #10 fold)

src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.

src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.

src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.

src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.

Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
  cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
  captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
  questions empty, success → cooldown ts written, cooldown blocks rerun,
  budget exhausted → partial. drift not_enabled, soft-band candidate
  detection, complete + dry-run paths.

All pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e Postgres: takes engine + extract + MCP allow-list (12 cases)

test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real
Postgres (gated on DATABASE_URL). 12 cases:
- addTakesBatch upsert via unnest() bind path (Postgres-specific)
- listTakes filters: holder, kind, sort=weight, takesHoldersAllowList
- searchTakes pg_trgm + allow-list filter
- supersedeTake transactional path (BEGIN/COMMIT semantics)
- resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED
- synthesis_evidence FK CASCADE on take delete
- countStaleTakes + listStaleTakes filter active+null
- extractTakesFromDb populates takes from fenced markdown
- MCP dispatch with takesHoldersAllowList=['world'] returns only world
- MCP dispatch local-CLI path returns all holders
- MCP dispatch takes_search honors allow-list
- think op forces remote_persisted_blocked even for save+take

postgres-engine.ts: addTakesBatch boolean[] serialization fix.
postgres-js auto-detects element type from JS arrays; for booleans it
mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then
SQL-cast to boolean[] — same pattern other batch methods rely on for
type-stable bind shapes.

test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in
TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence)
and (b) calls engine.initSchema() to actually run migrations.

test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match
Lane D's landed pipeline. They previously asserted not_implemented
envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY
graceful-degrade behavior.

Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts
Result: 12/12 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: local DreamPhaseResult type (avoid premature CyclePhase enum extension)

cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.

Introduces DreamPhaseResult exported from auto-think.ts:
  { name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
    detail: string; totals?: Record<string,number>; duration_ms: number }

drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: access_tokens.permissions JSONB end-to-end (5 cases)

test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:

- Migration v32 default backfill: new tokens created without a permissions
  column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
  holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
  via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.

Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.

All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: chunker takes-strip integration test (Codex P0 #3 verification)

test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually
strips fenced takes content end-to-end through the import pipeline.
This is the Codex P0 #3 fix's verification path: takes content lives
ONLY in the takes table for retrieval, never duplicated in
content_chunks where the per-token MCP allow-list cannot reach.

5 cases:
- chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers
- chunkText output never contains fenced claim text
- chunkText output retains non-fence prose (no over-stripping)
- importFromContent end-to-end: imported page has chunks but none
  contain fenced content
- takes_fence_chunk_leak doctor invariant: zero rows globally where
  chunk_text matches `<!--- gbrain:takes:%`

Final v0.28 test sweep:
  121 pass, 0 fail, 336 expect() calls, 12 files
  Coverage: schema migrations, engine methods (PGLite + Postgres),
  takes-fence parser, page-lock, extract phase, takes CLI engine
  surface, model config 6-tier resolver, MCP+auth allow-list,
  think pipeline (gather + sanitize + cite-render + synthesize),
  auto-think + drift + budget meter, JSONB end-to-end, chunker
  strip integration. ~95% of v0.28 surface area covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: apply-migrations skippedFuture arrays + http-transport SQL mock

Two CI failures from PR #563:

test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact
skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to
the migration registry means it shows up in skippedFuture when the test
runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both
hardcoded arrays.

test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix
matches `SELECT id, name FROM access_tokens` to return a row. v0.28's
validateToken now selects `SELECT id, name, permissions FROM access_tokens`
to read the per-token takes_holders allow-list. Mock returned [] on the
new query → validateToken treated every token as invalid → 401.

Fix: mock now matches both query shapes. validTokens row gets a default
`{takes_holders: ['world']}` permission injected when caller didn't
supply one (mirrors the migration v33 column DEFAULT). Updated
FakeEngineConfig type to allow tests to pass explicit permissions.

Verification:
  bun test test/apply-migrations.test.ts → 18/18 pass
  bun test test/http-transport.test.ts   → 24/24 pass
  bun run typecheck                       → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: add scope annotations to v0.28 ops (takes_list/takes_search/think)

test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing:
every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op
flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added
before master shipped v0.26 + the new invariant; the merge surfaced the gap.

Annotations:
- takes_list   → read
- takes_search → read
- think        → write (mutating: true; --save persists synthesis page)

Verification:
  bun test test/oauth.test.ts → 42/42 pass
  bun run typecheck            → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.28.1): export INJECTION_PATTERNS for shared sanitization

The same pattern set protects takes from prompt-injection (think/sanitize.ts)
and now retrieved chat content in the LongMemEval harness. One source of
truth for both surfaces; adding a new pattern in this file automatically
covers benchmarks too.

Existing consumers (sanitizeTakeForPrompt, renderTakesBlock) keep working
unchanged. Verified via test/think-pipeline.test.ts (18 pass, 0 fail).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.28.1): longmemeval harness — reset-in-place over in-memory PGLite

One in-memory PGLiteEngine per benchmark run; TRUNCATE between questions
with runtime-enumerated tables via pg_tables so future schema migrations
don't silently leak across questions. Infrastructure tables (sources,
config, gbrain_cycle_locks, subagent_rate_leases) preserved across resets
so initSchema-seeded rows like sources.'default' survive (FK target for
pages.source_id).

Files:
- src/eval/longmemeval/harness.ts: createBenchmarkBrain + resetTables +
  withBenchmarkBrain. ~50 lines, no class wrapper.
- src/eval/longmemeval/adapter.ts: pure haystackToPages() converter.
  Slug prefix `chat/` (verified non-matching against DEFAULT_SOURCE_BOOSTS).
- src/eval/longmemeval/sanitize.ts: re-uses INJECTION_PATTERNS from
  think/sanitize.ts; wraps each session in <chat_session id date> tags;
  4000-char cap.
- test/longmemeval-sanitize.test.ts: 12 cases pinning the F8 contract.

Hermetic: no DATABASE_URL, no API keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.28.1): gbrain eval longmemeval CLI command

Run the LongMemEval public benchmark against gbrain's hybrid retrieval.
Dataset is a positional path (download from xiaowu0162/longmemeval on HF).
Per-question loop wraps everything in try/catch; one bad question doesn't
kill the run, error JSONL line emitted instead.

Wiring:
- src/cli.ts: pre-dispatch bypass for `eval longmemeval` so the user's
  ~/.gbrain brain is never opened. Hermeticity gate verified: --help works
  on machines with no gbrain config.
- src/commands/eval-longmemeval.ts: arg parsing, JSONL emit (LF + UTF-8
  pinned), hybridSearch with optional expandQuery from search/expansion.ts,
  resolveModel from model-config.ts (6-tier chain), ThinkLLMClient injection
  seam from think/index.ts, structural <chat_session> framing.
- test/eval-longmemeval.test.ts: 12 cases covering harness lifecycle,
  reset clears all tables, schema-migration robustness, p50/p99 speed gate
  (warm reset+import+search target <500ms), adapter shape, source-boost
  regression guard, end-to-end with stubbed LLM, JSONL format guard,
  per-question failure handling.
- test/fixtures/longmemeval-mini.jsonl: 5 hand-authored questions with
  keyword-friendly overlap so --keyword-only works in CI.

Speed: warm reset+import 5 pages+search p50=25.9ms p99=30.3ms locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.28.1): bump VERSION + CHANGELOG

VERSION + package.json synchronized at 0.28.1. CHANGELOG entry uses the
release-summary voice + "To take advantage of v0.28.1" block per CLAUDE.md.

Sequential release on garrytan/v0.28-release; lands after v0.28.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: surface v0.28.1 LongMemEval CLI across project docs

- README.md: add EVAL section to Commands reference (eval --qrels, export,
  prune, replay, longmemeval); add v0.28.1 announce paragraph next to the
  v0.25.0 BrainBench-Real intro.
- CLAUDE.md: add Key files entry for src/eval/longmemeval/ +
  src/commands/eval-longmemeval.ts; add "Key commands added in v0.28.1"
  subsection (mirrors the v0.26.5 / v0.25.0 pattern); inventory
  test/eval-longmemeval.test.ts + test/longmemeval-sanitize.test.ts under
  the unit-test list.
- docs/eval-bench.md: cross-link from the "What it actually does" section
  to LongMemEval as the third evaluation axis (public benchmark,
  ground-truth labels, full QA pipeline); append "Public benchmarks:
  LongMemEval (v0.28.1)" section with architecture, flags table, and
  perf numbers.
- CONTRIBUTING.md: append a paragraph after the eval-replay block pointing
  contributors at gbrain eval longmemeval for public-benchmark coverage.
- AGENTS.md: extend the existing eval-retrieval bullet with a one-line
  mention of gbrain eval longmemeval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.2 feat: remote-source MCP + scope hierarchy + whoami (#690)

* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts

src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).

Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe

New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:

- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
  -c protocol.file.allow=never, -c protocol.ext.allow=never,
  --no-recurse-submodules. Single source of truth shared by cloneRepo
  and pullRepo so a future flag added to one path lands on both.
  Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
  .gitmodules as a second-fetch surface, file:// scheme in remotes.

- parseRemoteUrl: https-only, rejects embedded credentials and path
  traversal, delegates internal-target classification to isInternalUrl
  from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
  100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
  GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
  needed for self-hosted git over Tailscale (CGNAT trips the gate).

- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
  non-empty destDirs; spawns git via execFileSync (no shell injection)
  with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
  prompts. timeoutMs default 600s.

- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.

- validateRepoState: 6-state decision tree (missing | not-a-dir |
  no-git | corrupted | url-drift | healthy). Used by performSync's
  re-clone branch to recover from rmd clone dirs and refuse syncs on
  url-drift or corruption.

test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist

New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.

Hierarchy:
  - admin implies all (escape hatch)
  - write implies read
  - sources_admin and users_admin are siblings (different axes —
    sources-mgmt vs user-account-mgmt; neither implies the other)

Exported:
  - hasScope(grantedScopes, requiredScope): the canonical scope check.
    Replaces exact-string-match at three call sites in upcoming commits
    (serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
    token issuance). Without this rewrite, an admin-grant token would
    fail to refresh down to sources_admin (codex finding).
  - ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
    for OAuth metadata wire format and drift-check output).
  - assertAllowedScopes / InvalidScopeError: registration-time gate so
    tokens with bogus scope strings (read flying-unicorn) get rejected
    with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
    registerClientManual. Today's behavior accepts any string silently.
  - parseScopeString: space-separated wire format → array.

Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).

test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope-constants mirror + drift CI for src/core/scope.ts

The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.

Files:
  - admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
    duplicate, sorted alphabetically to match src/core/scope.ts.
  - scripts/check-admin-scope-drift.sh: extracts the list from each file
    via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
    (with full breakdown of which scopes diverged), 2 on internal error.
    Tested both passing and corrupted paths.
  - package.json: wires check:admin-scope-drift into both `verify` and
    `check:all` so any update to src/core/scope.ts that forgets the
    admin-side mirror fails the build.

The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration

Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:

- F3 refresh-token subset enforcement at line 365: previously rejected
  admin → sources_admin refresh because exact-match treated them as
  unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
  refresh down to least-privilege sources_admin scope; this fix lands
  that path.

- Token issuance intersection at line 498 (client_credentials grant):
  same hasScope swap so a client whose stored grant is `admin` can mint
  tokens including any implied scope.

- registerClient (DCR /register) and registerClientManual: validate
  every scope string against ALLOWED_SCOPES via assertAllowedScopes.
  Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
  and persisted the bogus string in oauth_clients.scope. Post-fix the
  caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
  pre-allowlist scopes keep working (allowlist gates registration only).

Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union

62 OAuth tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES

Two changes against src/commands/serve-http.ts:

- Line 195: scopesSupported on the mcpAuthRouter options switches from the
  hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
  Without this, /.well-known/oauth-authorization-server keeps reporting
  the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
  cannot discover the v0.28 sources_admin and users_admin scopes via
  standard discovery — they would have to be pre-configured out of band.

- Line 673: request-time scope check on /mcp swaps
  authInfo.scopes.includes(requiredScope) for hasScope(...). This was
  the most-cited codex finding: without it, sources_admin tokens could
  not even satisfy a `read`-scoped op (sources_admin doesn't include
  the literal string "read"). hasScope routes through the hierarchy
  table in src/core/scope.ts so admin implies all and write implies
  read at the gate too.

T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): sources-ops module — atomic clone + symlink-safe cleanup

src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.

addSource: D3 atomicity contract from the eng review.
  1. Validate id (matches existing SOURCE_ID_RE).
  2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
     before any clone work. Pre-fix the existing CLI used INSERT…ON
     CONFLICT DO NOTHING which silently no-op'd; with clone-first that
     would orphan the temp dir.
  3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
  4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
     git-remote helpers.
  5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
  6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
     the temp dir; rename-failed path also DELETEs the just-INSERTed row
     best-effort.

removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
  - isPathContained (realpath-resolves both sides + parent-with-sep
    string check) rejects symlinks whose target falls outside the
    confine.
  - lstat-then-isSymbolicLink check refuses symlinks whose realpath
    happens to land back inside the confine (defense in depth).

getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.

recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).

test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): whoami + sources_{add,list,remove,status} MCP ops

Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.

Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).

whoami (scope: read): introspect calling identity over MCP.
  - Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
    for OAuth clients (clientId starts with gbrain_cl_).
  - Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
    for grandfathered access_tokens.
  - Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
    Empty scopes (NOT ['read','write','admin']) is the D2 decision —
    returning OAuth-shaped scopes for local callers would resurrect the
    v0.26.9 footgun where code conditionally trusted on
    `auth.scopes.includes('admin')` instead of `ctx.remote === false`.
  - Q3 fail-closed: throws unknown_transport when remote=true AND auth is
    missing OR ctx.remote is the literal `undefined` (cast bypass guard).
    A future transport that forgets to thread auth doesn't get a free
    pass.

sources_add (sources_admin, mutating): register a source by --path
  (existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
  Calls into addSource from sources-ops.ts which owns the temp-dir +
  rename atomicity.

sources_list (read): list registered sources with page counts, federated
  flag, and remote_url. The remote_url field is new — lets a remote MCP
  caller see which sources are auto-managed.

sources_remove (sources_admin, mutating): cascade-delete a source +
  symlink-safe clone cleanup. Requires confirm_destructive: true when the
  source has data.

sources_status (read): per-source diagnostic returning clone_state
  ('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
  'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
  busted clone without SSH access to the brain host.

test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.

test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sync): re-clone fallback when clone is missing/no-git/corrupted

src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:

  - 'healthy'    → fall through to existing pull (unchanged)
  - 'missing'    → loud stderr "auto-recovery: re-cloning <id>", then
  'no-git'         recloneIfMissing handles the temp-dir + rename. Sync
  'not-a-dir'      continues from the freshly-cloned head.
  - 'corrupted'  → throw with structured hint pointing at sources remove
                   + add (no syncing wrong state).
  - 'url-drift'  → throw with hint pointing at the (deferred) sources
                   rebase-clone command.

Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.

src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.

test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.

test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.

test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor

src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.

Two new flags on `gbrain sources add`:
  - `--url <https-url>` : federated remote-clone path (clone + INSERT +
    rename, atomic rollback on failure).
  - `--clone-dir <path>` : override the default
    $GBRAIN_HOME/clones/<id>/ destination.

Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.

`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.

54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)

addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.

Two layers:

1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
   walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
   prints a warn with disk-byte estimate. Operators see the leak before
   `df` complains.

2. The autopilot cycle's existing `purge` phase grows a substep that
   nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
   uses. Operator behavior stays uniform across all soft-delete-style
   surfaces.

Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope checkboxes source from scope-constants mirror + dist

admin/src/pages/Agents.tsx Register Client modal:
  - useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
    to true, others false; unchanged UX for the common case).
  - Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
    hardcoded ['read','write','admin'].

Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.

The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.

admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami

VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).

CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).

TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.

README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).

llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip

Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.

12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir

Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.

Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.

The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.

Skipped gracefully when DATABASE_URL is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps

Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.

CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.

HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.

MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.

PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.

Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
  link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
  asymmetry (clone_dir override silently ignored over MCP, path nulled,
  local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.

DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.

323 tests pass (9 files); 4071 unit tests pass (full suite).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump v0.28.1 → v0.28.2 (master collision)

Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).

Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.

Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated

PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(todos): close longmemeval-publication, file 4 follow-up TODOs

Full 500-question 4-adapter LongMemEval _s benchmark landed at
github.com/garrytan/gbrain-evals#main:ced01f0. gbrain-hybrid 97.60% R@5,
+1.0pt over MemPal raw 96.6%. Replacing the now-stale "needs full run"
TODO with closure + 4 grounded follow-ups:

  1. Timeline-aware retrieval signal for temporal-reasoning questions
     (P2 — closes the only category we lose to MemPal-raw)
  2. Per-question batch consolidation for ~10x cold-cache speedup
     (P3 — makes daily benchmark CI gate practical)
  3. LongMemEval _m split run (P3 — differentiated, not yet published
     by MemPal)
  4. Cheaper-embedding-model recipe (P4 — recall-cost tradeoff curve)

Each TODO has the standard What/Why/Pros/Cons/Context/Depends-on shape per
the gbrain TODOS-format convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(llms): regenerate llms-full.txt to match merged CLAUDE.md

CI test/build-llms.test.ts asserts the committed llms.txt/llms-full.txt
are byte-for-byte identical to what scripts/build-llms.ts produces. The
master merge brought in v0.28.9/v0.28.10/v0.28.11 + multimodal embedding
notes that updated CLAUDE.md; the bundle was stale.

No content changes. Pure regeneration via `bun run build:llms`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(changelog): rewrite v0.28.12 entry — lead with the LongMemEval result

Old entry buried the headline ("LongMemEval lands in the box…") under
process detail (hermetic CI test count, 25.9ms p50, schema-table
runtime enumeration). The reader cares what gbrain DOES — not how we
plumbed the harness.

New entry leads with the actual number — 97.60% R@5 on the public
LongMemEval _s split, beating MemPalace raw by 1.0pt — followed by
the per-category win table that proves gbrain ties or beats MemPal in
5 of 6 question types and shows the +7.1pt assistant-voice lift.

Links to the full gbrain-evals report (97.60% headline + full
methodology + reproducible runner) so curious readers can dig deeper.

Two honest findings published in plain text: vector-only is
essentially tied with hybrid at K=5, and query expansion via Haiku is
a clean null result on this dataset. Better to publish the null than
hide it.

Reproduction block updated to match the actual gbrain-evals workflow
(clone + bun install + dataset download + bash batch runner). The
prior "download / run / hand to evaluate_qa.py" block stayed for the
in-tree CLI path.

Regenerated llms-full.txt to keep the build-llms regen-drift guard
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant