Skip to content

v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker#1111

Merged
garrytan merged 10 commits into
masterfrom
garrytan/dreamy-thompson
May 17, 2026
Merged

v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker#1111
garrytan merged 10 commits into
masterfrom
garrytan/dreamy-thompson

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

Six-commit correctness wave. Five user-visible bugs closed; one structural CI guard added so the bootstrap forward-reference class can't bite the same shape again.

Test Coverage

Coverage diagram per wave commit, with [CRITICAL] regression tests pinned per IRON RULE:

[Commit 1] Bootstrap probes + MIGRATIONS introspection      9 tests
  ├── 7 new column probes × 2 engines [TESTED]
  ├── MIGRATIONS introspection contract [TESTED]
  ├── parseBaseTableColumns comment-stripping fix [TESTED]
  └── Planted-bug regression [TESTED]

[Commit 2] Orphans deleted_at filter (both sides)            3 new tests
  ├── Soft-deleted page NOT orphan [★★★ CRITICAL]
  ├── Live page w/ deleted source → IS orphan [★★★ CRITICAL codex C11]
  └── Live link smoke [★★ regression]

[Commit 3] runThink → gateway.chat adapter                   9 new tests
  ├── Response shape conversion [★★★]
  ├── Stop reason mapping [★★★]
  ├── Model-id normalization (bare + prefixed) [★★]
  ├── Unknown provider → null [★★]
  ├── No ANTHROPIC_API_KEY → null (legacy signal preserved) [★★★ CRITICAL]
  └── hasAnthropicKey env read [★★]

[Commit 4] Worktree path-segment discriminator               5 new tests
  ├── Submodule relative gitdir/modules/ [★★ D49 regression]
  ├── Submodule absolute gitdir/modules/ [★★ absorbed-submodule edge]
  ├── Worktree absolute gitdir/worktrees/ [★★★ CRITICAL closes #889]
  ├── Worktree relative gitdir/worktrees/ [★★]
  └── Malformed .git → MANAGE [★★ catch behavior]

[Commit 5] pruneDir + descent-time exclusion                13 new tests
  ├── isSyncable rejects node_modules [★★★ CRITICAL latent-bug regression]
  ├── pruneDir blocks node_modules/dot-prefix/ops/*.raw [★★]
  └── Walker integration tests [★★]

[Commit 6 codex-P1] DDL connection threading                preserved by 15 bootstrap tests

Tests: ~85 net new test cases. 6707/0 across full unit-test suite. 285/0 across affected wave files.

Coverage gate: PASS. Every code path has a regression test; every CRITICAL latent-bug fix is pinned.

Pre-Landing Review

Pre-landing review ran via /plan-eng-review (full 12 decisions resolved in ~/.claude/plans/ok-i-spun-up-dreamy-thompson.md). One Codex P1 from /ship adversarial review folded in:

  • Codex P1 (caught during /ship): Bootstrap was using this.sql (instance/pooler pool) while initSchema held the advisory lock on the DDL connection. Fixed by threading the DDL conn through applyForwardReferenceBootstrap. Pre-existing issue, but the v0.35.5.0 wave is explicitly about Supabase upgrade-wedge correctness — couldn't ship with the connection mismatch still in place.

Plan Completion

All 7 implementation tasks from ~/.claude/plans/ok-i-spun-up-dreamy-thompson.md landed:

  • ✓ T1 — bootstrap 7 new probes (both engines)
  • ✓ T2 — MIGRATIONS introspection CI guard
  • ✓ T3 — orphans deleted_at filter (both sides per D11)
  • ✓ T4 — gateway adapter (concrete D10 spec — 4 fixes per codex C7/C8/C9/C10)
  • ✓ T5 — path-segment worktree discriminator (D4)
  • ✓ T6 — pruneDir + descent-time exclusion + transcript predicate (D12)
  • ✓ T7 — TODOS.md v0.36.x follow-ups
  • ✓ Bonus: Codex-P1 DDL connection threading (caught during /ship adversarial review)

Documentation

CLAUDE.md updated with v0.35.5.0 annotations across the 5 modified source files. llms-full.txt regenerated to match. README.md, CONTRIBUTING.md, AGENTS.md, INSTALL_FOR_AGENTS.md unchanged — no user-facing CLI surface added or renamed, only correctness fixes to existing commands. Coverage: all shipped fixes have adequate documentation.

Test plan

  • bun run verify — 4 pre-checks + typecheck clean
  • bun run test — 6707/0 unit tests across 19 serial files + 8-shard parallel run
  • 285/0 across the wave's affected test surface (bootstrap, orphans, sync, storage-sync, think-pipeline, think-gateway-adapter, extract, extract-fs)
  • All [CRITICAL] regression tests pinned per IRON RULE
  • Codex P1 from pre-landing review fixed in-wave
  • Real-Postgres E2E recommended (bun run ci:local) before merge

🤖 Generated with Claude Code

garrytan and others added 10 commits May 17, 2026 08:22
…d* + add MIGRATIONS introspection guard

Adds 7 new forward-reference probes to applyForwardReferenceBootstrap on
both engines, closes the column-only forward-ref class via a new
MIGRATIONS-source introspection contract test.

New probes:
- files.source_id + files.page_id (v18 forward refs)
- oauth_clients.source_id + oauth_clients.federated_read (v60+v61+v65)
- sources.archived + archived_at + archive_expires_at (v34 promoted from JSONB)

The sources.archived* columns are the codex-flagged class: they're added
inline in v34's CREATE TABLE definition but `CREATE TABLE IF NOT EXISTS
sources` is a no-op on pre-v34 brains, so downstream visibility filters
(search/list_pages) trip on old brains. needsPagesBootstrap now folds
archive columns into its CREATE TABLE so pre-v0.18 brains get a v34-shape
sources in one go; needsSourcesArchive then only fires on the pre-v34
case (sources exists, archive cols don't).

Closes the structural bug class via test/helpers/extract-added-columns.ts:
reads src/core/migrate.ts as text and extracts every ALTER TABLE ADD
COLUMN. The new contract test asserts every (table, column) pair is
covered by EITHER the bootstrap's ALTER TABLE statements, the bootstrap's
CREATE TABLE definitions, OR the schema blob's CREATE TABLE bodies. The
column-only class (no index, no FK; just an inline CREATE TABLE column
the schema blob can't add to existing tables) is now caught at PR time.

Source-text introspection catches all three migration shapes uniformly:
- top-level `sql:` field
- `sqlFor.postgres` / `sqlFor.pglite` overrides
- handler-body `engine.runMigration(N, \`ALTER TABLE ...\`)` (v34 shape)

Pre-existing parseBaseTableColumns parser bug fixed: now strips `--` line
comments and `/* ... */` blocks before identifying column names. Without
this, a column preceded by a comment was silently dropped. Catches
pages.page_kind and others that were silently uncovered.

13 columns added by migrations but not in PGLITE_SCHEMA_SQL are exempted
with a unified rationale: they have no schema-blob forward reference;
migration handles all upgrade paths cleanly. Refreshing the schema blob
is a separate concern.

Issues closed: #1018 (v60 oauth_clients), #974 (files.source_id/page_id),
#820 (v0.13.0 migration files.page_id cascade); pre-empts the
sources.archived class before any pre-v34 brain trips on it.

Tests:
- 9 cases in test/schema-bootstrap-coverage.test.ts (5 existing + 4 new)
- helper-level unit tests cover SQL shape variants (IF NOT EXISTS,
  quoted identifiers, ALTER TABLE IF EXISTS ONLY, multi-statement)
- planted-bug regression verifies the gate actually catches new uncovered
  columns

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…urce sides

Closes #1021. The v0.26.5 soft-delete invariant requires that
findOrphanPages exclude both:
  1. Candidate pages that are themselves soft-deleted
  2. Inbound links from soft-deleted source pages

Pre-fix, findOrphanPages had no deleted_at filter at all. Soft-deleted
pages with no inbound links were counted as orphans (inflating counts).
Pre-codex-tension-D11, only the candidate-side filter was planned.
Codex C11 caught the second case: a live page that has ONE inbound link
from a soft-deleted source page was hidden from orphan results — the
link still existed in the links table, the EXISTS subquery saw it, the
page looked "linked." Now the inner JOIN on pages enforces
src.deleted_at IS NULL.

Three regression tests pin the contract:
- soft-deleted page with no inbound → NOT orphan
- live page with ONLY inbound link from soft-deleted source → IS orphan
- live page with live inbound → NOT orphan (smoke check that the new
  filters don't break unchanged behavior)

Engine parity: same SQL shape on both Postgres and PGLite engines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-fix, runThink instantiated `new Anthropic()` directly and read
ANTHROPIC_API_KEY from process.env. Claude Desktop's stdio MCP launch
doesn't inherit shell env, so `gbrain config set anthropic_api_key sk-...`
(writes to ~/.gbrain/config.json) never reached the SDK and every MCP
think call degraded to "no LLM available."

The adapter routes through gateway.chat() — the canonical seam per
CLAUDE.md. Gateway reads the API key from gbrain config OR env, picks
up prompt caching, rate-leases, retry, and the test seam
(__setChatTransportForTests) that v0.31.12 established.

Per plan-eng-review D10 (cross-model tension with codex C7+C8+C9+C10),
four spec points landed:

  1. Drop `new Anthropic()` direct path entirely. Every non-stub LLM
     call from runThink routes through gateway.

  2. Real availability check (NOT a false-positive `getChatModel()`
     truthy). `tryBuildGatewayClient` probes both the recipe (resolveRecipe
     throws AIConfigError on unknown providers) AND the API key (reads
     process.env + loadConfig at the gbrain config layer for parity with
     gateway's own auth resolution). Returns null on miss; runThink takes
     the graceful "no LLM available" early-return preserving the legacy
     NO_ANTHROPIC_API_KEY warning signal.

  3. Model-id normalization. resolveModel returns bare anthropic ids
     (claude-opus-4-7); gateway.chat needs provider:model. Adapter
     auto-prefixes anthropic: when the id is bare. Provider:model strings
     pass through unchanged.

  4. Response-shape conversion. ChatResult → Anthropic.Message via
     chatResultToMessage. mapStopReason translates gateway's
     provider-neutral stop reasons (end / length / tool_calls / refusal /
     content_filter / other) to Anthropic's stop_reason ('end_turn' /
     'max_tokens' / 'tool_use'); refusal/content_filter/other fall through
     to end_turn (no Anthropic equivalent). Usage tokens pass through.

`opts.client` injection preserved (test seam — see ThinkLLMClient).
`opts.stubResponse` preserved (pure-test escape).

Tests:
  - test/think-gateway-adapter.test.ts (9 cases): response shape, stop
    reason mapping, model-id normalization (bare + prefixed), provider
    unknown returns null, ANTHROPIC_API_KEY absent returns null
    (regression for legacy graceful degradation), hasAnthropicKey reads
    process.env correctly. Uses withEnv per the test-isolation contract.
  - test/think-pipeline.serial.test.ts (17 existing cases): unchanged;
    the graceful-degradation case at line 213 still produces the
    NO_ANTHROPIC_API_KEY warning because tryBuildGatewayClient returns
    null when no key is configured, taking the legacy early-return path.

Closes #952.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atch (closes #889)

Pre-fix, `manageGitignore` treated every `.git`-as-file as a submodule
and skipped gitignore management. Both submodules AND worktrees use
`.git` as a file (not a directory), so the legacy
`statSync.isFile()` check couldn't discriminate. Worktrees got
misclassified as submodules and their .gitignore wasn't managed.

Per plan-eng-review D4 (chose path-segment match over absolute-vs-
relative path heuristic): the gitdir path contains:
  - `/modules/<name>` for submodules (skip — managed by parent repo)
  - `/worktrees/<name>` for worktrees (MANAGE — first-class repo)

Both are documented Git internal layouts, stable across all 4
{relative, absolute} × {modules, worktrees} combinations including the
absorbed-submodule edge case from `git submodule absorbgitdirs` (where
the submodule's gitdir flips to an absolute path).

Malformed `.git` file (no `gitdir:` prefix, IO error) → MANAGE, preserving
the pre-#889 catch{} fail-closed-toward-managing semantics.

Tests (5 new + 1 regression renamed):
  - REGRESSION: submodule relative gitdir/modules/ → skip (D49 contract)
  - absorbed submodule absolute gitdir/modules/ → skip (edge case)
  - CRITICAL: worktree absolute gitdir/worktrees/ → MANAGE (closes #889)
  - worktree relative gitdir/worktrees/ → MANAGE
  - malformed .git file → MANAGE (preserves catch behavior)
  - regular .git directory → MANAGE (existing smoke)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…redicate (closes #923, #202)

Per plan-eng-review D12 (cross-model tension with codex C12+C13), three
structural changes:

1. Extract `pruneDir(name)` helper in src/core/sync.ts. Returns false for
   directory names walkers must NEVER descend into: `node_modules` (latent
   bug — no leading dot), dot-prefix dirs (`.git`, `.obsidian`, `.raw`,
   `.cache`, etc.), `ops`, and `*.raw` sidecar dirs (gbrain convention —
   `people/pedro.raw/` holds raw source for pedro.md). Walkers consult it
   at descent time BEFORE recursion, saving the IO cost of walking entire
   vendor / hidden / sidecar subtrees only to filter them at file-emit time.

2. `isSyncable` itself gains the same exclusion set (via pruneDir on each
   path segment). Closes the latent bug where node_modules markdown files
   slipped through: `node_modules/some-pkg/README.md` returned true pre-fix
   because the legacy dot-prefix check only blocked `.node_modules` (with
   a leading dot), not the actual `node_modules`. CRITICAL regression test
   in test/sync.test.ts pins the contract per IRON RULE.

3. Two walkers rewritten to use pruneDir at descent + per-walker file
   predicate at emit:
   - `walkMarkdownFiles` (src/commands/extract.ts): pruneDir + isSyncable
     ({strategy:'markdown'}). Pre-fix this walker had ONLY an ad-hoc
     dot-prefix exclusion and didn't call isSyncable at all — descended
     into node_modules, emitted markdown files from there, ignored README/
     ops/.raw filters.
   - `listTextFiles` (src/core/cycle/transcript-discovery.ts): pruneDir +
     own .txt/.md predicate. DOES NOT use isSyncable({strategy:'markdown'})
     because transcripts accept .txt and don't share markdown sync's
     README/ops exclusions (codex C12). Also made RECURSIVE — pre-fix
     it walked only the top dir, so transcripts in `corpus/2026/` were
     invisible (codex C14 — descent-time pruning is the right shape but
     the test would have passed vacuously on a non-recursive walker).

Verified blast radius before adding node_modules: every existing
isSyncable caller (sync.ts:558-561 sync filter, frontmatter.ts:264 validate,
brain-writer.ts:305 reverse-write, import.ts:454 import filter) wants
node_modules excluded — this is a latent-bug fix, not a behavior change
for any legitimate caller.

Tests:
- 7 new isSyncable cases including the node_modules CRITICAL regression
- 6 new pruneDir cases (node_modules, dot-prefix, ops, *.raw, content
  dirs that should pass, empty-string default)
- Existing extract.test.ts + extract-fs.test.ts unchanged and passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bootstrap parity

Two follow-up TODOs filed during the v0.36 dreamy-thompson wave:

1. runThink full rewrite (D5+D7 from plan-eng-review): drop the
   ThinkLLMClient indirection now that v0.36 routes through gateway.chat.
   12+ tests need migration to __setChatTransportForTests. Blocked by
   this wave landing.

2. Supabase parity test for applyForwardReferenceBootstrap (codex C6
   residual): real Docker Postgres E2E catches schema correctness but
   not Supabase pooler/direct-pool routing. The probe uses this.sql but
   PostgresEngine.initSchema chooses a DDL connection; the divergence
   has caused multiple historical wedges (#699, #820 lineage).

Both entries include full context per the CLAUDE.md TODOS-format spec
(what, why, pros, cons, blocked-by, plan reference).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…otstrap

Codex adversarial review during /ship caught a P1: initSchema selected a
DDL connection, took pg_advisory_lock(42) on it, but
applyForwardReferenceBootstrap used `this.sql` (the instance pool) inside.
Bootstrap probes ran outside the lock scope on a different connection.

Failure mode: two concurrent gbrain instances could BOTH enter the
bootstrap block on Supabase transaction-pooler setups because the
advisory lock was held on a different connection than the one running
ALTER TABLE. The pooler's statement_timeout could also kill the probes
mid-flight without affecting the lock-holder, leaving an inconsistent
schema state.

Fix: applyForwardReferenceBootstrap now accepts an optional connection
parameter. initSchema passes the DDL conn (the one holding the lock).
this.sql remains the fallback for any unit-test path that calls bootstrap
directly. PGLite engine doesn't need this change — single connection,
no pooler.

This was pre-existing (every prior probe used this.sql), but the v0.36
wave is explicitly about fixing the Supabase upgrade-wedge class. Codex's
position was correct: don't ship the wave with the underlying connection
mismatch still there. The Supabase parity TEST FIXTURE follow-up remains
on TODOS.md (test infra needed to PROVE the fix works under real pooler
topology), but the bug itself is closed.

15/15 bootstrap tests pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six-correctness-fix wave: bootstrap forward-ref class (4 issues + 1 pre-empt),
orphans soft-delete leak (both sides), runThink → gateway.chat adapter,
git worktree vs submodule discriminator, walker pruneDir + descent-time
exclusion, plus a Codex-P1 catch during /ship that threaded the DDL
connection through applyForwardReferenceBootstrap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fold v0.35.5.0 file-level annotations into CLAUDE.md:
- postgres-engine.ts + pglite-engine.ts: 7 new applyForwardReferenceBootstrap
  probes (files.source_id/page_id, oauth_clients.source_id/federated_read,
  sources.archived/archived_at/archive_expires_at) + DDL connection threading
- test/schema-bootstrap-coverage.test.ts: new MIGRATIONS-source introspection
  guard + parseBaseTableColumns comment-stripping fix
- src/core/sync.ts: new pruneDir helper + manageGitignore worktree
  discriminator
- src/core/think/index.ts (new entry): runThink gateway adapter for MCP
  stdio key resolution
- src/core/operations.ts (new entry): findOrphanPages soft-delete filter

Regenerate llms-full.txt via bun run build:llms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 4446e9f into master May 17, 2026
7 checks passed
ChenyqThu pushed a commit to ChenyqThu/jarvis-knowledge-os-v2 that referenced this pull request May 17, 2026
§6.26 in JARVIS-ARCHITECTURE.md: 9-version upstream sync (108 commits,
v0.34.4 → v0.35.6.0), only 2 real conflicts, PR garrytan#1017 superseded by
v0.35.5.0 garrytan#1111 bootstrap fixwave, brain_score 80/100 unchanged,
3138 pages preserved, ~1h end-to-end vs 3-3.5h plan estimate.

TODO.md: header bumped to post-v0.35.6.0 state, PR-2 entry marked CLOSED
2026-05-17 with supersede link.

README.md: upstream_compat bumped >= 0.35.6.0.

CONSOLIDATION-PLAN.md: Last reviewed bumped to 2026-05-17.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
brandonlipman added a commit to brandonlipman/gbrain that referenced this pull request May 29, 2026
* upstream/master:
  v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208)
  v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)
  v0.36.5.0 feat: secure DATABASE_URL access for shell jobs (inherit: ["database_url"]) (garrytan#1192)
  v0.36.4.0 feat: brain-health-100 — autonomous remediation via doctor --remediate + Minions (garrytan#1193)
  fix(docs): comprehensive drift audit — contradictions, broken links, stale refs (garrytan#1201)
  v0.36.3.0 feat: dynamic embedding column selection for search (garrytan#1164)
  v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite (garrytan#1136)
  v0.36.1.1 fix-wave: community PR triage + 28 atomic fixes (garrytan#1182)
  v0.36.1.0 Hindsight calibration wave: brain learns how you tend to be wrong (garrytan#1139)
  v0.36.0.0 feat(skillpack): scaffold + reference + harvest (retire managed-block install) (garrytan#1130)
  v0.35.8.0 feat(cycle): phantom-page redirect inside extract_facts (garrytan#1138)
  v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4) (garrytan#1131)
  v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes garrytan#1091) (garrytan#1129)
  v0.35.5.1 fix(doctor): stop counting clean supervisor exits as crashes (garrytan#1108)
  v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker (garrytan#1111)
  v0.35.4.0 fix(doctor,entities): supervisor crash classification + bare-name resolver + 58x perf + stub guard observability (garrytan#1085)
  v0.35.3.1 feat(eval): temporal-aware contradiction probe + verdict enum (garrytan#1052)
  v0.35.3.0 fix wave: extract_facts items + git --no-recurse-submodules placement (garrytan#1053)

# Conflicts:
#	src/core/postgres-engine.ts
#	test/schema-bootstrap-coverage.test.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment