Skip to content

v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures#1323

Merged
garrytan merged 28 commits into
masterfrom
garrytan/v0.40.3.0-contextual-retrieval
May 23, 2026
Merged

v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures#1323
garrytan merged 28 commits into
masterfrom
garrytan/v0.40.3.0-contextual-retrieval

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

Two big things shipped, plus four CLI improvements that close v0.40.3.0 deferral items.

Contextual retrieval cathedral (T1-T11 of the original v0.40.3.0 plan):

  • Every chunk gets its page title wrapped before the embed call. <context>Page Title</context>\nchunk text goes into the vector; the canonical chunk_text stays unwrapped for snippets, FTS, reranker, debug output. Asymmetric design (Voyage + ZeroEntropy distinguish query/document inputType natively; OpenAI symmetric users still benefit from document-side orientation).
  • Three-tier ladder: conservative (no wrapper), balanced (title-only, default), tokenmax (per-chunk Haiku synopsis — $1-5 backfill / 10K pages).
  • Migration v90 (renamed from original v81 because master claimed v82-v88 via v0.38/v0.39 cathedrals): pages.contextual_retrieval_mode, pages.corpus_generation, sources.contextual_retrieval_mode, sources.trust_frontmatter_overrides, query_cache.page_generations.
  • New contextual_reindex_per_chunk Minion handler (PROTECTED).
  • Post-upgrade cost prompt; --unsafe-bypass-dream-guard retained.
  • Soft kill switch: gbrain config set search.contextual_retrieval_disabled true.

Cache invalidation gate (the deferred v0.40.3.0+ item):

  • Migration v91: pages.generation BIGINT + query_cache.max_generation_at_store BIGINT + bump_page_generation_fn trigger (BEFORE INSERT OR UPDATE on 10-column allow-list per D6) + pages_generation_idx btree.
  • Two-layer gate at lookup: Layer 1 cheap MAX(generation) bookmark via index (O(log N)); Layer 2 per-page snapshot via jsonb_each + LEFT JOIN pages. Pre-upgrade cache rows with empty {} snapshot are vacuously valid (IRON-RULE backward compat).
  • src/core/search/query-cache-gate.ts (NEW): pure helpers (buildPageGenerationsSnapshot, validateCacheRowAgainstPages, CACHE_GATE_WHERE_CLAUSE).
  • KNOBS_HASH_VERSION 3 → 5 (skipped past 4, reserved by salem's pending graph-signals work). One-time cache-row invalidation on upgrade; refills within TTL.

Four deferred-item closures (the other v0.40.3.0+ deferrals):

  • gbrain mounts enable/disable/trust-frontmatter/untrust-frontmatter <id> — 4 new verbs sharing the runSetMountFlag helper. MountEntry interface extended with trust_frontmatter_overrides?: boolean (default false; mounts opt in explicitly per D15 security posture).
  • gbrain sources set-cr-mode <id> <none|title|per_chunk_synopsis> — per-source CR mode override. Pass unset/default/"" to clear (NULL falls through to global). Missing source ID fails loudly with paste-ready gbrain sources list hint.
  • gbrain config set search.mode tokenmax — mode-switch banner fires with cost preview + (on TTY + active Minion worker) offers to submit gbrain reindex --markdown as a Minion job. Non-TTY callers get a paste-ready hint to stderr. Suppress with GBRAIN_NO_MODE_SWITCH_UX=1 for CI fixtures.
  • RemediationStep refactor: Remediation lifted from brain-score-recommendations.ts into src/core/remediation-step.ts as RemediationStep with makeRemediationStep factory using canonical-JSON-based deterministic IDs (codex D12 Bug 2). Doctor's integrity + sync_failures checks emit RemediationSteps via the factory. 3 new Minion handlers (lint-fix, integrity-auto, sync-retry-failed) wire into the consumer pattern. sync-skip-failed deliberately NOT emitted per codex D12 Bug 3 (auto-skip hides data loss; remains as direct CLI).

Other folded-in work:

  • Race fix (D24): NULL→non-NULL upsert race between concurrent embed workers closed via embedded_at tiebreaker. Closes long-standing v0.35.x TODO.
  • Master merge: absorbed v0.38.0.0 (ingestion cathedral), v0.38.1.0 (agents), v0.38.2.0 (doctor), v0.39.0.0 (brainstorm cathedral), v0.39.1.0 (schema packs), v0.39.2.0 (autopilot/cycle-lock) — 6 commits.

Test Coverage

SCHEMA + TRIGGER
  ├── migration v90 (contextual_retrieval_columns)
  │   └── [★★★] migrate.test.ts structural + PGLite roundtrip
  ├── migration v91 (pages.generation + trigger + idx)
  │   ├── [★★★] migrate.test.ts v91 structural
  │   ├── [★★★] operations.test.ts trigger fires on each content column
  │   ├── [★★★] operations.test.ts trigger fires on INSERT
  │   └── [★★★] schema-bootstrap-coverage REQUIRED set extension

CACHE GATE
  ├── query-cache-gate.ts (15 unit cases)
  │   ├── validator + snapshot builder + SQL fragment regression
  └── lookup/store SQL (6 e2e cases)
      ├── store→HIT, content UPDATE→MISS, INSERT→HIT (codex #4 case)
      ├── legacy row→HIT (IRON-RULE backward compat)
      ├── soft-delete→MISS, multi-page partial bump→MISS

MODE-SWITCH UX (18 unit cases)
  ├── summarizeTransition 5-cell matrix + invalid fallthrough
  ├── probeWorkerAvailable 3 branches (PGLite-backed)
  └── buildReindexIdempotencyKey content-stable invariance (codex D12 Bug 1)

MOUNTS/SOURCES CLI
  ├── mounts-cli.test.ts (5 new cases: enable/disable, trust cycles, rejections, idempotent enable)
  └── sources-set-cr-mode.test.ts (10 cases: 3 modes × happy, unset × 2, rejections × 3, round-trip)

REMEDIATIONSTEP (17 unit cases)
  ├── canonicalJson determinism (codex D12 Bug 2)
  ├── idempotencyKey shape + invariance
  └── makeRemediationStep factory + back-compat alias

CAPTURE-GENERATION REGRESSION (3 e2e cases, codex #5 strengthened)
  ├── INSERT path bumps MAX(generation)
  ├── UPDATE path bumps generation
  └── idempotent re-capture does NOT bump

Tests: 9389 → 9482 (+93 new across 7 new test files + 5 extended files)

Pre-Landing Review

Two rounds of /plan-eng-review + one /codex outside voice on top of the original v0.40.3.0 codex pass. 8 codex findings absorbed before merge:

  • Trigger allow-list too narrow. Original plan watched 6 columns. Codex flagged title/type/page_kind/content_hash as user-visible. Widened to 10 columns.
  • BEFORE UPDATE missed INSERTs. ingest_capture uses INSERT ... ON CONFLICT DO UPDATE; fresh slugs took INSERT path. Fixed: trigger now BEFORE INSERT OR UPDATE; INSERT branch sets generation = COALESCE(MAX, 0) + 1.
  • Migration version collision. Master claimed v82-v88. Our v81 → renamed to v90 (skipping v89 reserved by trajectory-routing). New trigger work at v91.
  • KNOBS_HASH_VERSION collision with salem. Both waves wanted v=4. Per D8 we bumped to v=5.
  • D3 regression test too weak. Codex pointed out INSERT path (codex docs: expand brain schema with database architecture and OSS smoothing #4) needs explicit coverage. T11 test now exercises both INSERT + UPDATE paths.
  • v0.38 provenance columns verified at implementation time — not in trigger allow-list (channel metadata, not content).
  • Index shape simplified to plain (generation) (not DESC). CONCURRENTLY on Postgres.
  • 6 master commits absorbed, not 5 (v0.39.2.0 autopilot/cycle-lock was missed in the original count).

TODOS

No TODOs.md updates in this PR (the deferred items list at the top of v0.40.3.0's CHANGELOG was the de facto TODO; this PR closes them).

Test plan

  • bun run verify: clean (typecheck + 7 pre-checks)
  • bun run test: 9482 pass / 0 fail / 0 skip in 410-516s
  • bun run test:e2e: included via parallel suite (DATABASE_URL not required for the new e2e cache-gate + capture-regression tests — they use PGLite)
  • Smoke test against fresh PGLite brain documented in skills/migrations/v0.40.3.0.md

🤖 Generated with Claude Code

garrytan and others added 28 commits May 22, 2026 12:26
Five additive columns + Page/SourceRow type extensions + CRMode discriminated
union land the schema foundation for v0.40.3.0 contextual retrieval. All
columns are NULL-tolerant; existing rows continue working unchanged until
the post-upgrade reembed sweep catches up.

Schema (migration v81 + schema.sql + pglite-schema.ts mirror):
- pages.contextual_retrieval_mode TEXT NULL — tier the page was last
  embedded under. NULL on pre-v81 rows; drift detection treats NULL as
  'none' for reindex predicates.
- pages.corpus_generation TEXT NULL — composite hash of
  (synopsis_prompt_version, haiku_model, title_wrapper_version,
  embedding_model) per D27 P1-5. Document-side provenance for the
  v0.40.3.0 query_cache.page_generations invalidation contract.
- sources.contextual_retrieval_mode TEXT NULL — per-source override.
  CLI-write-only per D15 security gate.
- sources.trust_frontmatter_overrides BOOLEAN DEFAULT FALSE — per-source
  mount-frontmatter trust gate per D15. Host source (id='default') is
  always trusted in the resolver regardless of column value.
- query_cache.page_generations JSONB DEFAULT '{}' — D27 P1-5 invalidation
  contract foundation. Per-row tag of {page_id: corpus_generation} so
  lookup can LEFT JOIN against current pages and exclude stale rows.

Types (src/core/types.ts + src/core/sources-ops.ts):
- New CR_MODES = ['none', 'title', 'per_chunk_synopsis'] as const +
  CRMode type union + isCRMode() type guard for parsing untrusted
  frontmatter / config values.
- Page interface extended with contextual_retrieval_mode + corpus_generation
  (optional, NULL-tolerant for pre-v81 rows).
- SourceRow interface extended with contextual_retrieval_mode +
  trust_frontmatter_overrides (optional for pre-v81 brains).

Bootstrap coverage:
- All four pages/sources columns are in PGLITE_SCHEMA_SQL CREATE TABLE
  bodies (fresh installs get them at initSchema time).
- query_cache.page_generations is exempt because query_cache itself is
  migration-created (added in v55, not in PGLITE_SCHEMA_SQL). Same
  rationale as the existing query_cache.knobs_hash exemption.

Pinned by the migrate.test.ts v81 round-trip + the schema-bootstrap-coverage
parser (which also gained the query_cache.page_generations exemption).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps the markdown chunker version so the post-upgrade reembed sweep finds
every page on the old chunker version and re-embeds it through the new
contextual-retrieval wrapper path. Chunk boundaries themselves are
unchanged from v2 — the bump forces re-embed (not re-chunk) so existing
pages pick up the wrapper without recomputing chunk splits.

JSDoc on MARKDOWN_CHUNKER_VERSION updated to document the v3 semantic
("chunks embed with optional contextual retrieval wrapper per Anthropic's
published methodology"). Pins the dependency between the chunker version
bump and the upcoming src/core/contextual-retrieval-service.ts (T5).

Test fixture in test/chunkers/recursive.test.ts updated to assert v3 with
a brief comment on the bump rationale so future contributors see the
v0.40.3.0 reason inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four new pure modules under src/core/ that the upcoming service layer (T5)
and Minion handler (T6) compose. All four are testable in isolation; no
engine I/O, no filesystem reads outside the synopsis source-text fallback
chain (which is invoked by the service, not the modules themselves).

src/core/contextual-retrieval-resolver.ts (D5+D6+D15+D26 P0-4):
- resolveContextualRetrievalMode() walks the three-source override chain:
  page frontmatter > source row > global mode bundle. Returns a tagged
  result with source attribution + invalid_frontmatter_value (D13) +
  frontmatter_rejected_untrusted_mount (D15) for doctor surfacing.
- crModeDistinct() helper for D26 P0-4 IS DISTINCT FROM semantics on
  app-side CRMode comparisons (NULL-aware, defeats the != misses NULL
  drift bug Codex pass 2 caught).
- HOST_SOURCE_ID = 'default' always trusted regardless of
  trust_frontmatter_overrides; mount sources require the explicit flag
  per D15 security gate.

src/core/embedding-context.ts (D20-T1 + D20-T4 + Codex T5 title-weakness):
- buildContextualPrefix(title, synopsis) → null | wrapped block. Handles
  title-only, summary-only, both, or neither.
- wrapChunkForEmbedding(text, prefix, chunkSource) short-circuits on
  chunk_source='fenced_code' per D20-T4 (code chunks inside markdown
  pages skip the wrapper — prepending page title to a code block doesn't
  help cross-modal retrieval).
- sanitizeTitle/sanitizeSynopsis strip </context> (injection vector) and
  collapse whitespace + cap at 300 chars.
- extractFirstTwoSentences() pure regex with CJK_SENTENCE_DELIMITERS
  from src/core/cjk.ts for the title-tier free fallback path.

src/core/page-summary.ts (D27 P1-2 + D27 P1-4 + D21 reversal):
- generatePerChunkSynopsis() routes through gateway.chat(tier='utility').
- Richer failure envelope per D27 P1-2: refusal/empty/malformed (→ D14
  page-level fall-back) vs auth_failure/rate_limit/timeout/network/
  provider_5xx (→ retry per gateway, or throw to Minion retry).
- buildSynopsisCacheKey() composes the LRU key per D27 P1-4:
  (content_hash, chunk_index, corpus_generation, source_text_hash).
- DELIBERATELY no calibration injection — D21 reversed D7's calibration-
  aware acceptance. Mutable answer-time bias tags don't belong in static
  document vectors. Query-side personalization is the v0.41+ home.

src/core/audit-synopsis.ts (D17, mirrors v0.35.0.0 rerank-audit precedent):
- Failure-only JSONL writer at ~/.gbrain/audit/synopsis-failures-YYYY-Www.jsonl
  with ISO-week rotation. Deliberately no success logging (10K+ pages per
  backfill would generate 10K+ JSONL rows of noise; failure signal is the
  actionable one).
- summarizeSynopsisFailures() aggregator returns SynopsisFailureSummary
  for doctor's synopsis_refusal_rate check.

Clean typecheck across the four modules. Tests land in T14 alongside the
service + Minion handler so the test layer can integrate the full path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-tier wrapper ladder gated by search.mode lands in the bundle. The
per-mode defaults match the cost-tier philosophy (D2):

  conservative → 'none'                (minimum surface)
  balanced     → 'title'                (free at runtime; pure string concat)
  tokenmax     → 'per_chunk_synopsis'   (Anthropic's published method)

Plus the D18 soft kill switch (contextual_retrieval_disabled) so a single
config-key flip neutralizes wrapping for queries AND new embeds without
touching the migration path.

src/core/search/mode.ts:
- ModeBundle: contextual_retrieval: CRMode + contextual_retrieval_disabled.
- All three frozen MODE_BUNDLES updated with the per-tier defaults.
- SearchKeyOverrides + SearchPerCallOpts: both fields optional in the
  per-key config + per-call surfaces.
- resolveSearchMode's pick chain threads both new fields through the
  standard per-call > per-key > mode bundle precedence ladder.
- KNOBS_HASH_VERSION 3→4. Two new entries appended to knobsHash() parts
  list (append-only per CDX2-F13 convention): cr=${cr_mode} +
  crd=${0|1}. A query against a tokenmax-mode brain can no longer be
  served from a cache row written when the brain was on balanced — they
  sit in different embedding spaces.
- SEARCH_MODE_CONFIG_KEYS: 'search.contextual_retrieval' +
  'search.contextual_retrieval_disabled' added.
- loadOverridesFromConfig reads both keys; CR_MODES guard rejects typos
  (drift typos still fall through to mode default per D13 sync-failure
  semantics; this is the no-typo path).
- Imports CR_MODES + CRMode from src/core/types.ts.

src/commands/search.ts:
- KNOB_DESCRIPTIONS picks up the two new entries so `gbrain search modes`
  dashboard renders them with description copy.

test/search-mode.test.ts:
- Three canonical bundle tests updated with the per-tier CR defaults.
- KNOBS_HASH_VERSION expectation bumped 3→4 with inline rationale.

Clean typecheck + 42 search-mode tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two writers racing on the same chunk (autopilot sync + manual `embed --stale`
+ contextual reindex) previously raced last-writer-wins via the text-
unchanged branch's `COALESCE(EXCLUDED.embedding, content_chunks.embedding)`.
Pre-v0.40.3 the cost of an overwrite was one wasted ~$0.000001 text-
embedding-3-large call. With v0.40.3's per-chunk Haiku synopsis on tokenmax,
the cost rises ~300x to ~$0.0003 per overwritten chunk plus the discarded
synopsis work. On a 10K-page tokenmax brain, a few percent overwrite rate
during concurrent backfill+sync wastes $1-5 of Haiku spend silently.

Fix (mirrored exactly in postgres-engine.ts + pglite-engine.ts so both
engines stay parity-pinned):

  embedding = CASE
    WHEN EXCLUDED.chunk_text != content_chunks.chunk_text THEN EXCLUDED.embedding
    WHEN content_chunks.embedding IS NULL THEN EXCLUDED.embedding
    WHEN EXCLUDED.embedded_at IS NOT NULL
         AND (content_chunks.embedded_at IS NULL OR EXCLUDED.embedded_at > content_chunks.embedded_at)
         THEN EXCLUDED.embedding
    ELSE content_chunks.embedding
  END,
  embedded_at = CASE
    WHEN EXCLUDED.chunk_text != content_chunks.chunk_text AND EXCLUDED.embedding IS NULL THEN NULL
    WHEN content_chunks.embedding IS NULL AND EXCLUDED.embedding IS NOT NULL THEN EXCLUDED.embedded_at
    WHEN EXCLUDED.embedded_at IS NOT NULL
         AND (content_chunks.embedded_at IS NULL OR EXCLUDED.embedded_at > content_chunks.embedded_at)
         THEN EXCLUDED.embedded_at
    ELSE content_chunks.embedded_at
  END,

The two columns move together via aligned CASE WHEN logic — embedding +
embedded_at stay consistent so `embed --stale` (predicate
`embedding IS NULL`) keeps working correctly.

Behavior summary for the text-unchanged branch:
  - existing embedding NULL → take new (cold path, no race)
  - new is fresher (embedded_at > existing) → take new
  - otherwise → keep existing (slower writer with stale embedding loses)

Closes the v0.35.x TODOS.md item that flagged this race pre-existing.
v0.40.3 fold-in lands the fix when the wave amplifies the cost vector,
per D24 in the eng-review pass.

100 pglite-engine tests pass + clean typecheck. E2E concurrent-writer
test lands in T14 alongside the broader test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Centerpiece service module. Single source of truth for "re-embed one page
with the active CR mode" — composed by import-file.ts (sync time),
reindex.ts (batch sweep), and the contextual-reindex-per-chunk Minion
handler (T6). Closes the drift class Codex pass 2 P1-1 flagged: each
consumer no longer hand-rolls the embed-then-stamp flow, so there's
literally no way for them to diverge.

src/core/contextual-retrieval-service.ts:
- reembedPageWithContextualRetrieval() implements the D26 P0-2 two-phase
  build pattern.
  PHASE 1 (in-memory, no DB writes):
    - Load page + source + chunks
    - Resolve effective CR mode (resolver) with optional kill-switch
      short-circuit per D18
    - 'none' tier: skip wrap, stamp column, return early (records page
      is up-to-date relative to current state so reindex sweep doesn't
      re-walk it)
    - 'title' tier: pure string concat with sanitized title prefix
    - 'per_chunk_synopsis' tier: read source text via fallback chain (D11),
      generate synopsis per chunk SEQUENTIALLY within page (D10), batch
      embedBatch ONCE per page (D27 P2-2). Rate-leasing hooks
      (acquireSynopsisLease/releaseSynopsisLease) supplied by the Minion
      handler; inline callers rely on gateway-level retry.
    - On refusal/empty/malformed (per D27 P1-2): RESTART PHASE 1 at
      'title' tier — D14 page-level consistency (whole page demoted, no
      mid-state on disk).
  PHASE 2 (single DB transaction):
    - tx.upsertChunks() — chunk_text stays canonical per D20-T1; only
      the wrapped string went to the embedder, not into the column.
    - tx.updatePageContextualRetrievalState() — stamps both columns
      atomically with PHASE 1 chunk writes.
- computeCorpusGeneration() composes the document-side provenance hash
  per D27 P1-5: sha256(cr_mode + synopsis_prompt_version + haiku_model
  + title_wrapper_version + embedding_model_tag).slice(0,16). Future
  prompt edits or model bumps invalidate prior cache rows via the
  query_cache.page_generations LEFT JOIN (lands in T11).
- computeSourceTextHash() for D27 P1-4 synopsis cache key composition.
- expectedModeForPageSourceOnly() helper for the T9 reindex sweep
  predicate.
- ReembedPageResult discriminated union: success | skipped (4 reasons)
  | page_fallback (refusal triggered D14) | transient_error | permanent_error.
  Each consumer dispatches on `kind` to decide retry / surface / commit.

New engine method (added to BrainEngine interface + both engines):
- updatePageContextualRetrievalState(slug, sourceId, mode, corpusGeneration):
  narrow UPDATE of just the two CR-state columns + updated_at. Skips
  soft-deleted rows. Mirrors refreshPageBody's narrow-update pattern so
  we don't fire createVersion on every tier upgrade (which would bloat
  page_versions).

Clean typecheck + 272 existing tests pass (no regressions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thin handler (D23) that wires the global Haiku rate-leaser (D26 P0-3) +
delegates re-embed work to contextual-retrieval-service.ts (T5). One job
per page (D10). Submitted by the mode-switch hook (T10), the reindex
sweep (T9), and doctor --remediate (T13).

src/core/minions/handlers/contextual-reindex-per-chunk.ts:
- makeContextualReindexHandler(opts) factory closure.
- Per-chunk Haiku call wrapped in acquireLease/releaseLease against the
  shared key 'anthropic:utility:contextual-synopsis'. Default RPM cap is
  50 (Anthropic Haiku 4.5 published limit); operators on a tier with
  higher quota override via GBRAIN_CONTEXTUAL_HAIKU_RPM env var.
- D27 P2-1 source-id derivation: payload carries only page_slug;
  handler loads the page row and uses its source_id as authoritative.
  Optional expected_source_id field on the payload triggers
  UnrecoverableError on mismatch (stale/malicious payload defense).
- Result classification:
    success / page_fallback (D14)        → ok
    transient_error                       → throw (Minion retries)
    permanent_error                       → UnrecoverableError → dead-letter
- 60s poll-wait per Haiku call when the rate-lease is saturated; gives
  up with explicit error rather than blocking forever.

src/core/minions/protected-names.ts:
- contextual_reindex_per_chunk added to PROTECTED_JOB_NAMES with comment
  documenting the cost vector (1-50 Haiku calls per page, bulk MCP
  submission could drain user's Anthropic budget).

src/commands/jobs.ts:
- registerBuiltinHandlers wires the new handler via dynamic import.
- Registered ABOVE autopilot-cycle so the handler is available when
  doctor --remediate proposes contextual_retrieval_coverage steps.

Clean typecheck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…umns

import-file.ts now resolves the effective CR mode for each page at embed
time and applies the wrapper inline. Per D20-T1 critical invariant, the
stored chunk_text stays canonical (powers FTS, snippets, reranker, debug);
only the wrapped string goes to the embedder.

Inline path scope (cost-discipline choice):
- title-tier: inline wrap is free (pure string concat). Applied directly.
- per_chunk_synopsis tier: TOO EXPENSIVE for the inline import path
  (one Haiku call per chunk on every sync would compound into hours of
  blocking per `gbrain sync`). The inline path lands the page at the
  title tier; the Minion-driven contextual reindex (T6 handler) upgrades
  it to per_chunk_synopsis later when the user accepts the cost prompt
  in the mode-switch hook (T10). Per D3 explicit-consent contract.
- 'none' tier (conservative mode, kill-switch disabled): no wrapping,
  raw chunk_text → embedder unchanged from pre-v40.3 behavior.

Code chunks (chunk_source='fenced_code') always bypass wrapping per
D20-T4 — wrapChunkForEmbedding short-circuits.

Stamping (alongside putPage in the same transaction):
- pages.contextual_retrieval_mode → tier the page was just embedded at
- pages.corpus_generation → composite hash via computeCorpusGeneration
  from the service module. NULL when 'none' tier or noEmbed=true.

Override chain: page frontmatter > source row > global mode bundle (D5+D6).
Mount-frontmatter trust gate (D15) — currently lookup uses defaults for
source row; future T9 reindex sweep + T10 mode-switch hook can pass a
richer source row when the per-source override lands.

Kill switch (D18): when search.contextual_retrieval_disabled=true, the
resolver short-circuits to 'none' and the wrapper is skipped.

Clean typecheck + 251 unit tests pass (migrate + pglite-engine +
import-file all green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`gbrain reindex --markdown` predicate widens from chunker_version drift
alone to also catch contextual_retrieval_mode IS NULL — the v0.40.3.0
upgrade-path signal that a page has never been evaluated against the
CR ladder (pre-v81 brains where the column is freshly NULL after the
migration ran).

Pages enter the sweep when EITHER:
  (a) chunker_version < MARKDOWN_CHUNKER_VERSION (existing behavior)
  (b) contextual_retrieval_mode IS NULL (new — D26 P0-1 + D26 P0-4 prep)

Since chunker_version 2→3 (T2) already forces every pre-v40 page into
(a), the IS NULL clause is effectively a belt-and-suspenders for the
case where a brain upgrades migrate but somehow the chunker_version
bump didn't propagate (concurrent upgrade race, manual SQL edit, etc.).

The re-import path uses importFromContent with forceRechunk:true
(existing v0.32.7 behavior) which bypasses the content_hash short-
circuit so the v0.40.3.0 import-file.ts wrapper application path (T7)
actually applies. Each re-imported page picks up the active CR tier and
stamps contextual_retrieval_mode + corpus_generation atomically.

Page-frontmatter overrides are honored at re-import time (importFromFile
re-parses YAML and the resolver picks the per-page tier). The frontmatter-
mismatch drift case Codex P0-1 called for (user removes override after
initial import) is partially handled here via the IS NULL+forceRechunk
path; a v0.41+ wave can add the explicit "frontmatter may contain
override" candidate path if real users hit drift the current predicate
misses.

Clean typecheck + 230 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing post-upgrade-reembed.ts prompt fires automatically on
`gbrain upgrade` because T2 bumped MARKDOWN_CHUNKER_VERSION 2→3. Prompt
copy extended to explain WHY the re-embed is happening — without this,
users see a "chunker-bump" prompt and wonder if it's a routine internal
refresh vs the actual headline feature ship.

formatReembedPrompt now appends a [contextual retrieval] line below the
chunker-bump cost summary, mentioning that v0.40.3.0 wraps each chunk
with its page title before embedding (Anthropic's published method).

What the user sees on upgrade:
  [chunker-bump] Will re-embed ~N markdown pages via {model}, est.
  ~$X.XX, ~Ymin. Press Ctrl-C within Zs to abort.
  [contextual retrieval] v0.40.3.0 wraps each chunk with its page
  title before embedding (Anthropic's published method).

Title-tier wrap is free at runtime (pure string concat, no Haiku) so
the cost number stays unchanged from the chunker-bump-only case. The
per-chunk Haiku synopsis tier is OPT-IN via
`gbrain config set search.mode tokenmax` post-upgrade, which fires the
contextual_reindex_per_chunk Minion handler (T6) for the backfill.

T10 mode-switch hook in src/commands/config.ts (the explicit per-mode
cost prompt UX on `gbrain config set search.mode tokenmax`) is deferred
to v0.40.3.1 — the explicit-consent contract (D3) is satisfied by the
existing post-upgrade prompt for the title-tier path that the wave
ships by default. The Minion handler from T6 + the protected-name
guard ensure that any direct Minion submission for the per-chunk path
is gated on the CLI/doctor-remediate trust boundary.

Kill switch (D18): the contextual_retrieval_disabled config key is
honored at import time (T7) and in the service (T5) — when true, the
resolver short-circuits to 'none' regardless of mode bundle. No
hybridSearch changes needed: queries embed raw text already; the kill
switch only affects NEW embeds. Existing wrapped vectors keep serving
queries via cosine similarity (asymmetric retrieval is preserved).

11 upgrade-reembed-prompt tests pass + clean typecheck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T11 (query_cache.page_generations contract): the DB column shipped in
T1 migration v81 + KNOBS_HASH_VERSION 4 bump in T4 invalidates the
common-case cache contamination (full-brain mode upgrade). The LEFT JOIN
read-side gate per Codex P1-5 — for the edge case where a brain is mid-
reindex and some pages are stamped at corpus_generation N+1 while others
are still at N — is deferred to v0.40.3.1. In practice, the post-upgrade
reembed prompt fires automatically + completes before search resumes on
healthy brains, so the edge case is narrow. CHANGELOG documents the
limitation.

T12 (generic RemediationStep contract): the existing recommendation
registry shape (sync/embed/backlinks/extract hardcoded) is extended via
the doctor check below rather than refactored to a generic registry.
Codex P1-6 called for the refactor; v0.40.3.1+ can absorb it once a real
second consumer requires the same registration shape.

T13 (contextual_retrieval_coverage doctor check):
- New checkContextualRetrievalCoverage() in src/commands/doctor.ts.
- Two SQL signals: pages.chunker_version < current + pages.contextual_
  retrieval_mode IS NULL. Single COUNT...FILTER query is cheap on every
  brain size.
- Audit summary line: reads ~/.gbrain/audit/synopsis-failures-*.jsonl
  via the v0.40.3.0 audit-synopsis module (T3). >5% page-level fallback
  rate surfaces explicitly so operators see the Haiku refusal signal.
- Paste-ready fix: `gbrain reindex --markdown` — the v0.32.7 + v0.40.3.0
  sweep covers both chunker_version drift AND CR mode drift per T9.
- Status: ok when fully aligned + no recent failures; warn when drift
  exists (with the paste-ready fix in the message).
- Wired into the standard doctor run alongside the other v0.36+ checks
  (abandoned_threads, calibration_freshness, etc.).

Sources/mounts CLI surfaces (set-cr-mode + trust-frontmatter) deferred
— the post-upgrade-reembed prompt + the per-page frontmatter override
path cover the v0.40.3.0 operational workflow. Per-source override CLI
is a power-user feature that can ship in v0.40.4+ once real federated-
brain users surface specific friction.

48 doctor tests pass + clean typecheck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test suite for the v0.40.3.0 contextual retrieval wave. 77 new test
cases across 5 files, all green. Pins every IRON-RULE invariant
end-to-end so future contributors can't silently regress the wave.

test/contextual-retrieval-resolver.test.ts (29 tests):
- 9-combo override matrix (page-fm > source-row > global, all
  permutations).
- D15 mount-trust gate: host always trusted, mounts honor only when
  trust_frontmatter_overrides=true, rejected frontmatter surfaces via
  result.frontmatter_rejected_untrusted_mount for doctor.
- D13 invalid frontmatter (typo + non-string + empty): falls through
  to source/global with raw value in invalid_frontmatter_value.
- D18 kill switch: short-circuits to 'none' regardless of overrides.
- D26 P0-4 crModeDistinct: NULL-aware comparison, matches SQL IS
  DISTINCT FROM semantics on every combination of NULL/defined args.

test/embedding-context.test.ts (21 tests):
- buildContextualPrefix: title-only, synopsis-only, both, neither.
- wrapChunkForEmbedding: non-code wraps; D20-T4 fenced_code ALWAYS
  bypasses; null prefix passes through; image_asset wraps as text.
- sanitizeTitle: </context> injection stripped (case-insensitive),
  whitespace collapsed, 300-char cap, trim semantics.
- extractFirstTwoSentences: English boundaries, question marks, CJK
  delimiters, run-on cap, empty input, no-delimiter passthrough.
- modeRequiresHaiku / modeRequiresWrapper guards.
- D20-T1 IRON-RULE regression test: wrapping does not mutate input
  string reference (so caller's chunk_text safely flows to upsert).

test/contextual-retrieval-service-pure.test.ts (16 tests):
- computeCorpusGeneration: 16-char hex, deterministic, mode-sensitive,
  model-sensitive, TITLE_WRAPPER_VERSION stable.
- computeSourceTextHash: D27 P1-4 cache invalidation key composition.
- expectedModeForPageSourceOnly (T9 reindex predicate helper): kill
  switch returns none, source override beats global, invalid override
  falls through, all CR modes round-trip.

test/audit-synopsis.test.ts (11 tests):
- ISO-week filename rotation (stable for same week, different days).
- logSynopsisFailure round-trip: kind, page_level_fallback flag,
  multi-event accumulation, detail 200-char cap.
- summarizeSynopsisFailures aggregation: null on empty, by_kind counts,
  page_level_fallback_rate math.
- Missing audit file returns empty (silent no-op).

test/e2e/contextual-retrieval-pglite.test.ts (5 tests, hermetic PGLite + gateway stub):
- IRON RULE #1 (D20-T1): wrapper text in embedder input but NEVER in
  content_chunks.chunk_text after import — pins the canonical
  chunk_text separation invariant end-to-end.
- IRON RULE #2 (D14 stamping): pages.contextual_retrieval_mode AND
  pages.corpus_generation are set after every import.
- IRON RULE: chunker_version stamps to current MARKDOWN_CHUNKER_VERSION
  (3 for v0.40.3.0).
- D5 per-page frontmatter override: `contextual_retrieval: none` makes
  the embedder receive UNWRAPPED text; mode column stamped 'none'.
- T9 reindex predicate: pages with contextual_retrieval_mode IS NULL
  enter the sweep regardless of chunker_version.

462 tests pass across all v0.40.3.0 + adjacent suites (migrate,
pglite-engine, search-mode, doctor, import-file, upgrade-reembed-prompt,
schema-bootstrap-coverage, recursive chunker, all five new files).
Zero regressions, clean typecheck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION 0.37.11.0 → 0.40.3.0 with package.json sync. CHANGELOG entry
follows the CLAUDE.md ELI10-lead voice rule: opens with "Your search
now understands what each chunk is about, not just what words are in
it," lays out the tier ladder with a real cost table, calls out the
chunk_text storage separation (D20-T1) with a concrete example, and
includes the "Things to watch" + "What we caught and fixed before
merging" sections per the format spec.

CHANGELOG also includes the canonical "To take advantage of v0.40.3.0"
self-repair block with the manual `gbrain apply-migrations --yes` +
`gbrain reindex --markdown` recovery path for users whose
`gbrain upgrade` post-upgrade-reembed didn't fully fire.

skills/migrations/v0.40.3.0.md walks the agent through the mechanical
upgrade flow, the opt-up to tokenmax path with the realistic backfill
cost table, the opt-out soft kill switch flip, and the per-page
frontmatter override with the D15 mount-trust note. Matches the
v0.13.0 + v0.32.7 migration doc structure so agent muscle memory
works.

llms-full.txt + llms.txt regenerated via `bun run build:llms` to pick
up the CHANGELOG + migration doc additions. test/build-llms.test.ts
passes.

Also moved test/audit-synopsis.test.ts → test/audit-synopsis.serial.test.ts
to satisfy the check-test-isolation lint (the test mutates
GBRAIN_AUDIT_DIR via beforeAll/afterAll for a fixture dir, which the
parallel runner forbids in *.test.ts files; serial quarantine is the
canonical fix per CLAUDE.md test-isolation rules).

`bun run verify` passes (typecheck + 4 CI gate checks). 469 tests
across all v0.40.3.0 + adjacent suites pass with 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-T15 test gap-fill: covers the two highest-leverage spots that the
T14 suite didn't exercise.

test/contextual-retrieval-doctor.serial.test.ts (8 tests, .serial because
the doctor check reads the audit JSONL via GBRAIN_AUDIT_DIR env mutation):
- empty-brain → ok
- fully-aligned brain (chunker_version current + mode stamped) → ok
- chunker_version drift → warn with paste-ready `gbrain reindex --markdown`
- NULL mode column → warn surfaces "never evaluated against CR ladder"
- both drift conditions together → warn with both messages
- soft-deleted pages NOT counted (deleted_at filter works)
- non-markdown (code) pages NOT counted (page_kind filter works)
- audit JSONL refusal event surfaces in the failure-summary line

test/e2e/concurrent-embed-race.test.ts (3 tests, D24 regression guard):
- cold path: existing embedding NULL → take new (no-race case)
- IRON RULE: fresher write wins over stale write when text unchanged.
  Pre-fix this would have last-writer-wins via COALESCE; post-fix the
  fresher embedded_at survives. Pinned by raw SQL upsert with an
  explicit -5min embedded_at to simulate the slower writer.
- text change with no new embedding → both embedding + embedded_at
  reset to NULL (consistent state so embed --stale picks up).

Cross-shard contamination fix: race test calls configureGateway with
embedding_dimensions=1536 BEFORE initSchema so the PGLite vector column
sizes consistently regardless of what other tests in the same shard
process configured first. Without this, running the race test alongside
the pglite-e2e test triggered "expected 1280 dimensions, not 1536"
when the gateway was left in its default ZE-1280 state by a prior file.

`bun run verify` passes (typecheck + 5 CI gate checks). 88 tests pass
across all v0.40.3.0 + new gap-fill files in one combined run; zero
shared-state contamination.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ns v81→v90

Absorbs master's v0.38 (ingestion cathedral), v0.38.1 (agents), v0.38.2 (doctor),
v0.39.0 (brainstorm cost cathedral), v0.39.1 (schema packs), and the v0.40.x VERSION
bump on top.

Conflict resolutions:
- VERSION → 0.40.5.0 (this wave's slot; v0.40.4.0 claimed by salem PR #1300)
- package.json → 0.40.5.0
- src/core/migrate.ts → took master's v81 (pages_provenance_columns) + v82-v88;
  appended our contextual_retrieval_columns as v90 (skipped v89 reserved by
  garrytan/v0.40.2.0-trajectory-routing per D7 inspection)
- src/core/search/mode.ts → KNOBS_HASH_VERSION 4→5 (per D8 sequencing behind
  salem's pending v=4 graph signals); both schema_pack hash fields (master)
  and contextual_retrieval hash fields (this branch) preserved
- src/core/types.ts → both v0.38 provenance Page fields and v0.40.3 CR fields
  preserved on the Page interface
- CHANGELOG.md → took master as baseline; v0.40.5.0 entry lands in T9 docs phase
- bun.lock → bun install refreshed to pick up chokidar@^4.0.3 (v0.38 dep)

bun run typecheck passes after merge.
… + index

Migration v90 (renamed from v0.40.3.0 v81 on master merge per D2/D7):
- 5 additive columns (pages.contextual_retrieval_mode, pages.corpus_generation,
  sources.contextual_retrieval_mode, sources.trust_frontmatter_overrides,
  query_cache.page_generations) for the contextual retrieval wave.

Migration v91 (NEW per D6 + codex #4 + codex #8):
- pages.generation BIGINT NOT NULL DEFAULT 1 (per-page generation counter)
- query_cache.max_generation_at_store BIGINT NOT NULL DEFAULT 0 (Layer 1 bookmark)
- bump_page_generation_fn() trigger function:
  - BEFORE INSERT: NEW.generation := COALESCE(MAX(generation), 0) + 1 — codex #4
    INSERT coverage so cache rows stored before a new page existed invalidate
    correctly.
  - BEFORE UPDATE: bumps generation only when allow-list columns IS DISTINCT
    FROM (compiled_truth, timeline, frontmatter, deleted_at,
    contextual_retrieval_mode, title, type, page_kind, corpus_generation,
    content_hash) per D6 widened to catch user-visible mutations.
- CREATE INDEX CONCURRENTLY pages_generation_idx ON pages (generation) so
  MAX(generation) for the bookmark check is O(log N) — codex #8 confirmed
  plain btree, no DESC necessary.

Mirrored in src/schema.sql, src/core/pglite-schema.ts CREATE TABLE body
(trigger included so fresh PGLite installs get it from the schema blob, not
just migration replay).

Extended REQUIRED_BOOTSTRAP_COVERAGE with pages.contextual_retrieval_mode,
pages.corpus_generation, sources.contextual_retrieval_mode,
sources.trust_frontmatter_overrides, pages.generation. Probes added to
applyForwardReferenceBootstrap on both engines + matching ALTER blocks for
pre-v90/pre-v91 brains.

COLUMN_EXEMPTIONS extended: query_cache.max_generation_at_store (same
rationale as page_generations — query_cache is migration-only, not in
PGLITE_SCHEMA_SQL).

Test results:
- bun test test/migrate.test.ts: 140 pass / 0 fail
- bun test test/schema-bootstrap-coverage.test.ts: 9 pass / 0 fail
- bun run typecheck: clean
New pure module src/core/search/query-cache-gate.ts:
- buildPageGenerationsSnapshot(engine, pageIds) builds the {pageId: gen}
  snapshot + MAX(generation) bookmark in one round trip via UNION ALL.
  Pre-v91 brains (no generation column) fall back to empty snapshot +
  zero bookmark — backward compat with legacy rows preserved.
- validateCacheRowAgainstPages() — pure validator for unit testing.
- CACHE_GATE_WHERE_CLAUSE exported as a SQL fragment that lookup() embeds
  in its WHERE clause. Two-layer gate per D11:
    Layer 1 (cheap): (SELECT MAX(generation) FROM pages) <=
                     qc.max_generation_at_store
    Layer 2 (per-page): jsonb_each + LEFT JOIN pages to detect deletes
                        + bumped pages on the cached result set.
  Legacy compat: rows with empty {} snapshot are vacuously valid (Layer 2
  short-circuits) — IRON-RULE pinned.

query-cache.ts wiring:
- lookup() table-aliased to `qc` so the gate fragment can reference
  qc.max_generation_at_store + qc.page_generations. WHERE clause adds
  `AND ${CACHE_GATE_WHERE_CLAUSE}` after the existing similarity + TTL +
  knobs_hash filters.
- store() captures the snapshot via the pure helper, then INSERTs both
  page_generations JSONB and max_generation_at_store BIGINT alongside
  the existing columns. ON CONFLICT (id) DO UPDATE refreshes both.

Test coverage (15 unit + 6 e2e):
- test/query-cache-gate.test.ts: 15 cases covering pure validator
  branches (vacuous valid, bookmark short-circuit, single/multi/partial
  bumps, deleted page, codex D11 critical case), PGLite-backed snapshot
  builder (empty pageIds, populated pageIds, integer JSONB shape,
  non-existent IDs skipped, bump-after-update), SQL shape regression
  on CACHE_GATE_WHERE_CLAUSE.
- test/e2e/cache-gate-pglite.test.ts: 6 cases covering store → HIT,
  content UPDATE → MISS, INSERT new page → HIT (codex #4 case where
  bookmark fires but snapshot intact serves correctly), legacy row →
  HIT (IRON-RULE backward compat), soft-delete → MISS (trigger path),
  multi-page partial bump → MISS.

Test results:
- bun test test/query-cache-gate.test.ts test/query-cache.test.ts
  test/query-cache-isolation.test.ts test/e2e/cache-gate-pglite.test.ts:
  33 pass / 0 fail
- bun run typecheck: clean

Note: hard-delete (raw DELETE FROM pages) is not covered by the trigger
(BEFORE INSERT OR UPDATE doesn't fire on DELETE). Production uses
soft-delete via deleted_at (trigger allow-list catches NULL → timestamp
distinction). Hard-delete via admin-only `gbrain pages purge-deleted` is
best-effort cache-wise — acceptable for the rare admin path.
New module src/core/search/mode-switch-ux.ts:
- summarizeTransition(old, new): pure 5-cell matrix (no_change /
  narrowing / broadening / tokenmax_opt_in / invalid_new_mode) + reindex
  command + cost estimate + paste-ready callout lines.
- probeWorkerAvailable(engine): worker liveness proxy. gbrain has no
  minion_workers heartbeat table yet (B7 follow-up from v0.19.1), so we
  use a proxy: minion_jobs activity within 10-min query window. Within
  2 min = active; >2min but <10min = stale; nothing = never_seen.
- buildReindexIdempotencyKey(): content-stable per codex D12 Bug 1.
  Pattern: cr-backfill:<source_id>:<chunker_version>:<mode>. NOT
  timestamp-based — two retries against same brain state dedupe.
- runModeSwitchUx(): orchestrator. Honors GBRAIN_NO_MODE_SWITCH_UX=1
  (full skip), non-TTY (print paste-ready hints to stderr), yesFlag
  (auto-submit reindex). For tokenmax_opt_in + TTY + worker probe
  active: submits via MinionQueue.add with allowProtectedSubmit=true.
  For probe = stale or never_seen: loud-fail per D3 with a "start a
  worker OR run inline" recovery hint — closes the silent-stall
  footgun.

src/commands/config.ts hook (~30 LOC):
- Captures the OLD search.mode BEFORE setConfig so summarizeTransition
  classifies correctly.
- Fires runModeSwitchUx() AFTER setConfig persisted, wrapped in
  try/catch so UX failures never break the config-set that already
  landed.
- Best-effort: failures emit `[mode-switch] UX hook failed (non-fatal)`
  to stderr.

Test coverage (18 cases):
- summarizeTransition: 8 cases covering all 5 transition kinds + null
  inputs + tokenmax-as-first-set + invalid mode.
- probeWorkerAvailable: 4 cases via real PGLite — never_seen / active /
  stale (seeded via minion_jobs) + threshold constant assertion.
- buildReindexIdempotencyKey: 6 cases pinning content-stable contract
  (codex D12 Bug 1) — identical inputs match, different inputs differ,
  consecutive calls match despite time delta (NOT timestamp-based).

Test results:
- bun test test/mode-switch-ux.test.ts: 18 pass / 0 fail
- bun run typecheck: clean
…-frontmatter}

Four new mounts CLI verbs per D4:
- gbrain mounts enable <id>             — re-enable a disabled mount
- gbrain mounts disable <id>            — toggle off without removing
- gbrain mounts trust-frontmatter <id>  — let this mount's per-page
                                          contextual_retrieval_mode
                                          frontmatter override the source
                                          default. Off by default for
                                          mounted brains; host is always
                                          trusted.
- gbrain mounts untrust-frontmatter <id> — clear the trust flag.

Implementation:
- src/core/brain-registry.ts MountEntry interface extended with
  trust_frontmatter_overrides?: boolean. loadMounts() projection threads
  the field through with default false (mounts opt in explicitly per D4
  + D15 security posture).
- src/commands/mounts.ts: new runSetMountFlag() helper handles all 4
  verbs via a shared file-write path. Missing-mount loud rejection
  (GBrainError with list-hint). Host brain rejection. Idempotent: no-op
  when current value already matches. Cache refresh after each write
  so host agents see the new flag immediately.

Test infrastructure:
- GBRAIN_MOUNTS_PATH env override on getMountsPath() in BOTH
  brain-registry.ts AND mounts.ts (the latter has its own
  copy — two source-of-truth paths). Reason: libuv caches homedir()
  on some platforms, so withFakeHome's HOME mutation isn't picked up
  by tests calling runMounts(). Production callers don't set the env.

Test coverage (5 new cases):
- enable → disable → enable cycle persists
- trust-frontmatter → untrust → trust cycle preserves other fields
- missing mount id → loud rejection with list-hint (closes the
  critical gap from idempotent-pebble Failure Modes table)
- host brain rejection: cannot trust-frontmatter "host"
- enable on already-enabled mount: no-op (idempotent)

Test results:
- bun test test/mounts-cli.test.ts test/brain-registry.serial.test.ts:
  54 pass / 0 fail
- bun run typecheck: clean
New verb `gbrain sources set-cr-mode <id> <mode>` per D5:
- Mode argument validated against CR_MODES via isCRMode (closed enum:
  none | title | per_chunk_synopsis).
- "unset" / "default" / "" clears the column to NULL (falls through to
  the global search.mode bundle).
- Loud rejection on:
  - Missing id/mode → exit 2, prints usage
  - Invalid mode → exit 2, lists valid options
  - Missing source id → exit 4, paste-ready `gbrain sources list` hint
    (closes the idempotent-pebble Failure Modes critical gap)

src/commands/sources.ts wired into the switch dispatch + help text
updated. isCRMode + CR_MODES lazy-imported per existing import pattern
in this file.

Test coverage (10 cases):
- happy path for all 3 valid CRMode values
- unset path via "unset" + "default" both clear to NULL
- invalid mode → exit 2 + no mutation
- missing source id → exit 4
- missing arguments → exit 2 with usage
- missing mode (only id) → exit 2 + no mutation
- round-trip preserves other fields (name)

Test results:
- bun test test/sources-set-cr-mode.test.ts: 10 pass / 0 fail
- bun run typecheck: clean
New canonical module src/core/remediation-step.ts:
- RemediationStep interface (lifted from brain-score-recommendations.ts).
  Same shape; rename to "Step" suffix per D6 for clarity ("a step in a
  remediation plan").
- RemediationSeverity + RemediationStatus type re-exports.
- canonicalJson(value): zero-dep canonical serialization — sorts object
  keys recursively before stringify. Per codex D12 Bug 2: identical
  logical params hash identically regardless of insertion order.
- idempotencyKey(source, job, params): shape
  <source>:<job>:sha8(canonicalJson(params)). Lifted from the legacy
  inline idemKey helper so future check authors don't drift.
- makeRemediationStep(opts): canonical factory. Defaults id to the
  idempotency key (override for human-readable like 'sync.repo').
  Status defaults to 'remediable'. All check authors should use this;
  hand-rolling is the drift hazard the refactor closes.

src/core/brain-score-recommendations.ts:
- Removed the local Remediation + RemediationSeverity + RemediationStatus
  definitions.
- Re-exports them from remediation-step.ts so existing callers (e.g.
  doctor.ts) still resolve. Also re-exports Remediation as an alias
  for RemediationStep so import paths can migrate gradually.
- Imports type Remediation alias internally so the (substantial) existing
  computeRecommendations body keeps compiling without sed pass.

Test coverage (17 cases):
- canonicalJson: key-ordering determinism (3 cases), nested objects,
  array order preservation, primitive types, codex D12 Bug 2 regression
- idempotencyKey: shape regex, content invariance, key-ordering
  invariance, source/job/params differentiation
- makeRemediationStep: default id, explicit id override, default status,
  canonical-JSON invariance, all-opts threadthrough
- back-compat: `import { Remediation } from brain-score-recommendations`
  still resolves to RemediationStep (compile + runtime check)

Test results:
- bun test test/remediation-step.test.ts: 17 pass / 0 fail
- bun test test/brain-score-recommendations.test.ts test/doctor.test.ts:
  70 pass / 0 fail (back-compat preserved)
- bun run typecheck: clean

Per D6 + D8: T8b in next commit wires lint, integrity, sync_failures
doctor checks to emit RemediationStep via the new factory.
…+ 3 Minion handlers

Doctor checks now emit RemediationStep via makeRemediationStep():
- `integrity` check (when bareHits > 0) emits integrity-auto step.
  Severity escalates to 'high' when bareHits > 50. Deterministic; $0 cost.
- `sync_failures` check (when unacked > 0) emits sync-retry-failed step.
  Severity escalates to 'high' when count >= 10. Content-stable params
  (failure_count + oldest_failure timestamp) per codex D12 Bug 2.
- sync-skip-failed DELIBERATELY NOT emitted per D12 Bug 3 (auto-skipping
  failed syncs hides data loss). Operators retain `gbrain sync --skip-failed`
  as a direct CLI option.

Lint doctor check NOT wired — there is no `lint` check in doctor.ts
today; the lint workflow is the standalone `gbrain lint` command. Adding
a doctor lint check is a v0.41+ TODO when it justifies its own complete
section.

Three new Minion handlers in registerBuiltinHandlers (NOT in
PROTECTED_JOB_NAMES — they're thin wrappers around already-shipping CLI
commands, idempotent, no shell exec, MCP-safe):
- lint-fix       → runLintCore({ fix: true })
- integrity-auto → runIntegrity(['auto'])
- sync-retry-failed → runSync(['--retry-failed'])

Check.remediation field shape upgrade:
- Was: inline Array<{...}> shape.
- Now: RemediationStep[] from the canonical
  src/core/remediation-step.ts. Check authors `import { makeRemediationStep }`
  and emit through the factory.

Test results:
- bun test test/doctor.test.ts: 48 pass / 0 fail (zero regression on
  the doctor surface; new remediation fields are additive)
- bun run typecheck: clean
The v0.38 ingestion cathedral added a new write path to pages via the
`ingest_capture` Minion handler. The v0.40.5.0 cache-invalidation gate
relies on pages.generation being bumped by EVERY write path via the
BEFORE INSERT OR UPDATE trigger.

This file pins that the new v0.38 capture write path correctly bumps
generation through three scenarios:

1. INSERT path (codex #4 INSERT coverage): ingest_capture with a fresh
   slug creates a page with generation = MAX(generation) + 1 so any
   cache row stored before the new page existed has its bookmark fire.
2. UPDATE path: ingest_capture with an existing slug + new content →
   trigger fires on content-column IS DISTINCT FROM and bumps generation.
3. Idempotent UPDATE: capture with the SAME content → trigger
   short-circuits, no bump. Cache freshness preserved on re-runs.

Per codex #5 strengthening: noEmbed: true is set explicitly so the test
doesn't require API keys (test runs against pure PGLite).

Test results:
- bun test test/e2e/capture-generation-regression.test.ts: 3 pass / 0 fail
- bun run typecheck: clean
…+ llms regen

Single combined v0.40.5.0 CHANGELOG entry folds in v0.40.3.0 contextual
retrieval content + v0.40.5.0 wave additions (cache gate + mode-switch
UX + mounts/sources CLI + RemediationStep refactor). Voice per CLAUDE.md:
ELI10 lead, plain language, paste-ready commands, tier table, "Things
to watch", "What we caught and fixed before merging" (summarizes the
8 codex findings + 3 design decisions in user-facing terms), "Itemized
changes", "## To take advantage of v0.40.5.0" mandatory self-repair
block.

CLAUDE.md: new section "Key commands added in v0.40.5.0 (contextual
retrieval + cache gate + 4 CLI verbs)" listing the 4 new mount verbs,
sources set-cr-mode, mode-switch UX, KNOBS_HASH_VERSION bump, 3 new
Minion handlers, and the 3 new modules (remediation-step,
query-cache-gate, mode-switch-ux).

skills/migrations/v0.40.5.0.md: new migration skill with feature_pitch
frontmatter for the auto-update agent. Documents the 6 master commits
merged in, migration v90 (renumber from v81) + v91 (trigger), the
optional opt-up to tokenmax, per-source CR mode overrides, mount
frontmatter trust, the soft kill switch, and the backward-compat
guarantees.

bun run build:llms refreshed llms.txt + llms-full.txt:
- llms.txt: 4314 bytes
- llms-full.txt: 578257 bytes

Test results:
- bun test test/build-llms.test.ts: 7 pass / 0 fail (committed bundles
  byte-match generator output)
KNOBS_HASH_VERSION bumped 4→5 per D8 (sequenced behind salem's pending
v=4 graph-signals work). Three test files held stale ==3 / ==4
assertions:
- test/search-mode.test.ts: assertion + comment updated to v=5.
- test/search/knobs-hash-reranker.test.ts: assertion + describe name
  updated to v=5 ladder.
- test/cross-modal-phase1.test.ts: assertion + name updated to v=5.

reindex.test.ts "skips pages already at current chunker_version" — the
v0.40.3.0 reindex predicate (`chunker_version < CURRENT OR
contextual_retrieval_mode IS NULL`) caught the should-skip page
because its CR mode was NULL. Fixed by seeding `contextual_retrieval_mode
= 'title'` on the should-skip row.

reindex.test.ts "idempotent: re-run on a fully-updated brain reports
nothing to do" — by design, `--no-embed` reindex bumps chunker_version
but skips CR-state stamping (import-file.ts:457-466 documents this).
Fixed by manually stamping `contextual_retrieval_mode = 'title'`
between the first and second reindex calls so the brain matches the
"fully updated" state the idempotency test name implies. Production
embed flow stamps both in one pass; the test uses --no-embed only to
avoid requiring API keys.

Test results:
- bun run verify (typecheck + 4 pre-checks): clean
- bun run test: 9482 pass / 0 fail / 0 skip across 410s
…master)

Master is at v0.40.2.0; v0.40.3.0 is genuinely the next free slot. The wave
was originally planned as v0.40.5.0 sequenced behind salem (PR #1300 = v0.40.4.0)
but the user is shipping THIS branch as v0.40.3.0 because:

1. v0.40.3.0 IS the canonical version slot for the contextual retrieval
   cathedral (matches branch name garrytan/v0.40.3.0-contextual-retrieval).
2. Master is at v0.40.2.0 — v0.40.3.0 is the immediate next slot, not a
   collision.
3. salem's v0.40.4.0 + any v0.40.5.0 work sit ON TOP of this in the landing
   train, not under it.

Mechanical rename only — no content changes from the v0.40.5.0 commit
sequence (T1-T11 wave is preserved verbatim, just relabeled):
- VERSION + package.json: 0.40.5.0 → 0.40.3.0
- bun.lock: refreshed (no dep changes)
- CHANGELOG.md: ## [0.40.5.0] header → ## [0.40.3.0] + body references
- skills/migrations/v0.40.5.0.md → skills/migrations/v0.40.3.0.md
  (previous v0.40.3.0.md file overwritten with the richer T9 content)
- CLAUDE.md: "Key commands added in v0.40.5.0" → "v0.40.3.0"
- 30 source + test files: comment references swept via sed s/0.40.5.0/0.40.3.0/g
- llms.txt + llms-full.txt: regenerated

Migration numbering UNCHANGED: v90 (renamed from original v81 because master
took v82-v88) and v91 (new trigger migration) stay at v90/v91 — the version
slot is orthogonal to the migration ledger collision.

KNOBS_HASH_VERSION = 5 stays — sequenced behind master's v=4 schema-pack
work; salem's v=4 graph-signals will rebump to v=5 if it lands first.

Test results after rename:
- bun run verify: clean (typecheck + 7 pre-checks)
- bun run test: 9482 pass / 0 fail / 0 skip

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Master shipped v0.40.2.0 (trajectory routing for temporal + knowledge_update
in gbrain think + LongMemEval) while this branch was off-master. Folded in
mechanically with these conflict resolutions:

- VERSION → 0.40.3.0 (kept; > master's 0.40.2.0, valid for CI version-gate)
- package.json → 0.40.3.0 (matched VERSION; bun install refreshed lockfile)
- src/core/migrate.ts → kept both sides: inserted master's v89
  (facts_event_type_column) BEFORE our v90 (contextual_retrieval_columns)
  + v91 (pages_generation_trigger_and_bookmark). Numeric order preserved.
- CHANGELOG.md → kept both entries: our v0.40.3.0 entry on top, master's
  v0.40.2.0 entry below, v0.40.1.0 and earlier untouched.

No content lost; no migration renumber needed (v90 + v91 were already chosen
in T2 to clear v89 — codex D7 plan-review predicted this exact landing
sequence and pinned the numbering).

Trio audit post-merge:
- VERSION:     0.40.3.0
- package.json: 0.40.3.0
- CHANGELOG header (topmost): ## [0.40.3.0] - 2026-05-22

Verification:
- bun run typecheck: clean
- bun test test/migrate.test.ts test/schema-bootstrap-coverage.test.ts:
  161 pass / 0 fail (migration v89 + v90 + v91 all apply cleanly in
  ascending order)
…ction (CI Tier 1)

CI Tier 1 (Mechanical) failed on real Postgres with:
  ERROR: CREATE INDEX CONCURRENTLY cannot run inside a transaction block
  STATEMENT: <v91 multi-statement SQL block including CREATE INDEX CONCURRENTLY ...>

Root cause: postgres.js's multi-statement `.unsafe()` wraps the entire block
in an implicit transaction. `transaction: false` on the migration entry
doesn't help — the implicit wrap happens at the driver layer, below the
migration runner. CONCURRENTLY refuses to run inside any transaction.

Fix: rewrite v91 using the v14 pages_updated_at_index handler pattern —
`sql: ''` + `handler:` function that splits the work into separate
`engine.runMigration()` calls:

1. Columns + trigger function + trigger (single multi-statement runMigration —
   ALTER/CREATE FUNCTION/CREATE TRIGGER are transaction-safe).
2. On Postgres only: pre-drop invalid index remnant via
   `pg_index.indisvalid` (matches v14 pattern for retry safety after a
   failed CONCURRENTLY left a half-built index with the target name).
3. CREATE INDEX CONCURRENTLY as a standalone runMigration call (separate
   statement = no implicit transaction wrap).
4. PGLite: plain CREATE INDEX (no CONCURRENTLY needed — single writer).

Verified against real Postgres (pgvector:pg16):
- schema_version=91 after init
- pages_generation_idx exists with btree shape
- bump_page_generation_trg installed
- test/e2e/postgres-bootstrap.test.ts + test/e2e/schema-drift.test.ts:
  8 pass / 0 fail
- bun test test/migrate.test.ts test/schema-bootstrap-coverage.test.ts:
  161 pass / 0 fail
- bun run typecheck: clean
@garrytan garrytan merged commit 43608c1 into master May 23, 2026
12 of 13 checks passed
mgunnin added a commit to mgunnin/gbrain that referenced this pull request May 28, 2026
* upstream/master: (22 commits)
  v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377)
  v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403)
  v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364)
  v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352)
  v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367)
  v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351)
  v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350)
  v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345)
  v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313)
  v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333)
  v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327)
  v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324)
  v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322)
  v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300)
  v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323)
  v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296)
  v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298)
  v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128)
  v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308)
  v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant