v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures#1323
Merged
Merged
Conversation
Five additive columns + Page/SourceRow type extensions + CRMode discriminated
union land the schema foundation for v0.40.3.0 contextual retrieval. All
columns are NULL-tolerant; existing rows continue working unchanged until
the post-upgrade reembed sweep catches up.
Schema (migration v81 + schema.sql + pglite-schema.ts mirror):
- pages.contextual_retrieval_mode TEXT NULL — tier the page was last
embedded under. NULL on pre-v81 rows; drift detection treats NULL as
'none' for reindex predicates.
- pages.corpus_generation TEXT NULL — composite hash of
(synopsis_prompt_version, haiku_model, title_wrapper_version,
embedding_model) per D27 P1-5. Document-side provenance for the
v0.40.3.0 query_cache.page_generations invalidation contract.
- sources.contextual_retrieval_mode TEXT NULL — per-source override.
CLI-write-only per D15 security gate.
- sources.trust_frontmatter_overrides BOOLEAN DEFAULT FALSE — per-source
mount-frontmatter trust gate per D15. Host source (id='default') is
always trusted in the resolver regardless of column value.
- query_cache.page_generations JSONB DEFAULT '{}' — D27 P1-5 invalidation
contract foundation. Per-row tag of {page_id: corpus_generation} so
lookup can LEFT JOIN against current pages and exclude stale rows.
Types (src/core/types.ts + src/core/sources-ops.ts):
- New CR_MODES = ['none', 'title', 'per_chunk_synopsis'] as const +
CRMode type union + isCRMode() type guard for parsing untrusted
frontmatter / config values.
- Page interface extended with contextual_retrieval_mode + corpus_generation
(optional, NULL-tolerant for pre-v81 rows).
- SourceRow interface extended with contextual_retrieval_mode +
trust_frontmatter_overrides (optional for pre-v81 brains).
Bootstrap coverage:
- All four pages/sources columns are in PGLITE_SCHEMA_SQL CREATE TABLE
bodies (fresh installs get them at initSchema time).
- query_cache.page_generations is exempt because query_cache itself is
migration-created (added in v55, not in PGLITE_SCHEMA_SQL). Same
rationale as the existing query_cache.knobs_hash exemption.
Pinned by the migrate.test.ts v81 round-trip + the schema-bootstrap-coverage
parser (which also gained the query_cache.page_generations exemption).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps the markdown chunker version so the post-upgrade reembed sweep finds
every page on the old chunker version and re-embeds it through the new
contextual-retrieval wrapper path. Chunk boundaries themselves are
unchanged from v2 — the bump forces re-embed (not re-chunk) so existing
pages pick up the wrapper without recomputing chunk splits.
JSDoc on MARKDOWN_CHUNKER_VERSION updated to document the v3 semantic
("chunks embed with optional contextual retrieval wrapper per Anthropic's
published methodology"). Pins the dependency between the chunker version
bump and the upcoming src/core/contextual-retrieval-service.ts (T5).
Test fixture in test/chunkers/recursive.test.ts updated to assert v3 with
a brief comment on the bump rationale so future contributors see the
v0.40.3.0 reason inline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four new pure modules under src/core/ that the upcoming service layer (T5) and Minion handler (T6) compose. All four are testable in isolation; no engine I/O, no filesystem reads outside the synopsis source-text fallback chain (which is invoked by the service, not the modules themselves). src/core/contextual-retrieval-resolver.ts (D5+D6+D15+D26 P0-4): - resolveContextualRetrievalMode() walks the three-source override chain: page frontmatter > source row > global mode bundle. Returns a tagged result with source attribution + invalid_frontmatter_value (D13) + frontmatter_rejected_untrusted_mount (D15) for doctor surfacing. - crModeDistinct() helper for D26 P0-4 IS DISTINCT FROM semantics on app-side CRMode comparisons (NULL-aware, defeats the != misses NULL drift bug Codex pass 2 caught). - HOST_SOURCE_ID = 'default' always trusted regardless of trust_frontmatter_overrides; mount sources require the explicit flag per D15 security gate. src/core/embedding-context.ts (D20-T1 + D20-T4 + Codex T5 title-weakness): - buildContextualPrefix(title, synopsis) → null | wrapped block. Handles title-only, summary-only, both, or neither. - wrapChunkForEmbedding(text, prefix, chunkSource) short-circuits on chunk_source='fenced_code' per D20-T4 (code chunks inside markdown pages skip the wrapper — prepending page title to a code block doesn't help cross-modal retrieval). - sanitizeTitle/sanitizeSynopsis strip </context> (injection vector) and collapse whitespace + cap at 300 chars. - extractFirstTwoSentences() pure regex with CJK_SENTENCE_DELIMITERS from src/core/cjk.ts for the title-tier free fallback path. src/core/page-summary.ts (D27 P1-2 + D27 P1-4 + D21 reversal): - generatePerChunkSynopsis() routes through gateway.chat(tier='utility'). - Richer failure envelope per D27 P1-2: refusal/empty/malformed (→ D14 page-level fall-back) vs auth_failure/rate_limit/timeout/network/ provider_5xx (→ retry per gateway, or throw to Minion retry). - buildSynopsisCacheKey() composes the LRU key per D27 P1-4: (content_hash, chunk_index, corpus_generation, source_text_hash). - DELIBERATELY no calibration injection — D21 reversed D7's calibration- aware acceptance. Mutable answer-time bias tags don't belong in static document vectors. Query-side personalization is the v0.41+ home. src/core/audit-synopsis.ts (D17, mirrors v0.35.0.0 rerank-audit precedent): - Failure-only JSONL writer at ~/.gbrain/audit/synopsis-failures-YYYY-Www.jsonl with ISO-week rotation. Deliberately no success logging (10K+ pages per backfill would generate 10K+ JSONL rows of noise; failure signal is the actionable one). - summarizeSynopsisFailures() aggregator returns SynopsisFailureSummary for doctor's synopsis_refusal_rate check. Clean typecheck across the four modules. Tests land in T14 alongside the service + Minion handler so the test layer can integrate the full path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-tier wrapper ladder gated by search.mode lands in the bundle. The
per-mode defaults match the cost-tier philosophy (D2):
conservative → 'none' (minimum surface)
balanced → 'title' (free at runtime; pure string concat)
tokenmax → 'per_chunk_synopsis' (Anthropic's published method)
Plus the D18 soft kill switch (contextual_retrieval_disabled) so a single
config-key flip neutralizes wrapping for queries AND new embeds without
touching the migration path.
src/core/search/mode.ts:
- ModeBundle: contextual_retrieval: CRMode + contextual_retrieval_disabled.
- All three frozen MODE_BUNDLES updated with the per-tier defaults.
- SearchKeyOverrides + SearchPerCallOpts: both fields optional in the
per-key config + per-call surfaces.
- resolveSearchMode's pick chain threads both new fields through the
standard per-call > per-key > mode bundle precedence ladder.
- KNOBS_HASH_VERSION 3→4. Two new entries appended to knobsHash() parts
list (append-only per CDX2-F13 convention): cr=${cr_mode} +
crd=${0|1}. A query against a tokenmax-mode brain can no longer be
served from a cache row written when the brain was on balanced — they
sit in different embedding spaces.
- SEARCH_MODE_CONFIG_KEYS: 'search.contextual_retrieval' +
'search.contextual_retrieval_disabled' added.
- loadOverridesFromConfig reads both keys; CR_MODES guard rejects typos
(drift typos still fall through to mode default per D13 sync-failure
semantics; this is the no-typo path).
- Imports CR_MODES + CRMode from src/core/types.ts.
src/commands/search.ts:
- KNOB_DESCRIPTIONS picks up the two new entries so `gbrain search modes`
dashboard renders them with description copy.
test/search-mode.test.ts:
- Three canonical bundle tests updated with the per-tier CR defaults.
- KNOBS_HASH_VERSION expectation bumped 3→4 with inline rationale.
Clean typecheck + 42 search-mode tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two writers racing on the same chunk (autopilot sync + manual `embed --stale`
+ contextual reindex) previously raced last-writer-wins via the text-
unchanged branch's `COALESCE(EXCLUDED.embedding, content_chunks.embedding)`.
Pre-v0.40.3 the cost of an overwrite was one wasted ~$0.000001 text-
embedding-3-large call. With v0.40.3's per-chunk Haiku synopsis on tokenmax,
the cost rises ~300x to ~$0.0003 per overwritten chunk plus the discarded
synopsis work. On a 10K-page tokenmax brain, a few percent overwrite rate
during concurrent backfill+sync wastes $1-5 of Haiku spend silently.
Fix (mirrored exactly in postgres-engine.ts + pglite-engine.ts so both
engines stay parity-pinned):
embedding = CASE
WHEN EXCLUDED.chunk_text != content_chunks.chunk_text THEN EXCLUDED.embedding
WHEN content_chunks.embedding IS NULL THEN EXCLUDED.embedding
WHEN EXCLUDED.embedded_at IS NOT NULL
AND (content_chunks.embedded_at IS NULL OR EXCLUDED.embedded_at > content_chunks.embedded_at)
THEN EXCLUDED.embedding
ELSE content_chunks.embedding
END,
embedded_at = CASE
WHEN EXCLUDED.chunk_text != content_chunks.chunk_text AND EXCLUDED.embedding IS NULL THEN NULL
WHEN content_chunks.embedding IS NULL AND EXCLUDED.embedding IS NOT NULL THEN EXCLUDED.embedded_at
WHEN EXCLUDED.embedded_at IS NOT NULL
AND (content_chunks.embedded_at IS NULL OR EXCLUDED.embedded_at > content_chunks.embedded_at)
THEN EXCLUDED.embedded_at
ELSE content_chunks.embedded_at
END,
The two columns move together via aligned CASE WHEN logic — embedding +
embedded_at stay consistent so `embed --stale` (predicate
`embedding IS NULL`) keeps working correctly.
Behavior summary for the text-unchanged branch:
- existing embedding NULL → take new (cold path, no race)
- new is fresher (embedded_at > existing) → take new
- otherwise → keep existing (slower writer with stale embedding loses)
Closes the v0.35.x TODOS.md item that flagged this race pre-existing.
v0.40.3 fold-in lands the fix when the wave amplifies the cost vector,
per D24 in the eng-review pass.
100 pglite-engine tests pass + clean typecheck. E2E concurrent-writer
test lands in T14 alongside the broader test suite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Centerpiece service module. Single source of truth for "re-embed one page
with the active CR mode" — composed by import-file.ts (sync time),
reindex.ts (batch sweep), and the contextual-reindex-per-chunk Minion
handler (T6). Closes the drift class Codex pass 2 P1-1 flagged: each
consumer no longer hand-rolls the embed-then-stamp flow, so there's
literally no way for them to diverge.
src/core/contextual-retrieval-service.ts:
- reembedPageWithContextualRetrieval() implements the D26 P0-2 two-phase
build pattern.
PHASE 1 (in-memory, no DB writes):
- Load page + source + chunks
- Resolve effective CR mode (resolver) with optional kill-switch
short-circuit per D18
- 'none' tier: skip wrap, stamp column, return early (records page
is up-to-date relative to current state so reindex sweep doesn't
re-walk it)
- 'title' tier: pure string concat with sanitized title prefix
- 'per_chunk_synopsis' tier: read source text via fallback chain (D11),
generate synopsis per chunk SEQUENTIALLY within page (D10), batch
embedBatch ONCE per page (D27 P2-2). Rate-leasing hooks
(acquireSynopsisLease/releaseSynopsisLease) supplied by the Minion
handler; inline callers rely on gateway-level retry.
- On refusal/empty/malformed (per D27 P1-2): RESTART PHASE 1 at
'title' tier — D14 page-level consistency (whole page demoted, no
mid-state on disk).
PHASE 2 (single DB transaction):
- tx.upsertChunks() — chunk_text stays canonical per D20-T1; only
the wrapped string went to the embedder, not into the column.
- tx.updatePageContextualRetrievalState() — stamps both columns
atomically with PHASE 1 chunk writes.
- computeCorpusGeneration() composes the document-side provenance hash
per D27 P1-5: sha256(cr_mode + synopsis_prompt_version + haiku_model
+ title_wrapper_version + embedding_model_tag).slice(0,16). Future
prompt edits or model bumps invalidate prior cache rows via the
query_cache.page_generations LEFT JOIN (lands in T11).
- computeSourceTextHash() for D27 P1-4 synopsis cache key composition.
- expectedModeForPageSourceOnly() helper for the T9 reindex sweep
predicate.
- ReembedPageResult discriminated union: success | skipped (4 reasons)
| page_fallback (refusal triggered D14) | transient_error | permanent_error.
Each consumer dispatches on `kind` to decide retry / surface / commit.
New engine method (added to BrainEngine interface + both engines):
- updatePageContextualRetrievalState(slug, sourceId, mode, corpusGeneration):
narrow UPDATE of just the two CR-state columns + updated_at. Skips
soft-deleted rows. Mirrors refreshPageBody's narrow-update pattern so
we don't fire createVersion on every tier upgrade (which would bloat
page_versions).
Clean typecheck + 272 existing tests pass (no regressions).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thin handler (D23) that wires the global Haiku rate-leaser (D26 P0-3) +
delegates re-embed work to contextual-retrieval-service.ts (T5). One job
per page (D10). Submitted by the mode-switch hook (T10), the reindex
sweep (T9), and doctor --remediate (T13).
src/core/minions/handlers/contextual-reindex-per-chunk.ts:
- makeContextualReindexHandler(opts) factory closure.
- Per-chunk Haiku call wrapped in acquireLease/releaseLease against the
shared key 'anthropic:utility:contextual-synopsis'. Default RPM cap is
50 (Anthropic Haiku 4.5 published limit); operators on a tier with
higher quota override via GBRAIN_CONTEXTUAL_HAIKU_RPM env var.
- D27 P2-1 source-id derivation: payload carries only page_slug;
handler loads the page row and uses its source_id as authoritative.
Optional expected_source_id field on the payload triggers
UnrecoverableError on mismatch (stale/malicious payload defense).
- Result classification:
success / page_fallback (D14) → ok
transient_error → throw (Minion retries)
permanent_error → UnrecoverableError → dead-letter
- 60s poll-wait per Haiku call when the rate-lease is saturated; gives
up with explicit error rather than blocking forever.
src/core/minions/protected-names.ts:
- contextual_reindex_per_chunk added to PROTECTED_JOB_NAMES with comment
documenting the cost vector (1-50 Haiku calls per page, bulk MCP
submission could drain user's Anthropic budget).
src/commands/jobs.ts:
- registerBuiltinHandlers wires the new handler via dynamic import.
- Registered ABOVE autopilot-cycle so the handler is available when
doctor --remediate proposes contextual_retrieval_coverage steps.
Clean typecheck.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…umns import-file.ts now resolves the effective CR mode for each page at embed time and applies the wrapper inline. Per D20-T1 critical invariant, the stored chunk_text stays canonical (powers FTS, snippets, reranker, debug); only the wrapped string goes to the embedder. Inline path scope (cost-discipline choice): - title-tier: inline wrap is free (pure string concat). Applied directly. - per_chunk_synopsis tier: TOO EXPENSIVE for the inline import path (one Haiku call per chunk on every sync would compound into hours of blocking per `gbrain sync`). The inline path lands the page at the title tier; the Minion-driven contextual reindex (T6 handler) upgrades it to per_chunk_synopsis later when the user accepts the cost prompt in the mode-switch hook (T10). Per D3 explicit-consent contract. - 'none' tier (conservative mode, kill-switch disabled): no wrapping, raw chunk_text → embedder unchanged from pre-v40.3 behavior. Code chunks (chunk_source='fenced_code') always bypass wrapping per D20-T4 — wrapChunkForEmbedding short-circuits. Stamping (alongside putPage in the same transaction): - pages.contextual_retrieval_mode → tier the page was just embedded at - pages.corpus_generation → composite hash via computeCorpusGeneration from the service module. NULL when 'none' tier or noEmbed=true. Override chain: page frontmatter > source row > global mode bundle (D5+D6). Mount-frontmatter trust gate (D15) — currently lookup uses defaults for source row; future T9 reindex sweep + T10 mode-switch hook can pass a richer source row when the per-source override lands. Kill switch (D18): when search.contextual_retrieval_disabled=true, the resolver short-circuits to 'none' and the wrapper is skipped. Clean typecheck + 251 unit tests pass (migrate + pglite-engine + import-file all green). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`gbrain reindex --markdown` predicate widens from chunker_version drift alone to also catch contextual_retrieval_mode IS NULL — the v0.40.3.0 upgrade-path signal that a page has never been evaluated against the CR ladder (pre-v81 brains where the column is freshly NULL after the migration ran). Pages enter the sweep when EITHER: (a) chunker_version < MARKDOWN_CHUNKER_VERSION (existing behavior) (b) contextual_retrieval_mode IS NULL (new — D26 P0-1 + D26 P0-4 prep) Since chunker_version 2→3 (T2) already forces every pre-v40 page into (a), the IS NULL clause is effectively a belt-and-suspenders for the case where a brain upgrades migrate but somehow the chunker_version bump didn't propagate (concurrent upgrade race, manual SQL edit, etc.). The re-import path uses importFromContent with forceRechunk:true (existing v0.32.7 behavior) which bypasses the content_hash short- circuit so the v0.40.3.0 import-file.ts wrapper application path (T7) actually applies. Each re-imported page picks up the active CR tier and stamps contextual_retrieval_mode + corpus_generation atomically. Page-frontmatter overrides are honored at re-import time (importFromFile re-parses YAML and the resolver picks the per-page tier). The frontmatter- mismatch drift case Codex P0-1 called for (user removes override after initial import) is partially handled here via the IS NULL+forceRechunk path; a v0.41+ wave can add the explicit "frontmatter may contain override" candidate path if real users hit drift the current predicate misses. Clean typecheck + 230 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing post-upgrade-reembed.ts prompt fires automatically on
`gbrain upgrade` because T2 bumped MARKDOWN_CHUNKER_VERSION 2→3. Prompt
copy extended to explain WHY the re-embed is happening — without this,
users see a "chunker-bump" prompt and wonder if it's a routine internal
refresh vs the actual headline feature ship.
formatReembedPrompt now appends a [contextual retrieval] line below the
chunker-bump cost summary, mentioning that v0.40.3.0 wraps each chunk
with its page title before embedding (Anthropic's published method).
What the user sees on upgrade:
[chunker-bump] Will re-embed ~N markdown pages via {model}, est.
~$X.XX, ~Ymin. Press Ctrl-C within Zs to abort.
[contextual retrieval] v0.40.3.0 wraps each chunk with its page
title before embedding (Anthropic's published method).
Title-tier wrap is free at runtime (pure string concat, no Haiku) so
the cost number stays unchanged from the chunker-bump-only case. The
per-chunk Haiku synopsis tier is OPT-IN via
`gbrain config set search.mode tokenmax` post-upgrade, which fires the
contextual_reindex_per_chunk Minion handler (T6) for the backfill.
T10 mode-switch hook in src/commands/config.ts (the explicit per-mode
cost prompt UX on `gbrain config set search.mode tokenmax`) is deferred
to v0.40.3.1 — the explicit-consent contract (D3) is satisfied by the
existing post-upgrade prompt for the title-tier path that the wave
ships by default. The Minion handler from T6 + the protected-name
guard ensure that any direct Minion submission for the per-chunk path
is gated on the CLI/doctor-remediate trust boundary.
Kill switch (D18): the contextual_retrieval_disabled config key is
honored at import time (T7) and in the service (T5) — when true, the
resolver short-circuits to 'none' regardless of mode bundle. No
hybridSearch changes needed: queries embed raw text already; the kill
switch only affects NEW embeds. Existing wrapped vectors keep serving
queries via cosine similarity (asymmetric retrieval is preserved).
11 upgrade-reembed-prompt tests pass + clean typecheck.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T11 (query_cache.page_generations contract): the DB column shipped in T1 migration v81 + KNOBS_HASH_VERSION 4 bump in T4 invalidates the common-case cache contamination (full-brain mode upgrade). The LEFT JOIN read-side gate per Codex P1-5 — for the edge case where a brain is mid- reindex and some pages are stamped at corpus_generation N+1 while others are still at N — is deferred to v0.40.3.1. In practice, the post-upgrade reembed prompt fires automatically + completes before search resumes on healthy brains, so the edge case is narrow. CHANGELOG documents the limitation. T12 (generic RemediationStep contract): the existing recommendation registry shape (sync/embed/backlinks/extract hardcoded) is extended via the doctor check below rather than refactored to a generic registry. Codex P1-6 called for the refactor; v0.40.3.1+ can absorb it once a real second consumer requires the same registration shape. T13 (contextual_retrieval_coverage doctor check): - New checkContextualRetrievalCoverage() in src/commands/doctor.ts. - Two SQL signals: pages.chunker_version < current + pages.contextual_ retrieval_mode IS NULL. Single COUNT...FILTER query is cheap on every brain size. - Audit summary line: reads ~/.gbrain/audit/synopsis-failures-*.jsonl via the v0.40.3.0 audit-synopsis module (T3). >5% page-level fallback rate surfaces explicitly so operators see the Haiku refusal signal. - Paste-ready fix: `gbrain reindex --markdown` — the v0.32.7 + v0.40.3.0 sweep covers both chunker_version drift AND CR mode drift per T9. - Status: ok when fully aligned + no recent failures; warn when drift exists (with the paste-ready fix in the message). - Wired into the standard doctor run alongside the other v0.36+ checks (abandoned_threads, calibration_freshness, etc.). Sources/mounts CLI surfaces (set-cr-mode + trust-frontmatter) deferred — the post-upgrade-reembed prompt + the per-page frontmatter override path cover the v0.40.3.0 operational workflow. Per-source override CLI is a power-user feature that can ship in v0.40.4+ once real federated- brain users surface specific friction. 48 doctor tests pass + clean typecheck. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test suite for the v0.40.3.0 contextual retrieval wave. 77 new test cases across 5 files, all green. Pins every IRON-RULE invariant end-to-end so future contributors can't silently regress the wave. test/contextual-retrieval-resolver.test.ts (29 tests): - 9-combo override matrix (page-fm > source-row > global, all permutations). - D15 mount-trust gate: host always trusted, mounts honor only when trust_frontmatter_overrides=true, rejected frontmatter surfaces via result.frontmatter_rejected_untrusted_mount for doctor. - D13 invalid frontmatter (typo + non-string + empty): falls through to source/global with raw value in invalid_frontmatter_value. - D18 kill switch: short-circuits to 'none' regardless of overrides. - D26 P0-4 crModeDistinct: NULL-aware comparison, matches SQL IS DISTINCT FROM semantics on every combination of NULL/defined args. test/embedding-context.test.ts (21 tests): - buildContextualPrefix: title-only, synopsis-only, both, neither. - wrapChunkForEmbedding: non-code wraps; D20-T4 fenced_code ALWAYS bypasses; null prefix passes through; image_asset wraps as text. - sanitizeTitle: </context> injection stripped (case-insensitive), whitespace collapsed, 300-char cap, trim semantics. - extractFirstTwoSentences: English boundaries, question marks, CJK delimiters, run-on cap, empty input, no-delimiter passthrough. - modeRequiresHaiku / modeRequiresWrapper guards. - D20-T1 IRON-RULE regression test: wrapping does not mutate input string reference (so caller's chunk_text safely flows to upsert). test/contextual-retrieval-service-pure.test.ts (16 tests): - computeCorpusGeneration: 16-char hex, deterministic, mode-sensitive, model-sensitive, TITLE_WRAPPER_VERSION stable. - computeSourceTextHash: D27 P1-4 cache invalidation key composition. - expectedModeForPageSourceOnly (T9 reindex predicate helper): kill switch returns none, source override beats global, invalid override falls through, all CR modes round-trip. test/audit-synopsis.test.ts (11 tests): - ISO-week filename rotation (stable for same week, different days). - logSynopsisFailure round-trip: kind, page_level_fallback flag, multi-event accumulation, detail 200-char cap. - summarizeSynopsisFailures aggregation: null on empty, by_kind counts, page_level_fallback_rate math. - Missing audit file returns empty (silent no-op). test/e2e/contextual-retrieval-pglite.test.ts (5 tests, hermetic PGLite + gateway stub): - IRON RULE #1 (D20-T1): wrapper text in embedder input but NEVER in content_chunks.chunk_text after import — pins the canonical chunk_text separation invariant end-to-end. - IRON RULE #2 (D14 stamping): pages.contextual_retrieval_mode AND pages.corpus_generation are set after every import. - IRON RULE: chunker_version stamps to current MARKDOWN_CHUNKER_VERSION (3 for v0.40.3.0). - D5 per-page frontmatter override: `contextual_retrieval: none` makes the embedder receive UNWRAPPED text; mode column stamped 'none'. - T9 reindex predicate: pages with contextual_retrieval_mode IS NULL enter the sweep regardless of chunker_version. 462 tests pass across all v0.40.3.0 + adjacent suites (migrate, pglite-engine, search-mode, doctor, import-file, upgrade-reembed-prompt, schema-bootstrap-coverage, recursive chunker, all five new files). Zero regressions, clean typecheck. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION 0.37.11.0 → 0.40.3.0 with package.json sync. CHANGELOG entry follows the CLAUDE.md ELI10-lead voice rule: opens with "Your search now understands what each chunk is about, not just what words are in it," lays out the tier ladder with a real cost table, calls out the chunk_text storage separation (D20-T1) with a concrete example, and includes the "Things to watch" + "What we caught and fixed before merging" sections per the format spec. CHANGELOG also includes the canonical "To take advantage of v0.40.3.0" self-repair block with the manual `gbrain apply-migrations --yes` + `gbrain reindex --markdown` recovery path for users whose `gbrain upgrade` post-upgrade-reembed didn't fully fire. skills/migrations/v0.40.3.0.md walks the agent through the mechanical upgrade flow, the opt-up to tokenmax path with the realistic backfill cost table, the opt-out soft kill switch flip, and the per-page frontmatter override with the D15 mount-trust note. Matches the v0.13.0 + v0.32.7 migration doc structure so agent muscle memory works. llms-full.txt + llms.txt regenerated via `bun run build:llms` to pick up the CHANGELOG + migration doc additions. test/build-llms.test.ts passes. Also moved test/audit-synopsis.test.ts → test/audit-synopsis.serial.test.ts to satisfy the check-test-isolation lint (the test mutates GBRAIN_AUDIT_DIR via beforeAll/afterAll for a fixture dir, which the parallel runner forbids in *.test.ts files; serial quarantine is the canonical fix per CLAUDE.md test-isolation rules). `bun run verify` passes (typecheck + 4 CI gate checks). 469 tests across all v0.40.3.0 + adjacent suites pass with 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-T15 test gap-fill: covers the two highest-leverage spots that the T14 suite didn't exercise. test/contextual-retrieval-doctor.serial.test.ts (8 tests, .serial because the doctor check reads the audit JSONL via GBRAIN_AUDIT_DIR env mutation): - empty-brain → ok - fully-aligned brain (chunker_version current + mode stamped) → ok - chunker_version drift → warn with paste-ready `gbrain reindex --markdown` - NULL mode column → warn surfaces "never evaluated against CR ladder" - both drift conditions together → warn with both messages - soft-deleted pages NOT counted (deleted_at filter works) - non-markdown (code) pages NOT counted (page_kind filter works) - audit JSONL refusal event surfaces in the failure-summary line test/e2e/concurrent-embed-race.test.ts (3 tests, D24 regression guard): - cold path: existing embedding NULL → take new (no-race case) - IRON RULE: fresher write wins over stale write when text unchanged. Pre-fix this would have last-writer-wins via COALESCE; post-fix the fresher embedded_at survives. Pinned by raw SQL upsert with an explicit -5min embedded_at to simulate the slower writer. - text change with no new embedding → both embedding + embedded_at reset to NULL (consistent state so embed --stale picks up). Cross-shard contamination fix: race test calls configureGateway with embedding_dimensions=1536 BEFORE initSchema so the PGLite vector column sizes consistently regardless of what other tests in the same shard process configured first. Without this, running the race test alongside the pglite-e2e test triggered "expected 1280 dimensions, not 1536" when the gateway was left in its default ZE-1280 state by a prior file. `bun run verify` passes (typecheck + 5 CI gate checks). 88 tests pass across all v0.40.3.0 + new gap-fill files in one combined run; zero shared-state contamination. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ns v81→v90 Absorbs master's v0.38 (ingestion cathedral), v0.38.1 (agents), v0.38.2 (doctor), v0.39.0 (brainstorm cost cathedral), v0.39.1 (schema packs), and the v0.40.x VERSION bump on top. Conflict resolutions: - VERSION → 0.40.5.0 (this wave's slot; v0.40.4.0 claimed by salem PR #1300) - package.json → 0.40.5.0 - src/core/migrate.ts → took master's v81 (pages_provenance_columns) + v82-v88; appended our contextual_retrieval_columns as v90 (skipped v89 reserved by garrytan/v0.40.2.0-trajectory-routing per D7 inspection) - src/core/search/mode.ts → KNOBS_HASH_VERSION 4→5 (per D8 sequencing behind salem's pending v=4 graph signals); both schema_pack hash fields (master) and contextual_retrieval hash fields (this branch) preserved - src/core/types.ts → both v0.38 provenance Page fields and v0.40.3 CR fields preserved on the Page interface - CHANGELOG.md → took master as baseline; v0.40.5.0 entry lands in T9 docs phase - bun.lock → bun install refreshed to pick up chokidar@^4.0.3 (v0.38 dep) bun run typecheck passes after merge.
… + index Migration v90 (renamed from v0.40.3.0 v81 on master merge per D2/D7): - 5 additive columns (pages.contextual_retrieval_mode, pages.corpus_generation, sources.contextual_retrieval_mode, sources.trust_frontmatter_overrides, query_cache.page_generations) for the contextual retrieval wave. Migration v91 (NEW per D6 + codex #4 + codex #8): - pages.generation BIGINT NOT NULL DEFAULT 1 (per-page generation counter) - query_cache.max_generation_at_store BIGINT NOT NULL DEFAULT 0 (Layer 1 bookmark) - bump_page_generation_fn() trigger function: - BEFORE INSERT: NEW.generation := COALESCE(MAX(generation), 0) + 1 — codex #4 INSERT coverage so cache rows stored before a new page existed invalidate correctly. - BEFORE UPDATE: bumps generation only when allow-list columns IS DISTINCT FROM (compiled_truth, timeline, frontmatter, deleted_at, contextual_retrieval_mode, title, type, page_kind, corpus_generation, content_hash) per D6 widened to catch user-visible mutations. - CREATE INDEX CONCURRENTLY pages_generation_idx ON pages (generation) so MAX(generation) for the bookmark check is O(log N) — codex #8 confirmed plain btree, no DESC necessary. Mirrored in src/schema.sql, src/core/pglite-schema.ts CREATE TABLE body (trigger included so fresh PGLite installs get it from the schema blob, not just migration replay). Extended REQUIRED_BOOTSTRAP_COVERAGE with pages.contextual_retrieval_mode, pages.corpus_generation, sources.contextual_retrieval_mode, sources.trust_frontmatter_overrides, pages.generation. Probes added to applyForwardReferenceBootstrap on both engines + matching ALTER blocks for pre-v90/pre-v91 brains. COLUMN_EXEMPTIONS extended: query_cache.max_generation_at_store (same rationale as page_generations — query_cache is migration-only, not in PGLITE_SCHEMA_SQL). Test results: - bun test test/migrate.test.ts: 140 pass / 0 fail - bun test test/schema-bootstrap-coverage.test.ts: 9 pass / 0 fail - bun run typecheck: clean
New pure module src/core/search/query-cache-gate.ts:
- buildPageGenerationsSnapshot(engine, pageIds) builds the {pageId: gen}
snapshot + MAX(generation) bookmark in one round trip via UNION ALL.
Pre-v91 brains (no generation column) fall back to empty snapshot +
zero bookmark — backward compat with legacy rows preserved.
- validateCacheRowAgainstPages() — pure validator for unit testing.
- CACHE_GATE_WHERE_CLAUSE exported as a SQL fragment that lookup() embeds
in its WHERE clause. Two-layer gate per D11:
Layer 1 (cheap): (SELECT MAX(generation) FROM pages) <=
qc.max_generation_at_store
Layer 2 (per-page): jsonb_each + LEFT JOIN pages to detect deletes
+ bumped pages on the cached result set.
Legacy compat: rows with empty {} snapshot are vacuously valid (Layer 2
short-circuits) — IRON-RULE pinned.
query-cache.ts wiring:
- lookup() table-aliased to `qc` so the gate fragment can reference
qc.max_generation_at_store + qc.page_generations. WHERE clause adds
`AND ${CACHE_GATE_WHERE_CLAUSE}` after the existing similarity + TTL +
knobs_hash filters.
- store() captures the snapshot via the pure helper, then INSERTs both
page_generations JSONB and max_generation_at_store BIGINT alongside
the existing columns. ON CONFLICT (id) DO UPDATE refreshes both.
Test coverage (15 unit + 6 e2e):
- test/query-cache-gate.test.ts: 15 cases covering pure validator
branches (vacuous valid, bookmark short-circuit, single/multi/partial
bumps, deleted page, codex D11 critical case), PGLite-backed snapshot
builder (empty pageIds, populated pageIds, integer JSONB shape,
non-existent IDs skipped, bump-after-update), SQL shape regression
on CACHE_GATE_WHERE_CLAUSE.
- test/e2e/cache-gate-pglite.test.ts: 6 cases covering store → HIT,
content UPDATE → MISS, INSERT new page → HIT (codex #4 case where
bookmark fires but snapshot intact serves correctly), legacy row →
HIT (IRON-RULE backward compat), soft-delete → MISS (trigger path),
multi-page partial bump → MISS.
Test results:
- bun test test/query-cache-gate.test.ts test/query-cache.test.ts
test/query-cache-isolation.test.ts test/e2e/cache-gate-pglite.test.ts:
33 pass / 0 fail
- bun run typecheck: clean
Note: hard-delete (raw DELETE FROM pages) is not covered by the trigger
(BEFORE INSERT OR UPDATE doesn't fire on DELETE). Production uses
soft-delete via deleted_at (trigger allow-list catches NULL → timestamp
distinction). Hard-delete via admin-only `gbrain pages purge-deleted` is
best-effort cache-wise — acceptable for the rare admin path.
New module src/core/search/mode-switch-ux.ts: - summarizeTransition(old, new): pure 5-cell matrix (no_change / narrowing / broadening / tokenmax_opt_in / invalid_new_mode) + reindex command + cost estimate + paste-ready callout lines. - probeWorkerAvailable(engine): worker liveness proxy. gbrain has no minion_workers heartbeat table yet (B7 follow-up from v0.19.1), so we use a proxy: minion_jobs activity within 10-min query window. Within 2 min = active; >2min but <10min = stale; nothing = never_seen. - buildReindexIdempotencyKey(): content-stable per codex D12 Bug 1. Pattern: cr-backfill:<source_id>:<chunker_version>:<mode>. NOT timestamp-based — two retries against same brain state dedupe. - runModeSwitchUx(): orchestrator. Honors GBRAIN_NO_MODE_SWITCH_UX=1 (full skip), non-TTY (print paste-ready hints to stderr), yesFlag (auto-submit reindex). For tokenmax_opt_in + TTY + worker probe active: submits via MinionQueue.add with allowProtectedSubmit=true. For probe = stale or never_seen: loud-fail per D3 with a "start a worker OR run inline" recovery hint — closes the silent-stall footgun. src/commands/config.ts hook (~30 LOC): - Captures the OLD search.mode BEFORE setConfig so summarizeTransition classifies correctly. - Fires runModeSwitchUx() AFTER setConfig persisted, wrapped in try/catch so UX failures never break the config-set that already landed. - Best-effort: failures emit `[mode-switch] UX hook failed (non-fatal)` to stderr. Test coverage (18 cases): - summarizeTransition: 8 cases covering all 5 transition kinds + null inputs + tokenmax-as-first-set + invalid mode. - probeWorkerAvailable: 4 cases via real PGLite — never_seen / active / stale (seeded via minion_jobs) + threshold constant assertion. - buildReindexIdempotencyKey: 6 cases pinning content-stable contract (codex D12 Bug 1) — identical inputs match, different inputs differ, consecutive calls match despite time delta (NOT timestamp-based). Test results: - bun test test/mode-switch-ux.test.ts: 18 pass / 0 fail - bun run typecheck: clean
…-frontmatter}
Four new mounts CLI verbs per D4:
- gbrain mounts enable <id> — re-enable a disabled mount
- gbrain mounts disable <id> — toggle off without removing
- gbrain mounts trust-frontmatter <id> — let this mount's per-page
contextual_retrieval_mode
frontmatter override the source
default. Off by default for
mounted brains; host is always
trusted.
- gbrain mounts untrust-frontmatter <id> — clear the trust flag.
Implementation:
- src/core/brain-registry.ts MountEntry interface extended with
trust_frontmatter_overrides?: boolean. loadMounts() projection threads
the field through with default false (mounts opt in explicitly per D4
+ D15 security posture).
- src/commands/mounts.ts: new runSetMountFlag() helper handles all 4
verbs via a shared file-write path. Missing-mount loud rejection
(GBrainError with list-hint). Host brain rejection. Idempotent: no-op
when current value already matches. Cache refresh after each write
so host agents see the new flag immediately.
Test infrastructure:
- GBRAIN_MOUNTS_PATH env override on getMountsPath() in BOTH
brain-registry.ts AND mounts.ts (the latter has its own
copy — two source-of-truth paths). Reason: libuv caches homedir()
on some platforms, so withFakeHome's HOME mutation isn't picked up
by tests calling runMounts(). Production callers don't set the env.
Test coverage (5 new cases):
- enable → disable → enable cycle persists
- trust-frontmatter → untrust → trust cycle preserves other fields
- missing mount id → loud rejection with list-hint (closes the
critical gap from idempotent-pebble Failure Modes table)
- host brain rejection: cannot trust-frontmatter "host"
- enable on already-enabled mount: no-op (idempotent)
Test results:
- bun test test/mounts-cli.test.ts test/brain-registry.serial.test.ts:
54 pass / 0 fail
- bun run typecheck: clean
New verb `gbrain sources set-cr-mode <id> <mode>` per D5:
- Mode argument validated against CR_MODES via isCRMode (closed enum:
none | title | per_chunk_synopsis).
- "unset" / "default" / "" clears the column to NULL (falls through to
the global search.mode bundle).
- Loud rejection on:
- Missing id/mode → exit 2, prints usage
- Invalid mode → exit 2, lists valid options
- Missing source id → exit 4, paste-ready `gbrain sources list` hint
(closes the idempotent-pebble Failure Modes critical gap)
src/commands/sources.ts wired into the switch dispatch + help text
updated. isCRMode + CR_MODES lazy-imported per existing import pattern
in this file.
Test coverage (10 cases):
- happy path for all 3 valid CRMode values
- unset path via "unset" + "default" both clear to NULL
- invalid mode → exit 2 + no mutation
- missing source id → exit 4
- missing arguments → exit 2 with usage
- missing mode (only id) → exit 2 + no mutation
- round-trip preserves other fields (name)
Test results:
- bun test test/sources-set-cr-mode.test.ts: 10 pass / 0 fail
- bun run typecheck: clean
New canonical module src/core/remediation-step.ts:
- RemediationStep interface (lifted from brain-score-recommendations.ts).
Same shape; rename to "Step" suffix per D6 for clarity ("a step in a
remediation plan").
- RemediationSeverity + RemediationStatus type re-exports.
- canonicalJson(value): zero-dep canonical serialization — sorts object
keys recursively before stringify. Per codex D12 Bug 2: identical
logical params hash identically regardless of insertion order.
- idempotencyKey(source, job, params): shape
<source>:<job>:sha8(canonicalJson(params)). Lifted from the legacy
inline idemKey helper so future check authors don't drift.
- makeRemediationStep(opts): canonical factory. Defaults id to the
idempotency key (override for human-readable like 'sync.repo').
Status defaults to 'remediable'. All check authors should use this;
hand-rolling is the drift hazard the refactor closes.
src/core/brain-score-recommendations.ts:
- Removed the local Remediation + RemediationSeverity + RemediationStatus
definitions.
- Re-exports them from remediation-step.ts so existing callers (e.g.
doctor.ts) still resolve. Also re-exports Remediation as an alias
for RemediationStep so import paths can migrate gradually.
- Imports type Remediation alias internally so the (substantial) existing
computeRecommendations body keeps compiling without sed pass.
Test coverage (17 cases):
- canonicalJson: key-ordering determinism (3 cases), nested objects,
array order preservation, primitive types, codex D12 Bug 2 regression
- idempotencyKey: shape regex, content invariance, key-ordering
invariance, source/job/params differentiation
- makeRemediationStep: default id, explicit id override, default status,
canonical-JSON invariance, all-opts threadthrough
- back-compat: `import { Remediation } from brain-score-recommendations`
still resolves to RemediationStep (compile + runtime check)
Test results:
- bun test test/remediation-step.test.ts: 17 pass / 0 fail
- bun test test/brain-score-recommendations.test.ts test/doctor.test.ts:
70 pass / 0 fail (back-compat preserved)
- bun run typecheck: clean
Per D6 + D8: T8b in next commit wires lint, integrity, sync_failures
doctor checks to emit RemediationStep via the new factory.
…+ 3 Minion handlers
Doctor checks now emit RemediationStep via makeRemediationStep():
- `integrity` check (when bareHits > 0) emits integrity-auto step.
Severity escalates to 'high' when bareHits > 50. Deterministic; $0 cost.
- `sync_failures` check (when unacked > 0) emits sync-retry-failed step.
Severity escalates to 'high' when count >= 10. Content-stable params
(failure_count + oldest_failure timestamp) per codex D12 Bug 2.
- sync-skip-failed DELIBERATELY NOT emitted per D12 Bug 3 (auto-skipping
failed syncs hides data loss). Operators retain `gbrain sync --skip-failed`
as a direct CLI option.
Lint doctor check NOT wired — there is no `lint` check in doctor.ts
today; the lint workflow is the standalone `gbrain lint` command. Adding
a doctor lint check is a v0.41+ TODO when it justifies its own complete
section.
Three new Minion handlers in registerBuiltinHandlers (NOT in
PROTECTED_JOB_NAMES — they're thin wrappers around already-shipping CLI
commands, idempotent, no shell exec, MCP-safe):
- lint-fix → runLintCore({ fix: true })
- integrity-auto → runIntegrity(['auto'])
- sync-retry-failed → runSync(['--retry-failed'])
Check.remediation field shape upgrade:
- Was: inline Array<{...}> shape.
- Now: RemediationStep[] from the canonical
src/core/remediation-step.ts. Check authors `import { makeRemediationStep }`
and emit through the factory.
Test results:
- bun test test/doctor.test.ts: 48 pass / 0 fail (zero regression on
the doctor surface; new remediation fields are additive)
- bun run typecheck: clean
The v0.38 ingestion cathedral added a new write path to pages via the `ingest_capture` Minion handler. The v0.40.5.0 cache-invalidation gate relies on pages.generation being bumped by EVERY write path via the BEFORE INSERT OR UPDATE trigger. This file pins that the new v0.38 capture write path correctly bumps generation through three scenarios: 1. INSERT path (codex #4 INSERT coverage): ingest_capture with a fresh slug creates a page with generation = MAX(generation) + 1 so any cache row stored before the new page existed has its bookmark fire. 2. UPDATE path: ingest_capture with an existing slug + new content → trigger fires on content-column IS DISTINCT FROM and bumps generation. 3. Idempotent UPDATE: capture with the SAME content → trigger short-circuits, no bump. Cache freshness preserved on re-runs. Per codex #5 strengthening: noEmbed: true is set explicitly so the test doesn't require API keys (test runs against pure PGLite). Test results: - bun test test/e2e/capture-generation-regression.test.ts: 3 pass / 0 fail - bun run typecheck: clean
…+ llms regen Single combined v0.40.5.0 CHANGELOG entry folds in v0.40.3.0 contextual retrieval content + v0.40.5.0 wave additions (cache gate + mode-switch UX + mounts/sources CLI + RemediationStep refactor). Voice per CLAUDE.md: ELI10 lead, plain language, paste-ready commands, tier table, "Things to watch", "What we caught and fixed before merging" (summarizes the 8 codex findings + 3 design decisions in user-facing terms), "Itemized changes", "## To take advantage of v0.40.5.0" mandatory self-repair block. CLAUDE.md: new section "Key commands added in v0.40.5.0 (contextual retrieval + cache gate + 4 CLI verbs)" listing the 4 new mount verbs, sources set-cr-mode, mode-switch UX, KNOBS_HASH_VERSION bump, 3 new Minion handlers, and the 3 new modules (remediation-step, query-cache-gate, mode-switch-ux). skills/migrations/v0.40.5.0.md: new migration skill with feature_pitch frontmatter for the auto-update agent. Documents the 6 master commits merged in, migration v90 (renumber from v81) + v91 (trigger), the optional opt-up to tokenmax, per-source CR mode overrides, mount frontmatter trust, the soft kill switch, and the backward-compat guarantees. bun run build:llms refreshed llms.txt + llms-full.txt: - llms.txt: 4314 bytes - llms-full.txt: 578257 bytes Test results: - bun test test/build-llms.test.ts: 7 pass / 0 fail (committed bundles byte-match generator output)
KNOBS_HASH_VERSION bumped 4→5 per D8 (sequenced behind salem's pending v=4 graph-signals work). Three test files held stale ==3 / ==4 assertions: - test/search-mode.test.ts: assertion + comment updated to v=5. - test/search/knobs-hash-reranker.test.ts: assertion + describe name updated to v=5 ladder. - test/cross-modal-phase1.test.ts: assertion + name updated to v=5. reindex.test.ts "skips pages already at current chunker_version" — the v0.40.3.0 reindex predicate (`chunker_version < CURRENT OR contextual_retrieval_mode IS NULL`) caught the should-skip page because its CR mode was NULL. Fixed by seeding `contextual_retrieval_mode = 'title'` on the should-skip row. reindex.test.ts "idempotent: re-run on a fully-updated brain reports nothing to do" — by design, `--no-embed` reindex bumps chunker_version but skips CR-state stamping (import-file.ts:457-466 documents this). Fixed by manually stamping `contextual_retrieval_mode = 'title'` between the first and second reindex calls so the brain matches the "fully updated" state the idempotency test name implies. Production embed flow stamps both in one pass; the test uses --no-embed only to avoid requiring API keys. Test results: - bun run verify (typecheck + 4 pre-checks): clean - bun run test: 9482 pass / 0 fail / 0 skip across 410s
…master) Master is at v0.40.2.0; v0.40.3.0 is genuinely the next free slot. The wave was originally planned as v0.40.5.0 sequenced behind salem (PR #1300 = v0.40.4.0) but the user is shipping THIS branch as v0.40.3.0 because: 1. v0.40.3.0 IS the canonical version slot for the contextual retrieval cathedral (matches branch name garrytan/v0.40.3.0-contextual-retrieval). 2. Master is at v0.40.2.0 — v0.40.3.0 is the immediate next slot, not a collision. 3. salem's v0.40.4.0 + any v0.40.5.0 work sit ON TOP of this in the landing train, not under it. Mechanical rename only — no content changes from the v0.40.5.0 commit sequence (T1-T11 wave is preserved verbatim, just relabeled): - VERSION + package.json: 0.40.5.0 → 0.40.3.0 - bun.lock: refreshed (no dep changes) - CHANGELOG.md: ## [0.40.5.0] header → ## [0.40.3.0] + body references - skills/migrations/v0.40.5.0.md → skills/migrations/v0.40.3.0.md (previous v0.40.3.0.md file overwritten with the richer T9 content) - CLAUDE.md: "Key commands added in v0.40.5.0" → "v0.40.3.0" - 30 source + test files: comment references swept via sed s/0.40.5.0/0.40.3.0/g - llms.txt + llms-full.txt: regenerated Migration numbering UNCHANGED: v90 (renamed from original v81 because master took v82-v88) and v91 (new trigger migration) stay at v90/v91 — the version slot is orthogonal to the migration ledger collision. KNOBS_HASH_VERSION = 5 stays — sequenced behind master's v=4 schema-pack work; salem's v=4 graph-signals will rebump to v=5 if it lands first. Test results after rename: - bun run verify: clean (typecheck + 7 pre-checks) - bun run test: 9482 pass / 0 fail / 0 skip Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Master shipped v0.40.2.0 (trajectory routing for temporal + knowledge_update in gbrain think + LongMemEval) while this branch was off-master. Folded in mechanically with these conflict resolutions: - VERSION → 0.40.3.0 (kept; > master's 0.40.2.0, valid for CI version-gate) - package.json → 0.40.3.0 (matched VERSION; bun install refreshed lockfile) - src/core/migrate.ts → kept both sides: inserted master's v89 (facts_event_type_column) BEFORE our v90 (contextual_retrieval_columns) + v91 (pages_generation_trigger_and_bookmark). Numeric order preserved. - CHANGELOG.md → kept both entries: our v0.40.3.0 entry on top, master's v0.40.2.0 entry below, v0.40.1.0 and earlier untouched. No content lost; no migration renumber needed (v90 + v91 were already chosen in T2 to clear v89 — codex D7 plan-review predicted this exact landing sequence and pinned the numbering). Trio audit post-merge: - VERSION: 0.40.3.0 - package.json: 0.40.3.0 - CHANGELOG header (topmost): ## [0.40.3.0] - 2026-05-22 Verification: - bun run typecheck: clean - bun test test/migrate.test.ts test/schema-bootstrap-coverage.test.ts: 161 pass / 0 fail (migration v89 + v90 + v91 all apply cleanly in ascending order)
…ction (CI Tier 1) CI Tier 1 (Mechanical) failed on real Postgres with: ERROR: CREATE INDEX CONCURRENTLY cannot run inside a transaction block STATEMENT: <v91 multi-statement SQL block including CREATE INDEX CONCURRENTLY ...> Root cause: postgres.js's multi-statement `.unsafe()` wraps the entire block in an implicit transaction. `transaction: false` on the migration entry doesn't help — the implicit wrap happens at the driver layer, below the migration runner. CONCURRENTLY refuses to run inside any transaction. Fix: rewrite v91 using the v14 pages_updated_at_index handler pattern — `sql: ''` + `handler:` function that splits the work into separate `engine.runMigration()` calls: 1. Columns + trigger function + trigger (single multi-statement runMigration — ALTER/CREATE FUNCTION/CREATE TRIGGER are transaction-safe). 2. On Postgres only: pre-drop invalid index remnant via `pg_index.indisvalid` (matches v14 pattern for retry safety after a failed CONCURRENTLY left a half-built index with the target name). 3. CREATE INDEX CONCURRENTLY as a standalone runMigration call (separate statement = no implicit transaction wrap). 4. PGLite: plain CREATE INDEX (no CONCURRENTLY needed — single writer). Verified against real Postgres (pgvector:pg16): - schema_version=91 after init - pages_generation_idx exists with btree shape - bump_page_generation_trg installed - test/e2e/postgres-bootstrap.test.ts + test/e2e/schema-drift.test.ts: 8 pass / 0 fail - bun test test/migrate.test.ts test/schema-bootstrap-coverage.test.ts: 161 pass / 0 fail - bun run typecheck: clean
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
May 28, 2026
* upstream/master: (22 commits) v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377) v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403) v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364) v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352) v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367) v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351) v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350) v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345) v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313) v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333) v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327) v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324) v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322) v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300) v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323) v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296) v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298) v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128) v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308) v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two big things shipped, plus four CLI improvements that close v0.40.3.0 deferral items.
Contextual retrieval cathedral (T1-T11 of the original v0.40.3.0 plan):
<context>Page Title</context>\nchunk textgoes into the vector; the canonicalchunk_textstays unwrapped for snippets, FTS, reranker, debug output. Asymmetric design (Voyage + ZeroEntropy distinguish query/document inputType natively; OpenAI symmetric users still benefit from document-side orientation).conservative(no wrapper),balanced(title-only, default),tokenmax(per-chunk Haiku synopsis — $1-5 backfill / 10K pages).contextual_reindex_per_chunkMinion handler (PROTECTED).--unsafe-bypass-dream-guardretained.gbrain config set search.contextual_retrieval_disabled true.Cache invalidation gate (the deferred v0.40.3.0+ item):
pages.generation BIGINT+query_cache.max_generation_at_store BIGINT+bump_page_generation_fntrigger (BEFORE INSERT OR UPDATE on 10-column allow-list per D6) +pages_generation_idxbtree.{}snapshot are vacuously valid (IRON-RULE backward compat).src/core/search/query-cache-gate.ts(NEW): pure helpers (buildPageGenerationsSnapshot,validateCacheRowAgainstPages,CACHE_GATE_WHERE_CLAUSE).Four deferred-item closures (the other v0.40.3.0+ deferrals):
gbrain mounts enable/disable/trust-frontmatter/untrust-frontmatter <id>— 4 new verbs sharing therunSetMountFlaghelper. MountEntry interface extended withtrust_frontmatter_overrides?: boolean(default false; mounts opt in explicitly per D15 security posture).gbrain sources set-cr-mode <id> <none|title|per_chunk_synopsis>— per-source CR mode override. Passunset/default/""to clear (NULL falls through to global). Missing source ID fails loudly with paste-readygbrain sources listhint.gbrain config set search.mode tokenmax— mode-switch banner fires with cost preview + (on TTY + active Minion worker) offers to submitgbrain reindex --markdownas a Minion job. Non-TTY callers get a paste-ready hint to stderr. Suppress withGBRAIN_NO_MODE_SWITCH_UX=1for CI fixtures.Remediationlifted from brain-score-recommendations.ts intosrc/core/remediation-step.tsasRemediationStepwithmakeRemediationStepfactory using canonical-JSON-based deterministic IDs (codex D12 Bug 2). Doctor's integrity + sync_failures checks emit RemediationSteps via the factory. 3 new Minion handlers (lint-fix,integrity-auto,sync-retry-failed) wire into the consumer pattern.sync-skip-faileddeliberately NOT emitted per codex D12 Bug 3 (auto-skip hides data loss; remains as direct CLI).Other folded-in work:
embedded_attiebreaker. Closes long-standing v0.35.x TODO.Test Coverage
Tests: 9389 → 9482 (+93 new across 7 new test files + 5 extended files)
Pre-Landing Review
Two rounds of
/plan-eng-review+ one/codexoutside voice on top of the original v0.40.3.0 codex pass. 8 codex findings absorbed before merge:ingest_captureusesINSERT ... ON CONFLICT DO UPDATE; fresh slugs took INSERT path. Fixed: trigger now BEFORE INSERT OR UPDATE; INSERT branch setsgeneration = COALESCE(MAX, 0) + 1.(generation)(not DESC). CONCURRENTLY on Postgres.TODOS
No TODOs.md updates in this PR (the deferred items list at the top of v0.40.3.0's CHANGELOG was the de facto TODO; this PR closes them).
Test plan
🤖 Generated with Claude Code