v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence by garrytan · Pull Request #1657 · garrytan/gbrain

garrytan · 2026-05-30T16:17:17Z

Summary

Supersedes the docs-only RFC in #1616 (closed) with the verified diagnosis + the complete fix. A real miss prompted it: an agent searched "Greek amphitheater," didn't confidently find the existing concept page (it came back at a weak 0.64 via a throwaway body chunk), and wrote a duplicate stub on top of 2,000 words. Four layers + a contract so that can't happen again:

Retrieval layers

Per-page max-pool (6f5028a1) — searchVector (both engines) pools to one row per page (its best chunk) via a shared buildBestPerPagePoolCte; a strong page can't be crowded out by another page's chunks.
Title-phrase boost (dff45961) — a query that's a phrase in page.title gets a bounded, floor-gated boost (search.title_boost).
Alias hop (3cd0f1fd, 91669d23) — frontmatter aliases: project into a new page_aliases table (migration v108) + a query-time hop, so chosen names ("Hall of Light", "明堂") resolve to the page. gbrain reindex --aliases backfills existing pages.
Evidence contract (e34f60df) — every result carries evidence + create_safety so the agent's don't-duplicate decision keys off why a page matched, not a fuzzy score.

Surface + UX

search op → cheap-hybrid everywhere (CLI + MCP); query stays full-control; search.mcp_keyword_only escape hatch. gbrain search "<text>" routes to hybrid; modes/stats/tune stay subcommands (d17bb075).
gbrain search diagnose "<q>" --target <slug> — Phase-0 retrieval trace (7a3e4d2a).
Per-call --mode for local/trusted callers (remote uses configured mode).

Eval + observability

NamedThingBench: gbrain eval retrieval-quality + hermetic CI gate over 7 query families (1d697fbe).
Rank-1 match-score drift telemetry (migration v109) in gbrain search stats (eee121a2).

Fixes

A page represented by its weakest chunk (the incident).
Source-isolation P0s (322e2990): per-page pool and alias hop now key on (source_id, slug), not bare slug — caught by the codex adversarial pass.

Test Coverage

~120 new test cases across the wave (max-pool dilution regression, title-match, alias engine round-trip + hop + source-isolation, evidence classifier, per-call mode, dispatch routes, telemetry round-trip, NamedThingBench harness + hermetic gate, pre-migration fail-open, diagnose). Real-Postgres engine-parity 23/23 (incl. the composite-key pooling). Full unit suite green (exit 0); bun run verify 29/29.

Pre-Landing Review

Built test-first throughout; bun run verify green on every step.

Adversarial Review

Codex (cross-model) caught 2 genuine P0 source-isolation bugs (per-page pool + alias hop keyed on slug-only) — both fixed in-wave (322e2990), real-PG parity re-verified. P1/P2 (evidence-label calibration, deep-pagination candidate budget, telemetry rolling-deploy gap) filed as TODOs — documented tradeoffs, not blockers.

Plan Completion

All 11 plan workstreams (T0–T10) DONE. Plan: ~/.claude/plans/system-instruction-you-are-working-frolicking-brooks.md (CEO/eng/codex outside-voice reviewed, 19 decisions).

TODOS

3 follow-ups filed (evidence calibration, page-bounded pagination, telemetry rolling-deploy).

Documentation

Corrected RETRIEVAL_MAXPOOL_INCIDENT.md (closes #1616's narrative), RETRIEVAL.md named-thing section, METRIC_GLOSSARY.md (Hit@1/3, avg_rank1_score, create_safety), CLAUDE.md key-files + llms regenerated.

To take advantage

gbrain upgrade applies migrations v108 + v109. Then gbrain reindex --aliases to backfill names on existing pages. The search op is now hybrid by default; set search.mcp_keyword_only true for the old keyword-only behavior.

Test plan

Full unit suite green (exit 0, ~3650 tests)
Real-Postgres engine-parity 23/23
bun run verify 29/29

🤖 Generated with Claude Code

T1 of the retrieval-cathedral wave (supersedes #1616). Vector search returned chunk-grain top-k with no DISTINCT ON, so a page could be represented by a weak chunk while a hub page's chunks crowded a distinct page's strong chunk out of the candidate set entirely. Keyword search always pooled per page; the vector path did not. - New shared buildBestPerPagePoolCte() in sql-ranking.ts — single source of truth consumed by searchKeyword + searchVector across postgres + pglite, so the two engines can't drift (the recurring parity bug class). - searchVector both engines: compute score as a select-list expr (HNSW ORDER BY stays pure-distance), pool DISTINCT ON (slug) over the full candidate set before the user LIMIT, deterministic tiebreak (slug, score DESC, page_id ASC, chunk_id ASC). - All keyword pooling blocks refactored onto the shared builder (DRY). - Regression test: a hub page's chunks no longer crowd out a distinct page's strong chunk; results are one-per-page by best chunk. Fails on old path. Verified: real-Postgres engine-parity 22/22, PGLite hermetic suite green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

T2 of the retrieval-cathedral wave. A query that is a phrase from a page's title ("Greek amphitheater" -> "The Mingtang - Indoor Greek Amphitheater") matched a weak body chunk instead of being recognized as a title hit. Names of things deserve weight. - New pure title-match.ts: isTitlePhraseMatch (contiguous token-run inside page.title OR exact full-title match). Precision guards: >= 2 content tokens OR exact full-title; stopword filter; token-boundary match (no raw substring). Reused by the eval later so production + bench can't drift. - applyTitleBoost post-fusion stage in hybrid.ts: reads page.title (not the brittle "first chunk"), floor-ratio-gated, stamps title_match_boost for --explain, never touches base_score (the agent's dedup confidence). - ModeBundle.title_boost knob (1.25, on in all modes - cheap gated correctness fix), search.title_boost config key, dashboard description. - KNOBS_HASH_VERSION 6 -> 7 so a boost-on cache write can't serve a boost-off lookup; all version-pin + canonical-bundle assertions updated. - 18 new tests (matcher 13 + stage 5); typecheck clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rch-max-pool

Free-text alias resolution for search. gbrain stored a page's chosen names in pages.frontmatter `aliases:` JSONB but search never consulted them, so a query like "Hall of Light" or "明堂" couldn't surface the "Mingtang" page. DELIBERATELY SEPARATE from slug_aliases (re-grounded against current code): - slug_aliases: old-slug -> canonical-slug (wikilink/get_page redirect, populated only from concept-redirect conversions) - page_aliases: normalized free-text name -> canonical slug (search hop) Overloading slug_aliases would muddy two distinct semantics, so this is a new table, not an extension (honors DRY by keeping concepts separate). - src/core/search/alias-normalize.ts: ONE normalizeAlias() (NFKC + lowercase + ws-collapse + quote-strip) + normalizeAliasList() shared by the write (ingest) and read (search) paths so they match on the same key (CQ2). - Migration v108 page_aliases (source_id, alias_norm, slug); btree (source_id, alias_norm) for indexed-equality hop, NOT ILIKE; unique TRIPLE (not source_id+alias_norm) so two pages may claim one alias — collisions reported + resolved at query time, not blocked at ingest (Codex#8). Mirror in pglite-schema.ts; Postgres fresh gets it from the migration. - engine.resolveAliases(aliasNorms, {sourceId|sourceIds}) read + setPageAliases(slug, source, aliasNorms) write, both engines, source-scoped. - 17 tests: normalize round-trip, collision, source-scope, replace, clear. Ingest projection + the hybridSearch alias hop land next (T3 wiring). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wires the page_aliases data layer into ingest (write) and hybridSearch (read) so a query that is a page's declared chosen name surfaces that page — the named-thing class neither max-pool nor title-boost can fix (true synonyms with zero surface overlap: "Hall of Light" / "明堂" -> the Mingtang page). - Ingest projection (import-file.ts): after the page write commits, normalizeAliasList(frontmatter.aliases) -> engine.setPageAliases. Always called (even []) so removing an alias clears its row; content_hash includes non-timestamp frontmatter so alias edits reach this path, not the skip branch. Fail-soft + pre-v108-safe (isUndefinedTableError swallowed). - applyAliasHop (hybrid.ts), AFTER rerank so a named query reliably surfaces its page: FULL normalized-query exact match only (no substring/n-grams), skip >6-token prose queries, present-boost 1.10x / inject absent canonical at top-of-organic + epsilon (never absolute 1.0, D3), collisions alpha-ordered + capped at 3, fail-open on pre-v108 / lookup error (D9). Stamps alias_hit for the T4 evidence contract. - SearchResult.alias_hit attribution field. - 8 tests: inject/boost/CJK/no-match/long-skip/collision + ingest projection round-trip + alias-removal-clears. 73 pass across the T1/T2/T3 + import suite. Backfill of existing pages' aliases lands as T8 (reindex --aliases). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… per-call mode (T4) The agent-facing fix for the incident's ROOT behavior: tonight the agent read a single blended 0.64 score, decided "no strong match, safe to write a new page", and wrote a duplicate on a developed concept page. A blended RRF/cosine score is not a calibrated probability, so the don't-duplicate decision must key off WHY a page matched, not a raw number. - evidence.ts: classifyEvidence (alias_hit > exact_title_match > high_vector_match > keyword_exact > weak_semantic) + createSafetyFor (exists | probable | unknown). stampEvidence runs at the end of every hybrid return path (main + both keyword fallbacks). SearchResult gains evidence + create_safety. The agent keys don't-duplicate off create_safety='exists', not a score threshold. - search op → cheap-hybrid everywhere (D4/D15): full vector+keyword+RRF+pool+ title+alias, expansion OFF (no per-call LLM cost); `query` stays full-control. search.mcp_keyword_only escape hatch (D17) keeps the old keyword-only behavior for operators who don't want query text sent to an embedding provider. - Alias hop + evidence now also run on the keyword-only fallback paths (the named-thing fix is most valuable exactly when vector is unavailable). - Per-call `mode` (D5): honored ONLY for local/trusted callers (ctx.remote=== false) so a remote OAuth client can't escalate to costly tokenmax; local + unknown mode rejects loudly; threaded into resolveSearchMode + the cache key. - 30 tests (evidence classifier incl. before/after-incident cases, per-call mode gate, alias hop). Updated mcp-eval-capture to the new cheap-hybrid contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

After T4 made the `search` op cheap-hybrid, `gbrain search "x"` already does the right thing — but `gbrain search modes/stats/tune` would have run a hybrid search for the literal word "modes" instead of opening the config dashboard (the op intercepts before the unreachable handleCliOnly dashboard path). Add a pre-dispatch interception in main(): `search` + subArgs[0] in {modes,stats,tune} → runSearch dashboard (with the v0.41.6.0 read-only connect+ dispatch 10s timeout preserved); everything else (free-text) falls through to the cheap-hybrid `search` op. Subprocess test pins all three routes: modes/stats → dashboard, free-text → search op ("No results", not "Unknown subcommand"). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The eval that makes the retrieval-maxpool incident impossible to reintroduce silently. 7 query families, each a failure class the incident exposed: title-substring, generic-to-named, alias-synonym, multi-chunk-dilution, short-vs-rich, graph-relationship, hard-negative. - src/eval/retrieval-quality/harness.ts: pure scoring (Hit@1/Hit@3/MRR per family) + injected SearchFn (CLI uses hybridSearch; tests stub it) + evaluateGate. D12 gate: hard-gate the families that ARE the bug from day one (title-substring Hit@1>=0.95, alias-synonym Hit@1>=0.98, dilution Hit@3=1.0), warn-then-enforce the softer families. Env-overridable floors. - `gbrain eval retrieval-quality <fixture.jsonl> [--json] [--source]` + dispatch in eval.ts. Exit 0 PASS / 1 FAIL / 2 USAGE. - Synthetic fixture (placeholder names only, privacy-grep guarded) + hermetic gate test: seeds a synthetic brain, forces the keyword+title+alias path (embed transport stubbed to throw — free, deterministic), asserts the bug families pass. The vector max-pool guarantee is pinned separately by searchvector-maxpool.test.ts. - CI gate: the hermetic test is a normal unit test, so it runs in every PR shard — the gate is live on every change. - 23 tests (harness unit + hermetic gate + fixture privacy guard). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Standing observability so a retrieval regression is caught before a human hits it in chat (like tonight). Aggregate columns on search_telemetry (NOT per-query rows, D10): sum_rank1_score + count_rank1 + 3 coarse buckets (<0.6 / 0.6-0.85 / >=0.85). The mean rank-1 base_score is the headline; a downward drift = retrieval quality regressing. - hybrid.ts: capture rank-1 base_score at all three return paths, thread through emitMeta → recordSearchTelemetry opts (like results_count). - telemetry.ts: Bucket + record + flush ON CONFLICT-add + readSearchStats expose avg_rank1_score (null when no samples — no NaN) + rank1_distribution. - Migration v109 ADD COLUMN IF NOT EXISTS (both engines; search_telemetry lives only in migration v57, so the v57+v109 chain covers fresh + upgrade). Columns exempted in schema-bootstrap-coverage (no forward-ref index → no bootstrap need). - `gbrain search stats` surfaces the avg + bucket line; JSON envelope auto-carries the fields. "true-positive" wording dropped per Codex#14 — production has no labels, so this is an unlabeled rank-1 score histogram; labeled calibration lives in NamedThingBench (T6). - 3 round-trip tests (mean+buckets, no-result excluded, empty=null). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Import-time projection (T3) covers new + changed pages; this backfills EXISTING pages whose frontmatter `aliases:` predate v108 / the projection. Walks listAllPageRefs (cheap cross-source (source_id, slug) enumeration), reads each page's frontmatter aliases, writes page_aliases via setPageAliases. Idempotent (setPageAliases replaces) so re-running is convergent — no op-checkpoint needed (fast, no embedding). --dry-run reports would-write counts, --source narrows, --limit caps, --json envelope, progress reporter. Wired into the `reindex` dispatch alongside --markdown / --multimodal. 4 tests: backfill from array + comma-scalar frontmatter, --dry-run writes nothing, idempotent second run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Pins that pre-v108 brains (no page_aliases table) keep working: applyAliasHop returns input unchanged + doesn't throw, importFromContent with frontmatter aliases still imports (projection swallows table-missing via isUndefinedTableError), and resolveAliases surfaces the error for the caller to catch. Completes the T9 mandatory regression set (dilution → searchvector-maxpool, dispatch → cli-search-dispatch, MCP contract → mcp-eval-capture, engine parity → engine-parity 22/22, pre-migration → here). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… (T0) The operator-facing trace the user runs against the production brain to pin which retrieval layer surfaces (or misses) a target page — the diagnostic the plan front-loaded so we don't ship a fix that doesn't move the incident. `gbrain search diagnose "<query>" --target <slug> [--json] [--source]` reports, for the target: keyword rank+score, vector rank+score (skipped/graceful if no embedding provider), whether the query is a registered alias, and the hybrid final rank + evidence + create_safety + which boosts fired (title/alias). The verdict names the layer that surfaces the target at rank 1 (or "none"), telling you whether the lever is max-pool/innerLimit (vector) vs title/alias. Wired into the `search` dispatch alongside modes/stats/tune (60s timeout since it runs real retrieval). 2 hermetic tests (alias-query trace + title-phrase trace). For the Mingtang incident, run: gbrain search diagnose "Greek amphitheater" --target projects/new-greek-theater/concept_v0 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ssary (T10) - RETRIEVAL_MAXPOOL_INCIDENT.md: replaces closed PR #1616's RFC with the verified record — what happened, the disease, the corrections to the RFC's mechanics (search was keyword-only, --mode unthreaded, hybrid already pooled at dedup, aliases dead to search), the four-layer fix that shipped, and the triage commands (search diagnose / reindex --aliases / search stats / eval retrieval-quality). - RETRIEVAL.md: new "Named-thing retrieval" section documenting per-page pool + title boost + alias hop + the evidence contract, reconciling the doc with the shipped pipeline (closes the doc/reality gap). - metric-glossary.ts + regenerated METRIC_GLOSSARY.md: Hit@1, Hit@3, avg_rank1_score (drift signal, not labeled accuracy), and create_safety (the evidence contract) now carry plain-English glossary entries. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…fixup) The banned-name literal list itself tripped check-privacy/check-test-real-names. Replace it with the load-bearing assertion: every fixture slug must be an *-example placeholder (no real brain page can be referenced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… adversarial) Codex outside-voice caught two source-isolation P0s in the retrieval wave — the exact class the v0.34.1 seal guards. Both fixed before merge. P0-1: buildBestPerPagePoolCte pooled on `slug` alone. In a federated brain, two pages with the same slug in different sources collapsed before ranking/pagination (the neighbor-source page dropped). Now DISTINCT ON (COALESCE(source_id,'default'), slug) — composite key matching dedup.ts's pageKey. Also fixes the PRE-EXISTING keyword-path bug (best_per_page was slug-only before this wave); real-PG parity 23/23. P0-2: the alias hop dropped source_id. resolveAliases returned bare slugs and applyAliasHop hydrated via getPage(slug, undefined), so a federated caller could get the default-source page injected or the right allowed-source page suppressed. resolveAliases now returns {slug, source_id} pairs; applyAliasHop matches by (source_id, slug) and fetches each canonical in its OWN source. Regression tests: alias hop boosts only the aliased source (not same-slug in another source); resolveAliases keeps cross-source same-slug distinct. Deferred as documented tradeoffs (TODO): evidence high_vector_match label uses blended base_score not pure cosine; deep-pagination candidate budget is chunk-bounded; telemetry writes swallow errors pre-v109 on rolling deploys. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Version trio + CHANGELOG header + CLAUDE.md key-file annotations + TODOS heading + regenerated llms bundles, all moved to 0.41.34.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rch-max-pool # Conflicts: # CHANGELOG.md # CLAUDE.md # TODOS.md # VERSION # llms-full.txt # package.json # src/core/migrate.ts

…rch-max-pool # Conflicts: # CHANGELOG.md # CLAUDE.md # TODOS.md # VERSION # llms-full.txt # package.json # src/core/migrate.ts # src/core/search/hybrid.ts

Two CI failures surfaced after the master merges that brought the branch to 111 migrations: 1. shard 1 — `ALL_METRICS roster > matches the renderer output (no orphans)`: the merge took master's `renderMetricGlossaryMarkdown` whose `groups` array lacked this branch's 4 retrieval-quality keys (hit@1, hit@3, avg_rank1_score, create_safety). `ALL_METRICS` (derived via Object.keys) kept them, so the roster test saw 4 orphans. The freshness check (check:eval-glossary) passed because renderer-output == committed doc — it can't catch a renderer that drops a metric; the roster test can. Restored the "Retrieval-Quality / Evidence Metrics (NamedThingBench)" group + regenerated docs/eval/METRIC_GLOSSARY.md. 2. shard 2 — facts-anti-loop's two engine-dependent put_page tests failed while the two engine-free extractFactsFromTurn tests passed (the signature of a partially-failed beforeAll). This file has a documented PGLite-cold-start-under-deep-shard-load timeout history; the 30s budget was tuned for 95 migrations and the chain is now 111. Bumped to 60s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Follow-up to the prior hook-timeout bump, which was the wrong theory: the [58ms]/[71ms] body times in the re-run prove beforeAll did NOT time out — the engine connects and the two put_page tests run and fail for real, while the two engine-free extractFactsFromTurn tests in the same file pass. put_page (via dispatchToolCall) touches process-global singletons (the facts queue + the AI gateway used by importFromContent's embed step). Some sibling file in the 78-file shard-2 process leaves residual global state that makes put_page's pre-backstop path fail on the CI runner. The failure is NOT reproducible alone, in a Linux oven/bun:1 container, or in a full local shard-2 run (1172 pass) — only on the GitHub runner, deterministically. Per CLAUDE.md's test-isolation rules, a test coupled to shared process state belongs in its own process. Renamed to *.serial.test.ts so it runs in the dedicated serial-tests job (scripts/run-serial-tests.sh spawns a fresh `bun test` per serial file), where it passes deterministically; test-shard.sh excludes serial files from the matrix. Updated the comment to reflect the real cause and refreshed the test-weights.json key. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The prior serial-move theory was incomplete. The real, single root cause behind all three shard failures (2, 5, 10) is cross-file AI-gateway config pollution within a shard's bun process: - A test calls configureGateway() and doesn't restore the gateway on exit. The legacy-embedding preload pins OpenAI/1536 ONCE at process start and re-pins per-test ONLY when the gateway slot is empty — so a leaker that reconfigured the gateway to the v0.37 default (zeroentropyai:zembed-1 / 1280-d) and never reset poisons every later file in the shard. - Victim A (shard 5, test/search/searchvector-maxpool.test.ts): runs initSchema in beforeAll under the leaked gateway → content_chunks.embedding becomes vector(1280) → inserting its hardcoded 1536-d basis vectors throws pgvector CheckExpectedDim. - Victims B/C (shard 10 facts-backstop-gating, shard 2 facts-anti-loop): put_page's importFromContent embeds by design (embed failure PROPAGATES, Codex C2). Under a leaked fake-key gateway the embed step 401s and put_page returns isError → the backstop assertions fail. My branch's shard re-partition (added test files + weight changes) merely co-located leakers with victims; the hazard was latent. Fixes (root cause + self-sufficient victims): - test/search/rerank.test.ts (the shard-5 leaker): add afterAll(resetGateway). Its stub omits embedding_model, so it fell back to the ZE/1280 default; now it restores the empty slot so the preload re-pins legacy for the next file. - test/search/searchvector-maxpool.test.ts: pin configureGateway(openai/1536) in beforeAll BEFORE initSchema (initSchema runs before any preload beforeEach, so it can't rely on the inherited slot). - test/facts-backstop-gating.test.ts + test/facts-anti-loop.test.ts: reset the gateway in beforeEach so put_page's embed is a graceful no-op; reverted anti-loop from the serial quarantine back into the matrix (the serial move was the wrong fix for a gateway-state problem). Validated deterministically: a non-resetting leaker that poisons the gateway to ZE, run first in one bun process, no longer breaks any of the three victims (14/14 pass). verify 29/29, typecheck clean, isolation lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Upstream shipped again after the last sync (v0.41.34.0 "retrieval cathedral" — garrytan#1657), which took version 0.41.34.0 (ours) and added migrations v110 (page_aliases) + v111 (search_telemetry_rank1_columns). Re-synced so PR garrytan#1388 merges clean. Re-resolutions: - Version: bumped 0.41.34.0 → 0.41.35.0 (next after upstream's 0.41.34.0). - Migration: our links_link_source_widen renumbered v110 → v112 (upstream now owns v110 + v111). Full-union CHECK preserved ('markdown','frontmatter','manual','mentions','wikilink-resolved'). - CHANGELOG: our 0.41.35.0 entry on top, upstream's 0.41.34.0 below. - schema-embedded.ts + llms-full.txt regenerated; link_source union verified intact in schema.sql + pglite-schema.ts after auto-merge against the new page_aliases / alias-hop schema. Verified: typecheck clean, privacy/jsonb/source-id guards pass, 376 surface + E2E tests green; migrations apply [110] page_aliases ... [112] wikilink-basename. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* upstream/master: v0.41.36.0 feat(mcp): publish agent skills (list_skills / get_skill) for thin clients (garrytan#1661) v0.41.35.0 feat(guardrails): vendor-neutral content guardrail seams (supersedes garrytan#1652) (garrytan#1660) v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence (garrytan#1657) v0.41.33.0 feat(search): intent-aware adaptive return-sizing + agent-facing query param (garrytan#1640) v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes garrytan#1623) (garrytan#1656) v0.41.31.0 feat(embed): delta-aware sync --all cost gate + real stale-embedding semantics (garrytan#1632) v0.41.30.0 fix(brainstorm/lsd): --save writes the advertised .md file via canonical ingestion path (garrytan#1655) # Conflicts: # src/core/operations.ts

garrytan and others added 18 commits May 29, 2026 20:23

Merge remote-tracking branch 'origin/master' into garrytan/vector-sea…

ea86a3c

…rch-max-pool

chore: bump version and changelog (v0.41.30.0)

8ce525e

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

docs: v0.41.30.0 retrieval cathedral — CLAUDE.md key files + llms regen

4b20f86

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chore: renumber release v0.41.30.0 → v0.41.34.0 (queue moved)

148ebfd

Version trio + CHANGELOG header + CLAUDE.md key-file annotations + TODOS heading + regenerated llms bundles, all moved to 0.41.34.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

garrytan changed the title ~~v0.41.30.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence~~ v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence May 30, 2026

garrytan and others added 5 commits May 30, 2026 10:42

Merge remote-tracking branch 'origin/master' into garrytan/vector-sea…

967e011

…rch-max-pool # Conflicts: # CHANGELOG.md # CLAUDE.md # TODOS.md # VERSION # llms-full.txt # package.json # src/core/migrate.ts

Merge remote-tracking branch 'origin/master' into garrytan/vector-sea…

0c9fb4b

…rch-max-pool # Conflicts: # CHANGELOG.md # CLAUDE.md # TODOS.md # VERSION # llms-full.txt # package.json # src/core/migrate.ts # src/core/search/hybrid.ts

garrytan merged commit d6db3f0 into master May 30, 2026
21 checks passed

This was referenced Jun 8, 2026

feat: pluggable embedding adapter with dynamic schema dimensions #49

Closed

Add optional Gemini embeddings support #58

Closed

feat: multi-provider embedding support (7 providers incl. China: DashScope, DeepSeek, Zhipu) #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence#1657

v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence#1657
garrytan merged 23 commits into
masterfrom
garrytan/vector-search-max-pool

garrytan commented May 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Pre-Landing Review

Adversarial Review

Plan Completion

TODOS

Documentation

To take advantage

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented May 30, 2026 •

edited

Loading