v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) by garrytan · Pull Request #1296 · garrytan/gbrain

garrytan · 2026-05-22T15:50:59Z

Summary

v0.40.2.0 trajectory routing wave — closes the gap between gbrain's
typed-claim substrate (shipped v0.35.7, currently dormant at answer-gen
time) and the production gbrain think surface that should be grounding
temporal/knowledge-update answers with it.

Six commits, ~3.1K LOC, 81 new tests:

feat(facts) (Commit 1): Substrate. Migration v82 adds nullable
facts.event_type. TrajectoryPoint.event_type + TrajectoryOpts.kind
filter. New shared src/core/trajectory-format.ts consumed by both
think and longmemeval (no DRY violation). INJECTION_PATTERNS
extended to escape </trajectory> adversarial sequences. Founder-
scorecard + eval-trajectory pass kind: 'metric' explicitly for
clarity (no behavior change — they already skipped NULL-metric rows).
feat(think) (Commit 2): gbrain think trajectory integration,
default ON. New src/core/think/intent.ts (regex-first classifier,
no LLM call on the 'other' fast path). New
src/core/think/entity-extract.ts shared with longmemeval. Per-
candidate findTrajectory with 5s Promise.race timeout + concurrency
cap 3. MCP think op handler extracts sourceScopeOpts(ctx) for
federated-read OAuth client scoping. think.trajectory_enabled=false
is the kill switch.
feat(longmemeval) (Commit 3): Inline Haiku claim extractor.
Content-hash cache (cuts 3-iteration benchmark run from ~$1.50 to
~$0.50 when sessions repeat across questions). Per-question alias
map collapses "Marco" + "Marco Smith" + "marco" to one slug; fresh
map per question, no cross-question leak. Fail-open on malformed
JSON / Haiku throw / insert collision.
feat(longmemeval) (Commit 4): Intent routing + prompt splice +
methodology disclosure. Per-question classifyIntent prefers
dataset's question_type field, falls back to the shared regex set
from think (single source of truth). --no-trajectory CLI flag for
A/B baselining. JSON envelope adds intent, trajectory_points,
entity_resolved, resolution_source, methodology_note: "extractor=haiku-preprocess-full-haystack-v1" per the Codex D1
disclosure contract — the published temporal-reasoning number is
"gbrain + Haiku-preprocess pipeline" vs "gbrain alone", NOT
directly comparable to LongMemEval's published baselines without
this note.
merge master + chore: bump: master jumped to v0.38.0.0
(ingestion cathedral, v0.38.0.0 ingestion cathedral — gbrain capture + write-through + IngestionSource contract #1275) mid-wave; my v81 facts_event_type_column
migration renumbered to v82. Engine + test code updated to v82.
Master's v81 (pages_provenance_columns) tests remain intact.
docs: CLAUDE.md, README.md, AGENTS.md updated for the new
surface. llms-full.txt regenerated.

Test Coverage

Test file	Cases
`test/trajectory-format.test.ts`	17 (grouping, caps, supersession annotation, adversarial `</trajectory>` escape)
`test/engine-parity-event-type.test.ts`	6 (PGLite round-trip + kind filter matrix)
`test/regressions/v0_40_2_0-trajectory-backcompat.test.ts`	4 (byte-identical founder-scorecard math with/without event rows)
`test/think-intent.test.ts`	14 (temporal/KU/other, precedence, non-string defense)
`test/think-entity-extract.test.ts`	10 (retrieved-slug + noun-phrase sources, stop-word + leading-verb stripping, dedup)
`test/think-trajectory-injection.test.ts`	7 (intent routing, kill switches, empty-trajectory skip, throw caught by Promise.allSettled)
`test/longmemeval-extract.test.ts`	13 (JSON repair, alias map per-question scope, content-hash cache, fail-open)
`test/longmemeval-intent.test.ts`	9 (dataset label → Intent mapping for all 6 LongMemEval labels)
`test/longmemeval-trajectory-routing.test.ts`	4 (end-to-end stubbed, methodology_note presence, perf gate)

272 tests pass across 14 impacted suites; `bun run verify` clean. No regressions.

Coverage gate: every changed codepath has a corresponding test. Fail-open
paths exercised (engine throw, Haiku throw, malformed JSON, empty array,
invalid records).

Pre-Landing Review

3 review passes completed during planning, all CLEARED:

CEO Review: premise reframed (typed-claim shape vs event-chronology shape distinction); single-PR bisect-commit slicing per the "bisect commits over PR splits" rule.
Eng Review: verified migration slot, buildThinkUserMessage injection point, no back-compat filter needed (callers already defensively skip NULL-metric rows). D1 decided default-ON rollout.
Codex Outside Voice: 18 findings; 6 load-bearing folded as design decisions (alias-map wording fix, resolveEntitySlugWithSource resolution_source signal, prompt-placement preserving BOTH calibration and default ordering, INJECTION_PATTERNS for </trajectory>, 5s findTrajectory + 10s extractor timeouts, doctor check deferred to v0.40.3+, real-LLM spot-check added, success metric broadened). The benchmark methodology contamination was the load-bearing decision — accepted with explicit CHANGELOG + JSON-envelope disclosure.

Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md

Documentation

CLAUDE.md: extended the v0.35.7 trajectory entry with migration v82 + facts.event_type column + TrajectoryOpts.kind filter; documented the new shared src/core/trajectory-format.ts helper; extended src/core/think/index.ts entry with v0.40.2.0 trajectory injection (default ON, think.trajectory_enabled config key, sourceScopeOpts threading, GBRAIN_THINK_DEBUG env); extended src/commands/eval-longmemeval.ts entry with the inline Haiku extractor + intent routing + methodology disclosure note.
README.md: added a v0.40.2.0 banner explaining gbrain think now grounds temporal/knowledge-update answers in the typed-claim timeline by default, with the opt-out config key and the LongMemEval methodology note disclosed.
AGENTS.md: extended the v0.35.7 trajectory bullet to note gbrain think now uses this substrate automatically + the kind: 'event' | 'all' filter for non-metric event rows.
llms-full.txt: regenerated via `bun run build:llms` to match CLAUDE.md edits (CI `test/build-llms.test.ts` gate).

Test plan

All 272 trajectory + think + longmemeval + impacted-suite tests pass
`bun run verify` clean (17 pre-checks)
Migration v82 applies cleanly on both PGLite + Postgres parity test
CHANGELOG + VERSION + package.json all agree at v0.40.2.0
llms-full.txt regenerated to match CLAUDE.md edits

Open follow-ups for v0.40.3+

trajectory_health doctor check (deferred per Codex P16 — premature on a column that's mostly NULL in production until users populate event_type via cycle phase).
Trajectory injection in gbrain auto-think, gbrain dream synthesize, calibration recall-footer.
Structured event fields (event_type TEXT alone is impoverished for things like "moved to SF" → needs object/actor/location).
Production extract_facts cycle phase event extraction (so production users get event rows in their facts table without manual seeding).
Real-LLM full LongMemEval run (3 seeds per condition, paired-bootstrap CI) — the methodology spec is in the plan file; the actual numbers come from the post-merge measurement run.

🤖 Generated with Claude Code

…2.0 Commit 1) Substrate work for v0.40.2.0 Track B (trajectory routing for temporal + knowledge_update). This commit lands the schema + the shared formatter; think wiring + LongMemEval extractor + intent routing come in Commits 2-4. Migration v81 (facts_event_type_column): ALTER TABLE facts ADD COLUMN event_type TEXT (nullable, metadata-only). Lets the v0.35.4 typed-claim substrate carry event-shaped rows (event_type='meeting'/'job_change'/'location_change') alongside the metric-shaped rows (claim_metric/claim_value etc) it has carried since v67. Temporal-reasoning questions ("when did I last meet Marco") need the event shape; the metric shape doesn't fit them. Engine changes (pglite + postgres parity): - TrajectoryPoint.event_type: string | null added; projection in both findTrajectory SQL paths returns the column. - TrajectoryOpts.kind?: 'metric' | 'event' | 'all' added (default 'all'). Defensive opt that future-proofs filtering once event rows accumulate. - Both engines apply the new kind filter at SQL level when set. Back-compat (codex outside-voice concern): Existing callers (founder-scorecard, eval-trajectory) already defensively skip metric === null rows in their per-metric math. Event-only rows (metric=NULL, event_type='meeting') ride through invisibly to those callers — verified by the new regression test that asserts byte-identical computeFounderScorecard + computeTrajectoryStats output with and without event rows in the input. Both callers now pass kind:'metric' explicitly for call-site clarity (no behavior change). MCP find_trajectory op: - event_type added to the wire-shape map. - kind param added to the op declaration (enum metric/event/all). Shared formatter (src/core/trajectory-format.ts, new): formatTrajectoryBlock(points, entitySlug, opts) — sibling shape to renderTakesBlock + renderChatBlock. Groups by (metric ?? event_type). Per-metric cap 20, total cap 100 (prompt-budget guardrail). For knowledge_update intent, annotates value-change rows with "(superseded prior)" — the explicit signal codex flagged was missing from default RRF-ordered retrieval. Promoted to src/core/ so both gbrain think (Commit 2) and the LongMemEval harness (Commit 4) consume one source of truth. Prompt-injection coverage (codex Problem 10): src/core/think/sanitize.ts INJECTION_PATTERNS extended with three new entries — close-trajectory, open-trajectory, xml-attr-inject — so adversarial </trajectory> sequences in extracted text get escaped before reaching the model. Parity with the existing </take> coverage. Tests (all hermetic, no DATABASE_URL): - test/trajectory-format.test.ts (17 cases, all green): grouping, caps, sanitization, supersession annotation, determinism, provenance, text-cap, adversarial </trajectory> escape. - test/engine-parity-event-type.test.ts (6 cases): PGLite round-trip of the column + kind filter matrix. - test/regressions/v0_40_2_0-trajectory-backcompat.test.ts (4 cases): pins the byte-identical-output contract that founder-scorecard's per-metric math ignores event rows. - test/migrate.test.ts: v81 round-trip verified via existing structural assertion harness. - 209 tests across 5 impacted suites pass; bun run verify clean (17 pre-checks including privacy, jsonb, type, fuzz purity). Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md GSTACK REVIEW REPORT: CEO + ENG + CODEX CLEARED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

….40.2.0 Commit 2) Wires the v0.40.2.0 substrate (Commit 1's facts.event_type column + formatTrajectoryBlock) into the production `gbrain think` surface. Default ON; flip `think.trajectory_enabled=false` to opt out. New pure modules (zero engine dependency): - src/core/think/intent.ts — classifyIntent(question): regex-first routing into 'temporal' | 'knowledge_update' | 'other'. KU wins over temporal when both match. The 'other' fast path short-circuits with zero SQL. - src/core/think/entity-extract.ts — extractCandidateEntities() pulls high-precision candidates from retrieval slugs (people/, companies/, organizations/, deals/) and medium-precision noun phrases from the question. Word-level tokenization + stop-word boundaries stitch "Blue Bottle" as one candidate while splitting "I last meet Marco" correctly. Leading-verb stripper drops "meet", "visit" etc so "marco" surfaces cleanly. Cap of 5 per question. Engine-touching wiring (src/core/think/index.ts): - RunThinkOpts gains 4 fields: withTrajectory (default true), sourceId, allowedSources, remote. - readThinkTrajectoryEnabled() reads the config kill switch; default true; survives missing config table on legacy brains. - Trajectory orchestration sits between gather and prompt assembly: intent classify → extract candidates → per-candidate resolveEntitySlugWithSource → skip fallback_slugify → 5s timeout Promise.race + 3-wide concurrency cap → formatTrajectoryBlock. Any error degrades to "no block" + TRAJECTORY_INJECTION_FAILED warning; the think call itself never crashes from trajectory. - On success, TRAJECTORY_INJECTED_<N>_POINTS warning records the count for downstream telemetry. Prompt placement (src/core/think/prompt.ts) — Codex Problem 6 fix: buildThinkUserMessage's trajectoryBlock slot honors BOTH existing orderings — calibration mode inserts trajectory between calibration and question; default mode inserts between retrieval and the output instruction. NO third ordering is introduced. Empty trajectoryBlock skips the "Known trajectory:" header entirely (don't cue the model we tried). Resolution-source signal (src/core/entities/resolve.ts) — Codex Problem 5: New companion resolveEntitySlugWithSource() returns {slug, source: 'exact_page' | 'fuzzy_match' | 'fallback_slugify'} so trajectory routing can skip fallback-only resolutions — querying findTrajectory on an invented slug always returns [] and wastes a SQL round-trip. The original resolveEntitySlug keeps its contract for pre-v0.40 callers. MCP think op handler (src/core/operations.ts): Extracts sourceScopeOpts(ctx) into scalar sourceId + allowedSources + remote, threads through to runThink. CLI callers omit (engine default source, remote=false). Mirrors the same source-scope discipline applied to all other read paths in v0.34.1.0. Sanitization (Commit 1 already extended INJECTION_PATTERNS for </trajectory> — consumed here). Test coverage (all hermetic, no DATABASE_URL, no API keys): - test/think-intent.test.ts (14 cases) — temporal, KU, other, precedence (KU wins when both match), defensive non-string inputs. - test/think-entity-extract.test.ts (10 cases) — retrieved-slug source, noun-phrase source, stop-word stripping, leading-verb stripping, dedup across sources, 5-candidate cap. - test/think-trajectory-injection.test.ts (7 cases against PGLite in-memory) — temporal intent injection happy path with superseded- prior annotation, "other" intent short-circuit, withTrajectory: false bypass, think.trajectory_enabled=false config bypass, empty-trajectory skip, engine.findTrajectory throw is caught (Promise.allSettled defense), TRAJECTORY_INJECTED warning count. - Existing test/think-pipeline.serial.test.ts re-asserted unchanged (10 cases — calibration mode parity, gather, sanitization, cite-render all intact). 72 tests pass across 7 impacted suites; bun run verify clean (17 pre- checks). Defaulted on per CEO + Eng D1; kill switch via config. Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…(v0.40.2.0 Commit 3) Populates the LongMemEval benchmark brain's facts table inline at import time so Commit 4's intent routing has data to retrieve. Per the CHANGELOG D1 decision, this is full-haystack preprocessing — disclosed explicitly in the benchmark output's methodology_note field (Commit 4). New module src/eval/longmemeval/extract.ts: extractAndInsertClaims({engine, client, model, sessionSlug, sessionId, sessionBody, sourceId, aliasMap}) - Hashes the session body (sha256) for cache lookup. - Cache hit → reuses parsed claims (cuts a 3-iteration benchmark run from $1.50 to $0.50 when sessions repeat across questions, as they do in LongMemEval). - Cache miss → one Haiku call. System prompt asks for {entity, metric, value, unit, period, event_type, valid_from, text}[] JSON. New parseExtractedJsonArray() helper does fence-strip + parse (parseModelJSON from cross-modal-eval is shaped for scored objects, not arrays — different parser needed here). - Per-record validateClaim() drops malformed records (missing entity, bad date) silently; the rest land in NewFact rows. - Per-question AliasMap (Codex Problem 4 — semantics pinned): "Marco" + "Marco Smith" + "marco" in the SAME question collapse to one slug via first-mention-wins canonicalization. Across questions, the harness creates a fresh map (no leak). - Real-page-aware entity resolution via the v0.40.2.0 resolveEntitySlugWithSource (Commit 2). Slugify-fallback rows still insert (we need the data); the resolution_source signal is only consulted at trajectory retrieval time (Commit 4). - Bulk insert via engine.insertFacts with the `gbrain-allow-direct-insert` allow-list comment per the check-system-of-record CI guard contract — benchmark brain is ephemeral in-memory PGLite, no markdown source-of-truth applies. - Fail-open posture: Haiku throw, malformed JSON, insert collision all return inserted=0 without throwing. One bad session never kills the per-question loop. - getCacheStats() exposes hits/misses/size for the per-run stderr telemetry Codex Problem 14 asked for (empirical hit-rate reporting; the optimistic claim self-verifies). Substrate plumbing (extends Commit 1): - NewFact.event_type?: string | null added in engine.ts so the extractor can pass event-shaped rows through to insertFacts. - PGLite engine + Postgres engine insertFacts() now persist event_type. Param-positional dispatch extended to 20/21 placeholders (null-embedding vs embedding-present); tx.unsafe vector cast on Postgres path unchanged. Test coverage (test/longmemeval-extract.test.ts, 13 cases, hermetic): - Happy path: typed-claim + event rows both insert with correct kind (event_type='meeting' → kind='event'; claim_metric='mrr' → kind='fact'). - Alias map: per-session collapsing ("Marco" + "Marco Smith"), cross-session persistence within one question, fresh map per question (caller-clears semantics pinned). - Content-hash cache: identical body → cache hit, only ONE Haiku call across two sessions; different bodies miss; getCacheStats reports hits/misses/size. - Fail-open: malformed JSON, Haiku throw, empty array output, invalid records (missing entity, bad date) — none crash; 0 inserted in each case. 55 tests pass across 4 impacted suites; bun run verify clean. Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ology disclosure (v0.40.2.0 Commit 4) The final wiring: per-question intent classification + trajectory call + block splice into the answer-gen prompt. Plus the methodology disclosure stamps that close out the Codex D1 contract. New module src/eval/longmemeval/intent.ts: classifyIntent(q): prefers q.question_type from the dataset (LongMemEval ships labels like 'temporal-reasoning', 'knowledge-update', 'single-session-user') before falling back to the SHARED regex set imported from src/core/think/intent.ts. Single source of truth for the regex — think and longmemeval cannot drift. Harness wiring in src/commands/eval-longmemeval.ts: - runEvalLongMemEval() spawns an extractor model via resolveModel (tier:'utility' → haiku) when trajectory routing is enabled. Calls resetExtractorState() once per benchmark run so the content-hash cache + counters start clean. - runOneQuestion() creates a FRESH per-question AliasMap (Codex Problem 4 — first-mention-wins canonicalization stays scoped to one question, never leaks across). - Per session: after importFromContent lands, extractAndInsertClaims populates the facts table. Fail-open if the Haiku call errors; next session keeps going. - After hybridSearch returns: classifyIntent(q) routes temporal/knowledge_update through extractCandidateEntities (the SHARED helper from Commit 2's think/entity-extract) → per-candidate findTrajectory with 5s Promise.race timeout → formatTrajectoryBlock. First candidate with a non-empty trajectory wins. - generateAnswer() splices the trajectory block BEFORE the Retrieved sessions block. Empty block (no entity match / no points) → no "Known trajectory:" header (don't cue the model we tried). - JSON envelope gains 5 fields per question when trajectory routing is on: intent, trajectory_points, entity_resolved, resolution_source, methodology_note. methodology_note also written to stderr at run completion. Resolution-source gate DIVERGES from think (intentional): In the think production path, fallback_slugify results are skipped because querying invented slugs wastes SQL — production brains have canonical pages. In the LongMemEval benchmark, there ARE no canonical pages; both the extractor and the lookup go through slugify-fallback on the same free-form name, so they cohere on the same slug. Applying the think-path gate here would permanently block trajectory injection on the benchmark. Comment in runOneQuestion documents the divergence. New CLI flag --no-trajectory: Bypasses BOTH the Haiku extractor AND the per-question intent routing. Used by the measurement protocol to baseline default-on vs no-trajectory across 3 seeds per condition with paired-bootstrap CI. Documented in the help text. New RunOpts fields: - extractorClient?: ThinkLLMClient — separate stub from the answer-gen client so tests can isolate the two surfaces. - extractorModel?: string — model override for the Haiku call. methodology_note = 'extractor=haiku-preprocess-full-haystack-v1' stamped on: - Every per-question JSON envelope row. - Stderr summary at run completion. This is the Codex D1 contract: the temporal-reasoning delta we publish is "gbrain + Haiku-preprocess pipeline" vs "gbrain alone", not directly comparable to LongMemEval's published baselines without that disclosure. Extractor cache hit-rate stderr summary (Codex Problem 14): '[longmemeval] extractor.cache_hits: 412 / 489 sessions (84.2%, cached_bodies=412)' — empirical verification of the optimistic hit-rate claim. The optimistic number self-verifies per run. Test coverage (all hermetic, no API keys): - test/longmemeval-intent.test.ts (9 cases) — dataset question_type → Intent mapping for all six LongMemEval labels; dataset label trumps question-text signal; unknown labels fall through to the regex classifier. - test/longmemeval-trajectory-routing.test.ts (4 cases) — end-to-end through runEvalLongMemEval with both clients stubbed: trajectory block lands in answer-gen prompt for temporal intent + absent for 'other'; --no-trajectory bypasses extractor AND injection AND omits envelope fields; methodology_note stamped on every routed row; perf gate preserved (< 10s for 2-question fixture). 118 tests pass across 11 impacted suites; bun run verify clean. Wave complete. CHANGELOG draft + measurement plan live in the plan file. v0.40.2.0 ready for /ship after a real-LLM spot-check run. Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Master shipped v0.38.0.0 (ingestion-cathedral wave, #1275) which claimed migration slot v81 with `pages_provenance_columns`. The v0.40.2.0 trajectory-routing wave's `facts_event_type_column` migration is renumbered to v82. Engine + test code that reference the new migration are updated to v82. Master's v81 tests (`pages_provenance_columns`) remain intact and test master's migration unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

v0.40.2.0 trajectory routing wave — gbrain think now grounds answers about temporal/knowledge-update questions in the typed-claim timeline the brain has been quietly building via the extract_facts cycle phase. Default ON; flip think.trajectory_enabled=false to opt out. LongMemEval-side wiring lands the same plumbing in the benchmark harness with explicit methodology disclosure (extractor=haiku-preprocess- full-haystack-v1) in the JSON envelope and stderr summary — the published temporal-reasoning number is "gbrain + Haiku-preprocess" vs "gbrain alone", not directly comparable to LongMemEval's published baselines without that disclosure. Plan: ~/.claude/plans/system-instruction-you-are-working-crystalline-owl.md 3 review passes: CEO + ENG + CODEX all CLEARED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CLAUDE.md, README.md, AGENTS.md extended with the v0.40.2.0 trajectory routing surface: gbrain think integration (default ON via think.trajectory_enabled config key), facts.event_type schema column + TrajectoryPoint.event_type + TrajectoryOpts.kind filter, shared formatTrajectoryBlock helper in src/core/trajectory-format.ts, LongMemEval extractor + intent routing + methodology disclosure, migration v82. llms-full.txt regenerated to match CLAUDE.md edits (CI test/build-llms gate). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… v82-v85) Master shipped v0.38.1.0 (provider-agnostic subagent loop, #1289) which claimed migration slots v82-v85: v82 — subagent_tool_executions_stable_id v83 — mcp_spend_reservations v84 — oauth_clients_budget_usd_per_day v85 — oauth_clients_agent_binding The v0.40.2.0 trajectory-routing wave's `facts_event_type_column` migration is renumbered to v86. Engine + test + CLAUDE.md references updated. CHANGELOG reconstructed: v0.40.2.0 entry kept at the top (our entry), master's v0.38.1.0 entry inserted below, both intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Audit of the trajectory-routing wave's test surface vs the shipped code surfaced 7 gaps. All filled, all green. Total: 343 tests across 17 impacted suites (was 272 pre-fill). Gap 1 — Migration v86 structural tests (11 new in test/migrate.test.ts): - v86 entry exists with documented name + idempotent - exactly one event_type column add to facts - IF NOT EXISTS guard - column is nullable (no NOT NULL, no DEFAULT regression guard) - does NOT create any index (event_type is selectivity-poor) - does NOT touch any other table (blast-radius pin) - does NOT carry a sqlFor override (engine-shared SQL contract) - PGLite round-trip: column exists with right type + nullable - event_type INSERT/SELECT round-trip - NULL round-trip for legacy + metric-only rows - LATEST_VERSION >= 86 contract pin Gap 2 — resolveEntitySlugWithSource branch coverage (12 new in test/entity-resolve.test.ts): - exact_page branch (full slug, slug-shape match) - fuzzy_match branch (Title-cased display name, bare first name via prefix expansion) - fallback_slugify branch (unseeded name, multi-word non-match phrase, accented input) - null tail (empty + whitespace) - back-compat parity with resolveEntitySlug for both exact_page and fallback_slugify branches Gap 3 — INJECTION_PATTERNS dedicated coverage for new entries (18 new in test/think-sanitize-trajectory.test.ts): - close-trajectory entry registered + matches canonical and whitespace/case variations - open-trajectory entry registered + matches both no-attr and with-attrs forms - xml-attr-inject strips entity=/metric=/event_type=/kind= - does NOT strip non-trajectory attribute names (class/id/title) - combined multi-vector attack: all three patterns fire - formatTrajectoryBlock end-to-end with adversarial extractor text: one live </trajectory> (the wrapper, not the injection); one entity= attribute (the wrapper, not the injection) - pattern ordering invariant: new entries land after close-take Gap 4 — runThink calibration-mode placement contract (3 new in test/think-trajectory-injection.test.ts): - default mode: question → pages → takes → trajectory → instruction - calibration mode: pages → takes → calibration → trajectory → question → instruction (Codex P6 — no third ordering invented) - empty trajectory in calibration mode preserves the existing calibration shape (no false-positive cue) Gap 5 — runThink resolution_source != fallback_slugify gate (1 new in test/think-trajectory-injection.test.ts): - candidate that only matches via fallback_slugify is NOT queried (think-path divergence from longmemeval-path which accepts it) Gap 6 — E2E for runThink trajectory injection (7 new in test/e2e/think-trajectory-pglite.test.ts): - full pipeline lands <trajectory> block in answer-gen prompt - knowledge_update intent annotates value-change rows with (superseded prior) - 'other' intent short-circuits (no block, no SQL) - think.trajectory_enabled=false config bypasses entire path - empty brain → graceful no-op (no crash, no block) - multi-entity deterministic ordering - adversarial </trajectory> in seeded fact text is escaped before reaching the LLM (end-to-end sanitization gate) Gap 7 — longmemeval extractor stress + persistence pins (6 new in test/longmemeval-extract.test.ts): - alias map cross-session stress with 12 sessions in one question; all 12 rows collapse under ONE entity_slug - different entities stay separate across many sessions - embedding + embedded_at both NULL on benchmark-inserted rows (regression guard against accidental embed-on-write) - row_num sequential + source_markdown_slug stamped per session (v0.32.2 partial UNIQUE index contract) - source field stamped "longmemeval:extractor" (audit-tag pin) - cache key invariance: same body hash hits cache across different sessionId/slug bun run verify clean (17 pre-checks). No regressions in any of the 14 non-new impacted suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… v86) Master shipped v0.38.2.0 (#1297 doctor frontmatter scan) and v0.39.0.0 (#1283 brainstorm cost cathedral). v0.39.0.0 claimed migration v86 with `page_links_view_alias`. The v0.40.2.0 trajectory-routing wave's `facts_event_type_column` migration renumbers v86 → v87. All references updated in: - src/core/migrate.ts: migration entry now v87, renumber comment notes the full v81→v82→v86→v87 history across three master merges. - src/core/engine.ts, src/core/pglite-engine.ts, src/core/postgres-engine.ts: inline comments bumped to v87. - test/migrate.test.ts: my describe blocks (11 structural + 4 round-trip cases) bumped to v87. LATEST_VERSION assertion bumped to >= 87. - CLAUDE.md: v0.40.2.0 entry mentions v87. Master's v0.39.0.0 references to v86 (page_links_view_alias) preserved intact. - CHANGELOG: reconstructed cleanly — v0.40.2.0 entry at top with v87 reference, master's v0.39.0.0 + v0.38.2.0 + v0.38.1.0 entries inserted in order below. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The schema-bootstrap-coverage CI guard (test/schema-bootstrap-coverage.test.ts) enforces that every ALTER TABLE ADD COLUMN in MIGRATIONS is covered by applyForwardReferenceBootstrap OR by PGLITE_SCHEMA_SQL's CREATE TABLE bodies OR by COLUMN_EXEMPTIONS. v0.40.2.0's migration v87 adds facts.event_type but deliberately ships without a bootstrap probe because: - No CREATE INDEX in PGLITE_SCHEMA_SQL references event_type - No FK references event_type - All existing callers (founder-scorecard, eval-trajectory, gbrain think trajectory injection) defensively skip NULL-metric rows in per-metric math, so event_type=NULL on pre-v87 brains is invisible - Pre-v87 brains land event_type=NULL via the migration ALTER Exactly mirrors the precedent set by facts.claim_metric / claim_value / claim_unit / claim_period exemptions (v67 typed-claim columns) which are exempted for the same structural reason: column-only migration, no forward-reference index, no downstream filter breaks on old brains. Adding facts.event_type to COLUMN_EXEMPTIONS with a brief rationale comment matching the existing v0.35.6 entry shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… v87+v88) Master shipped v0.39.1.0 (#1248 schema packs — bring your own shape) which claimed migration slots v87 + v88: v87 — takes_kind_drop_check v88 — eval_candidates_schema_pack_per_source The v0.40.2.0 trajectory-routing wave's `facts_event_type_column` migration renumbers v87 → v89. All references updated in: - src/core/migrate.ts: migration entry now v89, renumber comment notes the full v81→v82→v86→v87→v89 history across four master merges. - src/core/engine.ts, src/core/pglite-engine.ts, src/core/postgres-engine.ts: inline comments bumped to v89. - test/migrate.test.ts: my describe blocks (11 structural + 4 round-trip cases) bumped to v89. LATEST_VERSION assertion bumped to >= 89. - test/schema-bootstrap-coverage.test.ts: COLUMN_EXEMPTIONS entry comment bumped to v89. Master's v0.39.1.0 also added eval_candidates.schema_pack_per_source to the exemption list — both kept (file has no conflicts after stitching). - CLAUDE.md: v0.40.2.0 entry mentions v89. Master's v0.39 references to v86 (page_links_view_alias) preserved intact. - CHANGELOG.md: my v0.40.2.0 entry's Substrate header bumped to v89. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nflict, CHANGELOG stitched) master shipped agent-voice (v0.40.0.0), capture-fix wave (v0.39.3.0), and autopilot per-source fan-out (v0.39.2.0) while v0.40.2.0 trajectory-routing was in review. No migration version collision — v89 facts_event_type_column slots cleanly past master's v88 eval_candidates_schema_pack_per_source. VERSION/package.json kept at 0.40.2.0. CHANGELOG.md stitched with v0.40.2.0 on top, then master's three new entries in date order, then existing entries from v0.39.0.0 down. Post-merge audit: VERSION/package.json/top-CHANGELOG all agree on 0.40.2.0. typecheck clean. 243 wave-impacted tests pass. llms-full.txt regenerated for the merged CLAUDE.md.

master shipped v0.40.1.0 Track D — eval infrastructure (hermetic qrels gate + nightly cross-modal probe + --by-type/--by-type-floor on longmemeval). Conflicts in CHANGELOG.md, CLAUDE.md, VERSION, package.json, llms-full.txt, and src/commands/eval-longmemeval.ts. VERSION/package.json kept at 0.40.2.0. CHANGELOG.md stitched with v0.40.2.0 on top, then v0.40.1.0, then existing entries. CLAUDE.md merged: kept both master's new entries (eval-replay-gate + nightly-quality-probe) AND v0.40.2.0 trajectory entries, and merged both v0.40.1.0 Track D + v0.40.2.0 extensions onto the same eval-longmemeval annotation. src/commands/eval-longmemeval.ts: combined both ParsedArgs additions (noTrajectory + byType + byTypeFloor), both CLI parser branches, both help-text blocks, and both terminal-output branches (extractor cache hit-rate + by-type summary emission). Post-merge audit: VERSION/package.json/top-CHANGELOG all agree on 0.40.2.0. typecheck clean. 65 longmemeval-impacted tests pass (--by-type + trajectory routing co-exist). llms-full.txt regenerated.

* upstream/master: (22 commits) v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377) v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403) v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364) v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352) v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367) v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351) v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350) v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345) v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313) v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333) v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327) v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324) v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322) v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300) v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323) v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296) v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298) v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128) v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308) v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295) ...

garrytan and others added 5 commits May 22, 2026 07:58

garrytan changed the title ~~v0.40.0.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval)~~ v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) May 22, 2026

garrytan and others added 2 commits May 22, 2026 09:17

garrytan force-pushed the garrytan/v0.40.2.0-trajectory-routing branch from 588d720 to 1f5fc17 Compare May 22, 2026 16:18

garrytan and others added 7 commits May 22, 2026 09:28

garrytan merged commit a19ee8b into master May 23, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval)#1296

v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval)#1296
garrytan merged 14 commits into
masterfrom
garrytan/v0.40.2.0-trajectory-routing

garrytan commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Pre-Landing Review

Documentation

Test plan

Open follow-ups for v0.40.3+

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented May 22, 2026 •

edited

Loading