Skip to content

v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4)#1131

Merged
garrytan merged 5 commits into
masterfrom
garrytan/las-vegas-v2
May 18, 2026
Merged

v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4)#1131
garrytan merged 5 commits into
masterfrom
garrytan/las-vegas-v2

Conversation

@garrytan

@garrytan garrytan commented May 17, 2026

Copy link
Copy Markdown
Owner

Summary

Phases 2-4 of the temporal contradiction probe RFC. v0.35.3.1 taught the probe to see dates; this wave turns dates into a proper time-series substrate.

  • Typed metric claims in the ## Facts fenceclaim_metric / claim_value / claim_unit / claim_period columns. Fence widens 10→14 cells only when a row carries typed data (no diff churn on existing fences). 15-entry seed map normalizes MRR / Monthly Recurring Revenue / mrr → mrr.
  • gbrain eval trajectory <entity> — chronological metric history with regressions auto-flagged inline. --metric mrr, --since, --until, --json all honored. Regression threshold 10% default, env override GBRAIN_TRAJECTORY_REGRESSION_THRESHOLD.
  • gbrain founder scorecard <entity> — four-signal rollup (claim_accuracy / consistency / growth_trajectory / red_flags). schema_version: 1, additive-only.
  • find_trajectory MCP op — read scope, not localOnly. Routes through sourceScopeOpts(ctx) for federated isolation. Visibility-filters remote callers (AND visibility='world') mirroring recall's posture.
  • Cycle correctness fixes:
    • consolidate semantic upsert on (page_id, claim, since_date) — fixes pre-existing F4 duplicate-takes bug after extract_facts cleared consolidated_at
    • extract_facts batch-embeds via gateway.embed() before insert — fixes cycle-inserted facts arriving with NULL embeddings
    • extractFactsFromFenceText threads pages.effective_date as valid_from fallback — fixes import-date contamination on existing fences
  • Schema migration v67 (facts_typed_claim_columns) — 4 nullable columns + partial index, metadata-only on both engines.

Test Coverage

265 wave-specific tests across 13 files. Full suite: 6799 pass, 0 fail.

COVERAGE (wave-specific):
[+] facts-fence parser + renderer (round-trip, widening, normalize)       — 17 tests
[+] extract-from-fence valid_from precedence (3 branches)                  — covered
[+] consolidate semantic upsert (R4a + R4b/R7 idempotency)                 — 4 tests
[+] BrainEngine.findTrajectory (Postgres + PGLite parity)                  — 18 tests
[+] trajectory.ts pure functions (regressions, drift_score)                — covered
[+] find_trajectory MCP op (R5 schema, R6 visibility, source scoping)      — 9 tests
[+] gbrain eval trajectory CLI                                              — 7 tests
[+] gbrain founder scorecard CLI (4 rollup fields, G2 empty rollup)        — 9 tests
[+] R1 + R8 grep guard: probe never writes valid_until                     — 4 tests
[+] migration v67 shape + materialization + backward compat                — 6 tests
[+] schema-bootstrap-coverage exemptions for new typed-claim columns       — covered

COVERAGE: 100% of wave-introduced paths tested.
Pre-existing tests: 6799 pass, 0 fail (no regressions).

Pre-Landing Review

CEO + Eng + Codex outside-voice rounds completed during plan phase. Three rounds caught:

  • Codex F1: migration v66 collision (master had embed_stale_partial_index) — renumbered to v67
  • Codex F2: Haiku prompt belongs in facts/extract.ts (MCP write path), NOT extract-facts.ts (cycle phase is fence-parse only)
  • Codex F3: insertFacts doesn't compute embeddings — cycle now batch-embeds via gateway.embed() before insert
  • Codex F4: pre-existing duplicate-takes bug — semantic upsert on (page_id, claim, since_date) fixes it
  • Codex F5+F6: missed 4th caller facts/fence-write.ts:216 + no Page object there → pageEffectiveDate is OPTIONAL
  • Codex F7 (security): find_trajectory needed visibility filter for remote callers — D-CDX-1 mirrors recall
  • Codex F8: ParsedFact needed typed-field extension for markdown system-of-record
  • Codex F9: findTrajectory needs dual scalar + federated sourceId support
  • Eng A1: valid_from defaulting bug — threads pages.effective_date through the cycle

All 17 decisions locked (D1-D10 + D-ENG-1..4 + D-CDX-1..7). Plan file: ~/.claude/plans/system-instruction-you-are-working-curious-jellyfish.md.

Plan Completion

All 11 tasks DONE:

  • T1: Migration v67 + PGLite parity ✓
  • T2: ParsedFact + Haiku prompt + valid_from + NewFact + normalization ✓
  • T3: Consolidate semantic upsert + valid_until + embedding-on-insert ✓
  • T4: BrainEngine.findTrajectory (Postgres + PGLite) ✓
  • T5: find_trajectory MCP op ✓
  • T6: gbrain eval trajectory CLI ✓
  • T7: gbrain founder scorecard CLI ✓
  • T8: IRON-RULE regressions (R1, R3, R4a, R4b, R5, R6, R7, R8) + edge cases ✓
  • T9+T10+T11: VERSION bump + CHANGELOG + CLAUDE.md + migration note ✓

Bisect-friendly commits

  1. feat(facts): typed-claim substrate + cycle correctness fixes (wave 1/3) — schema migration v67 + ParsedFact + Haiku + cycle phases + tests
  2. feat(trajectory): find_trajectory MCP op + eval/founder CLIs (wave 2/3) — engine method + trajectory.ts + MCP op + 2 CLIs + 5 test files
  3. chore: v0.35.6.0 — CHANGELOG + VERSION + docs + migration note — release docs only

Each commit type-checks and bun run verify passes independently.

Test plan

  • bun run verify clean (typecheck + 7 lint gates + admin build)
  • bun run test clean (6799 pass, 0 fail in the parallel suite)
  • Wave-specific tests: 265 pass across 13 files
  • Cycle re-run on fixture brain produces zero new takes (R7 pin)
  • Visibility filter on remote=true returns world-only (R6 pin)
  • Probe path never writes valid_until (R1+R8 grep guard)

Documentation

Post-ship sync (commit 0cafbd54) — /document-release synced every reader-facing surface to v0.35.7.0:

  • README.mdgbrain eval trajectory added to the EVAL command table; new TEMPORAL section covers gbrain founder scorecard + the GBRAIN_TRAJECTORY_REGRESSION_THRESHOLD env override; v0.35.7 "What's new" paragraph added under the v0.28.8 LongMemEval blurb so the wave lands above the fold.
  • AGENTS.md — new "Track a founder/company over time" bullet under Common tasks pointing agents at gbrain eval trajectory, gbrain founder scorecard, and the find_trajectory MCP op.
  • docs/contradictions.md — "Temporal axis follow-on (v0.35.3.1 + v0.35.7)" subsection appended to See also. Cross-links the trajectory substrate, names the auto-supersession.ts:4 invariant preserved by both the verdict enum (probe side) and consolidate's valid_until writeback (cycle side).
  • CLAUDE.md — fixed stale (v0.35.4) tag on the trajectory Key Files entry to (v0.35.7) (rebumped twice during the merge wave).
  • skills/migrations/v0.35.7.md → v0.35.7.0.md — renamed for consistency with the v0.35.0.0.md / v0.14.0.md naming convention.
  • llms-full.txt — regenerated.

Coverage map (Diataxis)

Entity Reference How-to Tutorial Explanation
gbrain eval trajectory CLI ✅ README + AGENTS ✅ CHANGELOG
gbrain founder scorecard CLI ✅ README + AGENTS ✅ CHANGELOG
find_trajectory MCP op ✅ CLAUDE.md + AGENTS + contradictions.md
Typed-claim fence columns ✅ skills/migrations/v0.35.7.0.md + CHANGELOG ✅ migration note (syntax example)
Migration v67 ✅ CLAUDE.md + CHANGELOG

Tutorial / explanation gaps are deliberately deferred — the migration note's fence-syntax example covers the "first typed claim" walkthrough sufficiently. ARCHITECTURE diagrams not drifted (the trajectory work extends existing facts/takes infrastructure; no new component boxes to add).

CHANGELOG voice sell-test (Diataxis 3/3 across all 7 subsections)

Every "What you can now do" subsection answers: what changed (reference) + why it matters (explanation) + how to use it (how-to). No commit-message-shaped prose; everything leads with what the user can now do.

🤖 Generated with Claude Code

garrytan and others added 4 commits May 17, 2026 15:52
… wave 1/3)

Schema (migration v67):
- Add four optional typed-claim columns to facts: claim_metric TEXT,
  claim_value DOUBLE PRECISION, claim_unit TEXT, claim_period TEXT
- Partial index facts_typed_claim_idx ON (entity_slug, claim_metric, valid_from)
  WHERE claim_metric IS NOT NULL
- All nullable, metadata-only on both engines

Fence layer:
- ParsedFact (facts-fence.ts) gains optional claimMetric/Value/Unit/Period
- Parser tolerates both 10-cell (legacy) and 14-cell (widened) rows
- Renderer emits 14 cells iff any row has typed data; otherwise stays
  10-cell so existing fences don't widen on unrelated edits
- Numeric value cell tolerates comma thousand separators (50,000 -> 50000)

Extract pipeline (D-CDX-2, D-ENG-1):
- src/core/facts/extract.ts (the actual Haiku call site, NOT extract-facts.ts
  cycle phase) extends its system prompt to emit typed fields for metric-shaped
  claims
- extractFactsFromFenceText gains optional pageEffectiveDate. Precedence:
  fence-row validFrom > pageEffectiveDate > undefined (engine defaults to now)
- normalizeMetricLabel: 15-entry seed map for common founder metrics (mrr,
  arr, runway, headcount, team_size, cac, ltv, gross_margin, burn_rate, cash,
  users, mau, dau, churn_rate, revenue); unknown labels lowercase + space->_

Engine extensions:
- NewFact + insertFact + insertFacts in both engines accept the four typed
  columns (all nullable)
- Cycle phase extract-facts.ts threads page.effective_date through AND
  batch-embeds via gateway.embed() before insertFacts (D-CDX-3 fix for
  cycle-inserted facts arriving with embedding=NULL)

Consolidate fix (D-CDX-4 — Codex F4):
- Replace MAX(row_num)+1 INSERT with semantic upsert on (page_id, claim,
  since_date). Re-running the full cycle on stable input produces zero new
  takes — fixes the pre-existing duplicate-takes bug after extract_facts
  wipes consolidated_at
- Chronological valid_until writeback per cluster: sort by (valid_from ASC,
  id ASC), walk pairs, set older.valid_until = newer.valid_from

Tests:
- test/migrate.test.ts +6 cases for v67 shape + materialization + nullable
  backward compat
- test/facts-fence-typed.test.ts (new, 17 cases): parser+renderer round-trip,
  normalization seed map coverage, valid_from precedence three-branch
- test/consolidate-valid-until.test.ts (new, 4 cases): chronological
  writeback (R4a), same-day id tiebreaker, cycle re-run zero duplicates
  (R4b/R7), valid_until idempotency
- test/schema-bootstrap-coverage.test.ts: add four typed-claim columns to
  COLUMN_EXEMPTIONS (migration co-defines the partial index, no forward
  reference to bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… wave 2/3)

Engine method (D-CDX-1, D-CDX-6):
- BrainEngine.findTrajectory(opts) on both Postgres and PGLite
- TrajectoryOpts: scalar sourceId fast path + sourceIds federated array
  (mirrors v0.34.1.0 search* dual pattern)
- opts.remote: when true, SQL adds AND visibility='world' so OAuth read
  clients see only world-visibility facts (mirrors recall's posture —
  closes the F7 privacy regression Codex caught in plan review)
- Single SQL query, ORDER BY valid_from ASC, id ASC for deterministic
  output (R3 pin). Returns TrajectoryPoint[] including raw embedding so
  the caller can compute drift without a second round-trip

Pure function library (src/core/trajectory.ts, new):
- detectRegressions(points, threshold): walks consecutive (metric, value)
  pairs per metric; emits when newer drops >= threshold below older.
  10% default, override via GBRAIN_TRAJECTORY_REGRESSION_THRESHOLD
- computeDriftScore(points): 1 - mean(cosine(emb[i], emb[i-1])) over
  embedded points; clamped [0,1]; null when <3 embedded points (D-ENG-3
  graceful degradation)
- computeTrajectoryStats(points): composed shape returning both
- TRAJECTORY_SCHEMA_VERSION = 1 — additive-only across releases (R5)

MCP op (src/core/operations.ts):
- find_trajectory: scope read, NOT localOnly. Routes through
  sourceScopeOpts(ctx) for federated isolation AND threads ctx.remote
  for visibility filtering. Strips raw Float32Array embeddings from the
  wire shape; converts valid_from to YYYY-MM-DD string
- Registered in operations array after find_experts
- FIND_TRAJECTORY_DESCRIPTION in operations-descriptions.ts

CLIs:
- gbrain eval trajectory <entity> [--metric M] [--since D] [--until D]
  [--limit N] [--json] — chronological human view with [REGRESSION] inline
  annotation; thin-client routing via callRemoteTool(find_trajectory).
  Dispatched in src/commands/eval.ts sub-subcommand block
- gbrain founder scorecard <entity> [--since D] [--until D] [--json] —
  pure aggregation over Phase 2's substrate. Four signals:
  claim_accuracy (over resolved takes), consistency, growth_trajectory,
  red_flags. computeFounderScorecard exported for tests.
  Registered as top-level command in cli.ts; added to CLI_ONLY set

Tests (45 cases across 5 files):
- test/engine-find-trajectory.test.ts: 18 cases — chronological order,
  source scoping (scalar + federated), visibility filter on remote=true,
  metric + since/until filters, regression detection at threshold
  boundaries, drift score with various embedding states
- test/operations-find-trajectory.test.ts: 9 cases — op registration,
  param validation, JSON envelope shape, R5 schema_version: 1,
  embedding stripped from wire, R6 visibility filter, source scoping
- test/eval-trajectory.test.ts: 7 cases — arg parsing, --help,
  --json envelope, regression annotation, --metric filter, empty entity
- test/founder-scorecard.test.ts: 9 cases — empty inputs no-NaN (G2),
  claim_accuracy math, consistency math, growth_trajectory math,
  red_flags fire for regression / narrative_drift / missed_prediction
- test/eval-contradictions/no-valid-until-write.test.ts: 4 cases —
  R1 (probe never writes valid_until under eval-contradictions/) +
  R8 (only allow-listed files write valid_until anywhere in src/)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps to v0.35.6.0 (next-minor after master's v0.35.5.1 — typed-claim
substrate + trajectory + founder scorecard is a new user-facing
feature surface, not a fix).

- VERSION + package.json synced
- CHANGELOG.md release-summary block in the wave-style voice, lead with
  what the user can now DO. Sections: typed metric claims in the fence,
  chronological metric trajectories, founder scorecard, MCP
  find_trajectory op, cycle re-run idempotency fix, embedding-on-insert
  fix, valid_from precedence fix. To-take-advantage-of block with
  verification + opt-in fence syntax example
- CLAUDE.md Key Files entry consolidating the wave across
  eval-trajectory.ts + founder-scorecard.ts + trajectory.ts. Names every
  D-ENG / D-CDX decision and the Codex outside-voice F-numbers
- skills/migrations/v0.35.6.md agent-readable migration note. Includes
  fence-syntax example for typed-claim rows so downstream agents start
  emitting them. Iron-rule contracts called out (R1 + R8 + R7 + visibility)
- llms-full.txt regenerated to reflect the new CLAUDE.md entry

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Master shipped v0.35.6.0 (PR #1129 — floor-ratio gate for metadata boost
stages) between this branch's first push and now, colliding with our
v0.35.6.0 slot. Rebump to v0.35.7.0:

- VERSION + package.json: 0.35.6.0 → 0.35.7.0
- CHANGELOG: my temporal-trajectory + founder scorecard entry stays at
  the top, header rewritten to [0.35.7.0]; master's [0.35.6.0]
  floor-ratio entry preserved below it; internal references in my body
  ("v0.35.6 fixes" / "v0.35.6 batches" / "To take advantage of v0.35.6.0")
  rewritten to v0.35.7
- skills/migrations/v0.35.6.md renamed → v0.35.7.md, frontmatter +
  heading rewritten
- llms-full.txt regenerated
- bun.lock fresh (no dependency drift, just pin sync)

Verified: bun run typecheck clean, 356 wave + search + bootstrap tests
pass on the merged tree, no regression introduced by the merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan changed the title v0.35.6.0 feat: temporal trajectory + founder scorecard (Phases 2-4) v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4) May 17, 2026
- README.md: add `gbrain eval trajectory` to EVAL section, add new
  TEMPORAL block covering `gbrain founder scorecard` + the
  GBRAIN_TRAJECTORY_REGRESSION_THRESHOLD env override; add v0.35.7
  "What's new" paragraph below the v0.28.8 LongMemEval blurb
- AGENTS.md: new bullet under Common tasks teaching agents to reach for
  `gbrain eval trajectory` / `gbrain founder scorecard` / the
  `find_trajectory` MCP op when asked to evaluate a founder/company
  over time
- docs/contradictions.md: append "Temporal axis follow-on (v0.35.3.1 +
  v0.35.7)" subsection under See also, cross-linking the trajectory
  substrate and naming the auto-supersession.ts:4 invariant preserved
  by both the verdict enum (probe side) and consolidate's valid_until
  writeback (cycle side)
- CLAUDE.md: fix stale (v0.35.4) tag on the trajectory entry to
  (v0.35.7) — version got rebumped twice during the merge wave
- skills/migrations/v0.35.7.md renamed to v0.35.7.0.md for consistency
  with the v0.35.0.0.md / v0.14.0.md / etc naming convention
- llms-full.txt regenerated to reflect the CLAUDE.md edit

Coverage map (Diataxis):
  /eval trajectory CLI       ✅ ref (README, AGENTS) ✅ how-to (CHANGELOG) ❌ tutorial
  /founder scorecard CLI     ✅ ref (README, AGENTS) ✅ how-to (CHANGELOG) ❌ tutorial
  find_trajectory MCP op     ✅ ref (CLAUDE.md, AGENTS, contradictions.md)
  typed-claim fence cols     ✅ ref (skills/migrations/v0.35.7.0.md, CHANGELOG)
  Migration v67              ✅ ref (CLAUDE.md, CHANGELOG)

No tutorial / explanation gaps worth filling in this PR — the migration
note's fence-syntax example already covers the "first typed claim"
walkthrough. ARCHITECTURE diagrams not drifted (the trajectory work
extends existing facts/takes infrastructure; no new component boxes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 1dadd9e into master May 18, 2026
7 checks passed
brandonlipman added a commit to brandonlipman/gbrain that referenced this pull request May 29, 2026
* upstream/master:
  v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208)
  v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)
  v0.36.5.0 feat: secure DATABASE_URL access for shell jobs (inherit: ["database_url"]) (garrytan#1192)
  v0.36.4.0 feat: brain-health-100 — autonomous remediation via doctor --remediate + Minions (garrytan#1193)
  fix(docs): comprehensive drift audit — contradictions, broken links, stale refs (garrytan#1201)
  v0.36.3.0 feat: dynamic embedding column selection for search (garrytan#1164)
  v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite (garrytan#1136)
  v0.36.1.1 fix-wave: community PR triage + 28 atomic fixes (garrytan#1182)
  v0.36.1.0 Hindsight calibration wave: brain learns how you tend to be wrong (garrytan#1139)
  v0.36.0.0 feat(skillpack): scaffold + reference + harvest (retire managed-block install) (garrytan#1130)
  v0.35.8.0 feat(cycle): phantom-page redirect inside extract_facts (garrytan#1138)
  v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4) (garrytan#1131)
  v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes garrytan#1091) (garrytan#1129)
  v0.35.5.1 fix(doctor): stop counting clean supervisor exits as crashes (garrytan#1108)
  v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker (garrytan#1111)
  v0.35.4.0 fix(doctor,entities): supervisor crash classification + bare-name resolver + 58x perf + stub guard observability (garrytan#1085)
  v0.35.3.1 feat(eval): temporal-aware contradiction probe + verdict enum (garrytan#1052)
  v0.35.3.0 fix wave: extract_facts items + git --no-recurse-submodules placement (garrytan#1053)

# Conflicts:
#	src/core/postgres-engine.ts
#	test/schema-bootstrap-coverage.test.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant