v0.42.27.0 feat(idea-lineage): first-class idea_lineage op + feature eval + skill hardening by garrytan · Pull Request #1940 · garrytan/gbrain

garrytan · 2026-06-07T14:01:11Z

Summary

Builds on community PR #1830 (the original idea-lineage thinking skill by @davidNbreslauer) and takes it from a markdown workflow to a first-class, tested capability:

Hardened + deepened the skill — graph/timeline-aware evidence gathering (get_backlinks, traverse_graph depth-2, get_timeline), a concrete high/medium/low confidence rubric + degraded-evidence note, and a hardened routing boundary (adversarial fixtures separating it from concept-synthesis and trajectory queries; new low-collision changed my mind about trigger).
First-class idea_lineage op (src/core/operations.ts) — resolve a free-text idea to its best concept/page anchor, then gather dated evidence (matches, related concepts via backlinks + graph, timeline anchors, takes, optional entity trajectory, idea-scoped cached contradictions). Handler-orchestrated over existing engine primitives (no new engine method), resolve→gather two-phase, scope:'read' + localOnly with an in-handler ctx.remote reject. CLI (gbrain idea-lineage <idea>) + MCP tool auto-generated from the contract. Returns candidate anchors + disambiguation_needed and a degraded flag.
Feature-recovery eval — synthetic-corpus op test asserting lineage recovery, a cross-engine determinism case, and gbrain eval idea-lineage <idea> reporting evidence coverage (new lineage_evidence_coverage glossary metric), persisting to .gbrain-evals/ (gitignored).

The op is local-only by design: it composes read primitives whose source/visibility filtering isn't uniform yet, so v1 is scoped to local/trusted callers (see TODOS.md for the federated/remote follow-up). Classification (reversal vs abandoned branch) stays agent-side; the op returns evidence, not narrative, so it stays deterministic and engine-parity-testable.

Design review

Went through /plan-eng-review + a Codex outside-voice plan review (4 cross-model tensions decided: handler-orchestration over an engine method, keep the op thin, local-only v1, deterministic parity assertions) and a Codex adversarial diff review (takes source-scoping + eval source-resolution fixed; the GBRAIN_HOME-preload isolation was replaced with per-test isolation after it was found to leak into subprocess tests).

Testing

idea-lineage op test (synthetic-corpus recovery + contract) 10/0; metric-glossary 18/0; engine-parity (incl. new idea_lineage case) 14/0 against real Postgres + PGLite; skills-conformance/resolver/skillpack 371/0; routing-eval 116/0 (0 false positives); bun run verify 30/30; typecheck clean.
Test isolation fix: the suite's "no API key configured" assertions now isolate from a developer's real ~/.gbrain/config.json (they previously passed in CI but failed locally on a configured brain). Verified the 6 affected files 85/0 under the real config.

Supersedes

Supersedes #1830 (cross-repo fork PR — these improvements build on that work; original skill credited in the CHANGELOG).

🤖 Generated with Claude Code

Documentation

Docs synced for this release:

CHANGELOG.md — v0.42.27.0 entry (sell-test: what / why / how, with copy-paste commands).
docs/eval/METRIC_GLOSSARY.md — regenerated with the new lineage_evidence_coverage metric (CI freshness guard green).
TODOS.md — filed the federated/remote idea_lineage follow-up (deferred from v1's local-only scope).
skills/RESOLVER.md + skills/manifest.json + operations-descriptions.ts — idea-lineage skill row + idea_lineage op description (the canonical reference surfaces).

Documentation debt (pre-existing, not from this PR)

⚠️ CLAUDE.md says "GBrain ships 29 skills" but the manifest now holds 51 — long-standing drift from prior releases that didn't update the prose count. Out of scope here (touching CLAUDE.md triggers an llms-bundle regen); worth a standalone cleanup. Reference-quadrant fix.

Add graph/timeline tools (get_backlinks, traverse_graph depth-2, get_timeline) to the skill, a concrete high/medium/low confidence rubric + degraded-evidence note, and a new low-collision "changed my mind about" trigger. Expand routing-eval fixtures with adversarial paraphrases + trajectory/query negatives, plus a protective concept-synthesis boundary case. RESOLVER + llms-full updated.

Resolve a free-text idea to its best concept/page anchor, then gather dated evidence (matches, backlinks + depth-2 graph, timeline, takes, optional entity trajectory, cached contradictions). Handler-orchestrated over existing engine primitives (no new engine method); resolve→gather two-phase; scope:'read' + localOnly with a ctx.remote reject (sidesteps the federated-scope/visibility gaps in getBacklinks/getTimeline). Embeddings stripped at the wire boundary. Extract the contradiction slug-filter into a shared contradiction-filter helper reused by find_contradictions (DRY). Pin IDEA_LINEAGE_DESCRIPTION.

…eage Synthetic-corpus op test asserting lineage recovery (resolution, disambiguation, idea-scoped contradictions, embedding-stripped wire shape, remote reject, empty→degraded). Cross-engine parity case asserting deterministic evidence (top-result + non-vector set-equal). New `gbrain eval idea-lineage <idea>` CLI reporting evidence coverage, persisting to .gbrain-evals/idea-lineage-results.jsonl (explicit persistence; gitignored). Add lineage_evidence_coverage glossary metric + render group; regenerate METRIC_GLOSSARY.md.

Point GBRAIN_HOME at a throwaway temp dir in the shared preload so the suite never reads the developer's real ~/.gbrain/config.json. Without this, tests that assert "no API key configured" behavior (think degradation, hasAnthropicKey, probeLlmAvailability, ZE-key health, dream synthesize) pass in CI but fail on any machine with a configured brain, because loadConfig() resolves the key from the config file even after the env var is deleted. Makes local runs match CI.

…skill

…solution Address adversarial-review findings: - Replace the GBRAIN_HOME-in-preload approach with per-test suppressAnthropicKey (file-level beforeAll/afterAll) on the in-process no-key tests. The preload leaked GBRAIN_HOME into HOME-isolated subprocess tests (skillpack-check, doctor-home-dir, init-migrate-only, …) and broke their child-process isolation. Per-test isolation touches no subprocess test. - idea_lineage: source-scope the takes gather (was unscoped while every other evidence bucket was scoped). - gbrain eval idea-lineage: resolve the source via the canonical resolveSourceId chain (--source / GBRAIN_SOURCE / .gbrain-source) instead of hardcoding default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…Ids[] getBacklinks, getTimeline, searchTakes, and searchTakesVector now accept the federated sourceIds[] array (array path wins over scalar; neither = no filter), mirroring findTrajectory's source predicate. searchTakes/searchTakesVector gain real source_id isolation (previously holder-allow-list only — holder-scope is not a source boundary). getTimeline's 8-case branch collapses to one composed query. Both engines move in lockstep; engine-parity asserts cross-source EXCLUSION for each method.

Drop localOnly + the runtime ctx.remote reject; thread one validated sourceScopeOpts scope to all five gather channels. findTrajectory now threads remote=ctx.remote===true (world-only facts for remote — fixes a hardcoded remote:false private-fact leak). Contradictions (global, unscoped trend) are omitted for remote callers. p.source is validated against ctx.auth.allowedSources for remote callers (closes a cross-source IDOR). Phase-2 gather uses Promise.allSettled with a partial/errors flag; schema_version 1->2. HTTP/OAuth MCP only — not added to the subagent allow-list (deferred). Remote-safety unit tests + description updated; TODOS follow-ups filed.

…skill # Conflicts: # CHANGELOG.md # TODOS.md # VERSION # package.json # test/helpers/no-anthropic-key.ts # test/think-pipeline.serial.test.ts

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

davidNbreslauer and others added 8 commits June 3, 2026 14:51

feat(skills): add idea-lineage

0564c5b

Merge remote-tracking branch 'origin/master' into codex/idea-lineage-…

0d0a1a6

…skill

chore: bump version and changelog (v0.42.27.0)

e5e68ab

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

garrytan mentioned this pull request Jun 7, 2026

v0.42.30.0 feat(operations): make idea_lineage MCP/agent-callable (lift local-only) #1830

Merged

3 tasks

garrytan and others added 4 commits June 7, 2026 08:37

Merge remote-tracking branch 'origin/master' into codex/idea-lineage-…

d9b14c0

…skill # Conflicts: # CHANGELOG.md # TODOS.md # VERSION # package.json # test/helpers/no-anthropic-key.ts # test/think-pipeline.serial.test.ts

chore: bump version and changelog (v0.42.30.0)

eb0f5db

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.42.27.0 feat(idea-lineage): first-class idea_lineage op + feature eval + skill hardening#1940

v0.42.27.0 feat(idea-lineage): first-class idea_lineage op + feature eval + skill hardening#1940
garrytan wants to merge 12 commits into
masterfrom
codex/idea-lineage-skill

garrytan commented Jun 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

garrytan commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design review

Testing

Supersedes

Documentation

Documentation debt (pre-existing, not from this PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

garrytan commented Jun 7, 2026 •

edited

Loading