v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (#1700) by garrytan · Pull Request #1757 · garrytan/gbrain

garrytan · 2026-06-02T02:43:21Z

What this ships

gbrain enrich --thin develops stub (thin) people/company pages at scale by consolidating what the brain already knows about an entity into one cited page — no web lookups. gbrain's own model tooling only sees brain-internal context (search / get_page / facts / backlinks), so enrich does brain-internal grounded synthesis: deterministically retrieve everything scattered about an entity (meetings, other pages, deals, facts, raw notes), then one grounded gateway.chat call per page writes a real dossier with [Source: slug] citations. When the brain knows too little, it skips instead of fabricating. Web research stays the agent-driven enrich SKILL's job.

gbrain enrich --thin --dry-run --json                       # preview + cost estimate, no spend
gbrain enrich --thin --limit 3 --max-usd 0.50 --model anthropic:claude-haiku-4-5

Resumable (--resume), budget-capped (--max-usd; best-effort under --workers>1, pin --workers 1 for a hard ceiling), source-scoped (--source), backgroundable (--background). Enriched pages stamp enriched_at/enriched_by so a recency guard skips them next run. Opt-in autopilot phase enrich_thin (default OFF) trickles a few pages/source/cycle with per-source + brain-wide cost/walltime caps.

Commits (whole branch vs master)

feat(engine) — listEnrichCandidates source-aware candidate selection (pg + pglite parity): thin-filter + per-page source-correct inbound count + enriched_at recency guard + whitelisted ORDER BY + LIMIT, lightweight projection (no bodies).
feat(enrich) — gbrain enrich --thin core: runEnrichCore/enrichOne, retrieve→ground→synthesize→put_page write-through, op-checkpoint resume, BudgetTracker cap, per-page advisory lock, CLI + thin-client refuse + Minion handler. Includes the codex-review fixes (below).
feat(cycle) — opt-in enrich_thin autopilot phase with enforced per-source cost cap.
merge origin/master (0.42.2.0) — resolved 4 conflicts; cycle phase count is now 22 (both enrich_thin and skillopt landed; both sides had independently claimed 21).
chore — VERSION/CHANGELOG/CLAUDE.md/llms → 0.42.6.0.

Codex review (implementation) — FAIL→addressed

A fresh /codex review of the implementation (the prior codex pass was plan-stage only) found 4 P1 + 2 P2, all verified against the code. Folded in:

P1 injection escape — sanitizeContext neutralizes the <context>…</context> data-envelope delimiters so an untrusted retrieved note can't close the envelope and inject instructions (mirrors the existing </trajectory> convention).
P1 background idempotency — multi-source --background key now carries the run fingerprint (backgroundIdempotencyKey); a re-run with different --model/--limit/--force enqueues new work instead of returning a stale completed job.
P1 budget honesty — budget_exhausted is flagged post-hoc when tracker.totalSpent > cap even if the gateway swallowed the final-call throw (new read-only BudgetTracker.cap getter; no gateway.ts change).
P2 checkpoint flush — body() flushes the checkpoint on BudgetExhausted before it propagates, so resume doesn't re-charge completed pages.
P2 per-source cap — cycle phase enforces min(per-source, brain-wide remaining) instead of leaving the per-source cap parsed-but-ignored.
Accepted (D5): in-flight gateway.chat isn't cancelled on budget abort (documented best-effort overshoot ~1 call/worker; a true fix needs a shared runSlidingPool API change). Code comment added.

Verification

bun run typecheck clean; bun run verify 29/29.
Full unit suite green except 7 pre-existing env-coupled tests (assert keyless-fallback behavior; all 4 files pass 34/34 in a clean keyless env = CI-equivalent; none touch the enrich diff).
enrich e2e enrich-pglite 15/15 × 3 runs (the prior flaky budget test now carries the repo-standard 30s timeout); listEnrichCandidates pg↔pglite parity confirmed on real Postgres.

Reviews: ENG cleared (plan), CODEX FAIL absorbed (5 fixes + 1 accepted). All fixtures use placeholder names.

🤖 Generated with Claude Code

…1700) One SQL query per engine: thin-filter + per-page source-correct inbound-link count (to_page_id = p.id, mentions excluded) + enriched_at recency guard + whitelisted ORDER BY (ENRICH_ORDER_SQL) + LIMIT, returning a lightweight projection (no page bodies). EnrichCandidate/EnrichCandidatesOpts types. pg + pglite parity, pinned by engine-parity.test.ts.

…1700) Develops stub pages at scale by consolidating scattered brain knowledge (search + backlinks + facts + raw_data) into one grounded gateway.chat call per page. Resumable (op-checkpoint), budget-capped (best-effort under --workers), per-page advisory lock, put_page write-through. CLI + thin-client refuse + Minion handler. Includes codex-review fixes: sanitizeContext neutralizes the <context> envelope delimiters (P1 injection escape); background fan-out idempotency key carries the run fingerprint (P1); post-hoc budget-overage flag via new BudgetTracker.cap getter (P1); checkpoint flush on budget exhaustion (P2). Accepts the documented best-effort in-flight-cancel limitation (D5) with an explicit code note.

Default-OFF trickle around runEnrichCore: develops a few thin pages per source per tick so the brain compounds over time. Per-source cap enforced as min(per-source, brain-wide remaining) with brain-wide total + walltime caps (P2 fix: per-source max_cost_usd was parsed but never enforced). Wired into CyclePhase / ALL_PHASES / PHASE_SCOPE / NEEDS_LOCK + dispatch.

…n-batch # Conflicts: # src/cli.ts # src/core/cycle.ts # test/core/cycle.serial.test.ts # test/phase-scope-coverage.test.ts

gbrain enrich --thin batch enrichment (#1700) + codex-review fixes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…n-batch # Conflicts: # CHANGELOG.md # VERSION # package.json

* upstream/master: v0.42.8.0 feat: content-quality gate on sync — quarantine junk + flag boilerplate (garrytan#1699) (garrytan#1756) v0.42.7.0 feat(extract): link/timeline extraction freshness watermark — gbrain extract --stale + doctor lag check (garrytan#1696) (garrytan#1755) v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (garrytan#1700) (garrytan#1757) v0.42.5.0 fix(minions): RSS watchdog opacity + pooler-reap self-heal + silent lens backlog + cycle lint DB-disconnect (garrytan#1678) (garrytan#1735) v0.42.4.0 fix: think --model fails loud — slash-form ids + never persist empty synthesis (garrytan#1698) (garrytan#1736) v0.42.3.0 feat(search): autocut — score-discontinuity result-sizing (garrytan#1663 wave 1) (garrytan#1682) v0.42.2.0 feat: gbrain connect — one-command Claude Code onboarding from a bearer token (garrytan#1683)

garrytan and others added 6 commits June 1, 2026 19:10

Merge remote-tracking branch 'origin/master' into garrytan/enrich-thi…

102e407

…n-batch # Conflicts: # src/cli.ts # src/core/cycle.ts # test/core/cycle.serial.test.ts # test/phase-scope-coverage.test.ts

chore: bump version and changelog (v0.42.6.0)

e2ccf40

gbrain enrich --thin batch enrichment (#1700) + codex-review fixes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/master' into garrytan/enrich-thi…

3e89286

…n-batch # Conflicts: # CHANGELOG.md # VERSION # package.json

garrytan merged commit 662a6e2 into master Jun 2, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (#1700)#1757

v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (#1700)#1757
garrytan merged 6 commits into
masterfrom
garrytan/enrich-thin-batch

garrytan commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Jun 2, 2026

What this ships

Commits (whole branch vs master)

Codex review (implementation) — FAIL→addressed

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant