v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (#1700)#1757
Merged
Merged
Conversation
…1700) One SQL query per engine: thin-filter + per-page source-correct inbound-link count (to_page_id = p.id, mentions excluded) + enriched_at recency guard + whitelisted ORDER BY (ENRICH_ORDER_SQL) + LIMIT, returning a lightweight projection (no page bodies). EnrichCandidate/EnrichCandidatesOpts types. pg + pglite parity, pinned by engine-parity.test.ts.
…1700) Develops stub pages at scale by consolidating scattered brain knowledge (search + backlinks + facts + raw_data) into one grounded gateway.chat call per page. Resumable (op-checkpoint), budget-capped (best-effort under --workers), per-page advisory lock, put_page write-through. CLI + thin-client refuse + Minion handler. Includes codex-review fixes: sanitizeContext neutralizes the <context> envelope delimiters (P1 injection escape); background fan-out idempotency key carries the run fingerprint (P1); post-hoc budget-overage flag via new BudgetTracker.cap getter (P1); checkpoint flush on budget exhaustion (P2). Accepts the documented best-effort in-flight-cancel limitation (D5) with an explicit code note.
Default-OFF trickle around runEnrichCore: develops a few thin pages per source per tick so the brain compounds over time. Per-source cap enforced as min(per-source, brain-wide remaining) with brain-wide total + walltime caps (P2 fix: per-source max_cost_usd was parsed but never enforced). Wired into CyclePhase / ALL_PHASES / PHASE_SCOPE / NEEDS_LOCK + dispatch.
…n-batch # Conflicts: # src/cli.ts # src/core/cycle.ts # test/core/cycle.serial.test.ts # test/phase-scope-coverage.test.ts
gbrain enrich --thin batch enrichment (#1700) + codex-review fixes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n-batch # Conflicts: # CHANGELOG.md # VERSION # package.json
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
Jun 3, 2026
* upstream/master: v0.42.8.0 feat: content-quality gate on sync — quarantine junk + flag boilerplate (garrytan#1699) (garrytan#1756) v0.42.7.0 feat(extract): link/timeline extraction freshness watermark — gbrain extract --stale + doctor lag check (garrytan#1696) (garrytan#1755) v0.42.6.0 feat(enrich): gbrain enrich --thin — brain-internal grounded synthesis for stub pages (garrytan#1700) (garrytan#1757) v0.42.5.0 fix(minions): RSS watchdog opacity + pooler-reap self-heal + silent lens backlog + cycle lint DB-disconnect (garrytan#1678) (garrytan#1735) v0.42.4.0 fix: think --model fails loud — slash-form ids + never persist empty synthesis (garrytan#1698) (garrytan#1736) v0.42.3.0 feat(search): autocut — score-discontinuity result-sizing (garrytan#1663 wave 1) (garrytan#1682) v0.42.2.0 feat: gbrain connect — one-command Claude Code onboarding from a bearer token (garrytan#1683)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this ships
gbrain enrich --thindevelops stub (thin) people/company pages at scale by consolidating what the brain already knows about an entity into one cited page — no web lookups. gbrain's own model tooling only sees brain-internal context (search / get_page / facts / backlinks), so enrich does brain-internal grounded synthesis: deterministically retrieve everything scattered about an entity (meetings, other pages, deals, facts, raw notes), then one groundedgateway.chatcall per page writes a real dossier with[Source: slug]citations. When the brain knows too little, it skips instead of fabricating. Web research stays the agent-drivenenrichSKILL's job.gbrain enrich --thin --dry-run --json # preview + cost estimate, no spend gbrain enrich --thin --limit 3 --max-usd 0.50 --model anthropic:claude-haiku-4-5Resumable (
--resume), budget-capped (--max-usd; best-effort under--workers>1, pin--workers 1for a hard ceiling), source-scoped (--source), backgroundable (--background). Enriched pages stampenriched_at/enriched_byso a recency guard skips them next run. Opt-in autopilot phaseenrich_thin(default OFF) trickles a few pages/source/cycle with per-source + brain-wide cost/walltime caps.Commits (whole branch vs master)
feat(engine)—listEnrichCandidatessource-aware candidate selection (pg + pglite parity): thin-filter + per-page source-correct inbound count +enriched_atrecency guard + whitelisted ORDER BY + LIMIT, lightweight projection (no bodies).feat(enrich)—gbrain enrich --thincore:runEnrichCore/enrichOne, retrieve→ground→synthesize→put_pagewrite-through, op-checkpoint resume, BudgetTracker cap, per-page advisory lock, CLI + thin-client refuse + Minion handler. Includes the codex-review fixes (below).feat(cycle)— opt-inenrich_thinautopilot phase with enforced per-source cost cap.origin/master(0.42.2.0) — resolved 4 conflicts; cycle phase count is now 22 (bothenrich_thinandskilloptlanded; both sides had independently claimed 21).chore— VERSION/CHANGELOG/CLAUDE.md/llms → 0.42.6.0.Codex review (implementation) — FAIL→addressed
A fresh
/codex reviewof the implementation (the prior codex pass was plan-stage only) found 4 P1 + 2 P2, all verified against the code. Folded in:sanitizeContextneutralizes the<context>…</context>data-envelope delimiters so an untrusted retrieved note can't close the envelope and inject instructions (mirrors the existing</trajectory>convention).--backgroundkey now carries the run fingerprint (backgroundIdempotencyKey); a re-run with different--model/--limit/--forceenqueues new work instead of returning a stale completed job.budget_exhaustedis flagged post-hoc whentracker.totalSpent > capeven if the gateway swallowed the final-call throw (new read-onlyBudgetTracker.capgetter; nogateway.tschange).body()flushes the checkpoint onBudgetExhaustedbefore it propagates, so resume doesn't re-charge completed pages.min(per-source, brain-wide remaining)instead of leaving the per-source cap parsed-but-ignored.gateway.chatisn't cancelled on budget abort (documented best-effort overshoot ~1 call/worker; a true fix needs a sharedrunSlidingPoolAPI change). Code comment added.Verification
bun run typecheckclean;bun run verify29/29.enrich-pglite15/15 × 3 runs (the prior flaky budget test now carries the repo-standard 30s timeout);listEnrichCandidatespg↔pglite parity confirmed on real Postgres.Reviews: ENG cleared (plan), CODEX FAIL absorbed (5 fixes + 1 accepted). All fixtures use placeholder names.
🤖 Generated with Claude Code