v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix by garrytan · Pull Request #1267 · garrytan/gbrain

garrytan · 2026-05-21T15:22:57Z

Summary

For agents indexing source code with gbrain, the right embedding model is now obvious — and the brain tells you so out loud. Voyage publishes voyage-code-3, a code-tuned embedding model with head-to-head numbers above their general flagships on code retrieval. The model was already registered in gbrain since pre-v0.33, but nothing in the discovery path said "use this for code." This PR closes that gap on four surfaces (decision tree, Topology 3 doc, setup skill, runtime nudge) and bundles a small cost-preview correctness fix so the nudge lands cleanly.

Discoverability:

docs/integrations/embedding-providers.md — new "Code-heavy brain" branch in the Decision tree + dedicated voyage-code-3 paragraph in the Voyage section
docs/architecture/topologies.md — Topology 3 (per-worktree code brain) gets a "Recommended embedding model" subsection with gbrain init --pglite --embedding-model voyage:voyage-code-3 --embedding-dimensions 1024 one-shot
skills/setup/SKILL.md — Topology 3 picker option one-liner pointing at voyage-code-3
src/commands/reindex-code.ts — new shouldNudgeCodeModel(bareModelName) pure helper + integration in runReindexCode so dry-run AND execute both surface a stderr nudge when the configured embedding model isn't code-tuned. Suppress with GBRAIN_NO_CODE_MODEL_NUDGE=1, --no-embed, or --json.

Bundled correctness fix:

src/commands/reindex-code.ts — cost-preview model field swapped from the v0.13-era EMBEDDING_MODEL back-compat constant (hardcoded text-embedding-3-large) to getEmbeddingModelName() at all five usage sites. Pre-fix, the preview would have printed a directly-contradictory model name right next to the new nudge.

Test Coverage

[+] src/commands/reindex-code.ts
  ├── shouldNudgeCodeModel(bare) pure fn               [★★★ TESTED] 6 cases (3-large, 3-small, voyage-4-large fire; voyage-code-3, Voyage-Code-3, empty don't)
  ├── runReindexCode nudge integration                 [★★★ TESTED] 5 CLI cases (stderr placement, --json, --no-embed, env var, already-optimal)
  └── runReindexCode model field (regression)          [★★★ TESTED] 3 cases (voyage-code-3, text-embedding-3-small, voyage-4-large round-trip)
[+] src/core/ai/recipes/voyage.ts (regression-pin)
  └── voyage-code-3 in models[], dims set, dimsProviderOptions SDK shape  [★★★ TESTED] 4 cases

COVERAGE: 18/18 paths tested (100%)
QUALITY: ★★★:14 ★★:0 ★:0

New test files:

test/ai/voyage-code-3-recipe.test.ts (recipe regression — 4 cases)
test/reindex-code-nudge.serial.test.ts (pure helper + CLI integration — 11 cases)
test/reindex-code-model-source.serial.test.ts (IRON-RULE cost-preview regression — 3 cases)

Existing test/reindex-code.test.ts updated to configure the gateway in beforeAll (required now that the cost-preview field reads from the gateway).

Pre-Landing Review

Ran /plan-eng-review against the plan before implementation. 6 findings surfaced + accepted, 0 unresolved, 0 critical gaps. Codex outside-voice caught 3 real correctness bugs in the original nudge helper (bare vs qualified model IDs, dry-run path miss, --no-embed flag naming) plus a Topology 3 config set + init ordering trap — all absorbed into the plan and the implementation.

Plan + review report: ~/.claude/plans/system-instruction-you-are-working-cozy-muffin.md.

Test Results

bun run verify (13 pre-checks + typecheck): clean.
bun run test (parallel unit loop): 8145 pass / 2 fail.

Both failures are pre-existing master regressions, not from this PR's touch surface:

test/minions.test.ts > MinionWorker self-health-check — passes 169/169 in isolation, shard-leak from another file.
test/doctor-report-remote.test.ts > healthy status — fails with health_score: 50 (expects >= 70). Introduced by master's v0.37.3.0 (v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes #1206) #1215, skill_brain_first doctor check) which tanks the score on fresh PGLite test brains. Filed in TODOS.md as P0 for the v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes #1206) #1215 maintainer.

My new + modified tests all pass: 23 cases across 4 files, 0 failures.

Plan Completion

All 10 implementation tasks from ~/.claude/plans/system-instruction-you-are-working-cozy-muffin.md shipped:

✅ T1 — EMBEDDING_MODEL constant → getEmbeddingModelName() (5 call sites)
✅ T2 — shouldNudgeCodeModel() helper + integration in runReindexCode
✅ T3 — embedding-providers.md decision tree branch + Voyage paragraph
✅ T4 — Topology 3 "Recommended embedding model" subsection
✅ T5 — setup/SKILL.md one-liner
✅ T6 — Three test files (recipe + nudge + cost-preview regression)
✅ T7 — CLAUDE.md annotations
✅ T8 — VERSION + package.json + CHANGELOG ELI10-lead entry
✅ T9 — bun run build:llms regenerated
✅ T10 — Tests + verify gate green

Adversarial Review

Run during /plan-eng-review via Codex outside-voice (gpt-5.5, model_reasoning_effort=high). 13 items surfaced, 8 absorbed into plan, 5 dismissed as process / accepted-context / minor. The 3 correctness bugs Codex caught in the original nudge helper would have shipped silently broken without the outside voice. No new adversarial pass needed at /ship time — diff is small (541+/12-, well below 200-line Codex structured threshold) and no code has changed since the plan-eng-review pass other than the merge from master.

Version bump

Workspace-aware queue collision: my branch baseline was at master 9a4ae096 (v0.37.2.0) and originally claimed v0.37.3.0. Between then and /ship, master shipped v0.37.3.0 (#1215), v0.37.4.0 (#1228), v0.37.5.0 (#1229). Rebumped during merge resolution to v0.37.6.0.

Three-line audit:

VERSION: 0.37.6.0
package.json: 0.37.6.0
CHANGELOG top: ## [0.37.6.0] - 2026-05-20

TODOS

Filed 1 new P0 follow-up: doctor-report-remote.test.ts:65 regression caused by master's #1215. Owner: maintainer of skill_brain_first doctor check. Not in this PR's touch surface.

Documentation

All docs updated as part of the implementation. /document-release ran as a Step 18 subagent and confirmed: README/AGENTS/CLAUDE/CHANGELOG/topologies/embedding-providers/setup-SKILL/VERSION/llms.txt are all in sync. No additional doc work needed.

Test plan

bun run typecheck clean
bun run verify (13 pre-checks + typecheck) clean
bun test test/ai/voyage-code-3-recipe.test.ts — 4/4 pass
bun test test/reindex-code-nudge.serial.test.ts — 11/11 pass
bun test test/reindex-code-model-source.serial.test.ts — 3/3 pass
bun test test/reindex-code.test.ts — 5/5 pass (updated to configure gateway)
Manual smoke test: gbrain reindex --code --dry-run --yes with text-embedding-3-large configured → nudge fires on stderr, cost preview shows the actually-configured model
Manual smoke test: gbrain reindex --code --dry-run --yes with voyage:voyage-code-3 configured → no nudge (already optimal)
bun run build:llms matches CLAUDE.md content (CI shard 1 won't fail)

🤖 Generated with Claude Code

…view fix For agents indexing source code with gbrain, the right embedding model is now obvious — and the brain tells you so out loud. Decision tree + Topology 3 doc + Topology 3 setup-skill pointer + runtime stderr nudge from `gbrain reindex --code` against non-code-tuned models. Same diff fixes the stale hardcoded `text-embedding-3-large` in the cost preview that would have made the nudge land badly. Tests: 3 new files (recipe regression, nudge logic + CLI integration, cost-preview IRON-RULE regression). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # CHANGELOG.md # VERSION # package.json

…follow-up) Noticed during /ship of v0.37.6.0. The skill_brain_first check added in v0.37.3.0 (#1215) appears to tank the doctor health score on fresh PGLite test brains, causing test/doctor-report-remote.test.ts:65 to fail with health_score: 50 (expects >=70). Pre-existing on master; not in this branch's touch surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolved version queue collision: master shipped v0.37.6.0 (#1246 OpenRouter) under the same slot this branch had claimed. Rebumped to v0.37.8.0 per user direction. CHANGELOG keeps both entries (v0.37.8.0 voyage-code-3 on top, v0.37.6.0 OpenRouter below); TODOS keeps both follow-up sections; my body text updated to reference v0.37.8.0 instead of v0.37.6.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolved: VERSION + package.json kept at v0.37.8.0 (our claim is higher than master's new v0.37.7.0, no rebump needed). CHANGELOG keeps both entries (v0.37.8.0 voyage-code-3 on top, v0.37.7.0 federated-brains below). TODOS keeps both follow-up sections. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* upstream/master: v0.38.2.0 fix(doctor): bounded frontmatter scan + partial-state surfacing (supersedes garrytan#1287) (garrytan#1297) v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter (garrytan#1289) v0.38.0.0 ingestion cathedral — gbrain capture + write-through + IngestionSource contract (garrytan#1275) v0.37.11.0: fresh-install PGLite embedding setup fix wave (garrytan#1286) v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants (garrytan#1278) v0.37.9.0 fix(frontmatter): canonical-style normalization for tag arrays (garrytan#1252) v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix (garrytan#1267) v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients (garrytan#1253) v0.37.6.0 feat(ai): OpenRouter recipe + generic default_headers seam (cherry-pick garrytan#1210) (garrytan#1246) v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) (garrytan#1229) feat: pgGraph-inspired CI scaffolding wave (v0.37.4.0) (garrytan#1228) v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes garrytan#1206) (garrytan#1215) v0.37.2.0: takes_resolution_consistency CHECK accepts 'unresolvable' (garrytan#1211) v0.37.1.0 feat: brainstorm + lsd — bisociation idea generator grounded in your own brain (garrytan#1214) v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208) v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)

garrytan and others added 4 commits May 21, 2026 08:08

Merge remote-tracking branch 'origin/master' into garrytan/kolkata-v3

9e24f4e

# Conflicts: # CHANGELOG.md # VERSION # package.json

garrytan changed the title ~~v0.37.6.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix~~ v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix May 21, 2026

garrytan merged commit 54a0629 into master May 21, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix#1267

v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix#1267
garrytan merged 5 commits into
masterfrom
garrytan/kolkata-v3

garrytan commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 21, 2026

Summary

Test Coverage

Pre-Landing Review

Test Results

Plan Completion

Adversarial Review

Version bump

TODOS

Documentation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant