v0.40.8.0 fix: local embeddings as a first-class provider by kohai-ut · Pull Request #1329 · garrytan/gbrain

kohai-ut · 2026-05-23T20:40:56Z

Summary

Makes local embedding providers (Ollama, llama-server) behave exactly like hosted ones across every place gbrain inspects your setup. Three independent gaps, one root cause (a hosted-only key check copied into two producers):

doctor/autopilot remediation planner (fix: recipe-aware embedding-provider check): gbrain doctor --remediation-plan reported a blocked "missing embedding API key" for a brain on ollama: / llama-server:, contradicting gbrain doctor --json's 100%-coverage health. A new shared embeddingProviderConfigured() helper (recipe-aware: empty auth_env.required ⇒ no key needed; hosted ⇒ checks its OWN required key) replaces the hosted-only prefix ladder in doctor.ts and the parallel copy in autopilot.ts. The RecommendationContext.hasEmbeddingApiKey field is renamed embeddingProviderConfigured and the blocker reason broadened to "embedding provider not configured".
budget tracker (fix(budget): price local embed providers at $0): a --max-cost-bounded embed/reindex job for a local provider TX2 hard-failed with no_pricing. New FREE_LOCAL_EMBED_PROVIDERS = {ollama, llama-server} (sibling to the existing FREE_LOCAL_RERANK_PROVIDERS) returns $0 on a lookupEmbeddingPrice miss. lmstudio (no recipe) and litellm (can proxy paid) intentionally excluded.
models doctor (feat(models): embedding reachability probe): a down/misconfigured local embed server was invisible until first embed. New probeEmbeddingReachability() mirrors the reranker probe — a 1-input embed with a 5s abort timeout, new 'embedding_reachability' touchpoint, gated on the zero-network config probe returning ok first.

Intended behavior change (hosted providers)

The remediation planner now judges each hosted provider by its OWN required key. Pre-fix, every non-OpenAI/non-ZE provider fell back to "any OpenAI/ZE key present", so a Voyage brain looked configured if an unrelated OpenAI key existed. Now a Voyage brain is judged by VOYAGE_API_KEY. Strictly more correct, but a behavior change for Voyage/Google brains that relied on the old fallback. Hosted brains with no key still block, as before.

Test Coverage

test/brain-score-recommendations.test.ts: 6 new helper cases (empty/undefined → false, local → true regardless of keys, hosted iff key resolves, the Voyage behavior change, unknown provider → false, malformed model id → false) + renamed consumer assertions.
test/core/budget/budget-tracker.test.ts: local embed → $0 (no TX2); regression unknown-hosted embed still TX2 hard-fails; regression known-hosted (openai) still trips a real cost gate.
test/models-doctor-embed.test.ts (new): pins the three reachability-probe invariants (uses embed not embedQuery, distinct touchpoint member, gated on config-probe-ok).
test/v0_37_gap_fill.serial.test.ts: source-grep updated to the new helper names.

64/64 tests pass across the 4 affected files. bun run verify and bun run check:all green. (The full parallel PGLite suite is intentionally left to CI — it OOMs the 16GB dev box; targeted-file runs + verify + check:all are the local gate.)

Pre-Landing Review

Clean — 0 findings. No SQL/LLM trust-boundary surface; the only added runtime cost is 4 sequential engine.getConfig awaits per autopilot tick (negligible). No frontend files (design review N/A), no prompt files (evals N/A).

Plan Completion

All planned items shipped (gaps A/B/C) plus the codex outside-voice refinements from the plan-eng-review (recipe-aware helper ≠ isAvailable, parseModelId try/catch, stated hosted behavior change + tests, autopilot model resolution, HOSTED_EMBED_KEY_CONFIG sync-closure map, embed not embedQuery, distinct touchpoint member, gating on config-probe-ok).

TODOS

Added a follow-up to unify FREE_LOCAL_EMBED_PROVIDERS + FREE_LOCAL_RERANK_PROVIDERS into one FREE_LOCAL_PROVIDERS keyed by kind, and to evaluate recipe-cost-driven resolution, once the rerank (#1326) and embed sides both land.

Test plan

bun run verify (typecheck + prechecks)
bun run check:all
Targeted: bun test test/brain-score-recommendations.test.ts test/core/budget/budget-tracker.test.ts test/models-doctor-embed.test.ts test/v0_37_gap_fill.serial.test.ts → 64 pass
Full CI suite (runs on PR)

🤖 Generated with Claude Code

doctor --remediation-plan and autopilot both judged the embedding provider with a hosted-only key check, so a brain on ollama: or llama-server: was reported "blocked" on a missing API key it never needed, contradicting doctor --json's 100%-coverage health. Extract a shared embeddingProviderConfigured() helper into brain-score-recommendations.ts: empty auth_env.required (local providers) is configured with no key; hosted providers check their OWN required key. Both producers (doctor, autopilot) call it, killing the DRY violation that caused the bug. Hosted brains with a missing key still block.

A --max-cost-bounded embed/reindex job configured for ollama: or llama-server: TX2 hard-failed with no_pricing because lookupEmbeddingPrice has no entry for local models. Add FREE_LOCAL_EMBED_PROVIDERS (sibling to FREE_LOCAL_RERANK_PROVIDERS) so a pricing miss on a local-inference provider returns $0 instead of null. lmstudio/litellm intentionally excluded.

A down/misconfigured local embed server was invisible until first embed. Add probeEmbeddingReachability() (mirrors the reranker probe): a 1-input embed with a 5s abort timeout, classified via classifyError, under a new 'embedding_reachability' touchpoint, gated on the zero-network config probe returning ok first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

codex review caught a false positive: HOSTED_EMBED_KEY_CONFIG mapped VOYAGE_API_KEY/GOOGLE_GENERATIVE_AI_API_KEY to config fields, but buildGatewayConfig only threads openai/anthropic/zeroentropy config keys into the gateway env. A Voyage/Google brain with the key only in config.json would be judged "configured" and dispatch an embed.stale job that then fails auth at the gateway. Drop those two from the map so the producer closures resolve them by env var only, matching what the gateway can actually use. Pinned by a regression test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

garrytan · 2026-06-08T03:01:56Z

Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on.

We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs).

Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏

kohai-ut and others added 5 commits May 23, 2026 14:39

chore: bump version and changelog (v0.40.8.0)

3dd1260

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

This was referenced May 24, 2026

v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) #1377

Merged

fix(cli): use fd 0 instead of '/dev/stdin' for cross-platform stdin reads #1325

Closed

garrytan closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.40.8.0 fix: local embeddings as a first-class provider#1329

v0.40.8.0 fix: local embeddings as a first-class provider#1329
kohai-ut wants to merge 5 commits into
garrytan:masterfrom
kohai-ut:feat/local-embeddings

kohai-ut commented May 23, 2026

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kohai-ut commented May 23, 2026

Summary

Intended behavior change (hosted providers)

Test Coverage

Pre-Landing Review

Plan Completion

TODOS

Test plan

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants