v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow#962
Merged
Merged
Conversation
Voyage's hosted /embeddings endpoint accepts `output_dimension` only for
the seven flexible-dim models (voyage-4-large, voyage-4, voyage-4-lite,
voyage-3-large, voyage-3.5, voyage-3.5-lite, voyage-code-3). voyage-4-nano
is an open-weight variant Voyage lists separately as fixed 1024-dim — the
hosted API rejects the parameter for it.
The recipe docstring previously claimed "all v4 variants" have flexible
dims, which is what led to nano being added to the allowlist in the first
place. Tighten the comment to name the hosted trio explicitly and call out
nano-as-open-weight.
Convert the test case at test/ai/gateway.test.ts from a positive assertion
(voyage-4-nano returns { dimensions: 512 }) to a negative regression pin
(voyage-4-nano returns undefined), so a future contributor can't silently
re-add nano without breaking this test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…P3 follow-ups) Two follow-ups from Codex's adversarial review of PR #962, both Voyage-adjacent correctness fixes that the original PR scope had filed as TODOs. 1. gateway.ts:619 Voyage OOM cap was theatrical ------------------------------------------------- voyageCompatFetch's inbound response rewriter is wrapped in a try/catch that falls back to the original response on parse failure — correct for "Voyage returned JSON I can't reshape, let the SDK handle it." But the per-embedding Layer 2 OOM cap at line 619 threw a bare `new Error(...)`, which the same catch silently swallowed. Net result: an oversized base64 response (Layer 1 skipped because no Content-Length header) returned through to the AI SDK and could OOM the worker on JSON.parse. Fix: introduce `VoyageResponseTooLargeError`, throw it at both cap sites (Content-Length Layer 1 at line 595 and per-embedding Layer 2 at line 619), and rethrow it from the inbound try/catch via `if (err instanceof VoyageResponseTooLargeError) throw err`. Pre-existing fall-back-on-parse-error behavior for other thrown errors is preserved. Regression-pinned by 2 new behavioral tests (mock fetch returns oversized Content-Length / oversized base64; embed() throws with the expected message) and a structural assertion in test/voyage-response-cap.test.ts that the `instanceof VoyageResponseTooLargeError ⇒ throw` line stays put. 2. Voyage flexible-dim runtime validation + doctor check ------------------------------------------------------- A brain configured for a Voyage flexible-dim model (voyage-4-large, voyage-3-large, voyage-3.5, voyage-3.5-lite, voyage-4, voyage-4-lite, voyage-code-3) without an explicit `embedding_dimensions` would fall back to DEFAULT_EMBEDDING_DIMENSIONS=1536 — an OpenAI default that Voyage rejects. Voyage's only accepted values are {256, 512, 1024, 2048}. Pre-fix the failure surfaced as an HTTP 400 from Voyage that often got misclassified as a transient network error. Fix: - `dims.ts` exports `VOYAGE_VALID_OUTPUT_DIMS` and `isValidVoyageOutputDim`. - `dimsProviderOptions` throws `AIConfigError` with a paste-ready fix command (`gbrain config set embedding_dimensions ...`) when a Voyage flexible-dim model is configured with an invalid dim value. - `gbrain models doctor` gets a new `embedding_config` probe that runs first (zero tokens) and surfaces the misconfiguration before any chat/expansion probes spend a single token. New probe status `config` + optional `fix` hint rendered in human output. Regression-pinned by 6 new unit tests covering the AIConfigError throw, exact valid-values set, the bypass path for fixed-dim Voyage models, and the fix-hint contents.
…-dimension-wire-fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
brandonlipman
added a commit
to brandonlipman/gbrain
that referenced
this pull request
May 29, 2026
* upstream/master: v0.35.1.0: embedder shootout prereqs (pricing + gateway export + --resume-from) (garrytan#1055) v0.35.0.0 feat: ZeroEntropy zembed-1 + zerank-2 reranker (garrytan#1008) v0.34.4.0 fix(embed): cursor-paginated --stale hardening wave (D2/D3/D4/D6/D7/D8 + regression test) (garrytan#991) v0.34.3.0 fix: supervisor treats code=0 watchdog exits as crashes (garrytan#1003) v0.34.2.0 fix(import): path-based checkpoint resume — kills parallel-drop + failed-file-skip + sort-flip bugs (garrytan#988) v0.34.1.0 fix(mcp): MCP fix wave — source-isolation P0 + PKCE DCR + federated_read + 3 more (garrytan#996) v0.34.0.0 feat: Cathedral III — recursive code intelligence + Leiden clusters + eval gate (garrytan#994) v0.33.3.0 feat(v0.33.3): code intelligence MCP foundation (v0.34 W0a-c + W3) (garrytan#934) v0.33.2.1 docs: fork-PR workflow for garrytan-agents (garrytan#992) fix(sync): raise maxBuffer to 100 MiB to prevent silent ENOBUFS crash (garrytan#982) v0.33.2.0 feat(search-lite): token budget + semantic query cache + intent weighting (garrytan#897) v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow (garrytan#962)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Re-opens #866 by @100yenadmin from
electricsheephq/eva-brainon upstream so the voyage-4-nano fixup and two Codex-caught follow-ups can land in the same PR (the GitHubmaintainerCanModifyflag doesn't grant push access across forks for upstream pushes — both HTTPS and SSH return 403).Summary
Voyage core fix (Eva's commit
ee90709e, preserved verbatim)Fixes Voyage 2048-dim embeddings.
src/core/ai/dims.tsnow returns{ openaiCompatible: { dimensions: N } }(SDK-supported) instead ofoutput_dimension(Voyage wire key the AI SDK silently drops). ExistingvoyageCompatFetchatsrc/core/ai/gateway.ts:541translatesdimensions → output_dimensionbefore the HTTP body is sent. New wire-level test stubsglobalThis.fetchand asserts the actual outbound body containsoutput_dimension: 2048+encoding_format: base64.voyage-4-nano fixup (commit
8b02e238)Voyage's
voyage-4-nanois an open-weight variant listed separately as fixed 1024-dim — it doesn't acceptoutput_dimensionon the hosted API. Caught by adversarial Codex review during PR evaluation, confirmed against Voyage's docs.voyage-4-nanofromVOYAGE_OUTPUT_DIMENSION_MODELS.voyage-4-nano returns undefined).Codex P3 follow-ups (commit
d5dc5fad)gateway.ts:619OOM-cap was theatrical. The per-embedding Layer 2 base64 cap was throwing a genericErrorthat the surroundingcatch {}silently swallowed, then returning the original (oversized) response into the AI SDK's JSON parser. Now wrapped in taggedVoyageResponseTooLargeErrorand rethrown viainstanceofcheck. Parse-error fall-back preserved for non-OOM errors. Pinned by 2 behavioral tests + a structural assertion.Voyage flexible-dim runtime validation. Brains misconfigured for Voyage flexible-dim models with
embedding_dimensionsoutside{256, 512, 1024, 2048}(most common: leaving it unset → default 1536) now throwAIConfigErrorwith a paste-ready fix at the embed boundary instead of waiting for Voyage's opaque HTTP 400.gbrain models doctorembedding_config probe. New zero-token check runs first, before any chat/expansion probes, surfacing the misconfiguration viagbrain doctorat config-test time. Newconfigstatus +fixhint rendered in both human and JSON output.Ship bookkeeping (commit
82ab5507)VERSION 0.33.1.0 → 0.33.1.1, CHANGELOG entry.
Test Coverage
Pre-Landing Review
Adversarial Codex review ran twice during PR evaluation:
voyage-4-nanomembership P2 (resolved in8b02e238).gateway.ts:619OOM-cap P3 and theembedding_dimensions=undefined → 1536edge case (both resolved ind5dc5fad).No critical findings remain.
bun run verify(all 9 checks +tsc --noEmit) clean.Plan Completion
All planned items DONE. Plan file:
~/.claude/plans/this-is-an-outside-jolly-nebula.md.Documentation
src/core/ai/dims.ts(the Voyage flexible-dim resolver +VOYAGE_VALID_OUTPUT_DIMSallowlist + runtime validator). v0.33.1.1 waves appended tosrc/core/ai/gateway.ts(VoyageResponseTooLargeErrorrethrow),src/core/ai/recipes/voyage.ts(tightened docstring naming the seven hosted flexible-dim models, nano explicitly excluded), andsrc/commands/models.ts(new zero-tokenembedding_configdoctor probe +'config'probe status +fix?field). NewKey commands added in v0.33.1.1 (Voyage 2048-dim correctness wave)section documenting the four user-facing changes.gbrain models doctorhelp-text line extended to mention the new embedding_config probe (catches Voyage flexible-dim misconfigs before first embed).bun run build:llms(test/build-llms.test.ts gates this on every CLAUDE.md edit).Why this PR replaces #866
PR #866 was opened on Eva's fork.
maintainerCanModifyis set but doesn't grant cross-fork push from this account (both HTTPS and SSH return 403). Landing Eva's correct core fix + 3 Codex-caught follow-ups atomically required cherry-picking her commit verbatim onto upstream and stacking the follow-ups on top. Her authorship is preserved on commitee90709e.#866 closes with attribution. Thanks @100yenadmin.
Test plan
bun test test/ai/gateway.test.ts— 34/34 passbun run test(parallel fast loop, 4 shards × 2 concurrency) — 6039 pass / 0 fail (post-master-merge)bun run verify— 9 checks +tsc --noEmitclean🤖 Generated with Claude Code