Skip to content

v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow#962

Merged
garrytan merged 6 commits into
masterfrom
codex/voyage-output-dimension-wire-fix
May 13, 2026
Merged

v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow#962
garrytan merged 6 commits into
masterfrom
codex/voyage-output-dimension-wire-fix

Conversation

@garrytan

@garrytan garrytan commented May 13, 2026

Copy link
Copy Markdown
Owner

Re-opens #866 by @100yenadmin from electricsheephq/eva-brain on upstream so the voyage-4-nano fixup and two Codex-caught follow-ups can land in the same PR (the GitHub maintainerCanModify flag doesn't grant push access across forks for upstream pushes — both HTTPS and SSH return 403).

Summary

Voyage core fix (Eva's commit ee90709e, preserved verbatim)

Fixes Voyage 2048-dim embeddings. src/core/ai/dims.ts now returns { openaiCompatible: { dimensions: N } } (SDK-supported) instead of output_dimension (Voyage wire key the AI SDK silently drops). Existing voyageCompatFetch at src/core/ai/gateway.ts:541 translates dimensions → output_dimension before the HTTP body is sent. New wire-level test stubs globalThis.fetch and asserts the actual outbound body contains output_dimension: 2048 + encoding_format: base64.

flowchart LR
  A["embedding_dimensions=2048"] --> B["dimsProviderOptions"]
  B --> C["AI SDK openai-compatible"]
  C --> D["voyageCompatFetch"]
  D --> E["Voyage /embeddings"]
  B -->|"SDK key"| C
  C -->|"dimensions"| D
  D -->|"output_dimension"| E
Loading

voyage-4-nano fixup (commit 8b02e238)

Voyage's voyage-4-nano is an open-weight variant listed separately as fixed 1024-dim — it doesn't accept output_dimension on the hosted API. Caught by adversarial Codex review during PR evaluation, confirmed against Voyage's docs.

  • Drops voyage-4-nano from VOYAGE_OUTPUT_DIMENSION_MODELS.
  • Tightens the voyage recipe docstring so the "all v4 variants are flexible" claim doesn't lead a future contributor to re-add it.
  • Flips the test case to a negative regression pin (voyage-4-nano returns undefined).

Codex P3 follow-ups (commit d5dc5fad)

  1. gateway.ts:619 OOM-cap was theatrical. The per-embedding Layer 2 base64 cap was throwing a generic Error that the surrounding catch {} silently swallowed, then returning the original (oversized) response into the AI SDK's JSON parser. Now wrapped in tagged VoyageResponseTooLargeError and rethrown via instanceof check. Parse-error fall-back preserved for non-OOM errors. Pinned by 2 behavioral tests + a structural assertion.

  2. Voyage flexible-dim runtime validation. Brains misconfigured for Voyage flexible-dim models with embedding_dimensions outside {256, 512, 1024, 2048} (most common: leaving it unset → default 1536) now throw AIConfigError with a paste-ready fix at the embed boundary instead of waiting for Voyage's opaque HTTP 400.

  3. gbrain models doctor embedding_config probe. New zero-token check runs first, before any chat/expansion probes, surfacing the misconfiguration via gbrain doctor at config-test time. New config status + fix hint rendered in both human and JSON output.

Ship bookkeeping (commit 82ab5507)

VERSION 0.33.1.0 → 0.33.1.1, CHANGELOG entry.

Test Coverage

src/core/ai/dims.ts :: dimsProviderOptions(impl='openai-compatible', modelId, dims)
  ├── Voyage flexible-dim model
  │     ├── [★★★ TESTED] dims valid → { dimensions }       — gateway.test.ts:161-167
  │     ├── [★★★ TESTED] dims invalid → AIConfigError       — gateway.test.ts (new)
  │     └── wire effect: dimensions → output_dimension
  │           └── [★★★ TESTED] body.output_dimension === 2048 — gateway.test.ts:178-218
  ├── voyage-4-nano (open-weight, fixed-dim)
  │     └── [★★★ TESTED] returns undefined                  — gateway.test.ts (regression pin)
  └── voyage-3-lite (non-flexible, bypass validator)
        └── [★★★ TESTED] returns undefined                  — gateway.test.ts (new)

src/core/ai/gateway.ts :: voyageCompatFetch (inbound response rewriter)
  ├── Layer 1 Content-Length over cap
  │     └── [★★★ TESTED] VoyageResponseTooLargeError propagates — gateway.test.ts (new)
  └── Layer 2 oversized base64 (chunked encoding)
        └── [★★★ TESTED] VoyageResponseTooLargeError propagates — gateway.test.ts (new)

src/commands/models.ts :: probeEmbeddingConfig (new)
  └── [INFORMATIONAL] zero-token Voyage flexible-dim validator, surfaced via `gbrain models doctor`

COVERAGE: 8/8 changed code paths tested  |  QUALITY: tier ★★★ (behavior + wire shape + regression pin)

Pre-Landing Review

Adversarial Codex review ran twice during PR evaluation:

  • First pass caught the voyage-4-nano membership P2 (resolved in 8b02e238).
  • Second pass on the proposed plan caught the gateway.ts:619 OOM-cap P3 and the embedding_dimensions=undefined → 1536 edge case (both resolved in d5dc5fad).

No critical findings remain. bun run verify (all 9 checks + tsc --noEmit) clean.

Plan Completion

All planned items DONE. Plan file: ~/.claude/plans/this-is-an-outside-jolly-nebula.md.

Documentation

  • CLAUDE.md: New entry for src/core/ai/dims.ts (the Voyage flexible-dim resolver + VOYAGE_VALID_OUTPUT_DIMS allowlist + runtime validator). v0.33.1.1 waves appended to src/core/ai/gateway.ts (VoyageResponseTooLargeError rethrow), src/core/ai/recipes/voyage.ts (tightened docstring naming the seven hosted flexible-dim models, nano explicitly excluded), and src/commands/models.ts (new zero-token embedding_config doctor probe + 'config' probe status + fix? field). New Key commands added in v0.33.1.1 (Voyage 2048-dim correctness wave) section documenting the four user-facing changes.
  • README.md: gbrain models doctor help-text line extended to mention the new embedding_config probe (catches Voyage flexible-dim misconfigs before first embed).
  • llms-full.txt: Regenerated via bun run build:llms (test/build-llms.test.ts gates this on every CLAUDE.md edit).
  • CHANGELOG.md: Comprehensive v0.33.1.1 entry already in place — not modified.
  • TODOS.md: Already current — not modified.

Why this PR replaces #866

PR #866 was opened on Eva's fork. maintainerCanModify is set but doesn't grant cross-fork push from this account (both HTTPS and SSH return 403). Landing Eva's correct core fix + 3 Codex-caught follow-ups atomically required cherry-picking her commit verbatim onto upstream and stacking the follow-ups on top. Her authorship is preserved on commit ee90709e.

#866 closes with attribution. Thanks @100yenadmin.

Test plan

  • bun test test/ai/gateway.test.ts — 34/34 pass
  • bun run test (parallel fast loop, 4 shards × 2 concurrency) — 6039 pass / 0 fail (post-master-merge)
  • bun run verify — 9 checks + tsc --noEmit clean
  • Adversarial Codex review — twice, no remaining findings

🤖 Generated with Claude Code

Eva and others added 2 commits May 13, 2026 02:42
Voyage's hosted /embeddings endpoint accepts `output_dimension` only for
the seven flexible-dim models (voyage-4-large, voyage-4, voyage-4-lite,
voyage-3-large, voyage-3.5, voyage-3.5-lite, voyage-code-3). voyage-4-nano
is an open-weight variant Voyage lists separately as fixed 1024-dim — the
hosted API rejects the parameter for it.

The recipe docstring previously claimed "all v4 variants" have flexible
dims, which is what led to nano being added to the allowlist in the first
place. Tighten the comment to name the hosted trio explicitly and call out
nano-as-open-weight.

Convert the test case at test/ai/gateway.test.ts from a positive assertion
(voyage-4-nano returns { dimensions: 512 }) to a negative regression pin
(voyage-4-nano returns undefined), so a future contributor can't silently
re-add nano without breaking this test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan and others added 3 commits May 13, 2026 08:07
…P3 follow-ups)

Two follow-ups from Codex's adversarial review of PR #962, both Voyage-adjacent
correctness fixes that the original PR scope had filed as TODOs.

1. gateway.ts:619 Voyage OOM cap was theatrical
-------------------------------------------------
voyageCompatFetch's inbound response rewriter is wrapped in a try/catch that
falls back to the original response on parse failure — correct for "Voyage
returned JSON I can't reshape, let the SDK handle it." But the per-embedding
Layer 2 OOM cap at line 619 threw a bare `new Error(...)`, which the same
catch silently swallowed. Net result: an oversized base64 response (Layer 1
skipped because no Content-Length header) returned through to the AI SDK and
could OOM the worker on JSON.parse.

Fix: introduce `VoyageResponseTooLargeError`, throw it at both cap sites
(Content-Length Layer 1 at line 595 and per-embedding Layer 2 at line 619),
and rethrow it from the inbound try/catch via `if (err instanceof
VoyageResponseTooLargeError) throw err`. Pre-existing fall-back-on-parse-error
behavior for other thrown errors is preserved.

Regression-pinned by 2 new behavioral tests (mock fetch returns oversized
Content-Length / oversized base64; embed() throws with the expected message)
and a structural assertion in test/voyage-response-cap.test.ts that the
`instanceof VoyageResponseTooLargeError ⇒ throw` line stays put.

2. Voyage flexible-dim runtime validation + doctor check
-------------------------------------------------------
A brain configured for a Voyage flexible-dim model (voyage-4-large,
voyage-3-large, voyage-3.5, voyage-3.5-lite, voyage-4, voyage-4-lite,
voyage-code-3) without an explicit `embedding_dimensions` would fall back to
DEFAULT_EMBEDDING_DIMENSIONS=1536 — an OpenAI default that Voyage rejects.
Voyage's only accepted values are {256, 512, 1024, 2048}. Pre-fix the failure
surfaced as an HTTP 400 from Voyage that often got misclassified as a
transient network error.

Fix:
- `dims.ts` exports `VOYAGE_VALID_OUTPUT_DIMS` and `isValidVoyageOutputDim`.
- `dimsProviderOptions` throws `AIConfigError` with a paste-ready fix command
  (`gbrain config set embedding_dimensions ...`) when a Voyage flexible-dim
  model is configured with an invalid dim value.
- `gbrain models doctor` gets a new `embedding_config` probe that runs first
  (zero tokens) and surfaces the misconfiguration before any chat/expansion
  probes spend a single token. New probe status `config` + optional `fix`
  hint rendered in human output.

Regression-pinned by 6 new unit tests covering the AIConfigError throw,
exact valid-values set, the bypass path for fixed-dim Voyage models, and
the fix-hint contents.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan changed the title fix: send Voyage output_dimension on embedding requests v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow May 13, 2026
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan merged commit 182a144 into master May 13, 2026
7 checks passed
brandonlipman added a commit to brandonlipman/gbrain that referenced this pull request May 29, 2026
* upstream/master:
  v0.35.1.0: embedder shootout prereqs (pricing + gateway export + --resume-from) (garrytan#1055)
  v0.35.0.0 feat: ZeroEntropy zembed-1 + zerank-2 reranker (garrytan#1008)
  v0.34.4.0 fix(embed): cursor-paginated --stale hardening wave (D2/D3/D4/D6/D7/D8 + regression test) (garrytan#991)
  v0.34.3.0 fix: supervisor treats code=0 watchdog exits as crashes (garrytan#1003)
  v0.34.2.0 fix(import): path-based checkpoint resume — kills parallel-drop + failed-file-skip + sort-flip bugs (garrytan#988)
  v0.34.1.0 fix(mcp): MCP fix wave — source-isolation P0 + PKCE DCR + federated_read + 3 more (garrytan#996)
  v0.34.0.0 feat: Cathedral III — recursive code intelligence + Leiden clusters + eval gate (garrytan#994)
  v0.33.3.0 feat(v0.33.3): code intelligence MCP foundation (v0.34 W0a-c + W3) (garrytan#934)
  v0.33.2.1 docs: fork-PR workflow for garrytan-agents (garrytan#992)
  fix(sync): raise maxBuffer to 100 MiB to prevent silent ENOBUFS crash (garrytan#982)
  v0.33.2.0 feat(search-lite): token budget + semantic query cache + intent weighting (garrytan#897)
  v0.33.1.1 fix: Voyage output_dimension + flexible-dim guard + OOM-cap rethrow (garrytan#962)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant