Skip to content

fix(gateway): thread asymmetric input_type to ZeroEntropy via request header — AI-SDK adapter drops providerOptions.openaiCompatible.input_type#2083

Closed
victorfteha wants to merge 1 commit into
garrytan:masterfrom
victorfteha:fix/ze-embed-input-type
Closed

Conversation

@victorfteha

Copy link
Copy Markdown

Problem

embedQuery() exists and threads inputType: 'query' into providerOptions.openaiCompatible — but the AI-SDK's openai-compatible adapter silently drops unrecognized providerOptions fields. dimensions survives (known param); input_type doesn't. Every embed request therefore reached the ZeroEntropy compat shim without an input_type, and the shim's correct-for-ingest default ('document') applied to queries too.

Net effect: the entire vector arm performs asymmetric retrieval with symmetric (document-typed) query vectors. Nothing errors — rankings are just quietly wrong, especially for natural-question phrasings against note-style passages (exactly what asymmetric encoding exists to fix).

How this was found (evidence chain)

On a real ~2.3k-page / ~9.7k-chunk corpus (zembed-1 @ 1280, Postgres engine):

  1. Hybrid search consistently missed targets for natural-question queries that pure-vector SQL with a properly query-typed vector ranked feat: GBrain v0.1.0 — Postgres-native personal knowledge brain #1.
  2. A manually document-typed query embedding reproduced production's broken rankings byte-identically (cosine sims equal to 16 decimal places) — production was provably searching with document-typed query vectors.
  3. gbrain's own embedQuery() output matched that document-typed vector, not the query-typed one.
  4. Post-fix: 8/8 previously-missing natural-question probes rank their target feat: GBrain v0.1.0 — Postgres-native personal knowledge brain #1 on the same corpus. (An unrelated dims experiment — 1280 vs native 2560 — measured a flat wash, which was the tell: dims should have moved something. The model was never the problem.)

Worth noting: test/asymmetric-encoding-contract.test.ts stayed green through all of this — it captures providerOptions at the __setEmbedTransportForTests seam, upstream of the adapter that drops the field. The contract it pins is real but doesn't reach the wire.

Fix

  • embedSubBatch() threads the input type as an x-gbrain-input-type request header — headers survive the adapter, travel with the request (race-free), and need no module state.
  • zeroEntropyCompatFetch consumes + strips the header during its existing body rewrite and writes input_type into the request body.
  • No API change. The Voyage path is unaffected (it builds its request body directly rather than going through the openai-compatible adapter — input_type already reaches the wire there).

Test

New test/ze-input-type-wire.test.ts drives embed()/embedQuery() through the real adapter via a new __setZeFetchForTests terminal-fetch seam (mirrors the existing __set*ForTests conventions) and asserts the final wire body:

  • embedQuery(...) → body input_type: 'query', threading header stripped, URL rewrite ran
  • embed([...]) → body input_type: 'document'

Verified to fail without the fix (the query-side assertion sees 'document'). bun run verify: 30/30 green. Full unit suite: no new failures vs master — a set of provider-env-sensitive serial tests (reranker/cross-modal/think-gateway) fails identically on a clean master checkout in my environment (host injects ANTHROPIC_* env vars), so I'm expecting CI's clean env to be green.

Operational note for existing installs

Any cached query-embedding artifacts produced while the bug was live are document-typed — after upgrading, clear the query cache and restart long-running gbrain serve processes so they pick up the fixed code.

Possibly related: #1469 reports the hybrid vector path returning identical results regardless of input (ollama). Different symptom — a fixed/ignored query vector vs. a wrong-typed one — so this PR does not claim to fix it, but it's the same subsystem and worth re-testing against this change.

… header

providerOptions.openaiCompatible.input_type is silently dropped by the
AI-SDK adapter (unrecognized field; dimensions survives because it's a
known param). Every embed request — including query-side embedQuery() —
reached the ZE compat shim without input_type and got the shim's
correct-for-ingest 'document' default. Net effect: the entire vector arm
performed asymmetric retrieval with symmetric (document-typed) query
vectors. Nothing errored; rankings were just quietly wrong.

Fix: embedSubBatch() threads the input type as an 'x-gbrain-input-type'
request header (headers survive the adapter); zeroEntropyCompatFetch
consumes + strips it during its existing body rewrite. Race-free, no
module state, no API change. The Voyage path is unaffected (it builds
its request body directly, not via the openai-compatible adapter).

Adds test/ze-input-type-wire.test.ts — drives embed()/embedQuery()
through the REAL adapter via a new __setZeFetchForTests terminal-fetch
seam and asserts the final wire body. The existing
asymmetric-encoding-contract test captures providerOptions UPSTREAM of
the adapter, so it stayed green throughout this bug; the new test fails
without the fix (verified).
garrytan added a commit that referenced this pull request Jun 12, 2026
…SDK (supersedes #1400)

Reviewed: gateway.ts AsyncLocalStorage + search/mode.ts; disjoint from authored
fixes. Chosen over #2083 (whole openai-compatible recipe class vs ZeroEntropy-only).
Credit @billy-armstrong for the original #1400 diagnosis.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@garrytan

Copy link
Copy Markdown
Owner

Superseded by #2033 (merged into the v0.42.41.0 triage wave, PR #2128), which fixes the same asymmetric input_type bug across the whole openai-compatible recipe class rather than ZeroEntropy-only. Thanks for the fix here — closing as superseded.

@garrytan garrytan closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants