Skip to content

bug: reject partial or mixed-dimension embedding responses before indexing #925

@100yenadmin

Description

@100yenadmin

TL;DR

GBrain should reject embedding responses unless the provider returns exactly one vector for each input text and every vector matches the configured schema dimensions. Today the gateway only checks the first returned vector's dimensions, so a partial or mixed-dimension provider response can silently misalign chunks, skip rows, or write unsafe embeddings.

flowchart LR
  A["texts[0..N]"] --> B["embedMany"]
  B --> C["provider response"]
  C --> D{"count == N?"}
  D -->|"no"| E["fail before indexing"]
  D -->|"yes"| F{"every vector dim == schema?"}
  F -->|"no"| E
  F -->|"yes"| G["write chunk embeddings"]
Loading

Why This Matters

The embedding table is a database contract: chunk row i must receive the vector for input text i, and that vector must match content_chunks.embedding vector(N). If a provider, proxy, shim, or degraded endpoint returns fewer vectors than requested, GBrain should fail closed before indexing.

This is provider-neutral. It protects OpenAI, Voyage, Gemini, LiteLLM, local servers, and any future OpenAI-compatible provider.

Root Cause

The current gateway checks only the first returned embedding's dimensions:

const first = result.embeddings?.[0];
if (first && Array.isArray(first) && first.length !== expectedDims) { ... }

That misses two important cases:

  1. partial response: 2 input texts, 1 returned vector
  2. mixed response: first vector is correct width, later vector is wrong width

Acceptance Criteria

  • If result.embeddings.length !== texts.length, throw AIConfigError before returning vectors.
  • If any returned vector has the wrong dimension, throw AIConfigError before returning vectors.
  • The error message should make the operator action clear: retry/check provider health for partial responses; migrate/change model for dimension mismatch.
  • Add tests for partial response and later-vector dimension mismatch.

Agent Implementation Notes

Keep this PR small:

  • touch src/core/ai/gateway.ts
  • add focused tests in test/ai/gateway.test.ts
  • do not change provider recipes, batching, schema, or retry behavior

The fix belongs immediately after embedMany() returns and before recordSubBatchSuccess(recipe).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions