Skip to content

embedding_client.py: OpenAI path doesn't send dimensions, silently ignoring EMBEDDING_VECTOR_DIMENSIONS #601

@0xAlcibiades

Description

@0xAlcibiades

Summary

In src/embedding_client.py, the Gemini transport forwards output_dimensionality=self.vector_dimensions to the provider, but the OpenAI transport never passes the corresponding dimensions parameter on embeddings.create(...). As a result, EMBEDDING_VECTOR_DIMENSIONS is validated against the response via _validate_embedding_dimensions, but never actually requested from the provider. Any OpenAI-compatible server whose native embedding size differs from the configured value fails validation, even when the underlying model fully supports MRL (Matryoshka) truncation.

Where (on main)

Three OpenAI call sites in src/embedding_client.py:

  1. _EmbeddingClient.embed (single query)
  2. _EmbeddingClient.simple_batch_embed
  3. _EmbeddingClient._process_batch

Each invokes self.client.embeddings.create(model=..., input=...) with no dimensions= argument. The Gemini branch alongside each does pass config={"output_dimensionality": self.vector_dimensions}.

Related work

Motivating example

Running Qwen3-Embedding-4B (2560-dim native, MRL-trained) locally via oMLX with honcho's pgvector mode at EMBEDDING_VECTOR_DIMENSIONS=1536:

  1. Honcho calls self.client.embeddings.create(model="qwen3-embedding-4b", input=[query]) — no dimensions.
  2. oMLX's OpenAI-compatible /v1/embeddings endpoint returns 2560 floats (the model's native size).
  3. _validate_embedding_dimensions raises: Embedding dimension mismatch ... Expected 1536, got 2560.

oMLX's endpoint already honors the OpenAI dimensions request parameter end-to-end (slices + L2 renormalizes server-side — correct for MRL-trained models), so if honcho passed dimensions=1536, it would receive a valid, correctly-renormalized 1536-dim vector and pgvector ingestion would work. The same applies to any MRL-capable model served by an OpenAI-compatible backend (Nomic v1.5, mxbai-embed-large-v1, OpenAI's own text-embedding-3-small/-large, etc.).

Related server-side work on oMLX: jundot/omlx#901 (adding a per-model default so even clients that don't send dimensions get the right size). That is a complementary fix on the server side; honcho is the right place to fix it on the client side regardless of which backend it talks to.

Proposal

Pass dimensions=self.vector_dimensions on the three OpenAI embeddings.create calls, mirroring the existing Gemini behavior.

Backward-compatibility note

OpenAI rejects the dimensions parameter for pre-text-embedding-3 models (notably text-embedding-ada-002 — the API returns a 400 if dimensions is supplied). To avoid regressing ada-002 users, the cleanest option is a config toggle:

EMBEDDING.SEND_DIMENSIONS: bool = True  # opt-out for ada-002 or providers that reject the field

An alternative would be to condition on model-name heuristics (e.g., skip when the model starts with text-embedding-ada), but that gets fragile against OpenAI-compatible servers using custom model names. A config flag is clearer.

Scope

Happy to open a PR from a fork with the three-call-site change plus the SEND_DIMENSIONS flag and a docs update (.env.template, docs/v3/contributing/configuration.mdx) if the direction above works for you. Let me know if you'd prefer a different approach (e.g., always-send vs. flag-gated, or model-name heuristic) before I put the PR together.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions