Skip to content

feat(embeddings): make embedding model and vector dimensions configurable#512

Closed
Gabrielgvl wants to merge 3 commits into
plastic-labs:mainfrom
Gabrielgvl:fix/configurable-embedding-model-and-dimensions
Closed

feat(embeddings): make embedding model and vector dimensions configurable#512
Gabrielgvl wants to merge 3 commits into
plastic-labs:mainfrom
Gabrielgvl:fix/configurable-embedding-model-and-dimensions

Conversation

@Gabrielgvl

@Gabrielgvl Gabrielgvl commented Apr 6, 2026

Copy link
Copy Markdown

Summary

This PR removes hardcoded embedding assumptions (provider wiring, model name, and 1536 dimensions) so self-hosted deployments can use OpenAI-compatible embedding providers (e.g. Ollama) without source overrides.

Root Cause

Current code hardcodes OpenAI-style defaults in multiple places:

  • src/embedding_client.py
    • openrouter model hardcoded to openai/text-embedding-3-small
    • openai model hardcoded to text-embedding-3-small
    • Gemini requests hardcode output_dimensionality=1536
    • no explicit custom path for embedding provider even though non-embedding LLM config uses custom
  • src/models.py
    • Document.embedding and MessageEmbedding.embedding use Vector(1536) directly

This forces downstream operators to mount patched embedding_client.py, models.py, and config.py into the container when using embedding models with different dimensions (e.g. qwen3-embedding:4b -> 2560 dims).

Changes

  1. Add configurable embedding model + provider compatibility in settings
  • src/config.py
    • LLMSettings.EMBEDDING_PROVIDER now accepts custom (keeps openrouter alias)
    • add LLMSettings.EMBEDDING_MODEL: str | None = None
  1. Align embedding provider handling with LLM configuration pattern
  • src/embedding_client.py
    • Support custom embedding provider using:
      • LLM_OPENAI_COMPATIBLE_API_KEY
      • LLM_OPENAI_COMPATIBLE_BASE_URL
    • Keep openrouter as compatible alias (default base URL fallback)
    • Validate unsupported embedding provider values with explicit error
    • Use EMBEDDING_MODEL override for gemini/openrouter/custom/openai
  1. Remove hardcoded Gemini dimensionality
  • Stop forcing output_dimensionality=1536 in all Gemini embedding calls
  • Let provider/model defaults apply
  1. Make ORM vector dimensions follow config
  • src/models.py
    • Vector(settings.VECTOR_STORE.DIMENSIONS) for:
      • MessageEmbedding.embedding
      • Document.embedding
  1. Add tests (including regression coverage)
  • tests/test_embedding_configuration.py
    • custom provider path (key/base URL/model wiring)
    • custom requires base URL (error case)
    • openrouter alias behavior
    • openai behavior
    • invalid provider rejection
    • regression: gemini embed calls no longer pass config.output_dimensionality
    • wrapper singleton resolves custom provider API key path
    • regression: ORM embedding dims follow settings.VECTOR_STORE.DIMENSIONS

Why this is safe

  • Backward-compatible defaults remain for existing users.
  • Existing 1536 deployments continue to work unchanged.
  • Self-hosted operators can now align:
    • LLM_EMBEDDING_PROVIDER=custom
    • LLM_EMBEDDING_MODEL=<provider model>
    • VECTOR_STORE_DIMENSIONS=<matching dim>
      without code mounts/overrides.

Validation

  • Ruff:
    • uv run ruff check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.py
    • uv run ruff format --check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.py
  • Targeted tests (isolated from repo conftest):
    • uv run pytest --noconftest tests/test_embedding_configuration.py -q
    • Result: 8 passed

Operator Impact

This should allow removing runtime code overrides like:

  • mounted embedding_client.py
  • mounted models.py
  • mounted config.py
    from docker-compose in self-hosted setups.

@adavyas

adavyas commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

great idea. however, this would require more changes. schema would have to be changed in src/models.py which would require an alembic migration. user configured dimensions would also likely have to be validated, and this likely wouldn't work with users that already have pre-derived facts in the originally listed vector dimensions.

@ajspig

ajspig commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

Thanks @Gabrielgvl! Configurable embedding model and vector dimensions is exactly what PR #459 delivers as a core part of its refactor. The new nested EMBEDDING config structure lets you set model, dimensions (VECTOR_DIMENSIONS), base_url, and transport — all replacing the old flat env vars and hardcoded values.

Closing this as a duplicate, but thank you for putting in the work. I'm also closing your related issue #513 — the openai transport with base_url override covers the "custom" OpenAI-compatible provider use case you described there.

@ajspig ajspig closed this Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants