feat(embeddings): make embedding model and vector dimensions configurable#512
Conversation
|
great idea. however, this would require more changes. schema would have to be changed in src/models.py which would require an alembic migration. user configured dimensions would also likely have to be validated, and this likely wouldn't work with users that already have pre-derived facts in the originally listed vector dimensions. |
|
Thanks @Gabrielgvl! Configurable embedding model and vector dimensions is exactly what PR #459 delivers as a core part of its refactor. The new nested Closing this as a duplicate, but thank you for putting in the work. I'm also closing your related issue #513 — the |
Summary
This PR removes hardcoded embedding assumptions (provider wiring, model name, and 1536 dimensions) so self-hosted deployments can use OpenAI-compatible embedding providers (e.g. Ollama) without source overrides.
Root Cause
Current code hardcodes OpenAI-style defaults in multiple places:
src/embedding_client.pyopenai/text-embedding-3-smalltext-embedding-3-smalloutput_dimensionality=1536custompath for embedding provider even though non-embedding LLM config usescustomsrc/models.pyDocument.embeddingandMessageEmbedding.embeddinguseVector(1536)directlyThis forces downstream operators to mount patched
embedding_client.py,models.py, andconfig.pyinto the container when using embedding models with different dimensions (e.g.qwen3-embedding:4b-> 2560 dims).Changes
src/config.pyLLMSettings.EMBEDDING_PROVIDERnow acceptscustom(keepsopenrouteralias)LLMSettings.EMBEDDING_MODEL: str | None = Nonesrc/embedding_client.pycustomembedding provider using:LLM_OPENAI_COMPATIBLE_API_KEYLLM_OPENAI_COMPATIBLE_BASE_URLopenrouteras compatible alias (default base URL fallback)EMBEDDING_MODELoverride for gemini/openrouter/custom/openaioutput_dimensionality=1536in all Gemini embedding callssrc/models.pyVector(settings.VECTOR_STORE.DIMENSIONS)for:MessageEmbedding.embeddingDocument.embeddingtests/test_embedding_configuration.pyconfig.output_dimensionalitysettings.VECTOR_STORE.DIMENSIONSWhy this is safe
LLM_EMBEDDING_PROVIDER=customLLM_EMBEDDING_MODEL=<provider model>VECTOR_STORE_DIMENSIONS=<matching dim>without code mounts/overrides.
Validation
uv run ruff check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.pyuv run ruff format --check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.pyuv run pytest --noconftest tests/test_embedding_configuration.py -q8 passedOperator Impact
This should allow removing runtime code overrides like:
embedding_client.pymodels.pyconfig.pyfrom docker-compose in self-hosted setups.