feat(embeddings): make embedding model and vector dimensions configurable by Gabrielgvl · Pull Request #512 · plastic-labs/honcho

Gabrielgvl · 2026-04-06T23:46:04Z

Summary

This PR removes hardcoded embedding assumptions (provider wiring, model name, and 1536 dimensions) so self-hosted deployments can use OpenAI-compatible embedding providers (e.g. Ollama) without source overrides.

Root Cause

Current code hardcodes OpenAI-style defaults in multiple places:

src/embedding_client.py
- openrouter model hardcoded to openai/text-embedding-3-small
- openai model hardcoded to text-embedding-3-small
- Gemini requests hardcode output_dimensionality=1536
- no explicit custom path for embedding provider even though non-embedding LLM config uses custom
src/models.py
- Document.embedding and MessageEmbedding.embedding use Vector(1536) directly

This forces downstream operators to mount patched embedding_client.py, models.py, and config.py into the container when using embedding models with different dimensions (e.g. qwen3-embedding:4b -> 2560 dims).

Changes

Add configurable embedding model + provider compatibility in settings

src/config.py
- LLMSettings.EMBEDDING_PROVIDER now accepts custom (keeps openrouter alias)
- add LLMSettings.EMBEDDING_MODEL: str | None = None

Align embedding provider handling with LLM configuration pattern

src/embedding_client.py
- Support custom embedding provider using:
  - LLM_OPENAI_COMPATIBLE_API_KEY
  - LLM_OPENAI_COMPATIBLE_BASE_URL
- Keep openrouter as compatible alias (default base URL fallback)
- Validate unsupported embedding provider values with explicit error
- Use EMBEDDING_MODEL override for gemini/openrouter/custom/openai

Remove hardcoded Gemini dimensionality

Stop forcing output_dimensionality=1536 in all Gemini embedding calls
Let provider/model defaults apply

Make ORM vector dimensions follow config

src/models.py
- Vector(settings.VECTOR_STORE.DIMENSIONS) for:
  - MessageEmbedding.embedding
  - Document.embedding

Add tests (including regression coverage)

tests/test_embedding_configuration.py
- custom provider path (key/base URL/model wiring)
- custom requires base URL (error case)
- openrouter alias behavior
- openai behavior
- invalid provider rejection
- regression: gemini embed calls no longer pass config.output_dimensionality
- wrapper singleton resolves custom provider API key path
- regression: ORM embedding dims follow settings.VECTOR_STORE.DIMENSIONS

Why this is safe

Backward-compatible defaults remain for existing users.
Existing 1536 deployments continue to work unchanged.
Self-hosted operators can now align:
- LLM_EMBEDDING_PROVIDER=custom
- LLM_EMBEDDING_MODEL=<provider model>
- VECTOR_STORE_DIMENSIONS=<matching dim>
  without code mounts/overrides.

Validation

Ruff:
- uv run ruff check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.py
- uv run ruff format --check src/config.py src/embedding_client.py src/models.py tests/test_embedding_configuration.py
Targeted tests (isolated from repo conftest):
- uv run pytest --noconftest tests/test_embedding_configuration.py -q
- Result: 8 passed

Operator Impact

This should allow removing runtime code overrides like:

mounted embedding_client.py
mounted models.py
mounted config.py
from docker-compose in self-hosted setups.

…able

adavyas · 2026-04-07T16:40:57Z

great idea. however, this would require more changes. schema would have to be changed in src/models.py which would require an alembic migration. user configured dimensions would also likely have to be validated, and this likely wouldn't work with users that already have pre-derived facts in the originally listed vector dimensions.

ajspig · 2026-04-07T20:25:06Z

Thanks @Gabrielgvl! Configurable embedding model and vector dimensions is exactly what PR #459 delivers as a core part of its refactor. The new nested EMBEDDING config structure lets you set model, dimensions (VECTOR_DIMENSIONS), base_url, and transport — all replacing the old flat env vars and hardcoded values.

Closing this as a duplicate, but thank you for putting in the work. I'm also closing your related issue #513 — the openai transport with base_url override covers the "custom" OpenAI-compatible provider use case you described there.

Gabriel Vinhaes added 3 commits April 6, 2026 20:45

feat(embeddings): make embedding model and vector dimensions configur…

b693511

…able

test(embeddings): cover custom provider and regression scenarios

e6a3b85

chore(tests): fix fixture typing for basedpyright

f625123

This was referenced Apr 7, 2026

Embedding provider parity: support 'custom' (OpenAI-compatible) in LLM_EMBEDDING_PROVIDER #513

Closed

Honcho memory plugin: honcho_search should query assistant-observed user context NousResearch/hermes-agent#5667

Closed

ajspig closed this Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(embeddings): make embedding model and vector dimensions configurable#512

feat(embeddings): make embedding model and vector dimensions configurable#512
Gabrielgvl wants to merge 3 commits into
plastic-labs:mainfrom
Gabrielgvl:fix/configurable-embedding-model-and-dimensions

Gabrielgvl commented Apr 6, 2026 •

edited

Loading

Uh oh!

adavyas commented Apr 7, 2026

Uh oh!

ajspig commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Gabrielgvl commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Changes

Why this is safe

Validation

Operator Impact

Uh oh!

adavyas commented Apr 7, 2026

Uh oh!

ajspig commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Gabrielgvl commented Apr 6, 2026 •

edited

Loading