Bug: VECTOR_STORE_DIMENSIONS setting has no effect
The config option VECTOR_STORE_DIMENSIONS (default: 1536) exists in src/config.py but is never used in src/models.py. Both MessageEmbedding.embedding and Document.embedding are hardcoded to Vector(1536), which means changing the setting has no effect on the actual database column or SQLAlchemy model.
Affected files:
src/models.py line 281: mapped_column(Vector(1536), nullable=True)
src/models.py line 389: mapped_column(Vector(1536), nullable=True)
Suggested fix:
Add from src.config import settings to src/models.py and replace the hardcoded values:
# Before
embedding: MappedColumn[Any] = mapped_column(Vector(1536), nullable=True)
# After
embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)
Feature Request: Separate custom embedding provider endpoint
Currently, LLM_EMBEDDING_PROVIDER=openrouter reuses LLM_OPENAI_COMPATIBLE_BASE_URL for both LLM inference and embeddings. This makes it impossible to use a self-hosted embedding service independently from the LLM inference endpoint.
Use case: Running a local embedding model (e.g. llama.cpp, Ollama) on a separate server while using a different endpoint for LLM inference.
Suggested addition to LLMSettings in src/config.py:
EMBEDDING_PROVIDER: Literal["openai", "gemini", "openrouter", "custom"] = "openai"
EMBEDDING_COMPATIBLE_BASE_URL: str | None = None
EMBEDDING_COMPATIBLE_API_KEY: str | None = None
EMBEDDING_MODEL: str | None = None
And a new branch in src/embedding_client.py:
elif self.provider == "custom":
if api_key is None:
api_key = settings.LLM.EMBEDDING_COMPATIBLE_API_KEY
if not api_key:
raise ValueError("LLM_EMBEDDING_COMPATIBLE_API_KEY is required")
base_url = settings.LLM.EMBEDDING_COMPATIBLE_BASE_URL
if not base_url:
raise ValueError("LLM_EMBEDDING_COMPATIBLE_BASE_URL is required")
self.client = AsyncOpenAI(api_key=api_key, base_url=base_url)
self.model = settings.LLM.EMBEDDING_MODEL or "text-embedding-3-small"
self.max_embedding_tokens = settings.MAX_EMBEDDING_TOKENS
self.max_batch_size = 2048
Example .env usage:
LLM_EMBEDDING_PROVIDER=custom
LLM_EMBEDDING_COMPATIBLE_BASE_URL=http://my-embedding-server:8080/v1
LLM_EMBEDDING_COMPATIBLE_API_KEY=my-key
LLM_EMBEDDING_MODEL=my-embed-model
VECTOR_STORE_DIMENSIONS=1024
This allows full decoupling of embedding and inference providers, which is especially useful for self-hosted deployments.
Bug:
VECTOR_STORE_DIMENSIONSsetting has no effectThe config option
VECTOR_STORE_DIMENSIONS(default: 1536) exists insrc/config.pybut is never used insrc/models.py. BothMessageEmbedding.embeddingandDocument.embeddingare hardcoded toVector(1536), which means changing the setting has no effect on the actual database column or SQLAlchemy model.Affected files:
src/models.pyline 281:mapped_column(Vector(1536), nullable=True)src/models.pyline 389:mapped_column(Vector(1536), nullable=True)Suggested fix:
Add
from src.config import settingstosrc/models.pyand replace the hardcoded values:Feature Request: Separate custom embedding provider endpoint
Currently,
LLM_EMBEDDING_PROVIDER=openrouterreusesLLM_OPENAI_COMPATIBLE_BASE_URLfor both LLM inference and embeddings. This makes it impossible to use a self-hosted embedding service independently from the LLM inference endpoint.Use case: Running a local embedding model (e.g. llama.cpp, Ollama) on a separate server while using a different endpoint for LLM inference.
Suggested addition to
LLMSettingsinsrc/config.py:And a new branch in
src/embedding_client.py:Example
.envusage:This allows full decoupling of embedding and inference providers, which is especially useful for self-hosted deployments.