Bug: VECTOR_STORE_DIMENSIONS not respected in models + Feature: separate custom embedding provider

## Bug: `VECTOR_STORE_DIMENSIONS` setting has no effect

The config option `VECTOR_STORE_DIMENSIONS` (default: 1536) exists in `src/config.py` but is never used in `src/models.py`. Both `MessageEmbedding.embedding` and `Document.embedding` are hardcoded to `Vector(1536)`, which means changing the setting has no effect on the actual database column or SQLAlchemy model.

**Affected files:**
- `src/models.py` line 281: `mapped_column(Vector(1536), nullable=True)`
- `src/models.py` line 389: `mapped_column(Vector(1536), nullable=True)`

**Suggested fix:**

Add `from src.config import settings` to `src/models.py` and replace the hardcoded values:

```python
# Before
embedding: MappedColumn[Any] = mapped_column(Vector(1536), nullable=True)

# After
embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)
```

---

## Feature Request: Separate custom embedding provider endpoint

Currently, `LLM_EMBEDDING_PROVIDER=openrouter` reuses `LLM_OPENAI_COMPATIBLE_BASE_URL` for both LLM inference and embeddings. This makes it impossible to use a self-hosted embedding service independently from the LLM inference endpoint.

**Use case:** Running a local embedding model (e.g. llama.cpp, Ollama) on a separate server while using a different endpoint for LLM inference.

**Suggested addition to `LLMSettings` in `src/config.py`:**

```python
EMBEDDING_PROVIDER: Literal["openai", "gemini", "openrouter", "custom"] = "openai"
EMBEDDING_COMPATIBLE_BASE_URL: str | None = None
EMBEDDING_COMPATIBLE_API_KEY: str | None = None
EMBEDDING_MODEL: str | None = None
```

**And a new branch in `src/embedding_client.py`:**

```python
elif self.provider == "custom":
    if api_key is None:
        api_key = settings.LLM.EMBEDDING_COMPATIBLE_API_KEY
    if not api_key:
        raise ValueError("LLM_EMBEDDING_COMPATIBLE_API_KEY is required")
    base_url = settings.LLM.EMBEDDING_COMPATIBLE_BASE_URL
    if not base_url:
        raise ValueError("LLM_EMBEDDING_COMPATIBLE_BASE_URL is required")
    self.client = AsyncOpenAI(api_key=api_key, base_url=base_url)
    self.model = settings.LLM.EMBEDDING_MODEL or "text-embedding-3-small"
    self.max_embedding_tokens = settings.MAX_EMBEDDING_TOKENS
    self.max_batch_size = 2048
```

**Example `.env` usage:**

```env
LLM_EMBEDDING_PROVIDER=custom
LLM_EMBEDDING_COMPATIBLE_BASE_URL=http://my-embedding-server:8080/v1
LLM_EMBEDDING_COMPATIBLE_API_KEY=my-key
LLM_EMBEDDING_MODEL=my-embed-model
VECTOR_STORE_DIMENSIONS=1024
```

This allows full decoupling of embedding and inference providers, which is especially useful for self-hosted deployments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: VECTOR_STORE_DIMENSIONS not respected in models + Feature: separate custom embedding provider #564

Bug: `VECTOR_STORE_DIMENSIONS` setting has no effect

Feature Request: Separate custom embedding provider endpoint

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: VECTOR_STORE_DIMENSIONS not respected in models + Feature: separate custom embedding provider #564

Description

Bug: VECTOR_STORE_DIMENSIONS setting has no effect

Feature Request: Separate custom embedding provider endpoint

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug: `VECTOR_STORE_DIMENSIONS` setting has no effect