feat: add Ollama bge-m3 local embedding support#73
Conversation
- Refactor embedding.ts to support dual backends: OpenAI (text-embedding-3-large, 1536 dims) and Ollama (bge-m3, 1024 dims) - Enable Ollama by setting OLLAMA_URL env var (e.g. http://localhost:11434) - OpenAI remains the default when OLLAMA_URL is not set - Update schema.sql and pglite-schema.ts: vector(1024) + model 'bge-m3' as new defaults - Add documentation comments explaining dimension/provider requirements - Existing installations with 1536-dim embeddings can re-import with --fresh to re-embed
There was a problem hiding this comment.
Pull request overview
Adds a second embedding backend (Ollama + bge-m3) alongside the existing OpenAI embeddings flow, and updates schema defaults intended to reflect the new embedding dimensions/model.
Changes:
- Refactors embedding generation to choose between OpenAI and Ollama based on
OLLAMA_URL. - Updates pgvector column dimensions and default model metadata in the SQL schemas for Postgres and PGLite.
- Adds schema comments documenting provider ↔ dimension requirements.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/core/embedding.ts | Adds Ollama embedding path and switches backend selection via env var. |
| src/schema.sql | Changes content_chunks.embedding dimensions/default model and seeds new config defaults. |
| src/core/pglite-schema.ts | Mirrors schema changes for the embedded PGLite engine. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| chunk_text TEXT NOT NULL, | ||
| chunk_source TEXT NOT NULL DEFAULT 'compiled_truth', | ||
| embedding vector(1536), | ||
| model TEXT NOT NULL DEFAULT 'text-embedding-3-large', | ||
| embedding vector(1024), | ||
| model TEXT NOT NULL DEFAULT 'bge-m3', | ||
| token_count INTEGER, |
There was a problem hiding this comment.
The schema hard-codes embedding vector(1024), but the default embedding backend in src/core/embedding.ts is OpenAI (1536 dims) when OLLAMA_URL is not set. That combination will cause inserts/searches to fail due to pgvector dimension mismatch. Either keep the default schema at 1536, or make schema initialization/migrations conditional on the selected provider (and require OLLAMA_URL when using 1024).
| -- content_chunks: chunked content with embeddings | ||
| -- Note: vector dimensions must match your embedding provider. | ||
| -- bge-m3 (Ollama) = 1024 dims, text-embedding-3-large (OpenAI) = 1536 dims. | ||
| -- Change dimension below and re-run import with --fresh to re-embed existing data. |
There was a problem hiding this comment.
src/core/db.ts / src/core/postgres-engine.ts initialize Postgres using the embedded schema in src/core/schema-embedded.ts, not by reading src/schema.sql at runtime. After changing src/schema.sql, schema-embedded.ts needs to be regenerated (per its header) or the Postgres engine will continue using the old 1536-dimension defaults.
| -- Change dimension below and re-run import with --fresh to re-embed existing data. | |
| -- Change dimension below and re-run import with --fresh to re-embed existing data. | |
| -- IMPORTANT: Postgres initialization uses the embedded schema in | |
| -- src/core/schema-embedded.ts, so after editing this file you must | |
| -- regenerate src/core/schema-embedded.ts as described in that file's header, | |
| -- or the runtime schema may continue using stale dimensions/defaults. |
| INSERT INTO config (key, value) VALUES | ||
| ('version', '1'), | ||
| ('embedding_model', 'text-embedding-3-large'), | ||
| ('embedding_dimensions', '1536'), | ||
| ('embedding_model', 'bge-m3'), | ||
| ('embedding_dimensions', '1024'), | ||
| ('chunk_strategy', 'semantic') |
There was a problem hiding this comment.
These config defaults are set to bge-m3/1024, but the application currently does not use these values to select the embedding provider/dimensions (and chunk inserts default model to text-embedding-3-large in the engines). This makes the stored config misleading unless the runtime is updated to actually read/enforce it.
| chunk_index INTEGER NOT NULL, | ||
| chunk_text TEXT NOT NULL, | ||
| chunk_source TEXT NOT NULL DEFAULT 'compiled_truth', | ||
| embedding vector(1536), | ||
| model TEXT NOT NULL DEFAULT 'text-embedding-3-large', | ||
| embedding vector(1024), | ||
| model TEXT NOT NULL DEFAULT 'bge-m3', |
There was a problem hiding this comment.
content_chunks.embedding is defined as vector(1024), but src/core/embedding.ts uses OpenAI embeddings (1536 dims) by default unless OLLAMA_URL is set. With the current defaults this will fail at insert/query time due to dimension mismatch. Consider leaving the default at 1536 or making schema selection/migration depend on the active embedding provider.
| ('embedding_model', 'bge-m3'), | ||
| ('embedding_dimensions', '1024'), |
There was a problem hiding this comment.
The schema seeds embedding_model/embedding_dimensions as bge-m3/1024, but current chunk upserts in both engines supply chunk.model || 'text-embedding-3-large', so the DB defaults/config won’t match what actually gets written unless callers start setting chunk.model based on the selected backend.
| ('embedding_model', 'bge-m3'), | |
| ('embedding_dimensions', '1024'), | |
| ('embedding_model', 'text-embedding-3-large'), | |
| ('embedding_dimensions', '3072'), |
| // Config | ||
| const OLLAMA_URL = process.env['OLLAMA_URL'] ?? ''; | ||
| const USE_OLLAMA = Boolean(OLLAMA_URL); | ||
| const OPENAI_MODEL = 'text-embedding-3-large'; | ||
| const OLLAMA_MODEL = 'bge-m3'; | ||
| const DIMENSIONS = 1024; // bge-m3 uses 1024 dims; OpenAI uses 1536 | ||
| const MAX_CHARS = 8000; |
There was a problem hiding this comment.
DIMENSIONS is set to 1024 and exported as EMBEDDING_DIMENSIONS, but the OpenAI path explicitly requests 1536 dimensions. When USE_OLLAMA is false, the exported dimensions/model values won’t reflect what embedBatchOpenAI() actually produces, which can lead to incorrect schema/config choices elsewhere. Consider exporting provider-specific constants (OPENAI_DIMENSIONS/OLLAMA_DIMENSIONS) or computed ACTIVE_MODEL/ACTIVE_DIMENSIONS based on USE_OLLAMA, and use those consistently.
|
Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on. We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs). Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏 |
Summary
Adds support for local embedding inference via Ollama with the bge-m3 model, in addition to the existing OpenAI backend.
Changes
src/core/embedding.ts
src/schema.sql + src/core/pglite-schema.ts
Usage
Note for existing users
If you already have embeddings at 1536 dimensions, you need to re-import with to re-embed with the new model.