Skip to content

feat: add Ollama bge-m3 local embedding support#73

Closed
bravohenry wants to merge 1 commit into
garrytan:masterfrom
bravohenry:ollama-embedding
Closed

feat: add Ollama bge-m3 local embedding support#73
bravohenry wants to merge 1 commit into
garrytan:masterfrom
bravohenry:ollama-embedding

Conversation

@bravohenry

Copy link
Copy Markdown

Summary

Adds support for local embedding inference via Ollama with the bge-m3 model, in addition to the existing OpenAI backend.

Changes

src/core/embedding.ts

  • Refactor embedding backend into two paths: OpenAI (, 1536 dims) and Ollama (, 1024 dims)
  • Enable Ollama by setting env var (e.g. )
  • OpenAI remains the default when is not set

src/schema.sql + src/core/pglite-schema.ts

  • Change default + (was + )
  • Add comments documenting provider/dimension requirements

Usage

Note for existing users

If you already have embeddings at 1536 dimensions, you need to re-import with to re-embed with the new model.

- Refactor embedding.ts to support dual backends: OpenAI (text-embedding-3-large, 1536 dims) and Ollama (bge-m3, 1024 dims)
- Enable Ollama by setting OLLAMA_URL env var (e.g. http://localhost:11434)
- OpenAI remains the default when OLLAMA_URL is not set
- Update schema.sql and pglite-schema.ts: vector(1024) + model 'bge-m3' as new defaults
- Add documentation comments explaining dimension/provider requirements
- Existing installations with 1536-dim embeddings can re-import with --fresh to re-embed
Copilot AI review requested due to automatic review settings April 12, 2026 17:08

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a second embedding backend (Ollama + bge-m3) alongside the existing OpenAI embeddings flow, and updates schema defaults intended to reflect the new embedding dimensions/model.

Changes:

  • Refactors embedding generation to choose between OpenAI and Ollama based on OLLAMA_URL.
  • Updates pgvector column dimensions and default model metadata in the SQL schemas for Postgres and PGLite.
  • Adds schema comments documenting provider ↔ dimension requirements.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/core/embedding.ts Adds Ollama embedding path and switches backend selection via env var.
src/schema.sql Changes content_chunks.embedding dimensions/default model and seeds new config defaults.
src/core/pglite-schema.ts Mirrors schema changes for the embedded PGLite engine.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/schema.sql
Comment on lines 36 to 40
chunk_text TEXT NOT NULL,
chunk_source TEXT NOT NULL DEFAULT 'compiled_truth',
embedding vector(1536),
model TEXT NOT NULL DEFAULT 'text-embedding-3-large',
embedding vector(1024),
model TEXT NOT NULL DEFAULT 'bge-m3',
token_count INTEGER,

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema hard-codes embedding vector(1024), but the default embedding backend in src/core/embedding.ts is OpenAI (1536 dims) when OLLAMA_URL is not set. That combination will cause inserts/searches to fail due to pgvector dimension mismatch. Either keep the default schema at 1536, or make schema initialization/migrations conditional on the selected provider (and require OLLAMA_URL when using 1024).

Copilot uses AI. Check for mistakes.
Comment thread src/schema.sql
-- content_chunks: chunked content with embeddings
-- Note: vector dimensions must match your embedding provider.
-- bge-m3 (Ollama) = 1024 dims, text-embedding-3-large (OpenAI) = 1536 dims.
-- Change dimension below and re-run import with --fresh to re-embed existing data.

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/core/db.ts / src/core/postgres-engine.ts initialize Postgres using the embedded schema in src/core/schema-embedded.ts, not by reading src/schema.sql at runtime. After changing src/schema.sql, schema-embedded.ts needs to be regenerated (per its header) or the Postgres engine will continue using the old 1536-dimension defaults.

Suggested change
-- Change dimension below and re-run import with --fresh to re-embed existing data.
-- Change dimension below and re-run import with --fresh to re-embed existing data.
-- IMPORTANT: Postgres initialization uses the embedded schema in
-- src/core/schema-embedded.ts, so after editing this file you must
-- regenerate src/core/schema-embedded.ts as described in that file's header,
-- or the runtime schema may continue using stale dimensions/defaults.

Copilot uses AI. Check for mistakes.
Comment thread src/schema.sql
Comment on lines 141 to 145
INSERT INTO config (key, value) VALUES
('version', '1'),
('embedding_model', 'text-embedding-3-large'),
('embedding_dimensions', '1536'),
('embedding_model', 'bge-m3'),
('embedding_dimensions', '1024'),
('chunk_strategy', 'semantic')

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These config defaults are set to bge-m3/1024, but the application currently does not use these values to select the embedding provider/dimensions (and chunk inserts default model to text-embedding-3-large in the engines). This makes the stored config misleading unless the runtime is updated to actually read/enforce it.

Copilot uses AI. Check for mistakes.
Comment thread src/core/pglite-schema.ts
Comment on lines 50 to +54
chunk_index INTEGER NOT NULL,
chunk_text TEXT NOT NULL,
chunk_source TEXT NOT NULL DEFAULT 'compiled_truth',
embedding vector(1536),
model TEXT NOT NULL DEFAULT 'text-embedding-3-large',
embedding vector(1024),
model TEXT NOT NULL DEFAULT 'bge-m3',

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content_chunks.embedding is defined as vector(1024), but src/core/embedding.ts uses OpenAI embeddings (1536 dims) by default unless OLLAMA_URL is set. With the current defaults this will fail at insert/query time due to dimension mismatch. Consider leaving the default at 1536 or making schema selection/migration depend on the active embedding provider.

Copilot uses AI. Check for mistakes.
Comment thread src/core/pglite-schema.ts
Comment on lines +159 to +160
('embedding_model', 'bge-m3'),
('embedding_dimensions', '1024'),

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema seeds embedding_model/embedding_dimensions as bge-m3/1024, but current chunk upserts in both engines supply chunk.model || 'text-embedding-3-large', so the DB defaults/config won’t match what actually gets written unless callers start setting chunk.model based on the selected backend.

Suggested change
('embedding_model', 'bge-m3'),
('embedding_dimensions', '1024'),
('embedding_model', 'text-embedding-3-large'),
('embedding_dimensions', '3072'),

Copilot uses AI. Check for mistakes.
Comment thread src/core/embedding.ts
Comment on lines +17 to 23
// Config
const OLLAMA_URL = process.env['OLLAMA_URL'] ?? '';
const USE_OLLAMA = Boolean(OLLAMA_URL);
const OPENAI_MODEL = 'text-embedding-3-large';
const OLLAMA_MODEL = 'bge-m3';
const DIMENSIONS = 1024; // bge-m3 uses 1024 dims; OpenAI uses 1536
const MAX_CHARS = 8000;

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DIMENSIONS is set to 1024 and exported as EMBEDDING_DIMENSIONS, but the OpenAI path explicitly requests 1536 dimensions. When USE_OLLAMA is false, the exported dimensions/model values won’t reflect what embedBatchOpenAI() actually produces, which can lead to incorrect schema/config choices elsewhere. Consider exporting provider-specific constants (OPENAI_DIMENSIONS/OLLAMA_DIMENSIONS) or computed ACTIVE_MODEL/ACTIVE_DIMENSIONS based on USE_OLLAMA, and use those consistently.

Copilot uses AI. Check for mistakes.
@garrytan

garrytan commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on.

We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs).

Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏

@garrytan garrytan closed this Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants