feat: Add optional Ollama support for embeddings by niallobrien · Pull Request #100 · garrytan/gbrain

niallobrien · 2026-04-13T14:12:35Z

I wanted Gbrain to optionally allow for local embeddings via Ollama. Work completed by Codex 5.4 and verified working.

Summary

use provider-native embedding dimensions by default, including OpenAI native dimensions unless an override is configured
select the correct storage and index type for the active embedding size and reconcile schema automatically when provider, model, or dimensions change
clear stale embeddings and require re-embedding after dimension-changing migrations instead of allowing mixed or invalid vectors
surface embedding provider, model, dimension, and reset status in doctor and health output
resolve embedding metadata in one place, including provider, model, effective dimensions, and whether dimensions were explicitly overridden
update docs and tests for Ollama nomic-embed-text, OpenAI defaults, and migration flows

Testing

I'm running Hermes and Gbrain within Docker, so all testing was conducted within this context.

docker compose exec -T hermes sh -lc 'cd /data/repos/gbrain && bun test test/embedding-ollama.test.ts test/pglite-engine.test.ts test/doctor.test.ts'
docker compose exec -T hermes sh -lc 'cd /data/repos/gbrain && HOME=/data/hermes/gbrain-home bun run src/cli.ts doctor --json'

Notes

live Hermes-mounted GBrain was migrated to schema v6 and reports ollama/nomic-embed-text at 768d

Add EmbeddingProvider interface with two implementations: - OpenAIEmbeddingProvider: text-embedding-3-large, 1536d (default, unchanged) - OllamaEmbeddingProvider: local embeddings via /api/embed (default: nomic-embed-text, 768d) Provider selection via: 1. GBRAIN_EMBEDDING_PROVIDER env var ('openai' | 'ollama') 2. embedding_provider in ~/.gbrain/config.json 3. Auto-detect: OpenAI if API key set, else Ollama New init flags: --ollama, --openai, --embedding-model <name> Backward-compatible: embed() / embedBatch() still work as before. Tracks model name per chunk in DB for provenance. 9 files changed, +827/-86 lines

SanthoshMReddy · 2026-04-14T07:45:59Z

Trialed this PR end-to-end on a 128-page Obsidian-backed vault (macOS, PGLite engine, local Ollama). Confirming the Ollama integration works — gbrain init --ollama --embedding-model nomic-embed-text + gbrain embed --stale cleanly embedded 253 chunks at 768d.

Some observations from real-world use that might be worth folding in:

1. Extend `OLLAMA_MODEL_DIMENSIONS` with a few more common models

const OLLAMA_MODEL_DIMENSIONS: Record<string, number> = {
  'nomic-embed-text': 768,
  'mxbai-embed-large': 1024,
  'snowflake-arctic-embed': 1024,
  'snowflake-arctic-embed2': 1024,   // top open-source English retrieval, 568M
  'bge-m3': 1024,                    // multilingual, 8k context
  'embeddinggemma': 768,             // Google's recent small-model leader
  'all-minilm': 384,
};

When I ran gbrain init --ollama --embedding-model snowflake-arctic-embed2 on a clean brain, the absence of the entry made it fall through to the || 768 default, which then failed on insert since the model actually returns 1024d. Adding it to the table avoids users needing the embedding_dimensions override for these popular models.

2. Dynamic dimension probe via `/api/show`

For anything not in the hardcoded table, POST http://localhost:11434/api/show with {"name":"<model>"} returns embedding_length in the response. Example:

"Model":
  architecture        nomic-bert
  parameters          137M
  embedding length    768
  ...

Could probe this at init time if embedding_dimensions isn't in config, eliminating the hardcoded table entirely. Happy to open a separate PR for this if you're interested.

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)

pgvector's default vector column + HNSW index caps at 4000 dimensions. qwen3-embedding:8b (4096d) is a popular, very strong open-source model that exceeds this. Your codebase already has EmbeddingStorageType = 'vector' | 'halfvec' — exposing a --halfvec init flag (or auto-selecting halfvec when embedding_dimensions > 4000) would unlock this.

Encountered this when trying qwen3-embedding:8b for a vault trial. Settled on snowflake-arctic-embed2 which fits comfortably in the vector+HNSW default.

Not blocking this PR — Ollama integration as designed is already a huge win. Just notes from actually using it. Thanks for the work!

niallobrien · 2026-04-14T08:15:43Z

Trialed this PR end-to-end on a 128-page Obsidian-backed vault (macOS, PGLite engine, local Ollama). Confirming the Ollama integration works — gbrain init --ollama --embedding-model nomic-embed-text + gbrain embed --stale cleanly embedded 253 chunks at 768d.

Some observations from real-world use that might be worth folding in:

1. Extend OLLAMA_MODEL_DIMENSIONS with a few more common models
const OLLAMA_MODEL_DIMENSIONS: Record<string, number> = {
  'nomic-embed-text': 768,
  'mxbai-embed-large': 1024,
  'snowflake-arctic-embed': 1024,
  'snowflake-arctic-embed2': 1024,   // top open-source English retrieval, 568M
  'bge-m3': 1024,                    // multilingual, 8k context
  'embeddinggemma': 768,             // Google's recent small-model leader
  'all-minilm': 384,
};
When I ran gbrain init --ollama --embedding-model snowflake-arctic-embed2 on a clean brain, the absence of the entry made it fall through to the || 768 default, which then failed on insert since the model actually returns 1024d. Adding it to the table avoids users needing the embedding_dimensions override for these popular models.

2. Dynamic dimension probe via /api/show

For anything not in the hardcoded table, POST http://localhost:11434/api/show with {"name":"<model>"} returns embedding_length in the response. Example:
"Model":
  architecture        nomic-bert
  parameters          137M
  embedding length    768
  ...
Could probe this at init time if embedding_dimensions isn't in config, eliminating the hardcoded table entirely. Happy to open a separate PR for this if you're interested.

3. halfvec for models >4000d (qwen3-embedding:8b, etc.)

pgvector's default vector column + HNSW index caps at 4000 dimensions. qwen3-embedding:8b (4096d) is a popular, very strong open-source model that exceeds this. Your codebase already has EmbeddingStorageType = 'vector' | 'halfvec' — exposing a --halfvec init flag (or auto-selecting halfvec when embedding_dimensions > 4000) would unlock this.

Encountered this when trying qwen3-embedding:8b for a vault trial. Settled on snowflake-arctic-embed2 which fits comfortably in the vector+HNSW default.

Not blocking this PR — Ollama integration as designed is already a huge win. Just notes from actually using it. Thanks for the work!

Awesome stuff, many thanks for the extensive review. I'll address the issues you highlighted later today and update this PR when ready.

niallobrien · 2026-04-14T08:37:06Z

2. Dynamic dimension probe via /api/show

For anything not in the hardcoded table, POST http://localhost:11434/api/show with {"name":"<model>"} returns embedding_length in the response. Example:
"Model":
  architecture        nomic-bert
  parameters          137M
  embedding length    768
  ...
Could probe this at init time if embedding_dimensions isn't in config, eliminating the hardcoded table entirely. Happy to open a separate PR for this if you're interested.

Nice idea. I agree probing POST /api/show for embedding_length is a better long-term fix than expanding the hardcoded table. I’d prefer to land that as a separate PR so this one stays scoped to fixing the immediate mismatch for common models, but I’d definitely be interested in the follow-up.

niallobrien · 2026-04-14T08:40:59Z

3. halfvec for models >4000d (qwen3-embedding:8b, etc.)

pgvector's default vector column + HNSW index caps at 4000 dimensions. qwen3-embedding:8b (4096d) is a popular, very strong open-source model that exceeds this. Your codebase already has EmbeddingStorageType = 'vector' | 'halfvec' — exposing a --halfvec init flag (or auto-selecting halfvec when embedding_dimensions > 4000) would unlock this.

Encountered this when trying qwen3-embedding:8b for a vault trial. Settled on snowflake-arctic-embed2 which fits comfortably in the vector+HNSW default.

Not blocking this PR — Ollama integration as designed is already a huge win. Just notes from actually using it. Thanks for the work!

I think this is worth a follow-up, but the main missing piece may be dimension detection rather than halfvec support itself. The code already switches to halfvec for larger embeddings; the problem is that Ollama models not in the known-dimensions path can fall back to the wrong size at init. If we probe /api/show and get the real embedding_length, models like qwen3-embedding:8b should be able to use halfvec automatically. This could be handled in the separate PR we discussed above, wdyt?

garrytan · 2026-06-08T02:56:51Z

Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on.

We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs).

Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏

niallobrien and others added 2 commits April 13, 2026 10:30

feat: align embedding dimensions with provider defaults

d6aa773

niallobrien changed the title ~~feat: align embedding dimensions with provider defaults~~ feat: Add optional Ollama support for embeddings Apr 13, 2026

Merge origin/master into feat/ollama-embeddings

5b5ca43

niallobrien marked this pull request as ready for review April 13, 2026 14:26

garrytan mentioned this pull request May 10, 2026

v0.32.0 feat: 5 new embedding recipes + discoverability pass (closes 17-PR cluster) #810

Merged

8 tasks

garrytan closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add optional Ollama support for embeddings#100

feat: Add optional Ollama support for embeddings#100
niallobrien wants to merge 3 commits into
garrytan:masterfrom
niallobrien:feat/ollama-embeddings

niallobrien commented Apr 13, 2026 •

edited

Loading

Uh oh!

SanthoshMReddy commented Apr 14, 2026

Uh oh!

niallobrien commented Apr 14, 2026

1. Extend `OLLAMA_MODEL_DIMENSIONS` with a few more common models

2. Dynamic dimension probe via `/api/show`

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)

Uh oh!

niallobrien commented Apr 14, 2026

2. Dynamic dimension probe via `/api/show`

Uh oh!

niallobrien commented Apr 14, 2026 •

edited

Loading

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

niallobrien commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Notes

Uh oh!

SanthoshMReddy commented Apr 14, 2026

1. Extend OLLAMA_MODEL_DIMENSIONS with a few more common models

2. Dynamic dimension probe via /api/show

3. halfvec for models >4000d (qwen3-embedding:8b, etc.)

Uh oh!

niallobrien commented Apr 14, 2026

1. Extend OLLAMA_MODEL_DIMENSIONS with a few more common models

2. Dynamic dimension probe via /api/show

3. halfvec for models >4000d (qwen3-embedding:8b, etc.)

Uh oh!

niallobrien commented Apr 14, 2026

2. Dynamic dimension probe via /api/show

Uh oh!

niallobrien commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

3. halfvec for models >4000d (qwen3-embedding:8b, etc.)

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

niallobrien commented Apr 13, 2026 •

edited

Loading

1. Extend `OLLAMA_MODEL_DIMENSIONS` with a few more common models

2. Dynamic dimension probe via `/api/show`

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)

1. Extend `OLLAMA_MODEL_DIMENSIONS` with a few more common models

2. Dynamic dimension probe via `/api/show`

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)

2. Dynamic dimension probe via `/api/show`

niallobrien commented Apr 14, 2026 •

edited

Loading

3. `halfvec` for models >4000d (qwen3-embedding:8b, etc.)