feat: Add optional Ollama support for embeddings#100
Conversation
Add EmbeddingProvider interface with two implementations:
- OpenAIEmbeddingProvider: text-embedding-3-large, 1536d (default, unchanged)
- OllamaEmbeddingProvider: local embeddings via /api/embed (default: nomic-embed-text, 768d)
Provider selection via:
1. GBRAIN_EMBEDDING_PROVIDER env var ('openai' | 'ollama')
2. embedding_provider in ~/.gbrain/config.json
3. Auto-detect: OpenAI if API key set, else Ollama
New init flags: --ollama, --openai, --embedding-model <name>
Backward-compatible: embed() / embedBatch() still work as before.
Tracks model name per chunk in DB for provenance.
9 files changed, +827/-86 lines
|
Trialed this PR end-to-end on a 128-page Obsidian-backed vault (macOS, PGLite engine, local Ollama). Confirming the Ollama integration works — Some observations from real-world use that might be worth folding in: 1. Extend
|
Awesome stuff, many thanks for the extensive review. I'll address the issues you highlighted later today and update this PR when ready. |
Nice idea. I agree probing POST /api/show for embedding_length is a better long-term fix than expanding the hardcoded table. I’d prefer to land that as a separate PR so this one stays scoped to fixing the immediate mismatch for common models, but I’d definitely be interested in the follow-up. |
I think this is worth a follow-up, but the main missing piece may be dimension detection rather than halfvec support itself. The code already switches to halfvec for larger embeddings; the problem is that Ollama models not in the known-dimensions path can fall back to the wrong size at init. If we probe /api/show and get the real embedding_length, models like qwen3-embedding:8b should be able to use halfvec automatically. This could be handled in the separate PR we discussed above, wdyt? |
|
Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on. We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs). Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏 |
I wanted Gbrain to optionally allow for local embeddings via Ollama. Work completed by Codex 5.4 and verified working.
Summary
doctorand health outputnomic-embed-text, OpenAI defaults, and migration flowsTesting
I'm running Hermes and Gbrain within Docker, so all testing was conducted within this context.
docker compose exec -T hermes sh -lc 'cd /data/repos/gbrain && bun test test/embedding-ollama.test.ts test/pglite-engine.test.ts test/doctor.test.ts'docker compose exec -T hermes sh -lc 'cd /data/repos/gbrain && HOME=/data/hermes/gbrain-home bun run src/cli.ts doctor --json'Notes
ollama/nomic-embed-textat768d