fix: support self-hosted embedding providers (bge-m3, Ollama, etc.)#420
Conversation
Three changes to enable self-hosted embedding backends without OpenAI API: 1. embedding.ts: Replace OpenAI SDK with raw fetch to avoid SDK v4.104.0 silently truncating embeddings to 256 dimensions. Configurable via EMBEDDING_BASE_URL, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS env vars. 2. embed.ts: Add chunk rebuild logic in embedAll() — when content_chunks is empty (e.g. after TRUNCATE), rebuild from page.compiled_truth and timeline before embedding, matching the existing embedPage() behavior. 3. hybrid.ts: Only skip vector search when EMBEDDING_PROVIDER=keyword, not when OPENAI_API_KEY is unset — self-hosted providers work fine without an OpenAI key.
|
Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on. We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs). Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏 |
Summary
Three fixes to enable self-hosted embedding backends (bge-m3, Ollama, local servers) without requiring an OpenAI API key.
1. Replace OpenAI SDK with raw fetch (
embedding.ts)The OpenAI SDK v4.104.0 silently truncates all embedding responses to 256 dimensions regardless of the actual dimensionality returned by the API. This makes it impossible to use 1024-dim models like bge-m3.
Fix: Replace
openai.embeddings.create()with a rawfetch()call, bypassing SDK response processing entirely. All configuration is now via env vars:EMBEDDING_BASE_URL(default:http://localhost:9999/v1)EMBEDDING_MODEL(default:bge-m3)EMBEDDING_DIMENSIONS(default:1024)2. Rebuild chunks in
embedAll()(embed.ts)embedAll()(the--allpath) skipped pages with zero chunks in the DB, unlikeembedPage()(single-page path) which rebuilds chunks from page content. After aTRUNCATE content_chunks, runninggbrain embed --allwould embed 0 chunks across all pages.Fix: Added the same chunk rebuild logic from
embedPage()intoembedOnePage(): ifgetChunks()returns empty, chunk frompage.compiled_truthandpage.timeline, write to DB, then embed.3. Fix vector search gate (
hybrid.ts)Vector search was gated on
!process.env.OPENAI_API_KEY, which prevented self-hosted providers from using vector search at all.Fix: Only skip vector search when
EMBEDDING_PROVIDER=keyword(explicit opt-out).Test plan
curl http://localhost:9999/v1/embeddingsreturns 1024-dim vectorsgbrain embed --slug <page>correctly embeds with 1024 dimsgbrain embed --allrebuilds chunks from page content when DB is emptygbrain query "測試"returns vector search results without OPENAI_API_KEYgbrain doctorshows 100% embedding coverage, health score 85/100