Skip to content

fix: support self-hosted embedding providers (bge-m3, Ollama, etc.)#420

Closed
SHENMINKUN-DAZENDIEO-SEANEO wants to merge 1 commit into
garrytan:masterfrom
SHENMINKUN-DAZENDIEO-SEANEO:fix/embedding-self-hosted
Closed

fix: support self-hosted embedding providers (bge-m3, Ollama, etc.)#420
SHENMINKUN-DAZENDIEO-SEANEO wants to merge 1 commit into
garrytan:masterfrom
SHENMINKUN-DAZENDIEO-SEANEO:fix/embedding-self-hosted

Conversation

@SHENMINKUN-DAZENDIEO-SEANEO

Copy link
Copy Markdown

Summary

Three fixes to enable self-hosted embedding backends (bge-m3, Ollama, local servers) without requiring an OpenAI API key.

1. Replace OpenAI SDK with raw fetch (embedding.ts)

The OpenAI SDK v4.104.0 silently truncates all embedding responses to 256 dimensions regardless of the actual dimensionality returned by the API. This makes it impossible to use 1024-dim models like bge-m3.

Fix: Replace openai.embeddings.create() with a raw fetch() call, bypassing SDK response processing entirely. All configuration is now via env vars:

  • EMBEDDING_BASE_URL (default: http://localhost:9999/v1)
  • EMBEDDING_MODEL (default: bge-m3)
  • EMBEDDING_DIMENSIONS (default: 1024)

2. Rebuild chunks in embedAll() (embed.ts)

embedAll() (the --all path) skipped pages with zero chunks in the DB, unlike embedPage() (single-page path) which rebuilds chunks from page content. After a TRUNCATE content_chunks, running gbrain embed --all would embed 0 chunks across all pages.

Fix: Added the same chunk rebuild logic from embedPage() into embedOnePage(): if getChunks() returns empty, chunk from page.compiled_truth and page.timeline, write to DB, then embed.

3. Fix vector search gate (hybrid.ts)

Vector search was gated on !process.env.OPENAI_API_KEY, which prevented self-hosted providers from using vector search at all.

Fix: Only skip vector search when EMBEDDING_PROVIDER=keyword (explicit opt-out).

Test plan

  • curl http://localhost:9999/v1/embeddings returns 1024-dim vectors
  • gbrain embed --slug <page> correctly embeds with 1024 dims
  • gbrain embed --all rebuilds chunks from page content when DB is empty
  • gbrain query "測試" returns vector search results without OPENAI_API_KEY
  • gbrain doctor shows 100% embedding coverage, health score 85/100

Three changes to enable self-hosted embedding backends without OpenAI API:

1. embedding.ts: Replace OpenAI SDK with raw fetch to avoid SDK v4.104.0
   silently truncating embeddings to 256 dimensions. Configurable via
   EMBEDDING_BASE_URL, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS env vars.

2. embed.ts: Add chunk rebuild logic in embedAll() — when content_chunks
   is empty (e.g. after TRUNCATE), rebuild from page.compiled_truth and
   timeline before embedding, matching the existing embedPage() behavior.

3. hybrid.ts: Only skip vector search when EMBEDDING_PROVIDER=keyword,
   not when OPENAI_API_KEY is unset — self-hosted providers work fine
   without an OpenAI key.
@garrytan

garrytan commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on.

We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs).

Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏

@garrytan garrytan closed this Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants