support custom OpenAI-compatible embedding servers and other models by vazir · Pull Request #516 · garrytan/gbrain

vazir · 2026-04-29T12:37:23Z

Currently embedding.ts hardcodes text-embedding-3-large at 1536 dims.
This patch makes the model and dimensions overridable via env vars, plus
adds a workaround for self-hosted endpoints that don't accept OpenAI's
dimensions request param.

The OpenAI SDK already reads OPENAI_BASE_URL and OPENAI_API_KEY from env,
so pointing gbrain at a self-hosted server was already half-possible.
The remaining gaps were the hardcoded model + dims.

New env vars added by this patch (all optional):

GBRAIN_EMBEDDING_MODEL — override the model name
GBRAIN_EMBEDDING_DIMENSIONS — override the target dim (default 1536)
GBRAIN_EMBEDDING_OMIT_DIMENSIONS — don't send the dimensions param

Existing env vars used by the OpenAI SDK (already supported, listed for
completeness so a self-hosted user has the full set in one place):

OPENAI_BASE_URL — point at your self-hosted endpoint
OPENAI_API_KEY — required non-empty by the SDK; any string works if your server has no auth

If the server returns more dims than configured (e.g. a Matryoshka-trained
model where the param is omitted or ignored), we slice the prefix and
L2-renormalize. That keeps cosine retrieval quality on MRL models.

Tested with vLLM 0.20 + Qwen/Qwen3-Embedding-4B (native 2560-dim)
truncated to 1536. Full re-index of 4098 chunks across 736 pages,
gbrain doctor reports 100% coverage. Default behavior unchanged for
users on OpenAI.

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

… truncation Make embedding.ts backend-agnostic so gbrain can run against any OpenAI-compatible embedding endpoint (vLLM, sentence-transformers, etc.), not just OpenAI's text-embedding-3-large@1536. Three new optional env vars (defaults preserve current behavior): - GBRAIN_EMBEDDING_MODEL - model name override - GBRAIN_EMBEDDING_DIMENSIONS - schema dimension override - GBRAIN_EMBEDDING_OMIT_DIMENSIONS - skip the OpenAI dimensions request param for servers that reject it (e.g. non-Matryoshka models) Adds client-side dimension truncation + L2 renormalization for Matryoshka-trained models that return more dims than configured. Slicing the first N dims of an MRL embedding preserves cosine retrieval quality (Qwen3-Embedding, OpenAI text-embedding-3, etc.). Tested against vLLM 0.20 + Qwen/Qwen3-Embedding-4B (native 2560-dim, truncated to 1536 to match existing schema): 4098 chunks across 736 pages, 100% coverage in gbrain doctor.

garrytan · 2026-06-08T02:58:40Z

Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on.

We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs).

Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏

garrytan mentioned this pull request May 10, 2026

v0.32.0 feat: 5 new embedding recipes + discoverability pass (closes 17-PR cluster) #810

Merged

8 tasks

garrytan closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support custom OpenAI-compatible embedding servers and other models#516

support custom OpenAI-compatible embedding servers and other models#516
vazir wants to merge 1 commit into
garrytan:masterfrom
vazir:feat/env-overridable-embedding-model

vazir commented Apr 29, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vazir commented Apr 29, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vazir commented Apr 29, 2026 •

edited by blacksmith-sh Bot

Loading