feat: support alternative embedding and LLM providers via env vars#99
Closed
YIING99 wants to merge 1 commit into
Closed
feat: support alternative embedding and LLM providers via env vars#99YIING99 wants to merge 1 commit into
YIING99 wants to merge 1 commit into
Conversation
GBrain currently hardcodes OpenAI text-embedding-3-large for embeddings and Claude Haiku for query expansion. This makes it unusable in regions where these APIs are inaccessible (e.g. mainland China) or for users who prefer self-hosted / alternative models. This PR makes all model providers configurable through environment variables, with zero-change defaults for existing users: **Embedding (embedding.ts + pglite-schema.ts):** - EMBEDDING_MODEL — model name (default: text-embedding-3-large) - EMBEDDING_DIMENSIONS — vector dimensions (default: 1536) - OPENAI_BASE_URL — API endpoint for OpenAI-compatible providers **Query Expansion (expansion.ts):** - EXPANSION_PROVIDER — 'anthropic' (default when ANTHROPIC_API_KEY set) or 'openai' - EXPANSION_MODEL — model name (auto-selected per provider) - Falls back to OpenAI-compatible API when no Anthropic key is available **CJK word count fix:** - Chinese/Japanese/Korean queries were silently skipped by expandQuery() because space-based word count treated "向量搜索优化" as 1 word - Now detects CJK characters and counts non-whitespace characters instead Tested with DashScope (text-embedding-v3, qwen-plus) and original OpenAI/Anthropic providers. Fully backward compatible.
|
just came to upvote this. i was going to write a PR for the exact same thing. looks good. |
Contributor
Author
|
Closing this in favor of #213 (@aloysiusmartis) which is a proper superset of what this PR tried to do:
#213 is the right abstraction. My env-var approach was a local-first workaround; the provider interface is the principled fix. Thanks @aloysiusmartis — landing #213 unblocks a lot of us running on alternative stacks (DashScope, Gemini, etc.). cc @jeremy-windsor — ^ this is the one to watch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GBrain currently hardcodes OpenAI
text-embedding-3-largefor embeddings and Claude Haiku for query expansion. This makes it unusable in regions where these APIs are inaccessible (e.g. mainland China) or for users who prefer self-hosted / alternative models.This PR makes all model providers configurable through environment variables, with zero-change defaults for existing users.
Changes
Embedding (
embedding.ts+pglite-schema.ts)EMBEDDING_MODELtext-embedding-3-largetext-embedding-v3(DashScope)EMBEDDING_DIMENSIONS15361024OPENAI_BASE_URLhttps://api.openai.com/v1https://dashscope.aliyuncs.com/compatible-mode/v1The PGLite schema's
vector()dimension and default model name now follow these env vars.Query Expansion (
expansion.ts)EXPANSION_PROVIDERanthropic(ifANTHROPIC_API_KEYset) /openaiopenaiEXPANSION_MODELqwen-plusANTHROPIC_API_KEYis available, automatically falls back to OpenAI-compatible APIresponse_format: { type: 'json_object' }for structured output from OpenAI-compatible modelsBackward Compatibility
Tested With
text-embedding-3-large+ Anthropic Claude Haiku (original behavior)text-embedding-v3(1024 dims) +qwen-plus(Chinese provider)Files Changed
src/core/embedding.ts— configurable model & dimensions (7 lines)src/core/pglite-schema.ts— dynamic vector dimensions in schema (6 lines)src/core/search/expansion.ts— dual-provider support + CJK fix (50 lines)Total: 3 files, ~74 lines changed