Skip to content

feat: support alternative embedding and LLM providers via env vars#99

Closed
YIING99 wants to merge 1 commit into
garrytan:masterfrom
YIING99:feat/configurable-providers
Closed

feat: support alternative embedding and LLM providers via env vars#99
YIING99 wants to merge 1 commit into
garrytan:masterfrom
YIING99:feat/configurable-providers

Conversation

@YIING99

@YIING99 YIING99 commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Summary

GBrain currently hardcodes OpenAI text-embedding-3-large for embeddings and Claude Haiku for query expansion. This makes it unusable in regions where these APIs are inaccessible (e.g. mainland China) or for users who prefer self-hosted / alternative models.

This PR makes all model providers configurable through environment variables, with zero-change defaults for existing users.

Changes

Embedding (embedding.ts + pglite-schema.ts)

Env Var Default Example
EMBEDDING_MODEL text-embedding-3-large text-embedding-v3 (DashScope)
EMBEDDING_DIMENSIONS 1536 1024
OPENAI_BASE_URL https://api.openai.com/v1 https://dashscope.aliyuncs.com/compatible-mode/v1

The PGLite schema's vector() dimension and default model name now follow these env vars.

Query Expansion (expansion.ts)

Env Var Default Example
EXPANSION_PROVIDER anthropic (if ANTHROPIC_API_KEY set) / openai openai
EXPANSION_MODEL auto per provider qwen-plus
  • When no ANTHROPIC_API_KEY is available, automatically falls back to OpenAI-compatible API
  • Uses response_format: { type: 'json_object' } for structured output from OpenAI-compatible models
  • Also includes CJK word count fix (see fix: CJK word count in query expansion #98)

Backward Compatibility

  • No env vars set = exact same behavior as before
  • All defaults match current hardcoded values
  • No new dependencies added (OpenAI SDK already in the project)

Tested With

  • ✅ OpenAI text-embedding-3-large + Anthropic Claude Haiku (original behavior)
  • ✅ DashScope text-embedding-v3 (1024 dims) + qwen-plus (Chinese provider)

Files Changed

  • src/core/embedding.ts — configurable model & dimensions (7 lines)
  • src/core/pglite-schema.ts — dynamic vector dimensions in schema (6 lines)
  • src/core/search/expansion.ts — dual-provider support + CJK fix (50 lines)

Total: 3 files, ~74 lines changed

GBrain currently hardcodes OpenAI text-embedding-3-large for embeddings
and Claude Haiku for query expansion. This makes it unusable in regions
where these APIs are inaccessible (e.g. mainland China) or for users
who prefer self-hosted / alternative models.

This PR makes all model providers configurable through environment
variables, with zero-change defaults for existing users:

**Embedding (embedding.ts + pglite-schema.ts):**
- EMBEDDING_MODEL — model name (default: text-embedding-3-large)
- EMBEDDING_DIMENSIONS — vector dimensions (default: 1536)
- OPENAI_BASE_URL — API endpoint for OpenAI-compatible providers

**Query Expansion (expansion.ts):**
- EXPANSION_PROVIDER — 'anthropic' (default when ANTHROPIC_API_KEY set) or 'openai'
- EXPANSION_MODEL — model name (auto-selected per provider)
- Falls back to OpenAI-compatible API when no Anthropic key is available

**CJK word count fix:**
- Chinese/Japanese/Korean queries were silently skipped by expandQuery()
  because space-based word count treated "向量搜索优化" as 1 word
- Now detects CJK characters and counts non-whitespace characters instead

Tested with DashScope (text-embedding-v3, qwen-plus) and original
OpenAI/Anthropic providers. Fully backward compatible.
@jeremy-windsor

Copy link
Copy Markdown

just came to upvote this. i was going to write a PR for the exact same thing. looks good.

@YIING99

YIING99 commented Apr 19, 2026

Copy link
Copy Markdown
Contributor Author

Closing this in favor of #213 (@aloysiusmartis) which is a proper superset of what this PR tried to do:

  • Full EmbeddingProvider / ExpansionProvider interfaces + factory pattern
  • OpenAI and Gemini implementations (not just env-var swap)
  • New gbrain init --provider and gbrain migrate --provider commands
  • Fixes a silent bug I missed: put_page hardcoded OPENAI_API_KEY, so non-OpenAI users were getting zero embeddings on every import
  • 52 new unit tests

#213 is the right abstraction. My env-var approach was a local-first workaround; the provider interface is the principled fix. Thanks @aloysiusmartis — landing #213 unblocks a lot of us running on alternative stacks (DashScope, Gemini, etc.).

cc @jeremy-windsor — ^ this is the one to watch.

@YIING99 YIING99 closed this Apr 19, 2026
@YIING99 YIING99 deleted the feat/configurable-providers branch April 19, 2026 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants