Skip to content

feat: pluggable embedding + expansion providers (OpenAI + Gemini)#213

Closed
aloysiusmartis wants to merge 12 commits into
garrytan:masterfrom
aloysiusmartis:feat/pluggable-providers
Closed

feat: pluggable embedding + expansion providers (OpenAI + Gemini)#213
aloysiusmartis wants to merge 12 commits into
garrytan:masterfrom
aloysiusmartis:feat/pluggable-providers

Conversation

@aloysiusmartis

Copy link
Copy Markdown

Summary

Two provider abstractions that let GBrain run on Gemini instead of OpenAI/Anthropic, or mix-and-match. Both follow the same factory pattern.

Embedding providers (GBRAIN_EMBEDDING_PROVIDER)

  • EmbeddingProvider interface + factory (getActiveProvider) + isEmbeddingAvailable()
  • OpenAIEmbedder — text-embedding-3-large, 1536 dims (extracted from embedding.ts, unchanged behavior)
  • GeminiEmbedder — gemini-embedding-001, 1–3072 dims (Matryoshka truncation)
  • gbrain init --provider gemini [--dimensions N] — new brains with Gemini schema
  • gbrain migrate --provider gemini [--dry-run] — switch existing brains with ALTER TABLE + re-embed
  • Provider config persists to ~/.gbrain/config.json and propagates to env at startup
  • Critical fix: put_page now calls isEmbeddingAvailable() instead of hardcoding OPENAI_API_KEY — non-OpenAI users were silently getting no embeddings on every import

Expansion providers (GBRAIN_EXPANSION_PROVIDER)

  • ExpansionProvider interface + factory (getActiveExpansionProvider) + isExpansionAvailable()
  • AnthropicExpander — claude-haiku-4-5 with tool_use (default)
  • GeminiExpander — gemini-1.5-flash with function calling
  • Shared security constants (EXPANSION_SYSTEM_PROMPT, EXPANSION_TOOL_NAME, EXPANSION_PARAM_NAME) — prompt drift between providers is a compiler error, not a runtime surprise
  • expansion.ts delegates to the active provider; callHaikuForExpansion removed

What stays the same: all existing callers, the embed() / embedBatch() / getEmbeddingModel() / getEmbeddingDimensions() public API, and the sanitizeQueryForPrompt / sanitizeExpansionOutput security boundary.

Full Google stack: set GBRAIN_EMBEDDING_PROVIDER=gemini + GBRAIN_EXPANSION_PROVIDER=gemini + GOOGLE_API_KEY and the entire search pipeline runs on Gemini. OpenAI and Anthropic keys become optional.

New files

  • src/core/embedding-provider.ts — interface, factory, isEmbeddingAvailable, resetActiveProvider
  • src/core/providers/openai-embedder.ts — OpenAI impl
  • src/core/providers/gemini-embedder.ts — Gemini impl
  • src/core/providers/retry-utils.ts — shared exponentialDelay + sleep (extracted from both embedders)
  • src/commands/migrate-provider.tsgbrain migrate --provider command
  • src/core/expansion-provider.ts — interface, factory, isExpansionAvailable, shared security constants
  • src/core/providers/anthropic-expander.ts — Anthropic impl
  • src/core/providers/gemini-expander.ts — Gemini impl

Changed upstream files

  • src/core/embedding.ts — delegates to getActiveProvider(); lazy getEmbeddingModel() / getEmbeddingDimensions() functions replace module-level consts (ordering fix)
  • src/core/search/hybrid.tsisEmbeddingAvailable() replaces !OPENAI_API_KEY
  • src/core/search/expansion.ts — delegates to getActiveExpansionProvider().expand()
  • src/core/pglite-schema.tsgetPGLiteSchema(dims, model) export for provider-aware schema
  • src/core/pglite-engine.tsinitSchema() uses active provider dims/model
  • src/core/postgres-engine.tsinitSchema() string-replaces vector dim/model
  • src/core/config.tsGBrainConfig adds embedding_provider? + embedding_dimensions?; loadConfig() propagates to env
  • src/commands/init.ts--provider openai|gemini [--dimensions N] flag
  • src/cli.ts — routes migrate --provider to runMigrateProvider

Tests

  • test/embedding-provider.test.ts — 22 unit tests
  • test/pglite-schema-provider.test.ts — 6 tests for getPGLiteSchema() substitutions
  • test/config-embedding-provider.test.ts — 4 tests for env-var propagation
  • test/migrate-provider-args.test.ts — 8 tests for provider instantiation and dimsChange logic
  • test/expansion-provider.test.ts — 12 unit tests

All 1492 unit tests pass. No live API calls in test suite — Gemini and Anthropic paths both tested via mock key injection.

Test plan

  • bun test — 1492 pass, 0 fail
  • GBRAIN_EMBEDDING_PROVIDER=openai gbrain init — unchanged behavior
  • GBRAIN_EMBEDDING_PROVIDER=gemini gbrain init — creates vector(768) schema
  • gbrain migrate --provider gemini --dry-run — preview without touching data
  • GBRAIN_EXPANSION_PROVIDER=gemini gbrain search "..." — routes to GeminiExpander

🤖 Generated with Claude Code

aloysiusmartis and others added 12 commits April 18, 2026 23:29
Adds a provider-agnostic embedding abstraction so gbrain can use
Gemini text-embedding-004 (768 dims) as a drop-in alternative to
OpenAI text-embedding-3-large (1536 dims).

New files (additive, zero upstream conflict):
- src/core/embedding-provider.ts — EmbeddingProvider interface, factory,
  isEmbeddingAvailable() replacing hardcoded OPENAI_API_KEY checks
- src/core/providers/openai-embedder.ts — extracted from embedding.ts
- src/core/providers/gemini-embedder.ts — Gemini text-embedding-004
- test/embedding-provider.test.ts — 16 tests, all passing

Upstream files touched minimally (marked // FORK:):
- src/core/embedding.ts — delegates to active provider, same public API
- src/core/search/hybrid.ts — isEmbeddingAvailable() replaces OPENAI_API_KEY check

Config: GBRAIN_EMBEDDING_PROVIDER=openai|gemini (default: openai)
        GBRAIN_EMBEDDING_DIMENSIONS=1-768 (Gemini only, default 768)
        GOOGLE_API_KEY or GEMINI_API_KEY for Gemini

Phase 2 (schema migration + gbrain init --provider) tracked in COORDINATION.md.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
…, migrate command

Schema templating:
- pglite-schema.ts: getPGLiteSchema(dims, model) generates schema with correct
  vector dimensions for any provider (additive export, PGLITE_SCHEMA_SQL preserved)
- pglite-engine.ts: initSchema() uses getPGLiteSchema() from active provider
- postgres-engine.ts: initSchema() string-replaces vector dim/model in SCHEMA_SQL

Config persistence:
- config.ts: GBrainConfig gains embedding_provider + embedding_dimensions fields;
  loadConfig() propagates them to env vars so getActiveProvider() stays consistent
  across sessions without re-setting env manually

Init-time provider selection:
- gbrain init --provider gemini [--dimensions N]
  Creates brain with vector(768) schema and persists choice to config.json

Provider migration:
- gbrain migrate --provider gemini [--dimensions N] [--dry-run]
  Full migration: ALTER vector column + re-embed all chunks + update config table
  Handles both PGLite and Postgres engines. Safe to resume if interrupted.

Tests:
- 3 live Gemini API tests added (auto-skip when GOOGLE_API_KEY absent)
- Set GOOGLE_API_KEY in .env.local to run them (Bun auto-loads .env.local)
- .env.local.example added to repo as setup guide

Total: 19 embedding-provider tests, 2616 unit tests — 0 fail

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
- Model: text-embedding-004 → gemini-embedding-001 (the available stable model)
  gemini-embedding-001 supports 1–3072 dims via outputDimensionality (Matryoshka)
- Default dimensions remain 768; max raised from 768 to 3072
- Added 1536-dim mode (OpenAI-compat: switch providers without ALTER TABLE)
- Live API tests: 3 calls verified in 751ms total — batch + single + dim truncation
- .env.test.local: API keys only (no provider override) so schema stays at 1536
  for all existing tests; live Gemini tests instantiate GeminiEmbedder directly
- .env.local.example: scrubbed real keys, added NEVER put real keys warning

All 2609 unit tests pass. Live Gemini API verified.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
- pglite-schema-provider.test.ts: 6 tests for getPGLiteSchema() pure function
  (all 3 substitution targets, custom dims, OpenAI-compat mode, default args)
- config-embedding-provider.test.ts: 4 tests for env-var propagation logic
  (propagates when unset, does not override when already set, for both
  embedding_provider and embedding_dimensions)
- migrate-provider-args.test.ts: 8 tests for provider instantiation and
  dimsChange logic (same/different dims, API key guard, no-retry on config error)
- embedding-provider.test.ts: +1 test for unknown provider fallback to OpenAI

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
- migrate-provider.ts: validate --dimensions is integer 1-3072, reject NaN
  (was: parseInt of non-numeric string produces NaN which silently passes
  GeminiEmbedder constructor and gets interpolated into ALTER TABLE SQL)
- migrate-provider.ts: remove duplicate ?? 768 default (GeminiEmbedder
  constructor already defaults to DEFAULT_DIMS=768)
- embedding-provider.test.ts: use it.skipIf pattern for live API tests
  instead of silent early-return (tests were green when not running)
- embedding-provider.test.ts: add resetActiveProvider() to isEmbeddingAvailable
  beforeEach/afterEach for proper test isolation
- embedding-provider.test.ts: add boundary dimension tests (1 and 3072)

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
1. embedding.ts: change EMBEDDING_MODEL/EMBEDDING_DIMENSIONS from eager
   module-level consts to lazy functions getEmbeddingModel/getEmbeddingDimensions.
   Consts were evaluated at import time, before loadConfig() could propagate
   the persisted provider choice to env — caused silent ordering bugs.

2. migrate-provider.ts: add remote=false guard parameter; throws if called
   with remote=true. Command does destructive DDL and mutates process.env,
   both unsafe in MCP/remote context.

3. providers/retry-utils.ts: extract exponentialDelay() and sleep() shared
   helpers from openai-embedder.ts and gemini-embedder.ts into a single file.
   Both had identical implementations — one change point now instead of two.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
Critical:
- operations.ts: replace hardcoded OPENAI_API_KEY check with isEmbeddingAvailable()
  in put_page handler. Gemini users had all embeddings silently skipped on every
  put_page call (noEmbed was always true when OPENAI_API_KEY is unset).

High:
- gemini-embedder.ts: add Number.isInteger() to constructor guard to catch NaN.
  NaN fails both < 1 and > 3072 comparisons, silently creating an invalid provider.

Low:
- init.ts: use !== undefined instead of truthiness check for embeddingDimensions.
  Prevents dim=0 or dim=1 from being silently skipped when setting the env var.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
Mirrors the GBRAIN_EMBEDDING_PROVIDER abstraction for query expansion.
Extracted the hardcoded Anthropic/Haiku call from expansion.ts into a
provider interface with two implementations: AnthropicExpander (default,
same behavior) and GeminiExpander (gemini-1.5-flash via function calling).

New files:
- src/core/expansion-provider.ts — ExpansionProvider interface + factory
- src/core/providers/anthropic-expander.ts — Claude Haiku impl (extracted)
- src/core/providers/gemini-expander.ts — Gemini Flash impl
- test/expansion-provider.test.ts — 10 unit tests

Config:
  GBRAIN_EXPANSION_PROVIDER=anthropic|gemini (default: anthropic)
  GOOGLE_API_KEY or GEMINI_API_KEY required when using gemini

Updated docs: INSTALL_FOR_AGENTS.md, README.md, .env.testing.example
All 2648 unit tests pass.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
GeminiEmbedder.embed() and embedBatch() now run via injected mock client —
no GOOGLE_API_KEY required. AnthropicExpander missing-key test isolates
env var deletion to prevent parallel test file race conditions.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
…AGENTS

Step 2 previously buried Gemini as a paragraph footnote after the OpenAI
block. Agents would parse "Ask the user for these" and present only
OPENAI_API_KEY + ANTHROPIC_API_KEY. Restructured as Option A / Option B
so the agent surfaces both upfront and asks the user to choose.

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
…ander

Pre-upstream-contribution cleanup:
- Remove // FORK: comments from embedding.ts, hybrid.ts
- Remove FORK: comment from anthropic-expander.ts docblock
- Add explicit ANTHROPIC_API_KEY check before SDK call so tests
  get a clear error message instead of SDK's generic auth error

Co-authored-by: Al's bot <aloysiusmartis@users.noreply.github.com>
@jeremy-windsor

Copy link
Copy Markdown

thank you for this, adding a comment that this would be a great feature.

@aloysiusmartis aloysiusmartis deleted the feat/pluggable-providers branch April 20, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants