feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172
feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172nbzy1995 wants to merge 5 commits into
Conversation
…lama implementations The embedding service was a monolithic OpenAI-specific module. This extracts a provider interface so new backends (Ollama, vLLM, LiteLLM, Voyage) slot in without touching callers. Changes: - Add src/core/embedding/provider.ts — EmbeddingProvider interface + ProviderConfig type - Add src/core/embedding/providers/openai.ts — OpenAIProvider with Matryoshka dim param gated to text-embedding-3 family - Add src/core/embedding/providers/ollama.ts — OllamaProvider over /v1/embeddings, infers dim from known model registry, normalizes errors for retry - Add src/core/embedding/factory.ts — createProvider(config) + resolveConfig that merges explicit config > EMBEDDING_* env vars > defaults - Add src/core/embedding/service.ts — provider-agnostic batching, retry, truncation - Add src/core/embedding/index.ts — public surface - Keep src/core/embedding.ts as a thin re-export shim so existing imports work unchanged - Add test/embedding/provider.test.ts — 15 tests covering both providers, factory, env resolution Default behavior is preserved: no flags, no env vars → OpenAI text-embedding-3-large at 1536 dimensions. The full existing test suite (861 tests) passes without changes. The schema still hardcodes vector(1536); provider-driven schema templating lands in the follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now the PGLite and Postgres schemas hardcoded vector(1536) and
text-embedding-3-large — the result of the v0.6 env-var shim stopping at
embedding.ts without reaching the schema layer. This patch finishes the
abstraction: a brain's embedding dim and default model are chosen at
init time from the resolved EmbeddingProvider, templated into the schema,
and persisted to ~/.gbrain/config.json.
Changes:
- Convert PGLITE_SCHEMA_SQL const to pgliteSchema({dimensions, defaultModel})
function; keep the const as a backward-compat alias that evaluates defaults.
- Same shape for postgresSchema in src/core/schema-embedded.ts; SCHEMA_SQL
alias preserved.
- Engine.initSchema() now takes optional opts (same shape), passes through
to the schema function. Default behavior unchanged when called with no args.
- Add embedding: {provider, model, dimensions, base_url} field to GBrainConfig.
- init.ts: parse --provider / --model / --dimensions / --base-url; resolve
via createProvider() (validates + infers Ollama dims); dim-mismatch guard
refuses re-init against an existing brain with different dimensions; pass
opts to initSchema; persist the chosen provider to config.
- cli.ts: --version also prints active provider when a config is loadable.
- test/schema-templating.test.ts — 11 new unit tests covering default fallback,
partial opts, Postgres dollar-quote preservation, and const-alias parity.
Example usage:
gbrain init --provider=ollama --model=nomic-embed-text # 768d brain
gbrain init --provider=openai # 1536d brain (default)
gbrain init --provider=openai --dimensions=3072 # full text-embedding-3-large
gbrain init # defaults (openai 1536d)
All 861 existing tests still pass; 11 new schema tests added (872 total).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://...` was silently falling through to defaults because parseEmbeddingFlags only handled `--flag value` (space-separated) form. Supporting both forms is standard CLI behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
connectEngine() only loaded the database config. Commands that trigger embedding (embed, import, query, search) fell back to the service's default provider (OpenAI) regardless of what the brain was initialized with, causing 401s when the brain was configured for Ollama. Now connectEngine reads config.embedding, builds the matching provider, and installs it via setProvider before any command runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks for this contribution — and apologies for the slow triage. We did a full pass over the entire PR backlog. gbrain has moved fast, and the maintainer's larger "cathedral" rewrites have superseded a big share of community PRs: the AI gateway + recipes + user_provided_models system replaced almost all individual provider PRs; #1805 fixed the whole Postgres module-singleton class; #1542 unified the type taxonomy; #1657 the retrieval path; #1802 the doctor; and so on. We're closing this one in that cleanup — either the fix already landed on master, it duplicates another PR or merged change, or it's outside the current merge bar. Where a closed PR carried a genuinely valuable idea, we've recorded it in docs/designs/COMMUNITY_IDEAS.md so nothing good is lost (a few may graduate into TODOs). Please don't read the close as a judgment of the work — thank you for contributing. If you believe the underlying issue is still live on the latest master, reopen with a quick note and we'll take another look. 🙏 |
Summary
Promotes the env-var shim in
src/core/embedding.tsto a properEmbeddingProviderinterface so embedding backends slot in without touching call sites or the schema. ShipsOpenAIProvider(default, identical behavior to before) andOllamaProvider(any model on a local Ollama daemon, e.g.nomic-embed-text @ 768d). Templates thevector(N)schema dim by the resolved provider sogbrain init --provider=ollama --model=nomic-embed-textactually creates avector(768)brain.Why
Today
embedding.tsreadsEMBEDDING_MODEL/EMBEDDING_DIMENSIONS/EMBEDDING_BASE_URLenv vars and gates the Matryoshkadimensionsparam viamodel.startsWith('text-embedding-3'). The schema, however, hardcodesvector(1536)and'text-embedding-3-large', so setting those env vars produces a brain whose vector column doesn't match what the embedder writes — a silent footgun.This patch finishes the abstraction: provider quirks (Matryoshka, error normalization, dim inference) live in concrete provider classes; schema dim flows from the resolved provider at
gbrain inittime and is persisted to~/.gbrain/config.json. After init,connectEngine()hydrates the persisted provider before any command runs, soembed,query,import,serveall use the brain's frozen choice — env-var changes can't silently corrupt vectors mid-life.Scope (4 commits, ~870 LOC, all behavior-preserving by default)
8b206b8EmbeddingProviderinterface,OpenAIProvider,OllamaProvider, factory, service.embedding.tsbecomes a backward-compat shim. 15 new provider tests.0abd57cpglite-schema.tsandschema-embedded.tsby(dimensions, defaultModel).engine.initSchema(opts?)plumbs them through.init.tsparses--provider/--model/--dimensions/--base-url, persistsembedding: {...}to config, refuses re-init on dim mismatch.cli.ts --versionprints the active provider. 11 new schema-templating tests.7081e12parseEmbeddingFlagsaccepts both--flag valueand--flag=value.41469e1cli.ts connectEngine()hydrates the provider fromconfig.embeddingbefore any command — so embed/query don't fall back to OpenAI defaults when the brain was inited for Ollama.Tests
test/embedding/provider.test.tsandtest/schema-templating.test.tsgbrain initwith no flags → identical SQL as before,OpenAIProviderinstantiated, all existing tests unchangedgbrain queryreturns 1.0 cosine scores on topic-aligned questionsBackward compatibility
embedding.tsstill exportsembedandembedBatchwith unchanged signatures (re-exports from the newembedding/service.ts). No call sites need editing.PGLITE_SCHEMA_SQLandSCHEMA_SQLconst aliases preserved (evaluate the schema function with default opts → identical SQL).engine.initSchema()with no args defaults to(1536, 'text-embedding-3-large')— existing test harnesses keep working.EMBEDDING_*env vars still honored as the resolution fallback when no CLI flag and no persisted config.Example usage
Notes for maintainer
gbrain config showrenders nested objects as[object Object]— predates this PR but newly visible becauseembeddingis the first nested config fieldnomic-embed-text's native context is ~2K tokens, but the chunker can produce larger chunks. Worth gating chunker max size byprovider.maxInputCharsor the model's context windowprovider-layerfirst,schema-templatingsecond) if that's easier to review🤖 Generated with Claude Code