Make embedding vector dimension configurable + add LM Studio recipe by vincedk-alt · Pull Request #1051 · garrytan/gbrain

vincedk-alt · 2026-05-15T22:00:31Z

Summary

Two patches for non-OpenAI embedding workflows:

vector(N) schema parameterization — schema.sql had vector(1536) hardcoded for OpenAI's text-embedding-3-small dim. Changed to read embedding_dimensions from config (default 1536 for backwards compat), so users running local models like nomic-embed-text-v1.5 (768) or mxbai-embed-large (1024) don't have to fork.
LM Studio recipe — sibling of the ollama recipe at recipes/lmstudio.ts, points at http://localhost:1234/v1 by default. Cleanly separates LM Studio integration from the misleading "ollama at non-Ollama port" workaround pattern.

Why

Running gbrain on a personal fork to support a hybrid local+cloud setup (LM Studio MLX for embeddings, Anthropic Haiku for chat). Every upstream release requires manual rebase. These patches are generally useful — anyone running gbrain with a non-OpenAI embedding provider needs the dim parameterization.

Verified by operator

Running in production on operator's machine for ~10 days with the vector(768) schema patch
92 brain pages embedded at 768 dim using nomic-embed-text-v1.5 via LM Studio MLX
gbrain doctor passes (45/100 brain_score is unrelated — content sparsity, not schema)
Anthropic Haiku continues to work as the chat/expansion model alongside LM Studio embeddings

Test plan (not personally re-verified before opening this PR — operator's prior test run is the basis)

The original patches were validated in the operator's earlier work session (2026-05-14). The PR-opener (an agentic helper) is surfacing the patches upstream rather than re-running the test suite. Maintainer may wish to re-validate via:

Aged-brain test: existing vector(1536) databases continue to work with no migration needed
Fresh-init test with embedding_dimensions=768 in config produces a working PGLite database
WAL-stress: 100 sequential gbrain put operations under the new schema path
LM Studio recipe smoke test: gbrain put then gbrain search round-trip

Branch is 4 commits behind master at time of PR opening. Happy to rebase or iterate on the patch shape if a different abstraction is preferred.

Update 2026-05-16: This PR's branch now also carries commit 5d75545 (gitignore the gbrain db_only auto-managed directories media/x/, media/articles/, meetings/transcripts/). Strictly additive; doesn't conflict with the schema parameterization above.

Separately, see #1060 (closes #203) — the same operator's branch surfaced a config-first init bug while exercising the v0.35 → v0.35.0.1 rebuild path. That fix is independent of this PR; both can land in any order.

…ic-embed-text-v1.5

…io :1234) The ollama recipe template at the same port was misleading on inspection. LM Studio is not Ollama; the config now reads lmstudio:<model> truthfully. Recipe mirrors ollama.ts shape: openai-compat tier, no auth required, no static batch cap, default 768 dims for nomic-embed-text-v1.5. Verified end-to-end: gbrain doctor shows embedding_provider: lmstudio:text-embedding-nomic-embed-text-v1.5 OK 172ms, 768 dims, DB aligned. Operator config update separately at ~/.gbrain/config.json.

These dirs (media/x/, media/articles/, meetings/transcripts/) are populated at runtime by gbrain sync/ingest for db_only payloads and should never enter version control. Captured locally during the v0.33 -> v0.35 PGLite rebuild on 2026-05-15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vincedk-alt added 2 commits May 15, 2026 18:00

vince fork: embedding column vector(1536) → vector(768) for local nom…

12947cc

…ic-embed-text-v1.5

vincedk-alt force-pushed the vince-schema-768-dim branch from 66993f1 to 1c55e9b Compare May 15, 2026 23:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make embedding vector dimension configurable + add LM Studio recipe#1051

Make embedding vector dimension configurable + add LM Studio recipe#1051
vincedk-alt wants to merge 3 commits into
garrytan:masterfrom
vincedk-alt:vince-schema-768-dim

vincedk-alt commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vincedk-alt commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Verified by operator

Test plan (not personally re-verified before opening this PR — operator's prior test run is the basis)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vincedk-alt commented May 15, 2026 •

edited

Loading