fix(init): seed resolveAIOptions from loadConfig() (closes #203) by vincedk-alt · Pull Request #1060 · garrytan/gbrain

vincedk-alt · 2026-05-16T05:43:00Z

Summary

Re-init no longer ignores the embedding settings users already wrote down in ~/.gbrain/config.json. gbrain init --pglite (no flags) on a brain whose persisted config declares a non-default embedding_model / embedding_dimensions / provider_base_urls now respects those values instead of silently falling back to OpenAI 1536. Closes #203.

Why

resolveAIOptions in src/commands/init.ts only consumed CLI flags. It never read loadConfig(), so aiOpts came out empty when no flags were passed. That tripped the if (embedding_model || chat_model) gate further down in initPGLite (line 386), so configureGateway was never called, _config stayed null, and initSchema()'s catch-block fell back to literal 1536 in pglite-engine.ts:212.

Net result on every re-init without flags: the user's persisted choice was silently overridden by the OpenAI default. v0.28.5's embedding-dim-check catches the resulting column mismatch on the next embed, but only AFTER the wrong-shape brain is already on disk and the user has spent another minute figuring out why their --embedding-model lmstudio:... setup keeps producing vector(1536) columns.

What changed

One function (resolveAIOptions) in src/commands/init.ts. Seeds out from loadConfig() at the top, before the existing flag-handling code. Flags still win on a per-field basis. New resolution chain:

~/.gbrain/config.json via loadConfig() — NEW
--model SHORTHAND recipe lookup (sets model + dims)
--embedding-model VERBOSE (overrides shorthand)
--embedding-dimensions N (overrides recipe-derived dim)
--expansion-model / --chat-model (additive)

Gateway defaults still apply downstream when nothing in the chain populates a field — cold-start behavior unchanged. claw-test's fresh-install scenario keeps working.

Two edge cases caught during review and fixed in this PR:

Model swap clears seeded dim. If config has lmstudio:nomic 768 and the user reruns init with --embedding-model openai:text-embedding-3-large (no --embedding-dimensions), the seeded 768 is cleared so the recipe-default derivation re-fires for the new provider. Without this, the user would silently get OpenAI configured at 768-dim.
provider_base_urls rides along. Custom endpoint overrides (e.g. provider_base_urls.lmstudio pointing at http://localhost:1234/v1) are preserved through the seed + saveConfig cycle. Without this, re-init dropped the user's endpoint override and the gateway fell back to recipe defaults on next call.

Tests

test/init-config-first.test.ts — 7 cases, hermetic against the runner env via an isolateEnv() helper that unsets DATABASE_URL, GBRAIN_DATABASE_URL, and every GBRAIN_EMBEDDING_* / GBRAIN_*_MODEL / OPENAI_API_KEY / ANTHROPIC_API_KEY. Uses withEnv() from the canonical test helpers; no mock.module, no *.serial.test.ts quarantine needed. The 7 cases pin:

Config-only, no flags → returns config values
Flag-only, no config → existing flag behavior preserved
Both flags and config → flags override config per-field, config fills the gaps
Empty (cold-start, no config, no flags) → empty object (existing behavior preserved)
--embedding-model switches provider away from config → seeded dim cleared, recipe-default re-derives
provider_base_urls preserved through the seed cycle
Env-var-only (KNOWN LIMITATION, tracked at init: GBRAIN_EMBEDDING_MODEL/DIMENSIONS env vars ignored when no config.json AND no DATABASE_URL #1058)

Case 7 documents a separate bug class not closed by this PR: when there's no config.json AND no DATABASE_URL, loadConfig() early-returns null at config.ts:135, so the env-var merge below that early-return never fires. Setting GBRAIN_EMBEDDING_DIMENSIONS=768 env-only on a truly cold install still produces a 1536-dim schema. Filed at #1058 for separate fix.

Deliberate-failure run: DATABASE_URL=... GBRAIN_EMBEDDING_MODEL=... GBRAIN_EMBEDDING_DIMENSIONS=999 bun test test/init-config-first.test.ts → 7 pass, 0 fail. The isolateEnv() is adversarially proven, not just theoretical.

Verification

bun run typecheck — clean
bun run verify — all 13 pre-checks pass (check:privacy, check:test-isolation with 410 non-serial files scanned, etc.)
bun test test/init-config-first.test.ts — 7 pass, 0 fail
bun run test — full fast loop ran with 0 fails (two shards exceeded the 600s per-shard timeout cap; per gbrain CLAUDE.md that's the documented "infrastructure problem" classification, not test failures; all 4 shards' completed work reports 0 fail; new test is in shard 2 which completed in 1857 pass / 0 fail / 0 skip / rc=0)
End-to-end smoke against a tempdir GBRAIN_HOME: column came up at vector(768) (verified via direct pg_attribute format_type probe) on a config-first init with no flags

Reviewer notes

Two adversarial reviews ran against this branch before opening the PR:

A devil's-advocate critic loop (4 rounds, quieted with 4 findings raised + resolved — env-var-only path called out as KNOWN LIMITATION + filed as init: GBRAIN_EMBEDDING_MODEL/DIMENSIONS env vars ignored when no config.json AND no DATABASE_URL #1058, the bun -e smoke-test syntax error caught and re-run with ground truth, the rebase pre-check added before any force-push, and a test-isolation hole closed by isolateEnv()).
codex review --base upstream/master 3 rounds: round 1 surfaced the model-swap dim-stale bug (fixed), round 2 surfaced provider_base_urls field-loss (fixed), round 3 clean.

Both reviewers were against the same branch tip after each fix; their findings did not overlap, which is the cross-model-review value-add CLAUDE.md Rule 5b describes.

🤖 Generated with Claude Code

Re-init no longer requires users to re-pass embedding flags they already wrote down in ~/.gbrain/config.json. resolveAIOptions now reads loadConfig() first, then applies CLI flag overrides on a per-field basis. Resolution chain (each step overrides the previous per-field): 1. ~/.gbrain/config.json via loadConfig() -- NEW 2. --model SHORTHAND (recipe lookup; sets both model + dims) 3. --embedding-model VERBOSE (overrides shorthand) 4. --embedding-dimensions N (overrides recipe-derived dims) 5. --expansion-model / --chat-model (additive) Gateway defaults still apply downstream for fields nothing in the chain populates -- cold-start behavior unchanged. claw-test fresh-install scenario is unaffected. Known limitation (deferred to follow-up issue): if a user sets GBRAIN_EMBEDDING_MODEL + GBRAIN_EMBEDDING_DIMENSIONS env vars but has NO existing config.json AND no DATABASE_URL, loadConfig() returns null at src/core/config.ts:135. Env-only callers still hit OpenAI/1536 gateway defaults on cold-start. Pinned by case 5 in the new test. Tests: test/init-config-first.test.ts -- 5 cases pinning the resolution chain (config-only, flag-only, both-override, empty cold-start, env-only known-limitation). Uses withEnv() + tempdir GBRAIN_HOME, no mock.module, no env mutation outside helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…garrytan#203 fully) PR garrytan#1060 closed the flat-shape case (config has `embedding_model` / `embedding_dimensions` but init ignored them). This commit closes the ORIGINAL reproducer in jamebobob's issue body: the v0.10.x NESTED shape `{embedding: {provider, model, dimensions, base_url}}`. v0.27 (PR garrytan#257) flattened the embedding config from nested to top-level fields. No migration was written. Users on v0.10.x configs silently fall through to gateway defaults (OpenAI 1536) on every command because no code path reads the nested shape. PR garrytan#1060's seed step only matched flat fields, so it didn't help users still carrying the original shape. This commit adds a back-compat read in `loadConfig()` that maps the nested shape to flat fields IN MEMORY before the env-merge step: - `embedding.{provider, model}` → `embedding_model: "provider:model"` - `embedding.dimensions` → `embedding_dimensions` - `embedding.{base_url, provider}` → `provider_base_urls[provider] = base_url` Flat fields win when both shapes are present (case 7). The next `saveConfig` serializes only declared flat fields, so the nested object drops out on save. Tests: 2 new cases in test/init-config-first.test.ts (case 6: jamebobob reproducer, case 7: flat-wins-over-nested). Suite now 9 pass / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vincedk-alt · 2026-05-16T06:50:08Z

Follow-up commit: v0.10.x nested-shape coverage (close #203 fully)

The original commit closed the case for users who already have flat-field configs (embedding_model / embedding_dimensions / provider_base_urls at the top level). After an audit of the issue body, that left a gap: jamebobob's actual repro is the v0.10.x nested shape ({embedding: {provider, model, dimensions, base_url}}).

PR #257 flattened the config in v0.27 but no migration was written. Anyone carrying a config.json from before that release silently falls through to gateway defaults (OpenAI 1536) on every command because no code path reads the nested shape.

New commit 1c6fdb4 adds a back-compat read in loadConfig() that maps the nested shape to flat fields in memory before the env-merge step:

embedding.{provider, model} → embedding_model: "provider:model"
embedding.dimensions → embedding_dimensions
embedding.{provider, base_url} → provider_base_urls[provider] = base_url

The next saveConfig serializes only declared flat fields, so the nested object drops out on save. Flat fields win when both shapes are present (defensive against partial manual edits).

Two new test cases in test/init-config-first.test.ts:

case 6: jamebobob's exact repro from the issue body → maps to ollama:bge-m3 / 1024 / base_url override
case 7: flat-wins-over-nested when both present

Full suite now 9 pass / 0 fail. Typecheck clean. bun run verify clean.

This closes #203 for both shapes — users on the v0.10.x nested config AND users who've already manually re-saved to the flat shape.

vincedk-alt mentioned this pull request May 16, 2026

init: --provider flags required even when config.json has persisted embedding settings (DX) #203

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(init): seed resolveAIOptions from loadConfig() (closes #203)#1060

fix(init): seed resolveAIOptions from loadConfig() (closes #203)#1060
vincedk-alt wants to merge 2 commits into
garrytan:masterfrom
vincedk-alt:fix/init-config-first

vincedk-alt commented May 16, 2026

Uh oh!

vincedk-alt commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vincedk-alt commented May 16, 2026

Summary

Why

What changed

Tests

Verification

Reviewer notes

Uh oh!

vincedk-alt commented May 16, 2026

Follow-up commit: v0.10.x nested-shape coverage (close #203 fully)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant