v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants by garrytan · Pull Request #1278 · garrytan/gbrain

garrytan · 2026-05-21T19:31:36Z

Summary

Fixes the v0.36 bug class where a fresh gbrain init --pglite silently produced a broken brain when the ZeroEntropy default didn't match the user's actual API keys.

The headline story. A user on WSL ran bun install && gbrain init --pglite && gbrain import …, hit expected 1536 dimensions, not 1280, and recovered with five commands including rm -rf ~/.gbrain and three config keys that gbrain silently ignored (embedding.provider, embedding.model, embedding.dimensions). This PR closes every defect that bug surfaced.

What ships:

D1+D2 env-detection auto-pick + interactive picker (src/commands/init.ts, src/commands/init-provider-picker.ts). gbrain init --pglite now peeks at process.env, auto-picks when one provider is ready, fires a picker when multiple are ready. Local-only providers (Ollama, llama-server) excluded from auto-pick to prevent silent routing to daemons that may not be running.
D3 fail-loud non-TTY no-key path + D13 typo detection — CI/Docker installs with zero provider keys exit 1 with a paste-ready setup hint. OPENAPI_API_KEY surfaces "did you mean OPENAI_API_KEY?" via Levenshtein.
D5 atomic config persistence — init always writes the resolved provider/model/dim tuple to ~/.gbrain/config.json so subsequent runs are deterministic across releases. Pre-fix it only persisted when flags were passed.
D7 subagent-Anthropic caveat surfaces at three places (picker UI, post-init stderr, doctor subagent_provider check) when chat_model is non-Anthropic and ANTHROPIC_API_KEY is missing.
D6 strict config set rejects unknown keys with Levenshtein suggestion, --force escape hatch with stderr WARN.
D9 --no-embedding opt-in for deferred-setup mode; embed/import callsites refuse cleanly when the sentinel is set.
D11 preflight resolveSchemaEmbeddingDim runs BEFORE engine.initSchema() — invalid dim refuses without writing anything to disk. D12 extends to multimodal.
Doctor extensions — embedding_provider check now catches the v0.36 silent-default repair case with empty-brain vs non-empty branching (drop-and-re-init vs retrieval-upgrade). Never recommends rm -rf. subagent_provider check (v0.31.12) extended per D7.
Empty-brain brain_score = 100/100 — a fresh brain with no pages now scores full marks (vacuous truth: no coverage problem to penalize). Pre-fix it returned 0/100, which was structurally surprising on first gbrain doctor run.

Decision trail. 14 architectural decisions captured (D1-D14) including 4 codex outside-voice cross-model tensions resolved. Plan persisted at ~/.claude/plans/system-instruction-you-are-working-enumerated-mccarthy.md.

Test Coverage

NEW / MODIFIED CODE PATHS — 26 paths total, 100% covered (target: 80%)

[+] src/core/levenshtein.ts                          [★★★ 18 cases]
[+] src/core/embedding-dim-check.ts (extended)       [★★★ 23 cases]
[+] src/commands/providers.ts (extracted helper)     [★★★ 10 cases]
[+] src/commands/init-provider-picker.ts             [★★★ 7 cases]
[+] src/commands/init.ts:resolveAIOptions            [★★★ 21 cases]
[+] src/commands/init.ts:initPGLite (preflight)      [★★★ E2E 14 cases]
[+] src/commands/config.ts (strict set)              [★★★ 19 cases]
[+] src/commands/embed.ts (refuse on deferred)       [★★  unit]
[+] src/commands/import.ts (refuse on deferred)      [★★  unit + E2E]
[+] src/commands/doctor.ts (T9 + T10 extensions)     [★★  unit]
[+] src/commands/reindex-multimodal.ts (preflight)   [★★  unit]
[+] src/core/{pglite,postgres}-engine.ts (empty=100) [★★★ updated breakdown test]

COVERAGE: 100% (target: 80%)  |  Tests: before=580 → after=659 (+79 unit + 14 E2E)
QUALITY: ★★★ across all new paths
REGRESSION RULE: 3 cases (bug-reporter's three no-op config keys) — IRON RULE applied

Pre-Landing Review

Cleared via /plan-eng-review end-to-end:

11 architectural decisions locked (D1-D11)
15 codex outside-voice findings (7 baked into plan, 4 raised as tensions D9-D11, 4 deferred as TODOs)
0 critical gaps remaining
PR Quality Score: 9.5/10

Plan Completion

All 15 implementation tasks (T1-T15) DONE. Plan items NOT DONE: 0. UNVERIFIABLE: 0.

TODOS

Closed: v0.32.x "interactive provider chooser in gbrain init" — SUPERSEDED by src/commands/init-provider-picker.ts. Closed: P0 doctorReportRemote brain_score test flake (resolved by empty-brain-100 fix).
Filed: v0.37+ dedicated migration for v0.36 broken installs (telemetry-gated); namespaced ext fields for config set; runtime config-key audit; value-level Levenshtein on config set.

Documentation

Doc updates for v0.37.10.0:

README.md — Troubleshooting section with one-paragraph repair hint and links to provider docs.
docs/integrations/embedding-providers.md — TL;DR table refreshed; new "Init resolves your provider from env keys" section; "If first import fails" troubleshooting block.
docs/operations/headless-install.md (NEW) — Docker/CI sequencing guide covering both build-time-key and runtime-key (--no-embedding) patterns.
CLAUDE.md — Annotated both engine entries with the v0.37.10.0 empty-brain 100/100 contract.
TODOS.md — closed v0.32 picker entry as SUPERSEDED; closed P0 doctor-report test; added 4 follow-up entries.
llms.txt + llms-full.txt — regenerated per CLAUDE.md auto-derived rule.

Test plan

bun run verify clean (typecheck + 6 pre-checks)
bun run test — 8342 pass / 0 fail (was 1 pre-existing flake, now fixed)
T12 E2E suite (test/e2e/init-fresh-pglite.test.ts) — 14/14 pass against subprocess-driven happy-path, fail-loud, D6 regression, D9 deferred-setup, D11 preflight, explicit-flag-wins paths
Trio agreement: VERSION=0.37.10.0, package.json=0.37.10.0, CHANGELOG top=[0.37.10.0]
Bisectable history: 9 feature commits + 1 merge commit + 1 brain_score fix
Bug-reporter's exact repro now succeeds end-to-end with no manual config

🤖 Generated with Claude Code

Foundation for v0.37.10.0 env-detection wave. Two pure modules: - src/core/levenshtein.ts: editDistance(a,b) + suggestNearest(input, candidates, maxDistance). Used by config-set "did you mean" suggestions and env-var typo detection at init. - src/core/embedding-dim-check.ts: resolveSchemaEmbeddingDim() + resolveSchemaMultimodalDim() pure functions. Validate resolved dim against recipe default_dims + per-provider Matryoshka allow-lists (OpenAI text-3, Voyage flexible-dim, ZeroEntropy zembed-1) BEFORE any DB write. Plus EmbeddingDisabledError + assertEmbeddingEnabled() runtime guard for the deferred-setup path (D9). New PGVECTOR_COLUMN_MAX_DIMS=16000 exported. Tests: 41 unit cases across both modules.

Two changes prepping the env-detection wave: - providers.ts: extract formatRecipeTable() helper from runList(). Picker reuses it so UI can't drift from \`gbrain providers list\`. Also adds the codex finding #10 warn-line to \`providers test\` when the tested model differs from the configured default ("Note: tested X in isolation; gbrain's configured embedding is Y — this test does NOT verify your brain's active path."). envReady() takes an explicit env arg for testing. - init-provider-picker.ts (NEW): interactive picker mirroring init-mode-picker.ts. Filters candidate recipes to env-ready ones (codex finding #3), prompts via readLineSafe, exports printSubagentAnthropicCaveat() for shared use from initPGLite/initPostgres. Tests: 17 unit cases (10 providers + 7 picker).

Two changes for the v0.37.10.0 wave: - src/core/config.ts: add embedding_disabled?:boolean to GBrainConfig (D9 deferred-setup sentinel, mutually exclusive with embedding_model). Export KNOWN_CONFIG_KEYS (60+ canonical keys, file-plane + DB-plane) and KNOWN_CONFIG_KEY_PREFIXES (search., models., dream., cycle., etc.) for validation use. - src/commands/config.ts: D6 strict-default unknown-key rejection. Unknown key + no --force → exit 1 with Levenshtein suggestion against KNOWN_CONFIG_KEYS. Prefix matches accepted without --force. --force escape hatch accepts arbitrary keys with stderr WARN. Closes the silent-no-op class the bug reporter hit (embedding.provider, embedding.model, embedding.dimensions all exit 1 with right suggestion). Tests: 19 unit cases pinning the bug-reporter regression + gate logic.

…no-embedding Core of the v0.37.10.0 wave (D1-D7, D9-D11). Closes the bug where a fresh \`gbrain init --pglite\` silently produced a broken brain when no provider key matched the v0.36 default. resolveAIOptions rewritten with per-touchpoint env detection: - Explicit flag → shorthand → env auto-pick (group by provider id, codex #2) - Picker fires when multiple providers env-ready (D1+D2 hybrid) - Non-TTY zero-key exits 1 with paste-ready setup hint (D3) + Levenshtein typo detection for OPENAPI_API_KEY → OPENAI_API_KEY (D13) - All three touchpoints covered (embedding + expansion + chat, D4) - Local-only providers (Ollama/llama-server) excluded from auto-pick; picking Ollama silently when user has OPENAI_API_KEY set was wrong UX initPGLite + initPostgres: - Drop conditional configureGateway gate → always call before initSchema - Preflight resolveSchemaEmbeddingDim() BEFORE engine.initSchema() (D11) — invalid dim refuses with paste-ready hint, no disk write - Atomic embedding-config persistence (codex #13): either resolved tuple or embedding_disabled:true sentinel, never partial state - Post-initSchema invariant assertion stays as regression guardrail - --no-embedding opt-in flag (D9) for deferred-setup mode - Subagent-Anthropic caveat (D7) fires post-init when chat_model is non-Anthropic AND ANTHROPIC_API_KEY missing Exported groupReadyByProvider() + findEnvKeyTypos() for unit testing. Tests: 21 unit cases covering provider grouping + typo detection edge cases.

… is active T7 of the v0.37.10.0 wave. Both runEmbedCore and runImport now call assertEmbeddingEnabled(loadConfig()) at entry. When the brain was init'd with --no-embedding (config has embedding_disabled:true), they exit 1 with a paste-ready hint: gbrain config set embedding_model <provider>:<model> gbrain config set embedding_dimensions <N> gbrain init --force --embedding-model <provider>:<model> \`gbrain import --no-embed\` flag still works (chunks land without vectors), so users can still ingest in deferred-setup mode and backfill embeddings later with \`gbrain embed --stale\`.

…t extension Two doctor check extensions for v0.37.10.0: T9 — embedding_provider check extended for the v0.36 silent-default repair case. When config is empty AND schema column dim differs from the gateway-resolved default, surface the mismatch with empty-brain vs non-empty-brain repair branching (codex finding #7 nuance): - Empty brain (0 embedded chunks) → \`gbrain init --force --pglite --embedding-model <id> --embedding-dimensions <N>\` (drop and re-init) - Non-empty brain → \`gbrain retrieval-upgrade --to <id> --reindex\` Gated on totalChunks > 0 so pristine empty brains aren't pre-warned. Never recommend rm -rf ~/.gbrain. T10 — subagent_provider check (v0.31.12) extended per D7. When chat_model is non-Anthropic AND ANTHROPIC_API_KEY is missing, warn that subagent features (gbrain dream, gbrain agent run, gbrain autopilot) will fail at job submission. Chat alone (gbrain think) still works.

… init T11 — reindex-multimodal.ts: hook resolveSchemaMultimodalDim() preflight BEFORE the reindex sweep. Mirrors the text-side contract from initPGLite — if the configured multimodal model can't produce a dim matching the schema column, fail loud here with a \`gbrain config set\` hint rather than mid-reindex with a vector(N) INSERT error. T12 — test/e2e/init-fresh-pglite.test.ts (NEW, 14 cases): subprocess-driven E2E verification of the bug-reporter's repro scenarios: - Happy path: OPENAI_API_KEY set → auto-pick OpenAI, persists config - D3 non-TTY fail-loud (with and without env-key typos) - D6 regression: bug-reporter's three no-op config keys all exit 1 with Levenshtein suggestions - D9 deferred-setup mode + gbrain import refusal (and --no-embed bypass) - D11 preflight refuses BEFORE any disk write - Explicit --embedding-model wins over env detection Each test uses its own throw-away GBRAIN_HOME for hermetic runs.

T13/T14 docs sync for v0.37.10.0: - docs/integrations/embedding-providers.md: TL;DR table refreshed to reflect ZE as v0.36 default; added "Init resolves your provider from env keys" section explaining the auto-pick → picker → fail-loud chain; added "If first import fails" troubleshooting block pointing at gbrain doctor instead of \`rm -rf ~/.gbrain\`. - docs/operations/headless-install.md (NEW): Docker/CI sequencing guide. Two acceptable patterns — provider key at build time (Pattern 1) or --no-embedding opt-in + runtime config (Pattern 2). Codex finding #11. - README.md: Troubleshooting section with one-paragraph repair hint and links to embedding-providers.md + headless-install.md. - TODOS.md: closed v0.32.x "interactive provider chooser" entry as SUPERSEDED by this wave. Added four follow-up entries (dedicated v0.36 broken-install migration, namespaced ext fields, runtime config-key audit, value-level Levenshtein on config set).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

# Conflicts: # CHANGELOG.md # VERSION # docs/integrations/embedding-providers.md # package.json

…te test Two fixes coupled because the test couldn't pass without the formula fix: src/core/pglite-engine.ts + src/core/postgres-engine.ts — empty brain (pageCount === 0) now gets FULL marks (100/100), not 0/100. Semantically an empty brain has no coverage problem to penalize — there's nothing to embed, nothing to link, nothing to orphan. Vacuous truth applies. The pre-fix "empty = 0" caused fresh-init brains to score as critically unhealthy on \`gbrain doctor\`, which was a structural surprise to users who'd just run init successfully. Same fix on both engines. test/brain-score-breakdown.test.ts — updated the "empty brain" assertion to match the new contract (was: 0/0/0/0/0/0; is: 100/35/25/15/15/10). test/doctor-report-remote.test.ts → renamed to .serial.test.ts and made hermetic. The pre-fix test pulled audit data from the host ~/.gbrain (reranker_health, sync_failures, etc.), which made the assertion non-deterministic depending on whoever ran the suite. Now isolates GBRAIN_HOME to a tempdir via beforeAll/afterAll; env mutation requires serial-quarantine per scripts/check-test-isolation.sh R1. Closes the master-state flake that was failing on every \`bun run test\` run regardless of my branch contents.

- CLAUDE.md: annotate src/core/pglite-engine.ts + src/core/postgres-engine.ts entries with v0.37.10.0 empty-brain 100/100 contract. Vacuous truth: an empty brain has no coverage to penalize, so getBrainScore returns full marks (35/25/15/15/10 breakdown) when pageCount === 0. Pre-fix 0/100 was structurally surprising on fresh init and caused the v0.37.8.0 doctor-report-remote.test.ts flake. - TODOS.md: mark P0 doctor-report-remote.test.ts:65 TODO completed (resolved by commit 9aa571f's empty-brain-100/100 fix; test renamed to .serial.test.ts and made hermetic per scripts/check-test-isolation.sh R1). - llms-full.txt: regenerated from updated CLAUDE.md per CLAUDE.md "Auto-derived files" rule. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two coupled fixes for the v0.37.10.0 wave's interaction with CI's Tier-1 mechanical E2E suite (which runs without any embedding-provider env var). src/commands/init.ts — Honor D5 properly at resolveAIOptions entry. Pre-fix the env-detection branch fired on EVERY init regardless of persisted config. A non-TTY re-init with no env keys exited 1 (D3 fail-loud) even when ~/.gbrain/config.json already had embedding_model set from a prior successful init. Now resolveAIOptions reads loadConfig() first and seeds out.embedding_model / embedding_dimensions / expansion_model / chat_model from the file plane BEFORE running env detection. Also honors embedding_disabled (D9 sentinel) on re-init so deferred-setup brains don't re-trigger fail-loud. test/e2e/mechanical.test.ts:722 — Setup Journey's first init runs against a fresh DB with no persisted config. Pass --embedding-model explicitly (openai:text-embedding-3-large) so the preflight resolves offline. After this init writes config, subsequent inits in the file (RLS self-heal v24, RLS event-trigger probes, etc.) honor the persisted config via the D5 fix above. Verified locally: full test/e2e/mechanical.test.ts → 78 pass / 0 fail.

Conflicts resolved: - VERSION → 0.38.1.0 (higher semver wins; master bumped 0.37.9.0 → 0.37.10.0) - package.json → 0.38.1.0 (trio agreement) - CHANGELOG.md → my v0.38.1.0 entry stays on top; master's new v0.37.10.0 entry preserved directly below Master's v0.37.10.0 brings the init env-detection + interactive picker + preflight invariants wave (#1278). No collisions with v0.38 ingestion substrate. bun install + bun run typecheck → clean.

* upstream/master: v0.38.2.0 fix(doctor): bounded frontmatter scan + partial-state surfacing (supersedes garrytan#1287) (garrytan#1297) v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter (garrytan#1289) v0.38.0.0 ingestion cathedral — gbrain capture + write-through + IngestionSource contract (garrytan#1275) v0.37.11.0: fresh-install PGLite embedding setup fix wave (garrytan#1286) v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants (garrytan#1278) v0.37.9.0 fix(frontmatter): canonical-style normalization for tag arrays (garrytan#1252) v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix (garrytan#1267) v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients (garrytan#1253) v0.37.6.0 feat(ai): OpenRouter recipe + generic default_headers seam (cherry-pick garrytan#1210) (garrytan#1246) v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) (garrytan#1229) feat: pgGraph-inspired CI scaffolding wave (v0.37.4.0) (garrytan#1228) v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes garrytan#1206) (garrytan#1215) v0.37.2.0: takes_resolution_consistency CHECK accepts 'unresolvable' (garrytan#1211) v0.37.1.0 feat: brainstorm + lsd — bisociation idea generator grounded in your own brain (garrytan#1214) v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208) v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)

garrytan and others added 13 commits May 21, 2026 11:19

chore: bump version and changelog (v0.37.10.0)

c13ab12

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/master' into garrytan/milan-v2

8d71ef8

# Conflicts: # CHANGELOG.md # VERSION # docs/integrations/embedding-providers.md # package.json

garrytan merged commit a55de71 into master May 22, 2026
8 checks passed

alexzhu0 mentioned this pull request May 23, 2026

Add local Ollama embedding setup for OpenAI-compatible endpoint #112

Closed

jennyruan mentioned this pull request May 27, 2026

fix(heavy-tests): unbreak frontmatter_scan_wallclock (Heavy Tests red since 2026-05-23) #1577

Closed

1 task

felipeyanez mentioned this pull request Jun 2, 2026

fix: allow user-provided embedding dimensions #1766

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants#1278

v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants#1278
garrytan merged 13 commits into
masterfrom
garrytan/milan-v2

garrytan commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 21, 2026

Summary

Test Coverage

Pre-Landing Review

Plan Completion

TODOS

Documentation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant