v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants#1278
Merged
Conversation
Foundation for v0.37.10.0 env-detection wave. Two pure modules: - src/core/levenshtein.ts: editDistance(a,b) + suggestNearest(input, candidates, maxDistance). Used by config-set "did you mean" suggestions and env-var typo detection at init. - src/core/embedding-dim-check.ts: resolveSchemaEmbeddingDim() + resolveSchemaMultimodalDim() pure functions. Validate resolved dim against recipe default_dims + per-provider Matryoshka allow-lists (OpenAI text-3, Voyage flexible-dim, ZeroEntropy zembed-1) BEFORE any DB write. Plus EmbeddingDisabledError + assertEmbeddingEnabled() runtime guard for the deferred-setup path (D9). New PGVECTOR_COLUMN_MAX_DIMS=16000 exported. Tests: 41 unit cases across both modules.
Two changes prepping the env-detection wave: - providers.ts: extract formatRecipeTable() helper from runList(). Picker reuses it so UI can't drift from \`gbrain providers list\`. Also adds the codex finding #10 warn-line to \`providers test\` when the tested model differs from the configured default ("Note: tested X in isolation; gbrain's configured embedding is Y — this test does NOT verify your brain's active path."). envReady() takes an explicit env arg for testing. - init-provider-picker.ts (NEW): interactive picker mirroring init-mode-picker.ts. Filters candidate recipes to env-ready ones (codex finding #3), prompts via readLineSafe, exports printSubagentAnthropicCaveat() for shared use from initPGLite/initPostgres. Tests: 17 unit cases (10 providers + 7 picker).
Two changes for the v0.37.10.0 wave: - src/core/config.ts: add embedding_disabled?:boolean to GBrainConfig (D9 deferred-setup sentinel, mutually exclusive with embedding_model). Export KNOWN_CONFIG_KEYS (60+ canonical keys, file-plane + DB-plane) and KNOWN_CONFIG_KEY_PREFIXES (search., models., dream., cycle., etc.) for validation use. - src/commands/config.ts: D6 strict-default unknown-key rejection. Unknown key + no --force → exit 1 with Levenshtein suggestion against KNOWN_CONFIG_KEYS. Prefix matches accepted without --force. --force escape hatch accepts arbitrary keys with stderr WARN. Closes the silent-no-op class the bug reporter hit (embedding.provider, embedding.model, embedding.dimensions all exit 1 with right suggestion). Tests: 19 unit cases pinning the bug-reporter regression + gate logic.
…no-embedding Core of the v0.37.10.0 wave (D1-D7, D9-D11). Closes the bug where a fresh \`gbrain init --pglite\` silently produced a broken brain when no provider key matched the v0.36 default. resolveAIOptions rewritten with per-touchpoint env detection: - Explicit flag → shorthand → env auto-pick (group by provider id, codex #2) - Picker fires when multiple providers env-ready (D1+D2 hybrid) - Non-TTY zero-key exits 1 with paste-ready setup hint (D3) + Levenshtein typo detection for OPENAPI_API_KEY → OPENAI_API_KEY (D13) - All three touchpoints covered (embedding + expansion + chat, D4) - Local-only providers (Ollama/llama-server) excluded from auto-pick; picking Ollama silently when user has OPENAI_API_KEY set was wrong UX initPGLite + initPostgres: - Drop conditional configureGateway gate → always call before initSchema - Preflight resolveSchemaEmbeddingDim() BEFORE engine.initSchema() (D11) — invalid dim refuses with paste-ready hint, no disk write - Atomic embedding-config persistence (codex #13): either resolved tuple or embedding_disabled:true sentinel, never partial state - Post-initSchema invariant assertion stays as regression guardrail - --no-embedding opt-in flag (D9) for deferred-setup mode - Subagent-Anthropic caveat (D7) fires post-init when chat_model is non-Anthropic AND ANTHROPIC_API_KEY missing Exported groupReadyByProvider() + findEnvKeyTypos() for unit testing. Tests: 21 unit cases covering provider grouping + typo detection edge cases.
… is active T7 of the v0.37.10.0 wave. Both runEmbedCore and runImport now call assertEmbeddingEnabled(loadConfig()) at entry. When the brain was init'd with --no-embedding (config has embedding_disabled:true), they exit 1 with a paste-ready hint: gbrain config set embedding_model <provider>:<model> gbrain config set embedding_dimensions <N> gbrain init --force --embedding-model <provider>:<model> \`gbrain import --no-embed\` flag still works (chunks land without vectors), so users can still ingest in deferred-setup mode and backfill embeddings later with \`gbrain embed --stale\`.
…t extension Two doctor check extensions for v0.37.10.0: T9 — embedding_provider check extended for the v0.36 silent-default repair case. When config is empty AND schema column dim differs from the gateway-resolved default, surface the mismatch with empty-brain vs non-empty-brain repair branching (codex finding #7 nuance): - Empty brain (0 embedded chunks) → \`gbrain init --force --pglite --embedding-model <id> --embedding-dimensions <N>\` (drop and re-init) - Non-empty brain → \`gbrain retrieval-upgrade --to <id> --reindex\` Gated on totalChunks > 0 so pristine empty brains aren't pre-warned. Never recommend rm -rf ~/.gbrain. T10 — subagent_provider check (v0.31.12) extended per D7. When chat_model is non-Anthropic AND ANTHROPIC_API_KEY is missing, warn that subagent features (gbrain dream, gbrain agent run, gbrain autopilot) will fail at job submission. Chat alone (gbrain think) still works.
… init T11 — reindex-multimodal.ts: hook resolveSchemaMultimodalDim() preflight BEFORE the reindex sweep. Mirrors the text-side contract from initPGLite — if the configured multimodal model can't produce a dim matching the schema column, fail loud here with a \`gbrain config set\` hint rather than mid-reindex with a vector(N) INSERT error. T12 — test/e2e/init-fresh-pglite.test.ts (NEW, 14 cases): subprocess-driven E2E verification of the bug-reporter's repro scenarios: - Happy path: OPENAI_API_KEY set → auto-pick OpenAI, persists config - D3 non-TTY fail-loud (with and without env-key typos) - D6 regression: bug-reporter's three no-op config keys all exit 1 with Levenshtein suggestions - D9 deferred-setup mode + gbrain import refusal (and --no-embed bypass) - D11 preflight refuses BEFORE any disk write - Explicit --embedding-model wins over env detection Each test uses its own throw-away GBRAIN_HOME for hermetic runs.
T13/T14 docs sync for v0.37.10.0: - docs/integrations/embedding-providers.md: TL;DR table refreshed to reflect ZE as v0.36 default; added "Init resolves your provider from env keys" section explaining the auto-pick → picker → fail-loud chain; added "If first import fails" troubleshooting block pointing at gbrain doctor instead of \`rm -rf ~/.gbrain\`. - docs/operations/headless-install.md (NEW): Docker/CI sequencing guide. Two acceptable patterns — provider key at build time (Pattern 1) or --no-embedding opt-in + runtime config (Pattern 2). Codex finding #11. - README.md: Troubleshooting section with one-paragraph repair hint and links to embedding-providers.md + headless-install.md. - TODOS.md: closed v0.32.x "interactive provider chooser" entry as SUPERSEDED by this wave. Added four follow-up entries (dedicated v0.36 broken-install migration, namespaced ext fields, runtime config-key audit, value-level Levenshtein on config set).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md # VERSION # docs/integrations/embedding-providers.md # package.json
…te test Two fixes coupled because the test couldn't pass without the formula fix: src/core/pglite-engine.ts + src/core/postgres-engine.ts — empty brain (pageCount === 0) now gets FULL marks (100/100), not 0/100. Semantically an empty brain has no coverage problem to penalize — there's nothing to embed, nothing to link, nothing to orphan. Vacuous truth applies. The pre-fix "empty = 0" caused fresh-init brains to score as critically unhealthy on \`gbrain doctor\`, which was a structural surprise to users who'd just run init successfully. Same fix on both engines. test/brain-score-breakdown.test.ts — updated the "empty brain" assertion to match the new contract (was: 0/0/0/0/0/0; is: 100/35/25/15/15/10). test/doctor-report-remote.test.ts → renamed to .serial.test.ts and made hermetic. The pre-fix test pulled audit data from the host ~/.gbrain (reranker_health, sync_failures, etc.), which made the assertion non-deterministic depending on whoever ran the suite. Now isolates GBRAIN_HOME to a tempdir via beforeAll/afterAll; env mutation requires serial-quarantine per scripts/check-test-isolation.sh R1. Closes the master-state flake that was failing on every \`bun run test\` run regardless of my branch contents.
- CLAUDE.md: annotate src/core/pglite-engine.ts + src/core/postgres-engine.ts entries with v0.37.10.0 empty-brain 100/100 contract. Vacuous truth: an empty brain has no coverage to penalize, so getBrainScore returns full marks (35/25/15/15/10 breakdown) when pageCount === 0. Pre-fix 0/100 was structurally surprising on fresh init and caused the v0.37.8.0 doctor-report-remote.test.ts flake. - TODOS.md: mark P0 doctor-report-remote.test.ts:65 TODO completed (resolved by commit 9aa571f's empty-brain-100/100 fix; test renamed to .serial.test.ts and made hermetic per scripts/check-test-isolation.sh R1). - llms-full.txt: regenerated from updated CLAUDE.md per CLAUDE.md "Auto-derived files" rule. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two coupled fixes for the v0.37.10.0 wave's interaction with CI's Tier-1 mechanical E2E suite (which runs without any embedding-provider env var). src/commands/init.ts — Honor D5 properly at resolveAIOptions entry. Pre-fix the env-detection branch fired on EVERY init regardless of persisted config. A non-TTY re-init with no env keys exited 1 (D3 fail-loud) even when ~/.gbrain/config.json already had embedding_model set from a prior successful init. Now resolveAIOptions reads loadConfig() first and seeds out.embedding_model / embedding_dimensions / expansion_model / chat_model from the file plane BEFORE running env detection. Also honors embedding_disabled (D9 sentinel) on re-init so deferred-setup brains don't re-trigger fail-loud. test/e2e/mechanical.test.ts:722 — Setup Journey's first init runs against a fresh DB with no persisted config. Pass --embedding-model explicitly (openai:text-embedding-3-large) so the preflight resolves offline. After this init writes config, subsequent inits in the file (RLS self-heal v24, RLS event-trigger probes, etc.) honor the persisted config via the D5 fix above. Verified locally: full test/e2e/mechanical.test.ts → 78 pass / 0 fail.
garrytan
added a commit
that referenced
this pull request
May 22, 2026
Conflicts resolved: - VERSION → 0.38.1.0 (higher semver wins; master bumped 0.37.9.0 → 0.37.10.0) - package.json → 0.38.1.0 (trio agreement) - CHANGELOG.md → my v0.38.1.0 entry stays on top; master's new v0.37.10.0 entry preserved directly below Master's v0.37.10.0 brings the init env-detection + interactive picker + preflight invariants wave (#1278). No collisions with v0.38 ingestion substrate. bun install + bun run typecheck → clean.
Closed
1 task
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
May 28, 2026
* upstream/master: v0.38.2.0 fix(doctor): bounded frontmatter scan + partial-state surfacing (supersedes garrytan#1287) (garrytan#1297) v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter (garrytan#1289) v0.38.0.0 ingestion cathedral — gbrain capture + write-through + IngestionSource contract (garrytan#1275) v0.37.11.0: fresh-install PGLite embedding setup fix wave (garrytan#1286) v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants (garrytan#1278) v0.37.9.0 fix(frontmatter): canonical-style normalization for tag arrays (garrytan#1252) v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix (garrytan#1267) v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients (garrytan#1253) v0.37.6.0 feat(ai): OpenRouter recipe + generic default_headers seam (cherry-pick garrytan#1210) (garrytan#1246) v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) (garrytan#1229) feat: pgGraph-inspired CI scaffolding wave (v0.37.4.0) (garrytan#1228) v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes garrytan#1206) (garrytan#1215) v0.37.2.0: takes_resolution_consistency CHECK accepts 'unresolvable' (garrytan#1211) v0.37.1.0 feat: brainstorm + lsd — bisociation idea generator grounded in your own brain (garrytan#1214) v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208) v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the v0.36 bug class where a fresh
gbrain init --pglitesilently produced a broken brain when the ZeroEntropy default didn't match the user's actual API keys.The headline story. A user on WSL ran
bun install && gbrain init --pglite && gbrain import …, hitexpected 1536 dimensions, not 1280, and recovered with five commands includingrm -rf ~/.gbrainand three config keys that gbrain silently ignored (embedding.provider,embedding.model,embedding.dimensions). This PR closes every defect that bug surfaced.What ships:
src/commands/init.ts,src/commands/init-provider-picker.ts).gbrain init --pglitenow peeks atprocess.env, auto-picks when one provider is ready, fires a picker when multiple are ready. Local-only providers (Ollama, llama-server) excluded from auto-pick to prevent silent routing to daemons that may not be running.OPENAPI_API_KEYsurfaces "did you meanOPENAI_API_KEY?" via Levenshtein.~/.gbrain/config.jsonso subsequent runs are deterministic across releases. Pre-fix it only persisted when flags were passed.subagent_providercheck) whenchat_modelis non-Anthropic andANTHROPIC_API_KEYis missing.config setrejects unknown keys with Levenshtein suggestion,--forceescape hatch with stderr WARN.--no-embeddingopt-in for deferred-setup mode;embed/importcallsites refuse cleanly when the sentinel is set.resolveSchemaEmbeddingDimruns BEFOREengine.initSchema()— invalid dim refuses without writing anything to disk. D12 extends to multimodal.embedding_providercheck now catches the v0.36 silent-default repair case with empty-brain vs non-empty branching (drop-and-re-init vsretrieval-upgrade). Never recommendsrm -rf.subagent_providercheck (v0.31.12) extended per D7.brain_score= 100/100 — a fresh brain with no pages now scores full marks (vacuous truth: no coverage problem to penalize). Pre-fix it returned 0/100, which was structurally surprising on firstgbrain doctorrun.Decision trail. 14 architectural decisions captured (D1-D14) including 4 codex outside-voice cross-model tensions resolved. Plan persisted at
~/.claude/plans/system-instruction-you-are-working-enumerated-mccarthy.md.Test Coverage
Pre-Landing Review
Cleared via
/plan-eng-reviewend-to-end:Plan Completion
All 15 implementation tasks (T1-T15) DONE. Plan items NOT DONE: 0. UNVERIFIABLE: 0.
TODOS
gbrain init" — SUPERSEDED bysrc/commands/init-provider-picker.ts. Closed: P0doctorReportRemote brain_scoretest flake (resolved by empty-brain-100 fix).config set; runtime config-key audit; value-level Levenshtein onconfig set.Documentation
Doc updates for v0.37.10.0:
--no-embedding) patterns.Test plan
bun run verifyclean (typecheck + 6 pre-checks)bun run test— 8342 pass / 0 fail (was 1 pre-existing flake, now fixed)test/e2e/init-fresh-pglite.test.ts) — 14/14 pass against subprocess-driven happy-path, fail-loud, D6 regression, D9 deferred-setup, D11 preflight, explicit-flag-wins paths🤖 Generated with Claude Code