v0.42.14.0 fix(zero-config): code-* readiness signal + init embedding-key validation + lock self-heal (#1780)#1804
Merged
Merged
Conversation
…1780 Gap 1) New src/core/code-graph-readiness.ts: resolveCodeReadiness() returns a typed status (not_built | indexing | ready | unknown) + ready boolean so callers can tell "graph not built / still indexing" apart from "genuinely no match" when count===0. EXISTS-based (cheap), chunk-grain, resolver-version-matching pending predicate, fail-open. Wired into the 4 CLI envelopes (+ human hint) and the 4 MCP op handlers. def/refs are 2-state brain-wide; callers/callees 3-state scoped.
…Gap 3) tryAcquireDbLock now reclaims a held, not-TTL-expired lock when the same-host holder is provably dead (process.kill ESRCH) past a 60s grace, via guarded DELETE + one normal-upsert retry returning the normal handle. New shared injectable classifyHolderLiveness/isHolderDeadLocally (EPERM treated as ALIVE — never steals a live lock). runBreakLock's safe path consumes the shared predicate, fixing its prior EPERM-as-dead bug. Cross-host stays TTL-only.
New src/core/init-embed-check.ts: config-only diagnoseEmbedding (missing key, all providers) + best-effort 1-token live test-embed (invalid/expired key, 5s timeout, never blocks). Loud warning to stderr, init still exits 0; skipped by --no-embedding / --skip-embed-check / GBRAIN_INIT_SKIP_EMBED_CHECK=1. Builds the effective env (process.env + file-plane keys + --key) via buildGatewayConfig, extracted to src/core/ai/build-gateway-config.ts (cli.ts re-exports) so the check sees the same keys + provider base URLs as runtime. embedding_check added to --json.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CLAUDE.md Key Files: add src/core/code-graph-readiness.ts, init-embed-check.ts, ai/build-gateway-config.ts, and the db-lock auto-takeover + code-* readiness field behaviors. Regenerate llms-full.txt. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1780 # Conflicts: # CHANGELOG.md # CLAUDE.md # VERSION # llms-full.txt # package.json
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
Jun 3, 2026
* upstream/master: v0.42.23.0 feat(jobs): --nice scheduling-priority flag for jobs work/supervisor (garrytan#1815) (garrytan#1820) v0.42.22.0 fix(minions): supervisor progress watchdog + worker DB self-defense — alive-but-wedged worker self-heals (garrytan#1801) (garrytan#1824) v0.42.21.0 fix(postgres): module-singleton ownership — canonical landing for the dream-cycle "connect() has not been called" class (garrytan#1404/garrytan#1471/garrytan#1619) (garrytan#1805) v0.42.20.0 fix: reliability wave — PGLite capture lock-pin + Postgres reconnect race + search embed-hang (garrytan#1762 garrytan#1745 garrytan#1775) (garrytan#1810) v0.42.19.0 fix(skillopt): close the last gap in the AI SDK v6 tool-loop fix (write-capture mapper + regression test) (garrytan#1809) v0.42.18.0 fix: sync orphan-pileup watchdog (garrytan#1633) + links-lag µs stamp (garrytan#1768) (garrytan#1807) v0.42.17.0 fix(sync): resumable incremental sync — killed mid-import no longer loses progress (garrytan#1794) (garrytan#1808) v0.42.16.0 feat(doctor): brain health as a solved problem — cause-ranked doctor + OOM-loop line + auto-drain + pool-reap (garrytan#1685) (garrytan#1802) v0.42.15.0 fix: decouple CLI primary output from process.stdout.isTTY (garrytan#1784) (garrytan#1806) v0.42.14.0 fix(zero-config): code-* readiness signal + init embedding-key validation + lock self-heal (garrytan#1780) (garrytan#1804) v0.42.13.0 fix(search): archive/ content findable by default, demoted not hard-excluded (garrytan#1777) (garrytan#1797) v0.42.12.0 feat: self-upgrading gbrain — invocation-riding update check + opt-in auto-upgrade (garrytan#1798) v0.42.11.0 feat(skillopt): held-out eval gate, honest receipts, ENFORCE + ablation opts (garrytan#1759) v0.42.10.0 feat(extract): opt-in global-basename wikilink resolution (closes garrytan#972) (garrytan#1388)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1780.
Three zero-config gaps that made a brain silently fail for an unattended/agent consumer. Two real, one optional (all included).
Gap 1 — code-* readiness signal (was: silent empty result)
code-def/code-refs/code-callers/code-calleesreturnedcount: 0in three indistinguishable cases: graph not built, source never synced, or genuinely no match. An agent readingcount: 0couldn't tell "wait and retry" from "trust this." Now every CLI envelope and the four MCP ops carry a typedstatus(not_built | indexing | ready | unknown) +readyboolean.count: 0 + ready: true→ genuinely none.ready: false→ ask later.src/core/code-graph-readiness.ts:EXISTS-based (cheap; nopage_kindindex needed, pending probe rides the partialidx_content_chunks_edges_backfill), chunk-grain, fail-open on DB error.count > 0short-circuits toreadywith no query.edges_backfilled_at IS NULL OR < EDGE_EXTRACTOR_VERSION_TS) so a resolver-version bump can't falsely reportready.Gap 2 —
gbrain initvalidates the embedding key (was: silentembedded=0at first sync)Init persisted
--embedding-modelwithout checking the key; first sync imported pages but embedded zero, and search came back empty. Now init runs a free config-onlydiagnoseEmbedding(missing key, all providers) + a best-effort 1-token live test-embed (invalid/expired key, 5s timeout). Loud warning to stderr, init still exits 0.src/core/init-embed-check.ts. Builds the effective env (process.env + file-plane*_api_key+--key) viabuildGatewayConfigso it sees the same keys + provider base URLs as runtime (no false warning for config.json-keyed users).buildGatewayConfigextracted tosrc/core/ai/build-gateway-config.ts(cli.ts re-exports).--skip-embed-checkflag /GBRAIN_INIT_SKIP_EMBED_CHECK=1;embedding_checkin--json.--no-embeddingskips the whole check.Gap 3 — automatic same-host dead-pid cycle-lock takeover (was: TTL-only, ~30min wait)
A crashed lock holder blocked new cycles for the full TTL.
tryAcquireDbLocknow reclaims a held, not-TTL-expired lock when the same-host holder is provably dead (process.killESRCH) past a 60s grace, via a guarded DELETE + one normal-upsert retry returning the standard handle. Cross-host stays TTL-only. The sharedclassifyHolderLivenesspredicate (EPERM treated as alive — never steal a live lock) is reused bygbrain sync --break-lock, fixing its prior EPERM-as-dead bug.Tests
test/code-graph-readiness.test.ts(11),test/db-lock-auto-takeover.test.ts(11),test/init-embed-check.test.ts(9, hermetic via the gateway embed-transport seam +withEnv). Readiness-envelope cases added totest/e2e/code-intel-mcp-ops-pglite.test.ts. 3 critical-regression tests: file-plane no-false-warn, readiness-query-throw →unknown, EPERM-as-alive.bun run verifygreen (29/29: typecheck, privacy, jsonb, wasm, resolver, all guards). 134-test post-merge re-verify across the affected + new suites, 0 fail. Targeted Postgres E2E: 127 pass (1 skip, 1 pre-existing stale failure unrelated to this diff —sync-lock-recovery › --break-lock + --allasserts a refusal string absent in both master and this branch).GBRAIN_HOME): init with no key → loud warning +embedding_check.ok=false;code-callers fooon a fresh brain →status: "not_built", ready: false.No schema migration.
gbrain upgradehandles everything.🤖 Generated with Claude Code
Documentation
src/core/code-graph-readiness.ts,src/core/init-embed-check.ts,src/core/ai/build-gateway-config.ts; noted thetryAcquireDbLockauto-takeover +classifyHolderLivenesson thedb-lock.tsentry and thestatus/readyreadiness fields on thecode-def/code-refsentry.bun run build:llms; freshness test green).init-embed-check.ts:liveTestEmbedwithmodels.ts:probeEmbeddingReachabilityonto a shared embed-probe core.Coverage: all shipped surface (code-*
status/readyfields,gbrain init --skip-embed-check+embedding_checkJSON, the three new modules, db-lock auto-takeover) is documented in CLAUDE.md + CHANGELOG. README is product-level and needs no change. No architecture-diagram drift.