fix(autopilot): scope lock file to GBRAIN_HOME#1227
Closed
rafaelreis-r wants to merge 7 commits into
Closed
Conversation
Adds src/core/fts-language.ts with getFtsLanguage(), a centralized helper that reads the GBRAIN_FTS_LANGUAGE env var (default 'english'). Refactors postgres-engine.ts and pglite-engine.ts to use the helper in their search queries, replacing four hardcoded 'english' literals across searchKeyword and searchKeywordChunks. Why: non-English brains lose stemming and stop-word removal because the tokenizer is wired to English regardless of content language. A user storing Portuguese pages currently gets dramatically worse keyword search than an equivalent English brain. This PR fixes the *query side* of the problem with zero behavior change for the default case. The trigger functions in schema.sql/schema-embedded.ts/pglite-schema.ts still hardcode 'english' for the write side — that's covered in a follow-up PR (recreate triggers idempotently from a migration). Validation: - VALID_CONFIG_NAME regex (/^[a-z][a-z0-9_]*$/) blocks SQL injection since Postgres tsvector functions don't accept parameterized config names — the value must be interpolated into the query string. - Invalid values fall back to 'english' with a one-time warning. - Cached after first read; tests reset via resetFtsLanguageCache(). Tests: 14 unit tests covering defaults, cache, validation rules, and SQL-injection guard. Backward-compatible: 100% — default behavior identical when env unset. (cherry picked from commit 43ffe13)
…language Builds on PR garrytan#1 (GBRAIN_FTS_LANGUAGE env var) by extending configurability to the *write side*: the trigger functions that populate pages.search_vector and content_chunks.search_vector now use the language from getFtsLanguage() instead of hardcoded 'english'. Implementation: schema migration v33 (handler-based, not static SQL). The handler reads getFtsLanguage() at apply time and issues CREATE OR REPLACE FUNCTION for the two trigger functions, atomically swapping their bodies. The triggers themselves don't need recreation because they reference the function by name. Backfill: when the configured language differs from 'english', v33 also re-tokenizes existing rows under the new tokenizer (UPDATE-to-self on pages, direct UPDATE on content_chunks). Skipped for 'english' to avoid wasted I/O when defaults are kept. Validation strategy: the language string flows through getFtsLanguage(), which enforces /^[a-z][a-z0-9_]*$/ before interpolation \u2014 SQL injection is structurally impossible. Tests include a deliberate injection attempt ('english\'; DROP TABLE pages; --') that verifies the fallback to 'english' kicks in and no DROP TABLE appears in any emitted SQL. Validated against a real Postgres brain (2782 pages, 4372 chunks): - apply-migrations succeeds with GBRAIN_FTS_LANGUAGE=pt_br - search 'opera\u00e7\u00f5es' (with diacritics) returns hits using pt_br stemmer - re-running migrate is idempotent (CREATE OR REPLACE) - re-running with same env is a no-op (version stays 33) Tests: 7 unit tests covering registration, handler shape, default-vs-non-default backfill behavior, and SQL injection guard. Combined with PR garrytan#1's helper tests (14): 21/21 pass. Limitation: changing GBRAIN_FTS_LANGUAGE *after* v33 has been applied requires resetting config.version to 32 to re-apply (documented in README). PR garrytan#3 in this series introduces 'gbrain reindex --search-vector' to recreate-and-backfill on demand without the version-stamp dance. Backward-compatible: 100% \u2014 default GBRAIN_FTS_LANGUAGE='english' produces identical trigger output to the pre-v33 schema. (cherry picked from commit d73b7e1)
Completes the GBRAIN_FTS_LANGUAGE story (PRs garrytan#1, garrytan#2 in this series) by giving users an explicit way to recreate FTS trigger functions and backfill existing rows after changing the language env var. Why: schema migration v33 (PR garrytan#2) stamps the trigger functions with GBRAIN_FTS_LANGUAGE on first apply and then the migrations runner considers v33 'done'. Users who later change the env var would need to manually reset config.version to re-trigger v33 \u2014 fragile and undocumented. This CLI command is the documented escape hatch: explicit, gated, idempotent. Behavior: - Reads GBRAIN_FTS_LANGUAGE via the same getFtsLanguage() helper as the engines and v33 migration, so all three sources of truth stay in lockstep. - --dry-run shows row counts (pages + chunks affected) without touching the DB. - --yes / -y skips interactive prompt; required in non-TTY contexts. - --json emits a structured result envelope (status, language, counts, durationMs) for scripting. - Trigger recreate is atomic via CREATE OR REPLACE FUNCTION, so the two writes are individually atomic; backfill is two UPDATEs (pages UPDATE-to-self re-fires the trigger; content_chunks gets a direct vector compute). Validated against a real Postgres brain (2782 pages, 4372 chunks): - --dry-run reports correct counts, exits 0 without writes - --yes completes in ~7-8s, search 'opera\u00e7\u00f5es' continues to work afterward - --json output parses cleanly Tests: 6 unit tests covering --dry-run shortcuts, default vs non-default language behavior, SQL injection guard (same as PRs garrytan#1/garrytan#2), and edge cases (empty inventory, durationMs presence). With PRs garrytan#1+garrytan#2: 27/27 unit tests pass. Trade-offs considered: - Could persist language in the config table instead of relying on env var. Decided against: env var is the established pattern in GBrain (GBRAIN_EMBED_MODEL, GBRAIN_BRAIN_ID, GBRAIN_DATABASE_URL etc.) and adding a config table row creates ambiguity about which wins (env vs DB). Single source of truth via env is simpler. - Could auto-detect language drift (compare configured vs trigger body in pg_proc) and warn at startup. Out of scope for this PR; file as a follow-up if there's demand. Backward-compatible: command is additive. Default behavior of the brain (with no language env var set) is unchanged. (cherry picked from commit adf11ec)
… detection, evals, E2E, filing rules, USAGE docs (cherry picked from commit 396aba4)
The lock file path was hardcoded to $HOME/.gbrain/autopilot.lock, ignoring GBRAIN_HOME. When two brains share a host (e.g. main brain and a side brain), only the first autopilot to acquire the lock runs; the second sees a fresh lock (<10min) and silently exits with code 0. Under launchd KeepAlive=true + ThrottleInterval=5s, the second autopilot enters a respawn loop that produces no work but is invisible because the exit is clean. We observed 46,388 silent failures in 3 days on a dual-brain setup before tracing it to this lock. Fix: derive the lock path from GBRAIN_HOME when set, falling back to $HOME/.gbrain. Each brain now gets its own lock and they coexist.
6 tasks
Owner
|
Closing in favor of #1253 (v0.37.7.0 fix wave) which re-implements the same fix against current master via The implementation lives at |
garrytan
added a commit
that referenced
this pull request
May 21, 2026
…dential clients (#1253) * fix(reindex-frontmatter): connect engine before query (#1225) `createEngine()` from src/core/engine-factory.ts only constructs the engine; callers MUST call connect() before any executeRaw. The reindex-frontmatter CLI was constructing the engine and going straight to countAffected, which crashed on PGLite with "PGLite not connected. Call connect() first." even on --dry-run. Fix follows the existing-command pattern (src/commands/auth.ts, src/commands/backfill.ts, src/commands/integrity.ts all do the same): pass toEngineConfig(cfg) into both createEngine() AND engine.connect(), then engine.initSchema() (idempotent on a current schema, ~1ms cost). Pre-fix verification: codex outside-voice CF5 flagged the related "can't import connectEngine from cli.ts" misdirection in the original fix plan. This implementation uses the canonical sibling pattern instead. Regression test pinned at test/reindex-frontmatter-connect.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump VERSION to 0.37.7.0 + stub CHANGELOG v0.37.5.0 claimed by #1229 (warsaw-v4); v0.37.6.0 by #1246 (OpenRouter recipe). v0.37.7.0 is the next free slot for this fix wave. CHANGELOG entry stubbed in user-facing voice per CLAUDE.md "CHANGELOG voice + release-summary format" — ELI10 lead-first, real fix details below. The "## To take advantage of v0.37.7.0" block follows the v0.13+ self-repair pattern from CLAUDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(subagent): short-circuit terminal-on-resume (#1151) Bug: when the worker resumed a subagent job whose persisted last message was an assistant turn with text-only content (no tool_use blocks), the replay reconciler at subagent.ts:241-247 had no branch for that case. The main loop then called messages.create against a conversation ending in assistant role, which Sonnet 4.6+ rejects with HTTP 400 "This model does not support assistant message prefill." 3 retries later → dead-letter, despite all the job's work having committed in earlier turns. @zscgeek's bug report pinned this exactly: dream-cycle Otter corpus runs hit ~7% dead-letter rate, every dead job's last subagent_messages row was a text-only synthesis summary listing slugs that already existed in `pages`. Their proposed fix mirrors this implementation. Fix: add an else branch to the assistant-tail check that mirrors the live-loop terminal logic at subagent.ts:440-447 — reconstruct finalText from the persisted text blocks, return stop_reason='end_turn' immediately. No LLM call, no schema change. Two new regression cases: - text-only terminal on resume returns immediately with zero messages.create calls - tool-use replay path unchanged (existing behavior preserved) Codex outside-voice (CF13) initially flagged this fix as mis-targeted, claiming subagent.ts already handled the case. /investigate run revealed the live-loop terminal at :440-447 was covered but the REPLAY-path terminal at :241-247 was missing — both branches need symmetric handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): scope lockfile to GBRAIN_HOME (#1226) The autopilot lockfile was hardcoded at `~/.gbrain/autopilot.lock` (via `process.env.HOME`), bypassing GBRAIN_HOME. Two brains pointed at different GBRAIN_HOME directories still wrote to the same global lockfile; one would silently take over the other on each restart. Fix: route through `gbrainPath('autopilot.lock')` from src/core/config.ts (imported aliased as gbrainHomePath since the local `gbrainPath` var in installAutopilot references the CLI binary path). The mkdirSync(`~/.gbrain`) call also routes through the helper so the directory is created in the right place too. Co-authored with @rafaelreis-r — same fix shape as PR #1227, re-implemented against current master per the wave's "re-implement, credit, close" workflow. Tests cover: one GBRAIN_HOME → one canonical lock; two GBRAIN_HOME values → two distinct locks; default fall-through still works. Co-Authored-By: rafaelreis-r <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(graph-query): foreign-edge footer + --include-foreign (#1153) The graph-query CLI silently dropped edges to pages in other sources on federated brains. Users had no signal those edges existed unless they read the source code. Fix: - New --include-foreign flag (off by default, preserves the existing scoping contract; on = explicit cross-source traversal). - After every traversal, count edges from rootSlug whose target page lives in a different source. When count > 0 AND user didn't opt in, emit a stderr footer: `(N edge(s) to foreign-source pages hidden; pass --include-foreign to include them)` - The "no edges found" path also runs the count + footer so users discover foreign edges even when scoped traversal returned nothing. - Thin-client path skips the count (engine query not available); future T1 work threads source resolution through MCP for that path. - Single quotation correctness in count SQL: page_links table is `links` (not `page_links`); JOIN both endpoints to pages and compare source_id, NULL-safe via `IS NOT NULL` guards on both sides. - Fail-open on missing source_id column for pre-v0.18 brains: return 0 (no foreign edges to report) instead of throwing. 4 new test cases: footer fires on scoped query with foreign edge, --include-foreign suppresses footer, zero-foreign no-footer case, pluralization regression guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(sources): `gbrain sources current` + tier attribution (#1222) Federated-brain users running destructive ops (extract, import, purge) need a way to verify which source they're targeting BEFORE the op runs. Pre-fix, the only way was to grep config files or run the op with --dry-run and inspect output. New command: gbrain sources current # human output gbrain sources current --json # machine-readable gbrain sources current --source X # show what an explicit --source # X would resolve to (validates # X exists in the sources table) Output names BOTH the resolved source id AND which tier of the 6-tier resolution chain won (flag / env / dotfile / local_path / brain_default / seed_default), plus a `detail` line naming the winning signal (e.g. "GBRAIN_SOURCE=dept-x" or ".gbrain-source" or "/work/gstack/src"). Implementation: - New `resolveSourceWithTier()` in source-resolver.ts as an additive variant of `resolveSourceId()`. Walks the same 6 steps in the same order; just returns `{ source_id, tier, detail? }` instead of bare string. Existing `resolveSourceId()` unchanged — all callers continue working. - New `SOURCE_TIER_NAMES` const + `SourceTier` type export so the CLI, doctor (Tier 5 follow-up), and future MCP consumers share one vocabulary instead of inlining strings. - Help text updated; `current` subcommand registered in dispatcher. 11 new tests pin the 6-tier ladder + priority semantics. Existing 19 source-resolver tests still pass (regression preserved). Per codex CF3 (the existing src/core/source-resolver.ts was missed in the original plan). Re-uses the existing helper instead of inventing a duplicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(extract): --source-id scopes extraction to one brain source (#1204) Federated brain users running `gbrain extract` had no way to scope extraction to one source. The DB path walks all sources together via listAllPageRefs(), which is correct for cross-source resolution but sometimes the user wants to extract per-source explicitly (e.g. re-running extract on a specific source after a manual import). The pre-existing `--source` flag is the data-source axis (fs|db) and can't be repurposed. New flag `--source-id <id>` joins it on the brain-source-id axis: gbrain extract all --source db --source-id alpha -> walks only alpha-source pages; extracts links + timeline from those, into the alpha source Important: the resolver maps (allSlugs + slugToSources) stay built from the FULL listAllPageRefs result, not the scoped subset. This ensures qualified cross-source wikilinks like `[[other-src:slug]]` still resolve correctly even when the extract walk is scoped — the filter is on which pages we extract FROM, not what we can resolve TO. Threaded through both `extractLinksFromDB` and `extractTimelineFromDB` with backward-compat: callers passing no opts get the old behavior. 4 new test cases pin: walks-all-without-flag baseline, alpha-only-when-scoped-to-alpha, beta-only-when-scoped-to-beta, empty-set-on-unknown-source. Note: #1204's wider "silent 0 links" report on federated brains has additional facets beyond this flag (resolver path edge cases on overlapping slugs). The scoped-walk fix gives users an explicit workaround AND closes the per-source extraction gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(todos): file v0.37.7.0 follow-ups (#1173, #1204, T5N) Three items deferred from v0.37.7.0: 1. #1173 .sql indexing — verify-first gate found tree-sitter-sql.wasm missing from src/assets/wasm/grammars/. Dedicated wave needed: vendor the wasm, add .sql to walker filter, address slug-shape collision with #1172. 2. #1204 deeper investigation — wave added --source-id flag as workaround. Underlying silent-zero-links bug on unscoped federated extracts needs its own /investigate pass against a cross-source-duplicate-slug fixture. 3. Tier 5N doctor sweep for dead-lettered subagent jobs matching the #1151 fingerprint. Deferred to v0.37.8+ behind the islamabad doctor.ts conflict resolution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(sync): walker skips git submodule directories (#1169) Sync walker descended into git submodules and indexed their markdown content as if it belonged to the parent brain. Users with submodules in their brain repo saw foreign content in their pages table. Fix: pruneDir gains an optional `parentDir` arg. When set, the helper stats `<parentDir>/<name>/.git` and skips the directory if `.git` exists as a FILE (gitfile pointer — the canonical submodule shape). Directories containing `.git` as a DIRECTORY (a real nested repo, not a submodule) are descended into; the inner `.git` dir itself is then dot-prefix-excluded. Callers updated to pass parentDir: - src/commands/extract.ts walkMarkdownFiles - src/core/cycle/transcript-discovery.ts walker Back-compat preserved: existing pruneDir(name) callers without parentDir get the pre-v0.37.7.0 behavior unchanged. Companion `.gitignore`-respect feature from PR #1159 (@jetsetterfl) NOT in this wave — it would require adding the `ignore` npm package as a dep, which the plan's "no new deps in this PR" gate excludes. Filed as follow-up TODO for a dedicated wave. 5 new test cases pin the submodule shape + back-compat + nested-repo ambiguity. Existing extract-fs / extract-db tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(brain-routing): document 6-tier source resolution chain (#1222) The convention skill didn't have a tier-by-tier reference for how gbrain resolves the active source. Users running federated brains had to read the source code to know which signal wins. Added: - Canonical 6-tier table (flag → env → dotfile → local_path → brain_default → seed_default) matching src/core/source-resolver.ts. - Pointer to `gbrain sources current` (new in v0.37.7.0) as the verification command. - The CLI-layer trust boundary note: operations.ts handlers don't read env/dotfile (preserves v0.34.1.0 source-isolation work for MCP callers). - Per-command flag map: --source, --source-id (extract), and --include-foreign (graph-query). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(import): --source-id flag routes pages to a brain source (#1167) `gbrain import --source dept-x ./pages` silently fell back to the default source because the CLI parser never consumed --source. PR #707's design intent excluded the flag explicitly; users had no signal their pages were going to the wrong place. #1167 + #1222 filed the regression. Fix: parse `--source-id <id>` (matching v0.37.7.0 extract.ts T2's naming convention — --source-id stays out of conflict with future axes that may want --source). When set, the flag value wins over any programmatic opts.sourceId; back-compat preserved for callers that pass sourceId via opts only. Also threaded into the positional-dir arg parser's flagValues set so `--source-id <value> <dir>` doesn't treat <value> as the dir. Note on related surfaces: - `gbrain query "X" --source_id dept-x` already routed correctly via the operations.ts query op (added in v0.34) — no fix needed. - `gbrain extract --source-id <id>` shipped in T2. - `gbrain sync --source <id>` already worked (pre-existing). - `gbrain sources current` (shipped in T4) is the verification tool — run it before destructive ops to confirm routing. Closes the silent-fallback for the import path. Co-authored with @tyad67-netizen (#1168), @hnshah (#1124, #1120), whose patches informed the shape; re-implemented against current master per the wave's "re-implement, credit, close" workflow. 3 new test cases pin: default-without-flag, --source-id-routes-correctly, flag-value-not-treated-as-dirArg. Co-Authored-By: tyad67-netizen <noreply@github.com> Co-Authored-By: hnshah <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): reconnect classifier + launchd ThrottleInterval (#1162) Pre-fix: when database_url was unset/malformed, the DB-health-check reconnect loop logged `config.database_url undefined` forever because the catch swallowed every error type uniformly. launchd's KeepAlive=true respawned immediately on any exit, so even when the process did exit, it came right back into the same bad state. @colin477 reported the daemon-thrash pattern. Two-part fix: 1. In-process error classifier — `classifyReconnectError(err)`: - `unrecoverable` (database_url missing/empty/malformed, auth failure, no-brain-configured): exit immediately with a clear stderr line. Pattern-matched against postgres / config-loader error shapes. Tests pin the matcher against the #1162 fingerprint exactly. - `recoverable` (network blip, pool saturated, connection refused on a port coming up, Supabase 503): retry. Up to GBRAIN_AUTOPILOT_MAX_RECONNECT_FAILS (default 30 = ~5min) before finally giving up with `max_reconnect_fails_exceeded`. - Counter resets on every successful health probe or reconnect. 2. launchd plist gains `ThrottleInterval=60`. Combined with the in-process exit, launchd waits 60s before relaunching instead of immediate respawn. Pure-function `generateLaunchdPlist()` exported for tests. 16 new test cases: - 11 classifier cases (database_url shapes, malformed URL, auth, role-does-not-exist with quoted name, network blip, pool saturated, 503, non-Error inputs, case-insensitivity) - 5 plist generator cases (ThrottleInterval=60, KeepAlive preserved, wrapper path, XML escaping, StandardErrorPath). Pre-existing autopilot-lock-path tests unchanged — both fixes land cleanly side-by-side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(oauth): confidential clients via custom /token middleware (#1166) v0.34.1.0 (#909) fixed PUBLIC PKCE clients (client_secret=undefined) by normalizing NULL → undefined in getClient. Confidential clients regressed: the MCP SDK's clientAuth middleware does plaintext `client.client_secret !== presented_secret` compare, but gbrain stores SHA-256 hashes, so the SDK's compare always failed for authorization_code and refresh_token grants on confidential clients. Result: /token returned `invalid_client` for every confidential exchange. Fix shape per locked-decision-5: custom /token middleware BEFORE the SDK's authRouter, similar to the pre-existing client_credentials handler. The middleware: 1. Detects confidential auth via `client_secret` in body (client_secret_post) OR `Authorization: Basic` header (client_secret_basic per RFC 6749 §2.3.1). 2. Falls through to the SDK when neither is present (public PKCE path stays canonical, preserves v0.34.1.0 behavior). 3. Calls new `verifyConfidentialClientSecret(clientId, presented)` on the provider which does SHA-256 hash compare ourselves (same shape as exchangeClientCredentials' existing hash check). 4. On verification success, calls existing `exchangeAuthorizationCode` / `exchangeRefreshToken` directly with the validated client. 5. RFC 6749 §5.2 error semantics: 401 invalid_client for auth failures, 400 invalid_grant for code/token problems. Per CLAUDE.md "GBRAIN:RLS_EXEMPT" annotation contract: this surface sits in front of the SDK's clientAuth and doesn't depend on the SDK's plaintext compare working — the SDK's middleware never fires for confidential paths the new middleware claims. 7 new test cases pin: correct-secret-returns-client, wrong-secret opaque rejection, non-existent client, public-client refuses the confidential path, case-sensitivity, soft-deleted revocation, verify-then-exchange-refresh round-trip with second-use rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(doctor): 3 new checks — source routing + oauth + autopilot lock (T12/T13/T14) Three v0.37.7.0 doctor checks landing in one atomic commit (single file, shared merge-conflict surface with garrytan/islamabad-v3 per locked decision 1): 1. source_routing_health (T12 / #1167): Sample non-default sources for pages; warn when a registered source has zero pages (silent-collapse-to-default fingerprint). D5 lock: total-sample cap of 200 pages across all sources, with per-source cap = min(50, ceil(200/N)) so a 20-source CEO brain pays 200 selects, not 1000. Fix hint paste-ready to `gbrain sources current --json` for verification. 2. oauth_confidential_client_health (T13 / #1166): Probe every oauth_clients row. Confidential clients (auth_method != 'none') must have a non-NULL client_secret_hash; if any row claims confidential auth but stores NULL hash, that's the pre-v0.37.7.0 regression. Public clients (auth_method='none') correctly keep NULL hash per v0.34.1.0 #909. Fix hint: `gbrain auth revoke-client + register-client` OR `gbrain upgrade`. Pre-OAuth schemas (missing oauth_clients table) skip gracefully. 3. autopilot_lock_scope (T14 / #1226): Detect stale ~/.gbrain/autopilot.lock outside the current GBRAIN_HOME. Codex CF11: dangerous to paste-ready `rm` without verifying the owning PID isn't a live process. Hint reads the PID file and gives the user a `ps -p <pid>` check before any delete — matches sshd-style stale-lock recovery hints. 9 new test cases pin the canonical paths. Pre-existing 80+ doctor checks unchanged. Expected to conflict with garrytan/islamabad-v3 at merge time. The 3 new check functions live in their own block far from the islamabad skill_brain_first check; the conflict surface should be limited to the `checks.push(...)` call site near the end of runDoctor's DB-checks phase (~10 lines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): withEnv wrapper in source-resolver-with-tier (test-isolation lint) The new source-resolver-with-tier.test.ts from T4 mutated process.env.GBRAIN_SOURCE directly in two cases, which violates scripts/check-test-isolation.sh R1 (env mutations leak across parallel-loaded test files in the same shard process). Fix: wrap both mutation sites in withEnv() from test/helpers/with-env.ts, which saves+restores via try/finally per the canonical pattern in CLAUDE.md. Pure refactor — all 11 cases still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.37.7.0 CHANGELOG.md — populated the "What landed" stub with the 18-commit brisbane wave (source-id flag threading, sources current subcommand, graph-query foreign-edge footer, autopilot lockfile scope + reconnect classifier + launchd ThrottleInterval, OAuth confidential client middleware, reindex-frontmatter connect fix, subagent terminal-on-resume fix, sync walker submodule skip, 3 new doctor checks, brain-routing.md convention skill). Voice: ELI10 lead, capability table, paste-ready verification, "what's safe to know" + "what we caught" sections. CLAUDE.md — extended Key Files annotations for the v0.37.7.0 changes: import/extract --source-id flags, sources current subcommand, graph-query --include-foreign, resolveSourceWithTier() additive helper, autopilot classifyReconnectError + generateLaunchdPlist exports, OAuth confidential client middleware, pruneDir submodule detection, subagent terminal short-circuit, 3 new doctor checks. Pinned by their test files. llms-full.txt — regenerated via `bun run build:llms` (CI guard at test/build-llms.test.ts will fail otherwise). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: rafaelreis-r <noreply@github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1226.
Problem
gbrain autopilothardcodes the lock file to$HOME/.gbrain/autopilot.lock, ignoringGBRAIN_HOME. When multiple brains share a host, only one autopilot ever runs — the others silently exit on every launchd/systemd respawn, producing tens of thousands of invisible failures.See #1226 for the full story (46,388 silent failures observed in ~3 days on our dual-brain setup).
Fix
Two-line change: derive the lock path from
GBRAIN_HOMEwhen set, fall back to$HOME/.gbrainfor single-brain installs.Same fallback updates applied to the
mkdirSynccall so the directory creation is also scoped correctly.Verification
bun run typecheckclean.autopilot.lockfiles under their respectiveGBRAIN_HOMEdirs.Compatibility