feat(sync): opt-in --respect-gitignore for the full-import walker (#1073)#1159
feat(sync): opt-in --respect-gitignore for the full-import walker (#1073)#1159jetsetterfl wants to merge 1 commit into
Conversation
…rrytan#1073) collectSyncableFiles only skipped dot-dirs / node_modules / ops, so the full / first / `--full` import walked gitignored build output (dist/, out/, coverage/, __pycache__/, ...) and admitted every file in there as "code" (CODE_EXTENSIONS covers .json/.yaml/.toml/.html/.css). On repos with large gitignored trees this bloats the DB, raises embedding cost, pollutes search with stale fixtures, and can wedge the chunker on a pathological file — exactly the stall reported in garrytan#1073. The incremental sync path is already git-based and excludes these; this makes the full-import walker consistent with it, opt-in. - `collectSyncableFiles` gains `respectGitignore` (CollectOpts). When set and the root is a git work tree, `git ls-files -o -i --exclude-standard --directory` builds the ignored set; entries are pruned at *descent* time so a huge gitignored dir is never recursed into (addresses the walk-stall, not just emit-time filtering). Empty set / non-git root / no git => zero overhead, legacy behavior. - Default OFF. Preserves behavior for dotfile/secret brains that deliberately keep content out of git but want it brain-searchable (the "why not just default-on" case in garrytan#1073). - Wired through: `gbrain sync --respect-gitignore` (and `--no-respect-gitignore` to override an enabled config), `sync.respect_gitignore` config knob (flag wins), `gbrain import --respect-gitignore`, and the `sync --all` cost preview so the estimate matches what will actually be walked. - Tests: default still admits gitignored output (legacy preserved), opt-in prunes dist/ + coverage/ while keeping src/, non-git root falls back gracefully without throwing. `--ignore-from FILE` and `.gbrainignore` (garrytan#449) are intentionally left as follow-ups; this lands the easy win for repos that already express the intent in .gitignore.
…dential clients (#1253) * fix(reindex-frontmatter): connect engine before query (#1225) `createEngine()` from src/core/engine-factory.ts only constructs the engine; callers MUST call connect() before any executeRaw. The reindex-frontmatter CLI was constructing the engine and going straight to countAffected, which crashed on PGLite with "PGLite not connected. Call connect() first." even on --dry-run. Fix follows the existing-command pattern (src/commands/auth.ts, src/commands/backfill.ts, src/commands/integrity.ts all do the same): pass toEngineConfig(cfg) into both createEngine() AND engine.connect(), then engine.initSchema() (idempotent on a current schema, ~1ms cost). Pre-fix verification: codex outside-voice CF5 flagged the related "can't import connectEngine from cli.ts" misdirection in the original fix plan. This implementation uses the canonical sibling pattern instead. Regression test pinned at test/reindex-frontmatter-connect.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump VERSION to 0.37.7.0 + stub CHANGELOG v0.37.5.0 claimed by #1229 (warsaw-v4); v0.37.6.0 by #1246 (OpenRouter recipe). v0.37.7.0 is the next free slot for this fix wave. CHANGELOG entry stubbed in user-facing voice per CLAUDE.md "CHANGELOG voice + release-summary format" — ELI10 lead-first, real fix details below. The "## To take advantage of v0.37.7.0" block follows the v0.13+ self-repair pattern from CLAUDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(subagent): short-circuit terminal-on-resume (#1151) Bug: when the worker resumed a subagent job whose persisted last message was an assistant turn with text-only content (no tool_use blocks), the replay reconciler at subagent.ts:241-247 had no branch for that case. The main loop then called messages.create against a conversation ending in assistant role, which Sonnet 4.6+ rejects with HTTP 400 "This model does not support assistant message prefill." 3 retries later → dead-letter, despite all the job's work having committed in earlier turns. @zscgeek's bug report pinned this exactly: dream-cycle Otter corpus runs hit ~7% dead-letter rate, every dead job's last subagent_messages row was a text-only synthesis summary listing slugs that already existed in `pages`. Their proposed fix mirrors this implementation. Fix: add an else branch to the assistant-tail check that mirrors the live-loop terminal logic at subagent.ts:440-447 — reconstruct finalText from the persisted text blocks, return stop_reason='end_turn' immediately. No LLM call, no schema change. Two new regression cases: - text-only terminal on resume returns immediately with zero messages.create calls - tool-use replay path unchanged (existing behavior preserved) Codex outside-voice (CF13) initially flagged this fix as mis-targeted, claiming subagent.ts already handled the case. /investigate run revealed the live-loop terminal at :440-447 was covered but the REPLAY-path terminal at :241-247 was missing — both branches need symmetric handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): scope lockfile to GBRAIN_HOME (#1226) The autopilot lockfile was hardcoded at `~/.gbrain/autopilot.lock` (via `process.env.HOME`), bypassing GBRAIN_HOME. Two brains pointed at different GBRAIN_HOME directories still wrote to the same global lockfile; one would silently take over the other on each restart. Fix: route through `gbrainPath('autopilot.lock')` from src/core/config.ts (imported aliased as gbrainHomePath since the local `gbrainPath` var in installAutopilot references the CLI binary path). The mkdirSync(`~/.gbrain`) call also routes through the helper so the directory is created in the right place too. Co-authored with @rafaelreis-r — same fix shape as PR #1227, re-implemented against current master per the wave's "re-implement, credit, close" workflow. Tests cover: one GBRAIN_HOME → one canonical lock; two GBRAIN_HOME values → two distinct locks; default fall-through still works. Co-Authored-By: rafaelreis-r <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(graph-query): foreign-edge footer + --include-foreign (#1153) The graph-query CLI silently dropped edges to pages in other sources on federated brains. Users had no signal those edges existed unless they read the source code. Fix: - New --include-foreign flag (off by default, preserves the existing scoping contract; on = explicit cross-source traversal). - After every traversal, count edges from rootSlug whose target page lives in a different source. When count > 0 AND user didn't opt in, emit a stderr footer: `(N edge(s) to foreign-source pages hidden; pass --include-foreign to include them)` - The "no edges found" path also runs the count + footer so users discover foreign edges even when scoped traversal returned nothing. - Thin-client path skips the count (engine query not available); future T1 work threads source resolution through MCP for that path. - Single quotation correctness in count SQL: page_links table is `links` (not `page_links`); JOIN both endpoints to pages and compare source_id, NULL-safe via `IS NOT NULL` guards on both sides. - Fail-open on missing source_id column for pre-v0.18 brains: return 0 (no foreign edges to report) instead of throwing. 4 new test cases: footer fires on scoped query with foreign edge, --include-foreign suppresses footer, zero-foreign no-footer case, pluralization regression guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(sources): `gbrain sources current` + tier attribution (#1222) Federated-brain users running destructive ops (extract, import, purge) need a way to verify which source they're targeting BEFORE the op runs. Pre-fix, the only way was to grep config files or run the op with --dry-run and inspect output. New command: gbrain sources current # human output gbrain sources current --json # machine-readable gbrain sources current --source X # show what an explicit --source # X would resolve to (validates # X exists in the sources table) Output names BOTH the resolved source id AND which tier of the 6-tier resolution chain won (flag / env / dotfile / local_path / brain_default / seed_default), plus a `detail` line naming the winning signal (e.g. "GBRAIN_SOURCE=dept-x" or ".gbrain-source" or "/work/gstack/src"). Implementation: - New `resolveSourceWithTier()` in source-resolver.ts as an additive variant of `resolveSourceId()`. Walks the same 6 steps in the same order; just returns `{ source_id, tier, detail? }` instead of bare string. Existing `resolveSourceId()` unchanged — all callers continue working. - New `SOURCE_TIER_NAMES` const + `SourceTier` type export so the CLI, doctor (Tier 5 follow-up), and future MCP consumers share one vocabulary instead of inlining strings. - Help text updated; `current` subcommand registered in dispatcher. 11 new tests pin the 6-tier ladder + priority semantics. Existing 19 source-resolver tests still pass (regression preserved). Per codex CF3 (the existing src/core/source-resolver.ts was missed in the original plan). Re-uses the existing helper instead of inventing a duplicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(extract): --source-id scopes extraction to one brain source (#1204) Federated brain users running `gbrain extract` had no way to scope extraction to one source. The DB path walks all sources together via listAllPageRefs(), which is correct for cross-source resolution but sometimes the user wants to extract per-source explicitly (e.g. re-running extract on a specific source after a manual import). The pre-existing `--source` flag is the data-source axis (fs|db) and can't be repurposed. New flag `--source-id <id>` joins it on the brain-source-id axis: gbrain extract all --source db --source-id alpha -> walks only alpha-source pages; extracts links + timeline from those, into the alpha source Important: the resolver maps (allSlugs + slugToSources) stay built from the FULL listAllPageRefs result, not the scoped subset. This ensures qualified cross-source wikilinks like `[[other-src:slug]]` still resolve correctly even when the extract walk is scoped — the filter is on which pages we extract FROM, not what we can resolve TO. Threaded through both `extractLinksFromDB` and `extractTimelineFromDB` with backward-compat: callers passing no opts get the old behavior. 4 new test cases pin: walks-all-without-flag baseline, alpha-only-when-scoped-to-alpha, beta-only-when-scoped-to-beta, empty-set-on-unknown-source. Note: #1204's wider "silent 0 links" report on federated brains has additional facets beyond this flag (resolver path edge cases on overlapping slugs). The scoped-walk fix gives users an explicit workaround AND closes the per-source extraction gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(todos): file v0.37.7.0 follow-ups (#1173, #1204, T5N) Three items deferred from v0.37.7.0: 1. #1173 .sql indexing — verify-first gate found tree-sitter-sql.wasm missing from src/assets/wasm/grammars/. Dedicated wave needed: vendor the wasm, add .sql to walker filter, address slug-shape collision with #1172. 2. #1204 deeper investigation — wave added --source-id flag as workaround. Underlying silent-zero-links bug on unscoped federated extracts needs its own /investigate pass against a cross-source-duplicate-slug fixture. 3. Tier 5N doctor sweep for dead-lettered subagent jobs matching the #1151 fingerprint. Deferred to v0.37.8+ behind the islamabad doctor.ts conflict resolution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(sync): walker skips git submodule directories (#1169) Sync walker descended into git submodules and indexed their markdown content as if it belonged to the parent brain. Users with submodules in their brain repo saw foreign content in their pages table. Fix: pruneDir gains an optional `parentDir` arg. When set, the helper stats `<parentDir>/<name>/.git` and skips the directory if `.git` exists as a FILE (gitfile pointer — the canonical submodule shape). Directories containing `.git` as a DIRECTORY (a real nested repo, not a submodule) are descended into; the inner `.git` dir itself is then dot-prefix-excluded. Callers updated to pass parentDir: - src/commands/extract.ts walkMarkdownFiles - src/core/cycle/transcript-discovery.ts walker Back-compat preserved: existing pruneDir(name) callers without parentDir get the pre-v0.37.7.0 behavior unchanged. Companion `.gitignore`-respect feature from PR #1159 (@jetsetterfl) NOT in this wave — it would require adding the `ignore` npm package as a dep, which the plan's "no new deps in this PR" gate excludes. Filed as follow-up TODO for a dedicated wave. 5 new test cases pin the submodule shape + back-compat + nested-repo ambiguity. Existing extract-fs / extract-db tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(brain-routing): document 6-tier source resolution chain (#1222) The convention skill didn't have a tier-by-tier reference for how gbrain resolves the active source. Users running federated brains had to read the source code to know which signal wins. Added: - Canonical 6-tier table (flag → env → dotfile → local_path → brain_default → seed_default) matching src/core/source-resolver.ts. - Pointer to `gbrain sources current` (new in v0.37.7.0) as the verification command. - The CLI-layer trust boundary note: operations.ts handlers don't read env/dotfile (preserves v0.34.1.0 source-isolation work for MCP callers). - Per-command flag map: --source, --source-id (extract), and --include-foreign (graph-query). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(import): --source-id flag routes pages to a brain source (#1167) `gbrain import --source dept-x ./pages` silently fell back to the default source because the CLI parser never consumed --source. PR #707's design intent excluded the flag explicitly; users had no signal their pages were going to the wrong place. #1167 + #1222 filed the regression. Fix: parse `--source-id <id>` (matching v0.37.7.0 extract.ts T2's naming convention — --source-id stays out of conflict with future axes that may want --source). When set, the flag value wins over any programmatic opts.sourceId; back-compat preserved for callers that pass sourceId via opts only. Also threaded into the positional-dir arg parser's flagValues set so `--source-id <value> <dir>` doesn't treat <value> as the dir. Note on related surfaces: - `gbrain query "X" --source_id dept-x` already routed correctly via the operations.ts query op (added in v0.34) — no fix needed. - `gbrain extract --source-id <id>` shipped in T2. - `gbrain sync --source <id>` already worked (pre-existing). - `gbrain sources current` (shipped in T4) is the verification tool — run it before destructive ops to confirm routing. Closes the silent-fallback for the import path. Co-authored with @tyad67-netizen (#1168), @hnshah (#1124, #1120), whose patches informed the shape; re-implemented against current master per the wave's "re-implement, credit, close" workflow. 3 new test cases pin: default-without-flag, --source-id-routes-correctly, flag-value-not-treated-as-dirArg. Co-Authored-By: tyad67-netizen <noreply@github.com> Co-Authored-By: hnshah <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): reconnect classifier + launchd ThrottleInterval (#1162) Pre-fix: when database_url was unset/malformed, the DB-health-check reconnect loop logged `config.database_url undefined` forever because the catch swallowed every error type uniformly. launchd's KeepAlive=true respawned immediately on any exit, so even when the process did exit, it came right back into the same bad state. @colin477 reported the daemon-thrash pattern. Two-part fix: 1. In-process error classifier — `classifyReconnectError(err)`: - `unrecoverable` (database_url missing/empty/malformed, auth failure, no-brain-configured): exit immediately with a clear stderr line. Pattern-matched against postgres / config-loader error shapes. Tests pin the matcher against the #1162 fingerprint exactly. - `recoverable` (network blip, pool saturated, connection refused on a port coming up, Supabase 503): retry. Up to GBRAIN_AUTOPILOT_MAX_RECONNECT_FAILS (default 30 = ~5min) before finally giving up with `max_reconnect_fails_exceeded`. - Counter resets on every successful health probe or reconnect. 2. launchd plist gains `ThrottleInterval=60`. Combined with the in-process exit, launchd waits 60s before relaunching instead of immediate respawn. Pure-function `generateLaunchdPlist()` exported for tests. 16 new test cases: - 11 classifier cases (database_url shapes, malformed URL, auth, role-does-not-exist with quoted name, network blip, pool saturated, 503, non-Error inputs, case-insensitivity) - 5 plist generator cases (ThrottleInterval=60, KeepAlive preserved, wrapper path, XML escaping, StandardErrorPath). Pre-existing autopilot-lock-path tests unchanged — both fixes land cleanly side-by-side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(oauth): confidential clients via custom /token middleware (#1166) v0.34.1.0 (#909) fixed PUBLIC PKCE clients (client_secret=undefined) by normalizing NULL → undefined in getClient. Confidential clients regressed: the MCP SDK's clientAuth middleware does plaintext `client.client_secret !== presented_secret` compare, but gbrain stores SHA-256 hashes, so the SDK's compare always failed for authorization_code and refresh_token grants on confidential clients. Result: /token returned `invalid_client` for every confidential exchange. Fix shape per locked-decision-5: custom /token middleware BEFORE the SDK's authRouter, similar to the pre-existing client_credentials handler. The middleware: 1. Detects confidential auth via `client_secret` in body (client_secret_post) OR `Authorization: Basic` header (client_secret_basic per RFC 6749 §2.3.1). 2. Falls through to the SDK when neither is present (public PKCE path stays canonical, preserves v0.34.1.0 behavior). 3. Calls new `verifyConfidentialClientSecret(clientId, presented)` on the provider which does SHA-256 hash compare ourselves (same shape as exchangeClientCredentials' existing hash check). 4. On verification success, calls existing `exchangeAuthorizationCode` / `exchangeRefreshToken` directly with the validated client. 5. RFC 6749 §5.2 error semantics: 401 invalid_client for auth failures, 400 invalid_grant for code/token problems. Per CLAUDE.md "GBRAIN:RLS_EXEMPT" annotation contract: this surface sits in front of the SDK's clientAuth and doesn't depend on the SDK's plaintext compare working — the SDK's middleware never fires for confidential paths the new middleware claims. 7 new test cases pin: correct-secret-returns-client, wrong-secret opaque rejection, non-existent client, public-client refuses the confidential path, case-sensitivity, soft-deleted revocation, verify-then-exchange-refresh round-trip with second-use rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(doctor): 3 new checks — source routing + oauth + autopilot lock (T12/T13/T14) Three v0.37.7.0 doctor checks landing in one atomic commit (single file, shared merge-conflict surface with garrytan/islamabad-v3 per locked decision 1): 1. source_routing_health (T12 / #1167): Sample non-default sources for pages; warn when a registered source has zero pages (silent-collapse-to-default fingerprint). D5 lock: total-sample cap of 200 pages across all sources, with per-source cap = min(50, ceil(200/N)) so a 20-source CEO brain pays 200 selects, not 1000. Fix hint paste-ready to `gbrain sources current --json` for verification. 2. oauth_confidential_client_health (T13 / #1166): Probe every oauth_clients row. Confidential clients (auth_method != 'none') must have a non-NULL client_secret_hash; if any row claims confidential auth but stores NULL hash, that's the pre-v0.37.7.0 regression. Public clients (auth_method='none') correctly keep NULL hash per v0.34.1.0 #909. Fix hint: `gbrain auth revoke-client + register-client` OR `gbrain upgrade`. Pre-OAuth schemas (missing oauth_clients table) skip gracefully. 3. autopilot_lock_scope (T14 / #1226): Detect stale ~/.gbrain/autopilot.lock outside the current GBRAIN_HOME. Codex CF11: dangerous to paste-ready `rm` without verifying the owning PID isn't a live process. Hint reads the PID file and gives the user a `ps -p <pid>` check before any delete — matches sshd-style stale-lock recovery hints. 9 new test cases pin the canonical paths. Pre-existing 80+ doctor checks unchanged. Expected to conflict with garrytan/islamabad-v3 at merge time. The 3 new check functions live in their own block far from the islamabad skill_brain_first check; the conflict surface should be limited to the `checks.push(...)` call site near the end of runDoctor's DB-checks phase (~10 lines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): withEnv wrapper in source-resolver-with-tier (test-isolation lint) The new source-resolver-with-tier.test.ts from T4 mutated process.env.GBRAIN_SOURCE directly in two cases, which violates scripts/check-test-isolation.sh R1 (env mutations leak across parallel-loaded test files in the same shard process). Fix: wrap both mutation sites in withEnv() from test/helpers/with-env.ts, which saves+restores via try/finally per the canonical pattern in CLAUDE.md. Pure refactor — all 11 cases still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.37.7.0 CHANGELOG.md — populated the "What landed" stub with the 18-commit brisbane wave (source-id flag threading, sources current subcommand, graph-query foreign-edge footer, autopilot lockfile scope + reconnect classifier + launchd ThrottleInterval, OAuth confidential client middleware, reindex-frontmatter connect fix, subagent terminal-on-resume fix, sync walker submodule skip, 3 new doctor checks, brain-routing.md convention skill). Voice: ELI10 lead, capability table, paste-ready verification, "what's safe to know" + "what we caught" sections. CLAUDE.md — extended Key Files annotations for the v0.37.7.0 changes: import/extract --source-id flags, sources current subcommand, graph-query --include-foreign, resolveSourceWithTier() additive helper, autopilot classifyReconnectError + generateLaunchdPlist exports, OAuth confidential client middleware, pruneDir submodule detection, subagent terminal short-circuit, 3 new doctor checks. Pinned by their test files. llms-full.txt — regenerated via `bun run build:llms` (CI guard at test/build-llms.test.ts will fail otherwise). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: rafaelreis-r <noreply@github.com>
|
Re: the "requires Just want to flag for the record that this PR as it stands adds no npm dependencies. The diff touches only execFileSync('git', ['-C', root, 'ls-files', '-o', '-i', '--exclude-standard', '--directory'], …)
Two paths, your call: Path A — keep the Path B — if you'd rather not depend on the // Parse .gitignore (+ .git/info/exclude) into ordered rules.
type Rule = { re: RegExp; negate: boolean; dirOnly: boolean };
function loadRules(root: string): Rule[] {
const rules: Rule[] = [];
for (const line of readIgnoreLines(root)) { // skip '' and '#...'
let pat = line.trim();
if (!pat || pat.startsWith('#')) continue;
const negate = pat.startsWith('!'); if (negate) pat = pat.slice(1);
const dirOnly = pat.endsWith('/'); if (dirOnly) pat = pat.replace(/\/$/, '');
const anchored = pat.startsWith('/'); if (anchored) pat = pat.slice(1);
rules.push({ re: globToRegExp(pat, anchored), negate, dirOnly });
}
return rules;
}
// gitignore glob -> RegExp: ** = any depth, * = within a segment, ? = one char.
// Unanchored patterns match at any path depth; anchored match from root.
function globToRegExp(pat: string, anchored: boolean): RegExp {
const body = pat
.replace(/[.+^${}()|[\]\\]/g, '\\$&') // escape regex metachars
.replace(/\*\*\//g, '(?:.*/)?') // **/ -> any number of dirs
.replace(/\*\*/g, '.*') // ** -> anything
.replace(/\*/g, '[^/]*') // * -> within one segment
.replace(/\?/g, '[^/]');
const prefix = anchored ? '^' : '(?:^|/)';
return new RegExp(`${prefix}${body}(?:/.*)?$`); // dir matches its children
}
// Last matching rule wins (gitignore precedence); negation re-includes.
function isIgnored(relPath: string, isDir: boolean, rules: Rule[]): boolean {
let ignored = false;
for (const r of rules) {
if (r.dirOnly && !isDir) continue;
if (r.re.test(relPath)) ignored = !r.negate;
}
return ignored;
}Then in the walker, prune at descent time exactly like the current PR: const rel = relative(root, full);
if (isIgnored(rel, isDirEntry, rules)) continue;That's ~40 lines, no dependency, and gives the same directory-pruning win for both git and non-git roots — at the cost of owning a (small) slice of gitignore semantics ourselves. It won't be 100% spec-complete the way Happy to push either shape — Path A is already in the diff and is the lower-risk merge; Path B if you want to drop the This comment was drafted with the help of Claude Code. |
collectSyncableFiles (the full-sync / dry-run enumerator) reimplemented its own directory skip list inline (node_modules || ops), bypassing the canonical pruneDir gate and ignoring .gitignore entirely. On a Laravel/PHP repo this descended into vendor/ (~50k Composer files), storage/, and public/build/, trying to import 52k dependency/build files and flooding the index with library internals (a 35-min sync that never finished, killed by the watchdog at 3%). - collectSyncableFiles now enumerates via `git ls-files --cached --others --exclude-standard` when dir is a git work tree, so the walk honors .gitignore (tracked + untracked-not-ignored). Falls back to the FS walk for non-git dirs. EroLab: 52164 -> 1028 files. - The FS fallback now prunes through the canonical pruneDir() instead of a drifted inline list, so the two skip lists can't diverge again. - PRUNE_DIR_NAMES gains vendor/dist/build (dependency + build-output trees). Addresses #1483 (.gbrainignore), #1159 (--respect-gitignore), and the maintainer's #1942 vendor/dist/build prune. Walker regression suites (sync-walker-symlink, brain-writer-walk-prune, sync, sync-walker-submodule) green: 90 pass.
…unity PRs (#2128) * fix(oauth): default omitted authorize scope to client's full grant When a client omits `scope` on /authorize, the authorize() grant computed `(params.scopes || []).filter(...)` → the empty set. That empty grant was written to oauth_codes and propagated into the access AND refresh tokens, so every request failed `insufficient_scope` even though the client was registered with e.g. `read write`. Because refresh inherits the stored grant, it never self-healed — reconnecting just minted another empty-scoped token. Some MCP connectors (observed with Claude Desktop) omit `scope` on /authorize, so they hit this on every connection. Fix: when no scope is requested, default to the client's full registered scope (RFC 6749 §3.3 permits a server default). This mirrors exchangeClientCredentials, which already does `requestedScope ? ... : allowedScopes`. The result is still clamped to the allowed set, so an explicit over-broad request cannot escalate. Adds test/oauth-authorize-scope-default.test.ts covering: omitted/empty → inherits full grant; explicit subset honored; clamp preserved (over-broad and disallowed-only requests cannot escalate or trigger inheritance). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(sync): skip Python venv/ in the code walker collectSyncableFiles (first-sync walker) and the incremental PRUNE_DIR_NAMES set skipped node_modules but not Python venv/. On a Python repo the walker descended into venv/ (thousands of files); the resulting slug collisions crashed putPage's INSERT ... ON CONFLICT ... RETURNING with "undefined is not an object (evaluating 'row.deleted_at')". Add `venv` alongside node_modules in both the import.ts inline skip and PRUNE_DIR_NAMES. venv is the Python equivalent of node_modules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(gateway): carry asymmetric input_type across the AI SDK to the wire body (#1400) dimsProviderOptions() threads input_type ('query' | 'document') into providerOptions.openaiCompatible for asymmetric models (ZE zembed-1, Voyage v3+), but the AI SDK's openai-compatible adapter validates providerOptions against a fixed schema and silently drops the field before building the HTTP body. Every embedQuery() was therefore encoded document-side: the ZE shim's hard default fired ('document'), Voyage and local openai-compat servers got no input_type at all, and asymmetric retrieval silently collapsed toward surface-token overlap — while the providerOptions-level contract test stayed green. Fix: an AsyncLocalStorage (same pattern as __budgetStore) populated in embedSubBatch() only when providerOptions actually threads an input_type, read at body-rewrite time by the fetch shims: - zeroEntropyCompatFetch: recovers the threaded value; document default preserved for ingest paths. - voyageCompatFetch: opt-in like the dims.ts Voyage branch — inject only when threaded; the field stays off the wire otherwise. - NEW openAICompatAsymmetricFetch: fallthrough default for every other openai-compatible recipe (llama-server, litellm, ollama, ...) — the canonical local/proxy paths for asymmetric models. Strict pass-through when nothing was threaded, so symmetric deployments see zero wire change; recipes with their own compat fetch (azure) keep it via the compat.fetch ?? precedence. KNOBS_HASH_VERSION bumped 10→11: cached query_cache rows were keyed on document-side query vectors; pre-fix rows must not be served to post-fix lookups (same convention as the v=3 embedding-provider bump). One-time global cold-miss on upgrade; refills within cache.ttl_seconds. Tests: test/embed-input-type-wire.test.ts runs the REAL SDK transport with a mocked global fetch and asserts on the outbound body — the only layer where this regression is observable. Covers ZE hosted, llama-server, litellm, ollama (query + document sides) and pins the pass-through for non-asymmetric models and Voyage's opt-in shape. 4 of the original 7 assertions fail on master, proving the pin. One structural pin in test/ai/zeroentropy-compat-fetch.test.ts updated to the new line shape (same semantic); KEY_FILES.md gateway.ts entry updated to the new truth. Supersedes #1400 (closed unmerged) — same ALS mechanism, extended to Voyage + all openai-compatible recipes. Credit to @billy-armstrong for the original diagnosis. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(sync): honor .gitignore in code walk; prune vendor/dist/build collectSyncableFiles (the full-sync / dry-run enumerator) reimplemented its own directory skip list inline (node_modules || ops), bypassing the canonical pruneDir gate and ignoring .gitignore entirely. On a Laravel/PHP repo this descended into vendor/ (~50k Composer files), storage/, and public/build/, trying to import 52k dependency/build files and flooding the index with library internals (a 35-min sync that never finished, killed by the watchdog at 3%). - collectSyncableFiles now enumerates via `git ls-files --cached --others --exclude-standard` when dir is a git work tree, so the walk honors .gitignore (tracked + untracked-not-ignored). Falls back to the FS walk for non-git dirs. EroLab: 52164 -> 1028 files. - The FS fallback now prunes through the canonical pruneDir() instead of a drifted inline list, so the two skip lists can't diverge again. - PRUNE_DIR_NAMES gains vendor/dist/build (dependency + build-output trees). Addresses #1483 (.gbrainignore), #1159 (--respect-gitignore), and the maintainer's #1942 vendor/dist/build prune. Walker regression suites (sync-walker-symlink, brain-writer-walk-prune, sync, sync-walker-submodule) green: 90 pass. * fix(config): ignore DATABASE_URL auto-loaded from cwd .env (#427) Bun merges .env files from the process cwd into process.env before any user code runs. loadConfig() prefers env DATABASE_URL over ~/.gbrain/config.json, so any gbrain invocation from inside a web-app checkout silently retargets the brain at that app's database — reads go to the wrong DB and apply-migrations can write gbrain's schema into a production app database (#427). effectiveEnvDatabaseUrl() re-parses the .env files Bun auto-loads from cwd and treats a DATABASE_URL whose value matches one of them as file-origin: ignored, with a one-time stderr notice. GBRAIN_DATABASE_URL and genuinely exported DATABASE_URLs are honored unchanged, so the operator escape hatch and the e2e suite's env-provided URL keep working. Applied at loadConfig, getDbUrlSource (doctor parity), init --non-interactive, and migrate --to. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): arm the disconnect hard-deadline at teardown entry, not before the op body The 10s force-exit timer in the shared-op dispatch was armed BEFORE the try block, so any op whose handler ran past 10s wall-clock was killed mid-flight with process.exit(0) and zero stdout. On a slow Postgres pooler (6-10s per fresh connection) a healthy `gbrain search` was force-exited every time — an empty 'success' indistinguishable from no results. The v0.42.20.0 exitCode honor can't help: a mid-op kill fires before any error path sets exitCode. Move the arming into the finally (teardown entry), matching the fall-through owner-disconnect site later in main(): the timer still bounds a hung drain/disconnect (the C13 contract) but can no longer kill a slow-but-progressing op. Verified on a transaction-pooler Supabase brain: search went from 0 bytes/exit 0 at 10s to real results at ~21s. * fix(import): stamp source_id on extracted call-graph edges importCodeFile built CodeEdgeInput rows without source_id, so every edge landed NULL. getCallersOf/getCalleesOf filter `AND source_id = <scoped>` whenever a worktree pin or --source is in play — NULL never matches, so scoped call-graph queries silently returned 0 rows on multi-source brains even though the edges existed (2,122 edges, 26 targeting the probed symbol, count 0 returned). One-line fix: carry the sourceId already in scope into the edge input. Existing NULL rows backfill with: UPDATE code_edges_symbol e SET source_id = p.source_id FROM content_chunks c JOIN pages p ON p.id = c.page_id WHERE c.id = e.from_chunk_id AND e.source_id IS NULL; (same for code_edges_chunk). Verified: code-callers returns 21 callers where it returned 0. * docs(migrations): NULL embeddings BEFORE the column-type alter The Postgres recipe ordered ALTER COLUMN TYPE vector(N) before the UPDATE that clears stale embeddings. pgvector refuses to cast existing vectors across dimensions ('expected 1024 dimensions, not 1536'), so the recipe as written aborts the transaction on any brain that has embeddings — which is every brain doing this migration. Swap the steps: NULLs cast fine. * fix: honor legacy token source grants in oauth * fix(cli): bound read-scope op handlers at 180s wallclock (pre-landing review) With the hard-deadline timer correctly scoped to teardown, a genuinely wedged read handler (hung pooler connection mid-query) would hang the CLI forever — the #1633 zombie class the old pre-try timer accidentally bounded at 10s. Reads now get a generous withTimeout (180s default, far above any healthy slow-pooler run; --timeout=Ns overrides; exit 124 with the teardown finally still draining + disconnecting). Writes/admin stay unbounded: a long import/embed must never be killed by a default. * fix(import): stamp unscoped edges 'default', matching the pages-table default Review catch: 'sourceId ?? null' fixed the scoped path but left the unscoped one (reindex --code without --source, importCodeFile callers without opts.sourceId) stranding edges at NULL while their pages land under the schema default (pages.source_id DEFAULT 'default') — so getCallersOf(sym, { sourceId: 'default' }) missed them. Same bug, other door. Fallback is now 'default'. * fix(core): runtime dim-migration recipe NULLs embeddings before the alter Review catch: the doc fix corrected docs/embedding-migrations.md, but embeddingMismatchMessage still PRINTED the broken order — ALTER before UPDATE ... SET embedding = NULL — and linked to the now-contradicting doc. pgvector refuses to cast existing vectors across dimensions, so the printed recipe aborted on any brain that has embeddings. Swap the steps and say why inline. * feat(migrate): v116 — backfill NULL edge source_id + index from_symbol_qualified 1. Backfill: edges written before the stamping fix sit at source_id=NULL and stay invisible to scoped call-graph queries until repaired. Derive each edge's source from its own from_chunk's page (pages.source_id is NOT NULL DEFAULT 'default'). Same SQL verified live on a 2,122-edge production brain. 2. Indexes: getCalleesOf filters both edge tables on from_symbol_qualified, which had no index — every callee lookup was a seq scan, amplified per-BFS-node by the recursive code walk. With NULL edges repaired, scoped walks actually expand, so the latent cost becomes real. Mirrored into src/schema.sql; schema-embedded.ts regenerated. * docs(migrations): align the rationale list with the corrected recipe order The 'Why we don't do this automatically' list still said alter-then-wipe; reorder to wipe-then-alter and replace the fragile 'step 3' numeric cross-reference with a name-based one. * test: regression coverage for edge source_id stamping, timer placement, recipe order - import-code-edges-source-id: scoped import stamps edges + scoped getCallersOf/getCalleesOf match (verified failing pre-fix), plus the unscoped-import case asserting 'default' stamping. - cli-force-exit-teardown-arming: structural pin — the hard-deadline timer arms inside the finally (teardown entry), never before the op body; daemon guard, unref, clearTimeout intact. - embedding-dim-check: recipe order pinned — UPDATE precedes ALTER so the printed SQL can't drift from docs/embedding-migrations.md again. * fix(cli): hard-exit after teardown on wallclock timeout; bound makeContext too Adversarial review, two findings on the new timeout path: 1. On timeout the finally drained, disconnected, then CLEARED the hard-deadline timer — removing the only backstop while the abandoned handler (withTimeout races, it does not cancel) can hold ref'd sockets/SDK timers that keep Bun's loop alive: 'timed out' printed, process immortal — the zombie class this branch exists to kill, resurrected through its own fix. The finally now exits explicitly after teardown completes on the timeout path. 2. makeContext does DB I/O (resolveSourceId) for EVERY op and sat outside any bound — a pooler wedge at context build hung reads, writes, and admin alike. It now shares the same wallclock bound. * fix(import): normalize edge source once — closes the '' door and the unscoped chunk fan-out Adversarial review: txOpts used truthiness while the edge stamp used nullish — sourceId:'' put pages under 'default' but stamped edges '', FK-violating against sources(id) and silently dropping the file's whole call graph in the best-effort catch. The unscoped getChunks could also fan out to same-slug chunks from another source. One normalized edgeSourceId (sourceId || 'default') now drives both the chunk lookup and the stamp. * fix(engine): default edge source_id to 'default' at the insert layer (both engines) Adversarial review: addCodeEdges still wrote e.source_id ?? null, so any future caller that forgets the field reintroduces invisible NULL edges the day after the v116 backfill runs. A NULL source_id is invisible to every scoped call-graph query; default to the schema-default source the way the pages table does. Applied to both engines (parity). * fix(core): facts alter recipe NULLs embeddings before cross-dimension alters Adversarial review: buildFactsAlterRecipe shipped the same defect class this branch fixes for content_chunks 350 lines up — a cross-dimension ALTER ... USING cast that pgvector refuses while rows hold old-width vectors. Dimension changes now wipe first (the facts pipeline re-embeds on next write); same-dim type swaps (halfvec <-> vector) keep the lossless cast and PRESERVE data. Both behaviors pinned by tests. * v0.42.39.0 chore: version bump + CHANGELOG + TODOS Marks the v0.42.20.0 'decouple the op-dispatch force-exit timer' follow-up complete — this branch ships exactly that decoupling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(postgres-engine): atomic JSONB merge in updateSourceConfig — eliminate lost-update race ## Problem `updateSourceConfig` used a read-then-write pattern: read the current `config` row, normalize it in JavaScript, then write the merged result back with `SET config = <normalized> || <patch>`. Under concurrent callers (two background autopilot/cycle paths patching different keys simultaneously), both callers can read the same stale row. The later `SET config = ...` then clobbers the earlier patch, silently dropping whatever keys the first caller wrote. Reproduced at 21/25 lost-update events under real Postgres with parallel callers. ## Fix Fold the normalization and merge into a single atomic `UPDATE … SET config = CASE … END || patch` statement. Because the `SET` expression evaluates against the row-locked latest version of `config`, there is no snapshot window between the read and the write. Concurrent callers now converge correctly (50/50 clean in reproduction test). The `CASE` also normalizes historical bad JSONB shapes inline: - `object` — used as-is - `string` — double-encoded config; inner text parsed with the SQL `IS JSON` guard (Postgres 16+) so unparseable strings fall back to `{}` instead of raising `invalid input syntax for type json` - `array` — array of patch objects aggregated into a flat object via `jsonb_object_agg` - anything else — falls back to `{}` `pglite-engine.updateSourceConfig` already used an atomic `||` merge; this change brings postgres-engine to parity. ## Test Added two assertions to `test/list-all-sources.test.ts`: 1. JSONB string holding non-JSON text normalizes to `{}` (no cast throw) 2. JSONB string holding double-encoded valid JSON is parsed then merged * fix(doctor): five correctness fixes — stale locks, content sanity, graph coverage, exit code, gateway guard ## 1. Stale lock break hints cover gbrain-cycle: keys The doctor stale-lock report only recognized `gbrain-sync:` lock prefixes; everything else fell back to `gbrain sync --break-lock`, which is wrong for dream/autopilot cycle locks. A `gbrain-cycle:<source>` or `gbrain-cycle` lock now suggests `gbrain dream --break-lock [--source <name>]`, and unknown lock shapes fall back to `gbrain doctor` instead of a misleading sync command. ## 2. content_sanity_audit_recent counts reject and quarantine as hard failures v0.42 renamed the hard disposition path: rejected pages emit a `reject` event and quarantined junk pages emit `quarantine`; `hard_block` is now only the pre-v0.42 legacy alias. The status check only counted `hard_block`, so fresh `reject` / `quarantine` events from the new path cleared as `ok` whenever fewer than 10 events existed. The check now sums all three for the hard count, and `soft_block + flag` for the soft count. ## 3. graph_coverage excludes test fixture entity pages from the denominator Brains seeded with code sources (e.g. a sync of the gbrain repo itself) could accumulate test fixture pages typed as `entity` / `person`. Including these in the entity-count denominator diluted coverage and produced spurious warnings ("Entity link coverage 0%, timeline 0%") on knowledge-only brains with no real entity pages. The check now queries a per-entity stats CTE that excludes `tools/gbrain/test/*` slugs and the `templates/new-person` stub, with an additional guard for the all-fixture case (`eligibleEntityCount = 0`). ## 4. process.exitCode instead of process.exit at doctor main exit point `process.exit(hasFail ? 1 : 0)` was a hard kill that prevented cleanup handlers (Bun unload events, open DB connections) from running. Using `process.exitCode = hasFail ? 1 : 0` defers the actual termination until the end of the event loop, allowing cleanup to complete. ## 5. checkSubagentCapability exported for test seams + gateway loop guard The function was private, making it untestable in isolation. It is now exported. Additionally, users running gbrain with a non-Anthropic chat model via `agent.use_gateway_loop=true` no longer receive a spurious warning that `ANTHROPIC_API_KEY` is missing — subagents route via the gateway loop in that configuration and do not need the key directly. ## Tests Doctor test suite: 77 pass, 0 fail (no regressions). * fix(engine): deleteFactsForPage excludeSourcePrefixes (#1928) + reconnect() parity (#2034) Engine-layer API for two cycle/availability fixes that share these files: - deleteFactsForPage gains optional excludeSourcePrefixes so the fence reconcile can protect non-fence facts (e.g. cli: conversation facts). - reconnect(ctx?) is now a first-class BrainEngine method on both engines (PostgresEngine already had it; PGLite gains config capture + reconnect) so callers stop using disconnect()+bare connect(). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cycle): stop extract_facts from wiping conversation facts (#1928) The fence reconcile delete-then-reinsert wiped cli:-origin facts (no fence to recreate them); a failed-sync full walk turned it brain-wide (1829 rows, 0 reinserted, status ok). Now: exclude cli: rows from the wipe, do NOT inherit the failed-sync->full-walk fallback for this destructive phase, and warn on net-negative reconcile. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(autopilot,supervisor): reconnect() instead of disconnect()+bare connect() (#2034) The autopilot health-probe recovery called connect() with no args after disconnect(), losing the startup config (database_url undefined -> FATAL restart-loop on every DB blip) and opening a null-pool window. Both call sites now use engine.reconnect(), which restores the captured config. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(write-through): mirror to the assigned source's local_path, never the global repo (#2018) put_page write-through resolved the disk target from the global sync.repo_path, so a default-source page (local_path NULL) got written into an unrelated federated source's working tree. Now it uses the assigned source's own local_path; NULL local_path skips (no leak); the global path is used only as a sole-source fallback. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(pglite-lock): heartbeat + steal-grace so live holders are never stolen (#2058) A live holder's lock was force-removed after 5min age alone, letting a second process share the single-writer data dir -> WAL corruption. The lock now heartbeats while held; a holder is reaped only when its PID is dead OR its heartbeat went stale past the steal grace. Pairs PID liveness with heartbeat age to also defeat PID reuse. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(migrate,doctor): self-heal idx_timeline_dedup drift (#2038) A migration renumbered during a merge (v102) could be recorded-as-applied without its DDL running, leaving the 3-column index so every timeline write failed the 4-column ON CONFLICT. runMigrations now always runs a shape-keyed drift repair (dedupe-then-rebuild) even when no migration is pending, and doctor surfaces the drift. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(timeline): un-silence the swallowed batch catch; pin Date-batch round-trip (#2057) The meetings extractor's bare catch {} hid a brain-wide timeline-write failure (0 entries, no error). It now counts + surfaces batch errors. Adds a Date-bearing batch regression test proving the #1861 jsonb_to_recordset refactor already fixed the original ::text[] cast failure. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * chore: bump version and changelog (v0.42.41.0) Triage fix wave: 6 authored critical fixes (#1928 facts wipe, #2018 write-through leak, #2034 reconnect loop, #2058 WAL lock, #2038 timeline migration drift, #2057 timeline silent-empty) + community PRs #2064 #2052 #2020 #2033 #2074 #2075 #2009 #2072 #2073. TODOS: deferred #1994 #1963 #2050. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: address adversarial review findings (#1928, #2058, #2038, #2057) Codex as-built review of the authored fixes surfaced 4 real issues: - #2058: add a pid+acquired_at ownership token. A stale holder reaped + replaced past the grace must NOT let its resumed heartbeat refresh, nor releaseLock remove, the NEW owner's lock (re-opened the concurrent-writer hole). Heartbeat and release now verify the on-disk lock is still ours. + regression test. - #1928: the destructive-full-walk guard keyed off phases.includes('sync'), which wrongly suppressed a legitimate full reconcile when sync was SKIPPED (no engine / no brainDir). Key off a syncAttempted flag set only when sync actually ran. - #2038: dedupe keeps MIN(id) not MIN(ctid) — deterministic and consistent with the existing v-migration lower-id rule. - #2057: the extract CLI caller now surfaces batch_errors (stderr + exit 1) instead of printing a clean success over failed inserts. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(key-files): sync reference to v0.42.41.0 triage-wave behavior Update KEY_FILES.md to current-state truth for the shipped fixes (no release-history clauses, per the reference-doc discipline): - write-through.ts (#2018): resolves the disk target from the assigned source's own local_path; sole-source falls back to sync.repo_path, multi-source skips with source_has_no_local_path rather than leak. - engine.ts (#2034): reconnect() is now a REQUIRED lifecycle method on both engines; config-restoring, never disconnect()+bare connect(). - migrate.ts (#2073): document v116 edge source_id backfill + callee index, and the always-run (version-counter-blind) timeline dedup self-heal. - new entry for timeline-dedup-repair.ts (#2038) + the timeline_dedup_index doctor check. - new entry for pglite-lock.ts (#2058): heartbeat + steal-grace (GBRAIN_PGLITE_LOCK_STEAL_GRACE_SECONDS) so a live holder is never stolen. - extract-facts.ts (#1928): cli:-fact protection, no failed-sync full-walk inheritance, net_fact_deletion warn floor. bun run build:llms re-run (KEY_FILES is link-only so bundles unchanged); freshness + current-state guards green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(write-through): preserve nested multi-source layout; narrow #2018 leak guard The first #2018 fix skipped any no-local_path source on a multi-source brain, which broke the legitimate nested layout (a source without its own tree nests under the host repo at .sources/<id>/ — pinned by put-page-write-through.test). Narrow the guard: a no-local_path source nests under sync.repo_path as before; only SKIP when sync.repo_path is literally another source's own local_path (the actual leak — writing there pollutes that sibling's repo). Caught by the sharded suite. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: satisfy test-isolation guard for the new lock/reconnect tests CI `verify` flagged 3 intra-process isolation violations in the tests added this wave (the parallel runner shares one process per shard): - pglite-lock.test.ts: the GBRAIN_PGLITE_LOCK_STEAL_GRACE_SECONDS mutation now goes through withEnv() instead of a raw process.env write (R1). - pglite-reconnect: renamed to *.serial.test.ts — it creates per-test engines to exercise the connect/reconnect lifecycle, which doesn't fit the shared beforeAll-engine model (R3/R4). verify is now 30/30; both files green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(pglite): reconnect() is a no-op for in-memory engines (#2034) CI serial-tests + test(5) caught two in-branch regressions from the #2034 PGLite reconnect(): - worker/queue claim-error recovery + their renewLock e2e test assume PGLite reconnect is absent/no-op (queue.ts documents it). Making it a real disconnect+reopen wiped an in-memory engine's state mid-job. reconnect() now no-ops for in-memory (no database_path) — file-backed still re-opens the dir (state persists on disk). Restores the documented worker assumption. - connection-resilience 'Supervisor still has the 3-strikes-then-reconnect path' pinned the removed unsafe-cast text; updated to assert the direct this.engine.reconnect() call. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: quarantine embed-input-type-wire to serial lane (CI test(5) leak) #2033's embed-input-type-wire.test.ts configures a 1280-dim embedding gateway; the active dimension survived into engine-find-trajectory when CI's 10-way hash-disjoint sharding co-located them (this branch's added files reshuffled the assignment), failing 7 trajectory tests with 'expected 1280 dimensions, not 1536'. resetGateway() in afterEach clears the gateway but the dimension still leaked. It mutates global gateway/embedding state, so it belongs in the serial lane (own bun process, true isolation) by the repo's own definition. Root-caused by reproducing the exact failing pair locally. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Austin Arnett <austin@sdsconsultinggroup.org> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Dave MacDonald <djmacdonald@ucdavis.edu> Co-authored-by: pabloglzg <186649799+pabloglzg@users.noreply.github.com> Co-authored-by: Alex P. <12667893+aphaiboon@users.noreply.github.com> Co-authored-by: Garry Tan <bo.m.liu@gmail.com> Co-authored-by: jbarol <barol.j@gmail.com> Co-authored-by: maxpetrusenkoagent <max.petrusenko.agent@gmail.com> Co-authored-by: PAI <pai@scaffolde.ai>
Closes #1073.
Problem
collectSyncableFiles(src/commands/import.ts) only skips dot-dirs /node_modules/ops. The full / first /--fullimport therefore walks gitignored build output (dist/,out/,coverage/,__pycache__/, …) and admits every file in there as "code" —CODE_EXTENSIONScovers.json/.yaml/.toml/.html/.css. On repos with large gitignored trees this bloats the DB, raises embedding cost, pollutescode-def/ semantic search with stale fixtures/bundles, and can wedge the chunker on a pathological file (e.g. a single-line giant JSON). This is the stall reported in #1073. The incremental sync path is already git-based (git ls-files --exclude-standard,sync.ts) and excludes these — only the full-import walker diverges.Change (opt-in, default off)
collectSyncableFilesgainsrespectGitignoreinCollectOpts. When set and the root is a git work tree,git ls-files -o -i --exclude-standard --directorybuilds the ignored set; entries are pruned at descent time so a fully-ignored directory is never recursed into — addressing the walk-stall, not just emit-time filtering. Non-git root / git unavailable → empty set → zero overhead, exact legacy behavior.gbrain sync --respect-gitignore(and--no-respect-gitignoreto override an enabled config)sync.respect_gitignoreconfig knob (CLI flag wins)gbrain import --respect-gitignoresync --allcost preview so the estimate matches what will actually be walkedgbrain --helpMechanism note
#1073 suggested
git ls-files --cached --others --exclude-standard(an allowlist gated at emit time). I used the ignored set with--directoryinstead so whole gitignored trees are pruned before recursing — the issue's core pain is the walker stalling on tens of thousands of files, which directory-level pruning avoids and emit-time allowlisting does not. All existing walker hardenings (symlink/inode-cycle/max-depth) are untouched. Happy to switch to the allowlist shape if you'd rather.Tests
test/import-walker.test.ts:dist/bundle.js— legacy behavior preserveddist/+coverage/while keepingsrc/app.tsbun run typecheckclean;bun test test/import-walker.test.ts→ 7 pass / 0 fail.Follow-ups (intentionally out of scope)
--ignore-from FILEand.gbrainignore/sync.exclude(#449) — this PR lands the easy win for repos that already express the intent in.gitignore.🤖 Generated with Claude Code