v0.22.13 feat: parallel sync — bounded concurrent imports#490
Merged
v0.22.13 feat: parallel sync — bounded concurrent imports#490
Conversation
gbrain sync --concurrency N (alias --workers N) parallelizes the import
phase using per-worker Postgres engine instances with an atomic queue
index (same proven pattern as gbrain import --workers N).
Auto-concurrency: when a sync touches >100 files and the user didn't
explicitly set --concurrency, defaults to 4 workers. Small incremental
syncs (<50 files) stay serial. Full syncs auto-detect Postgres and
default to 4 workers.
Minion sync handler defaults to concurrency=4, configurable via job
params: {"concurrency": 8}.
Delete and rename phases remain serial (order-dependent, fast).
PGLite falls back to serial automatically (single-connection engine).
Changes:
- src/commands/sync.ts: SyncOpts.concurrency, parallel import loop in
performSync incremental path, --workers passthrough in performFullSync
- src/commands/jobs.ts: sync handler accepts concurrency param (default 4)
- CHANGELOG.md: v0.23.0 parallel sync entry
All 37 existing sync tests pass. Typecheck clean.
src/core/sync-concurrency.ts — single source of truth for autoConcurrency() + parseWorkers() + shouldRunParallel() + constants. Replaces three drifted call-site policies (performSync, performFullSync, jobs handler). src/core/db-lock.ts — generic tryAcquireDbLock(engine, lockId, ttlMinutes) over the existing gbrain_cycle_locks table. Parameterized lock id so performSync (gbrain-sync) can nest cleanly under cycle.ts (gbrain-cycle) without deadlock. test/sync-concurrency.test.ts — 17 cases covering PGLite-forces-serial, explicit override clamping, auto-path threshold, parseWorkers validation (rejects 0, negatives, NaN, decimals, trailing chars). No consumers yet; subsequent commits wire sync.ts, import.ts, and jobs.ts to use these helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CODEX-2: wrap performSync body in a gbrain-sync DB lock so two concurrent syncs (manual + autopilot, two terminals, two Conductor workspaces) cannot both read last_commit, both write it unconditionally, and let the last writer win. cycle.ts continues to hold gbrain-cycle for its broader scope; the two ids nest cleanly. CODEX-3: capture git HEAD at sync entry, re-rev-parse after the import phase, refuse to advance last_commit if HEAD drifted (someone ran git checkout / git pull mid-sync). Vanished files now go into failedFiles instead of silent-skip — same gating mechanism, no more bookmark advance past unimported work. A1: replace both PGLite detection sites with engine.kind === 'pglite'. The constructor.name sniff is gone (breaks under bundling) and so is the inconsistent config?.engine string check. A2: connect worker engines serially into an array, run inside try/finally so disconnect always fires — even on partial connect failure, OOM, or mid-import abort. Prior Promise.all(...disconnect) leaked the 8 worker connections on any panic path. Q1: explicit --workers / opts.concurrency now bypasses the >50-file floor. User opt-in beats the auto-path safety net. Q3: drop the config!.database_url! non-null assertions; fall back to serial when database_url is unset instead of crashing on TypeError. Q4: worker-count banner moves from console.log to console.error so stdout stays clean for --json output. test/sync-parallel.test.ts — 7 cases over PGLite covering the bookmark gate under concurrency request, the head-drift gate, vanished-file failure capture, PGLite-stays-serial, and the writer-lock contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Workers A1: replace the config?.engine === 'pglite' string sniff with engine.kind === 'pglite' to match sync.ts and the v0.13.1 contract. A2: wrap worker engine creation + the parallel loop in try/finally so disconnects always fire — same pattern as sync.ts. Worker engines now push onto an array as they connect (rather than Promise.all) so the finally block can clean up partial-connect state. Q2: route --workers parsing through the shared parseWorkers() helper. parseInt-with-no-validation is gone — '0', '-3', 'foo', '1.5' now exit with a clear error message instead of silently falling through. Q3: drop the config!.database_url! non-null assertion; fall back to serial when database_url is unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CODEX-1: resolve sourceId at handler entry by looking up sources.local_path. Mirrors cycle.ts:480's autopilot-cycle fix (PR #475). Without this, every Minion sync job on a multi-source brain reads global config.sync.last_commit instead of the per-source anchor, which on a regularly-GC'd repo can drop out of git history and trigger 30-min full reimports every cycle. The handler accepts an optional sourceId job param for callers that want to override; falls back to the resolveSourceForDir lookup when absent. CODEX-4: replace the hardcoded concurrency=4 default with the shared autoConcurrency policy. Behavior is now consistent between CLI sync, the Minion handler, and the autopilot cycle's sync phase. Jobs that request a specific concurrency via job.data.concurrency still win. noEmbed default stays at true — embed is a separate job (submit gbrain embed --stale, OR rely on the autopilot cycle's embed phase). The doc comment makes that contract explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DATABASE_URL-gated E2E coverage that PGLite-only tests can't reach: T2 — happy path: 60 files imported at concurrency=4, all 60 pages land in the DB, with a pg_stat_activity probe before/after to confirm worker engines (4 × 2 connections) actually disconnected. P4 — benchmark: 120-file fixture, serial vs concurrency=4 timing. Emits a single-line `SYNC_PARALLEL_BENCH 120 files | serial=Xms | parallel(4)=Yms | speedup=Zx` so the CHANGELOG can quote a real number instead of an unbacked '~4×' claim. Asserts parallel <= serial * 1.5 to allow for noisy CI but fail genuine regressions. Skips gracefully when DATABASE_URL is unset (consistent with the rest of test/e2e/). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION + package.json + bun.lock: 0.22.5/0.22.6 → 0.22.10. Repo had existing drift between VERSION and package.json on master; this commit brings them back in sync at the bumped value. CHANGELOG.md: v0.22.10 entry replaces the unfinished v0.23.0 stub from PR #490's original commit. Voice-rule clean (no em dashes, no AI vocabulary), real benchmark numbers from the new E2E test (serial=289ms parallel(4)=221ms speedup=1.31x), additive worker-pool note (A3), 'To take advantage of v0.22.10' self-repair block per CLAUDE.md convention. TODOS.md: A4 follow-up filed — plumb resolved database_url through SyncOpts so performSync / performFullSync / import.ts don't each call loadConfig() separately. Deferred to a future patch; not on the v0.22.10 critical path. Patch (not minor) framing held even though new CLI surface lands here; release-notes prose names the behavior change explicitly so users know to read them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE.md: - New "Key files" entries for src/core/sync-concurrency.ts and src/core/db-lock.ts (both v0.22.10). - New "Key files" entry for src/commands/sync.ts (covers the lock, head-drift gate, engine.kind discriminator, vanished-file failure capture, parallel branch wiring). - Updated src/commands/jobs.ts entry with v0.22.10 sourceId resolution + autoConcurrency policy + noEmbed contract. - Added test/sync-concurrency.test.ts and test/sync-parallel.test.ts to the unit-test list with case counts. - Added test/e2e/sync-parallel.test.ts to the E2E section with the SYNC_PARALLEL_BENCH grep marker for CHANGELOG quoting. - Added "Key commands added in v0.22.10" section: gbrain sync --workers, gbrain import --workers (parseWorkers validation). README.md: added --workers flag to the IMPORT section's gbrain sync and gbrain import lines, with the >100-file auto-parallelize note. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md # CLAUDE.md # TODOS.md # VERSION # package.json
# Conflicts: # CHANGELOG.md # CLAUDE.md # VERSION # package.json
VERSION 0.22.10 → 0.22.13. Master moved to 0.22.8 plus claimed slots 0.22.9-0.22.12 in sibling workspaces; 0.22.13 is the next free slot for this PR's parallel-sync hardening work. Updated all v0.22.10 references in CHANGELOG.md (release header + self-repair block), TODOS.md (D-PR490-1 follow-up tag), CLAUDE.md (Key files entries + tests + commands subsection), and the inline v0.22.10 markers in src/core/sync-concurrency.ts, src/core/db-lock.ts, src/commands/sync.ts, src/commands/import.ts, src/commands/jobs.ts, test/sync-parallel.test.ts, test/e2e/sync-parallel.test.ts. No behavioral change. CHANGELOG header rewrite, content unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's build-llms generator test failed because llms-full.txt was stale relative to the README + CLAUDE.md updates this PR added (--workers flag in the IMPORT section, sync-concurrency.ts/db-lock.ts/sync.ts entries in the Key files section). Per CLAUDE.md: "Run \`bun run build:llms\` after adding a new doc." The test test/build-llms.test.ts:67 verifies committed bundles match generator output — now they do again. llms.txt was already in sync (no curated config additions); only llms-full.txt needed the regen. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md # TODOS.md # VERSION # package.json
# Conflicts: # CHANGELOG.md # CLAUDE.md # VERSION # llms-full.txt # package.json
# Conflicts: # CHANGELOG.md # CLAUDE.md # VERSION # llms-full.txt # package.json # src/commands/sync.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
gbrain syncon a large brain (7,200+ pages) takes 25+ minutes because imports are serial. Meanwhile,gbrain import --workers Nalready has a proven parallel pattern.Solution
Thread the same parallel pattern through sync:
gbrain sync --concurrency N(alias--workers N) parallelizes the import phaseautoConcurrency()(was hardcoded 4):gbrain jobs submit sync --params '{"concurrency":8}'Each worker gets its own Postgres pool (2 connections). Total connections during the parallel phase =
workers * 2 + caller pool(e.g. 4 workers + 10-conn caller pool = 18, well under PgBouncer'smax_client_conndefault of 100).Update (v0.22.13 hardening — 6 atomic commits on top of #489)
After /plan-eng-review surfaced 14 issues + Codex outside-voice surfaced 4 more (3 critical), the PR now also ships:
gbrain-syncrow ingbrain_cycle_lockswith 30-min TTL. Prevents two concurrent syncs from both writinglast_commitand letting the last writer win. New helpersrc/core/db-lock.ts.git rev-parse HEADafter the import phase; refuses to advancelast_commitif HEAD moved (someone rangit checkout/git pullmid-sync). Vanished files now record afailedFilesentry instead of silent-skip — the silent-skip-then-advance pathology that survived prior hardening passes is dead.sourceIdviasources.local_pathlookup, mirroringcycle.ts:480's autopilot fix from PR fix: pass sourceId in cycle sync phase to prevent full reimport #475. Prevents the 30-min full-reimport-every-cycle behavior on multi-source brains.src/core/sync-concurrency.tswithautoConcurrency()+parseWorkers(). Replaces three drifted call-site policies inperformSync,performFullSync, and the jobs handler.sync.tsandimport.ts. PriorPromise.all(...disconnect)ran outside any try/finally, leaking 8 connections on panic.engine.kind === 'pglite'(the v0.13.1 discriminator). Theengine.constructor.namesniff is gone.--workersvalidation (Q2) —--workers 0,--workers -3,--workers foo,--workers 1.5now exit with a clear error. Prior parseInt-with-no-validation silently fell through to auto-concurrency (4 workers), the opposite of what was typed.--workershonored on small diffs (Q1) — drop the >50-file floor when the user opted in.console.log→console.errorsogbrain sync --jsonstdout stays clean.Versioning: held at v0.22.13 (patch) per user call, not the v0.23.0 (minor) Codex argued for. CHANGELOG entry names the behavior changes loud so users know to read the release notes.
Tests
bun test --timeout 30000)test/sync-concurrency.test.ts(17 cases) +test/sync-parallel.test.ts(7 cases, PGLite-routed, covers bookmark gate + head-drift + writer-lock contract)test/e2e/sync-parallel.test.ts— DATABASE_URL-gated, 60-file happy path withpg_stat_activityleak probe + 120-file benchmark (SYNC_PARALLEL_BENCH 120 files | serial=289ms | parallel(4)=221ms | speedup=1.31x). Real numbers now in the CHANGELOG.Files
src/commands/sync.ts—SyncOpts.concurrency, parallel import, auto-concurrency, writer lock, head-drift gate, vanished-file capture, engine.kind, try/finally workers, parseWorkers CLI, banner→stderrsrc/commands/import.ts— engine.kind, try/finally workers, parseWorkerssrc/commands/jobs.ts— sync handler resolves sourceId, autoConcurrency, noEmbed contract documentedsrc/core/sync-concurrency.ts(new) — autoConcurrency + parseWorkers + constantssrc/core/db-lock.ts(new) — generictryAcquireDbLock(engine, lockId)overgbrain_cycle_lockstest/sync-concurrency.test.ts(new),test/sync-parallel.test.ts(new),test/e2e/sync-parallel.test.ts(new)CHANGELOG.md— v0.22.13 entry (renamed from v0.23.0)VERSION0.22.5 → 0.22.13,package.json0.22.6 → 0.22.13,bun.lockrefreshedTODOS.md— A4 follow-up (plumbdatabase_urlthroughSyncOpts)CLAUDE.md+README.md— doc updates for the new modules, tests, and CLI flagsDocumentation
Doc updates in this PR (commit
36c750b):src/core/sync-concurrency.ts(v0.22.13) andsrc/core/db-lock.ts(v0.22.13); added a new entry forsrc/commands/sync.ts(covers the lock, head-drift gate, engine.kind, vanished-file capture, parallel branch); updated thesrc/commands/jobs.tsentry with v0.22.13 sourceId resolution + autoConcurrency + noEmbed contract; addedtest/sync-concurrency.test.ts,test/sync-parallel.test.ts, andtest/e2e/sync-parallel.test.tsto the test list with case counts and theSYNC_PARALLEL_BENCHgrep marker; added a "Key commands added in v0.22.13" subsection.--workers Nflag to the IMPORT section'sgbrain syncandgbrain importlines, with the >100-file auto-parallelize note.database_urlthroughSyncOptssoloadConfig()isn't re-read three times per sync.Need help on this PR? Tag
@codesmithwith what you need.