You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When gbrain dream runs on Postgres, resolveLintContentSanity (src/commands/lint.ts:298-339) creates a fresh PostgresEngine to probe DB-plane lint config, then disconnects it in finally. That disconnect kills the module-level db.ts connection singleton that the cycle's main engine is still using — every subsequent DB-touching phase then throws No database connection: connect() has not been called and lock.release() throws CONNECTION_ENDED from postgres.js, stranding the row in gbrain_cycle_locks.
Mechanism (root cause confirmed via stack-trace instrumentation)
#1.connect() falls through to the else branch in src/core/postgres-engine.ts:175 (no poolSize passed), calls db.connect(config), which creates the module-level sql singleton. Sets #1._connectionStyle = 'module'.
Cycle runs lint as its first phase; runLintCore calls resolveLintContentSanity().
The comment at postgres-engine.ts:94-100 explicitly anticipates this category of bug, but the existing _connectionStyle: 'instance' | 'module' | null flag doesn't distinguish a singleton owner from a singleton borrower — both call db.disconnect().
Repro
psql gbrain -c "DELETE FROM gbrain_cycle_locks;"
gbrain dream --dir ~/path/to/brain
psql gbrain -c "SELECT id, holder_pid FROM gbrain_cycle_locks;"
Expect: 1.4s run, every DB phase reports No database connection: connect() has not been called, stranded row in gbrain_cycle_locks.
(In a vanilla 0.41.2.0 install this currently surfaces differently because of a separate Bun WebCrypto issue that masks the post-extract_facts phases — see linked Bun issue draft. Switching src/core/schema-pack/manifest-v1.ts:274 and closure.ts:175 to sync node:crypto exposes the singleton-clobber bug cleanly.)
Suggested fix
Track singleton ownership in PostgresEngine. Only the instance whose connect() call actually created the singleton may disconnect it. Borrowers just clear their own marker:
// postgres-engine.ts — add fieldprivate _ownsModuleSingleton: boolean=false;// in connect(), else branch (the no-poolSize / module-singleton path):}else{// Determine ownership BEFORE delegating to db.connect.letalreadyConnected=false;try{db.getConnection();alreadyConnected=true;}catch{/* sql null = we'll be the owner */}awaitdb.connect(config);this._connectionStyle='module';this._ownsModuleSingleton=!alreadyConnected;// ... existing connectionManager wiring}// in disconnect():if(this._connectionStyle==='module'){if(this._ownsModuleSingleton){awaitdb.disconnect();this._ownsModuleSingleton=false;}this._connectionStyle=null;}
This fix is minimal, backward-compatible, and addresses the root cause rather than symptoms. The CLI's connectEngine() engine becomes the owner; any helper that creates a probe engine (lint, doctor, future probes) becomes a borrower automatically.
An alternative — refactor resolveLintContentSanity to accept an engine from the caller and skip its own probe — is also defensible but spreads the API contract through more call sites. The ownership flag contains the bug to the place it was introduced.
Verification
After the fix:
Dream completes all 19 phases doing real DB work (sync adds pages, extract_facts reconciles, embed runs, etc.)
Lock row released cleanly each run (zero rows in gbrain_cycle_locks)
Per-phase report shows ✓ on every DB phase, not ✗ ... connect() has not been called
Tested locally on a 12,500-page brain over multiple runs. Full diagnostic write-up + stack traces available if useful.
Environment
gbrain 0.41.2.0
Bun 1.3.14
Postgres on macOS (Apple Silicon, localhost)
Related
Companion issue: silent lock.release() swallow (made this bug invisible for an unknown time).
Summary
When
gbrain dreamruns on Postgres,resolveLintContentSanity(src/commands/lint.ts:298-339) creates a freshPostgresEngineto probe DB-plane lint config, then disconnects it infinally. That disconnect kills the module-leveldb.tsconnection singleton that the cycle's main engine is still using — every subsequent DB-touching phase then throwsNo database connection: connect() has not been calledandlock.release()throwsCONNECTION_ENDEDfrompostgres.js, stranding the row ingbrain_cycle_locks.Mechanism (root cause confirmed via stack-trace instrumentation)
connectEngine()createsPostgresEngineinstance feat: GBrain v0.1.0 — Postgres-native personal knowledge brain #1 (the cycle engine).#1.connect()falls through to theelsebranch insrc/core/postgres-engine.ts:175(nopoolSizepassed), callsdb.connect(config), which creates the module-levelsqlsingleton. Sets#1._connectionStyle = 'module'.runLintCorecallsresolveLintContentSanity().resolveLintContentSanitycallscreateEngine(...)→PostgresEngineinstance feat: GBrain v0.2.0 — incremental sync, file storage, install skill #2 to probe lifted config.#2.connect({})also falls through to theelsebranch.db.connect()seessqlalready set and returns early. Sets#2._connectionStyle = 'module'.lint.ts:319callsengine.disconnect()on feat: GBrain v0.2.0 — incremental sync, file storage, install skill #2 infinally.#2.disconnect()(postgres-engine.ts:192-211) sees_connectionStyle === 'module'and callsdb.disconnect()→ nulls the sharedsqlsingleton.connect() has not been called.runCycle's finally then throwsCONNECTION_ENDEDonlock.release().The comment at
postgres-engine.ts:94-100explicitly anticipates this category of bug, but the existing_connectionStyle: 'instance' | 'module' | nullflag doesn't distinguish a singleton owner from a singleton borrower — both calldb.disconnect().Repro
Expect: 1.4s run, every DB phase reports
No database connection: connect() has not been called, stranded row ingbrain_cycle_locks.(In a vanilla 0.41.2.0 install this currently surfaces differently because of a separate Bun WebCrypto issue that masks the post-extract_facts phases — see linked Bun issue draft. Switching
src/core/schema-pack/manifest-v1.ts:274andclosure.ts:175to syncnode:cryptoexposes the singleton-clobber bug cleanly.)Suggested fix
Track singleton ownership in
PostgresEngine. Only the instance whoseconnect()call actually created the singleton may disconnect it. Borrowers just clear their own marker:This fix is minimal, backward-compatible, and addresses the root cause rather than symptoms. The CLI's
connectEngine()engine becomes the owner; any helper that creates a probe engine (lint, doctor, future probes) becomes a borrower automatically.An alternative — refactor
resolveLintContentSanityto accept an engine from the caller and skip its own probe — is also defensible but spreads the API contract through more call sites. The ownership flag contains the bug to the place it was introduced.Verification
After the fix:
gbrain_cycle_locks)✓on every DB phase, not✗ ... connect() has not been calledTested locally on a 12,500-page brain over multiple runs. Full diagnostic write-up + stack traces available if useful.
Environment
Related
Companion issue: silent
lock.release()swallow (made this bug invisible for an unknown time).