Skip to content

fix(postgres-engine): only the singleton owner may disconnect it (#1471)#1651

Closed
nullhex-io wants to merge 2 commits into
garrytan:masterfrom
nullhex-io:fix/db-singleton-ownership
Closed

fix(postgres-engine): only the singleton owner may disconnect it (#1471)#1651
nullhex-io wants to merge 2 commits into
garrytan:masterfrom
nullhex-io:fix/db-singleton-ownership

Conversation

@nullhex-io

Copy link
Copy Markdown

Problem

Fixes #1471 (root cause). Should also resolve the symptom issues that stem from the same module-singleton-null-mid-cycle defect: #1404, #1413, #1491, #1619.

A PostgresEngine connecting via the module-singleton path (no poolSize) is tagged _connectionStyle = 'module' whether it CREATED the shared db.ts sql singleton (the owner - the CLI cycle engine) or merely found it already connected (a borrower - the probe engine in resolveLintContentSanity, doctor, future probes). The old disconnect() called db.disconnect() for BOTH, so a short-lived borrower's finally-block teardown nulled the singleton the owner cycle engine was still using. Every later phase then threw No database connection: connect() has not been called, and runCycle's finally stranded a row in gbrain_cycle_locks.

_connectionStyle alone can't separate owner from borrower - both are 'module'. This implements the ownership-flag fix proposed in #1471.

Fix

  • db.ts: add isConnected() so an engine can tell, at connect() time, whether the singleton already exists.
  • postgres-engine.ts: add _ownsModuleSingleton. connect() samples ownership before delegating to db.connect(); disconnect() calls db.disconnect() only when this engine owns the singleton. A borrower clears its own marker without touching the shared connection.

Scope:

  • Instance-pool engines (poolSize set) are unaffected - they never take the module branch.
  • reconnect() stays correct: an owner's disconnect+reconnect re-acquires ownership (the singleton is null right after its own db.disconnect()), and a borrower re-borrows the owner's (possibly refreshed) singleton.

Tests

  • New test/postgres-engine-singleton-ownership.test.ts: source-level guardrails (DB-free, matching the existing postgres-engine.test.ts convention, since runtime-mocking postgres.js's tagged-template interface is painful under bun ESM). Asserts db.isConnected() is sampled before db.connect(), and that the only db.disconnect() in disconnect() is guarded by _ownsModuleSingleton.
  • bun run typecheck: clean.
  • Regression (59 tests green): postgres-engine, db-disconnect-audit, connection-manager.serial, autopilot-reconnect-classifier, worker-shutdown-disconnect.

The live gbrain dream cycle is the e2e exercise of this path; the ownership logic above is what keeps the owner's singleton alive across a borrower probe's teardown.

…rytan#1471)

A PostgresEngine connecting via the module-singleton path (no poolSize) got
_connectionStyle='module' whether it CREATED the shared db.ts sql singleton
(owner, e.g. the CLI cycle engine) or merely found it already connected
(borrower, e.g. the probe engine in resolveLintContentSanity / doctor).
disconnect() then called db.disconnect() for both, so a short-lived borrower's
teardown nulled the singleton the owner was still using - every later cycle
phase threw "No database connection: connect() has not been called" and the
cycle lock stranded in gbrain_cycle_locks.

Track ownership: db.isConnected() lets connect() record whether THIS engine
created the singleton; disconnect() only calls db.disconnect() when it owns it.
Borrowers clear their own marker without touching the shared connection.
Instance-pool engines (poolSize set) are unaffected. reconnect() stays correct:
an owner disconnect+reconnect re-acquires ownership; a borrower re-borrows.
…nnect ownership

Follow-up doc nits on the garrytan#1471 fix:
- reconnect() doc no longer implies module-singleton engines lack saved config
  (both paths set _savedConfig; both reconnect, re-sampling ownership).
- Note that concurrent module-path connects both claiming ownership is benign
  (db.disconnect() is idempotent once sql is null).
@nullhex-io

Copy link
Copy Markdown
Author

#1805 looks like the canonical landing for this bug class and credits this approach - appreciated. Happy to close this in favor of it; leaving it open for now since #1805 still shows conflicts against master. Say the word if you'd rather I rebase this one instead.

@garrytan

garrytan commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Thanks for this! The Postgres module-singleton / "connect() has not been called" ownership bug was fixed canonically in #1805 (atomic owner-only-disconnect), with follow-ups in #1608 and #1572 — all merged to master. Closing as already-resolved. If you still reproduce after upgrading, please reopen, and thank you for the contribution.

@garrytan garrytan closed this Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resolveLintContentSanity disconnects shared module-level db singleton, killing the cycle's main engine connection

2 participants