Skip to content

fix(engine): module-singleton ownership guard — stop stray sub-engines nulling the shared pool (#1570)#1667

Closed
joelwp wants to merge 1 commit into
garrytan:masterfrom
joelwp:pr/module-singleton-ownership-1570
Closed

fix(engine): module-singleton ownership guard — stop stray sub-engines nulling the shared pool (#1570)#1667
joelwp wants to merge 1 commit into
garrytan:masterfrom
joelwp:pr/module-singleton-ownership-1570

Conversation

@joelwp

@joelwp joelwp commented May 30, 2026

Copy link
Copy Markdown
Contributor

Summary

A second PostgresEngine connected in module mode (no poolSize) shares the process-global db.ts singleton instead of owning a private pool. Its disconnect() unconditionally calls db.disconnect(), tearing down the singleton out from under the engine that established it. Any concurrent code reading db.getConnection() in that window throws No database connection: connect() has not been called.

This is the root cause behind #1570 for non-retry-wrapped callers.

Deterministic repro (dream cycle)

The lint phase's resolveLintContentSanity() creates a throwaway engine via engine.connect({}) (module mode) purely to read 4 config keys, then disconnect()s it. On a Supabase session pooler (frequent connection blips) this nulls the main cycle engine's singleton mid-run, so sync/synthesize crash with connect() has not been called → 0 pages written. Phases wrapped in withRetry+reconnect (e.g. extract) self-heal; un-wrapped phases die.

Captured live with per-instance tracing:

connect    id=1 module      # main cycle engine
connect    id=2 module      # lint's resolveLintContentSanity probe
disconnect id=2 module      # -> db.disconnect() nulls the SHARED singleton
... main engine's next getConnection() throws ...

v0.41.27's withRetry self-heal only covers retry-wrapped paths, so #1570 persisted for everyone else whenever a stray module-mode sub-engine is torn down.

Fix

Track _ownsModuleSingleton — true only for the engine whose connect() actually established the singleton (via new db.isConnected()) — and gate db.disconnect() on it. A sharing engine's disconnect() now leaves the global intact. Instance pools (poolSize set) are unchanged. This also restores reconnect()'s own documented "no-ops in module-singleton mode" intent, which never fired because connect() always sets _savedConfig.

Test

Ran the full dream cycle repeatedly against a Supabase pooler: 0 connect() has not been called errors (previously every run), sync/synthesize green, 19 synth + 8 pattern pages written. Instance-pool worker paths unaffected.

…s nulling the shared pool (garrytan#1570)

A second PostgresEngine connected in module mode (poolSize unset) SHARES the
process-global db.ts singleton rather than owning a private pool. Its
disconnect() unconditionally calls db.disconnect(), tearing the singleton down
out from under the engine that established it. Any concurrent code path reading
db.getConnection() in that window throws "connect() has not been called".

Reproduced deterministically in the dream cycle: the lint phase's
resolveLintContentSanity() does engine.connect({}) (module mode) + disconnect()
to read 4 config keys; on a Supabase session pooler (frequent connection blips)
this nulls the main cycle engine's singleton mid-run, so sync/synthesize crash
(0 pages). Phases wrapped in withRetry+reconnect (extract) recover; others die.
v0.41.27's withRetry self-heal only covers retry-wrapped paths, so garrytan#1570
persisted for sync/synthesize.

Fix: track _ownsModuleSingleton (true only for the engine whose connect()
actually established the singleton, via new db.isConnected()) and gate
db.disconnect() on it. A sharing engine's disconnect() now leaves the global
intact. Instance pools (poolSize set) are unchanged. This restores reconnect()'s
own documented "no-ops in module-singleton mode" intent.
@joelwp

joelwp commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Closing — superseded by the canonical landing in #1805 (merged as v0.42.21.0), which generalizes this ownership-guard approach (also covers the doctor borrower, the reconnect-promise race, and adds a CLI tripwire). Thanks for folding in the ownership-flag approach from here. 🙏

@joelwp joelwp closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant