fix(engine): module-singleton ownership guard — stop stray sub-engines nulling the shared pool (#1570)#1667
Closed
joelwp wants to merge 1 commit into
Closed
Conversation
…s nulling the shared pool (garrytan#1570) A second PostgresEngine connected in module mode (poolSize unset) SHARES the process-global db.ts singleton rather than owning a private pool. Its disconnect() unconditionally calls db.disconnect(), tearing the singleton down out from under the engine that established it. Any concurrent code path reading db.getConnection() in that window throws "connect() has not been called". Reproduced deterministically in the dream cycle: the lint phase's resolveLintContentSanity() does engine.connect({}) (module mode) + disconnect() to read 4 config keys; on a Supabase session pooler (frequent connection blips) this nulls the main cycle engine's singleton mid-run, so sync/synthesize crash (0 pages). Phases wrapped in withRetry+reconnect (extract) recover; others die. v0.41.27's withRetry self-heal only covers retry-wrapped paths, so garrytan#1570 persisted for sync/synthesize. Fix: track _ownsModuleSingleton (true only for the engine whose connect() actually established the singleton, via new db.isConnected()) and gate db.disconnect() on it. A sharing engine's disconnect() now leaves the global intact. Instance pools (poolSize set) are unchanged. This restores reconnect()'s own documented "no-ops in module-singleton mode" intent.
Contributor
Author
|
Closing — superseded by the canonical landing in #1805 (merged as v0.42.21.0), which generalizes this ownership-guard approach (also covers the doctor borrower, the reconnect-promise race, and adds a CLI tripwire). Thanks for folding in the ownership-flag approach from here. 🙏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A second
PostgresEngineconnected in module mode (nopoolSize) shares the process-globaldb.tssingleton instead of owning a private pool. Itsdisconnect()unconditionally callsdb.disconnect(), tearing down the singleton out from under the engine that established it. Any concurrent code readingdb.getConnection()in that window throwsNo database connection: connect() has not been called.This is the root cause behind #1570 for non-retry-wrapped callers.
Deterministic repro (dream cycle)
The
lintphase'sresolveLintContentSanity()creates a throwaway engine viaengine.connect({})(module mode) purely to read 4 config keys, thendisconnect()s it. On a Supabase session pooler (frequent connection blips) this nulls the main cycle engine's singleton mid-run, sosync/synthesizecrash withconnect() has not been called→ 0 pages written. Phases wrapped inwithRetry+reconnect(e.g.extract) self-heal; un-wrapped phases die.Captured live with per-instance tracing:
v0.41.27's
withRetryself-heal only covers retry-wrapped paths, so #1570 persisted for everyone else whenever a stray module-mode sub-engine is torn down.Fix
Track
_ownsModuleSingleton— true only for the engine whoseconnect()actually established the singleton (via newdb.isConnected()) — and gatedb.disconnect()on it. A sharing engine'sdisconnect()now leaves the global intact. Instance pools (poolSizeset) are unchanged. This also restoresreconnect()'s own documented "no-ops in module-singleton mode" intent, which never fired becauseconnect()always sets_savedConfig.Test
Ran the full dream cycle repeatedly against a Supabase pooler: 0
connect() has not been callederrors (previously every run),sync/synthesizegreen, 19 synth + 8 pattern pages written. Instance-pool worker paths unaffected.