fix(discord): clear stale heartbeat timers in SafeGatewayPlugin.connect()#65087
Conversation
Greptile SummaryAdds a Confidence Score: 5/5Safe to merge — targeted defensive fix with correct logic and appropriate test coverage for the reported crash scenario. No P0 or P1 issues found. The override correctly clears stale timers on both the normal and early-return code paths without introducing any new races. The mock faithfully represents the fields needed to exercise the override, and the test structure follows repo conventions. No files require special attention. Reviews (1): Last reviewed commit: "test(discord): assert super.connect() de..." | Re-trigger Greptile |
9a951fa to
ff03a1e
Compare
…ct()
The @buape/carbon@0.15.0 heartbeat setup has a race where stopHeartbeat()
runs before heartbeatInterval is assigned, leaving a stale setInterval with
a closed reconnectCallback. When the stale interval fires ~41s later it
throws an uncaught exception that bypasses the EventEmitter error path and
crashes the gateway process via process.on('uncaughtException').
Add a connect() override in SafeGatewayPlugin that unconditionally clears
both heartbeatInterval and firstHeartbeatTimeout before calling super. The
parent's connect() only calls stopHeartbeat() when isConnecting=false; when
isConnecting=true it returns early without clearing — this override fills
that gap.
Fixes openclaw#65009. Related: openclaw#64011, openclaw#63387, openclaw#62038.
The connect() override added in the heartbeat fix shifted the two pre-existing fetch() callsites from lines 370/436 to 387/453.
ff03a1e to
360d38e
Compare
…ct() (openclaw#65087) * fix(discord): clear stale heartbeat timers in SafeGatewayPlugin.connect() The @buape/carbon@0.15.0 heartbeat setup has a race where stopHeartbeat() runs before heartbeatInterval is assigned, leaving a stale setInterval with a closed reconnectCallback. When the stale interval fires ~41s later it throws an uncaught exception that bypasses the EventEmitter error path and crashes the gateway process via process.on('uncaughtException'). Add a connect() override in SafeGatewayPlugin that unconditionally clears both heartbeatInterval and firstHeartbeatTimeout before calling super. The parent's connect() only calls stopHeartbeat() when isConnecting=false; when isConnecting=true it returns early without clearing — this override fills that gap. Fixes openclaw#65009. Related: openclaw#64011, openclaw#63387, openclaw#62038. * test(discord): assert super.connect() delegation in SafeGatewayPlugin tests * fix(ci): update raw-fetch allowlist line numbers for gateway-plugin.ts The connect() override added in the heartbeat fix shifted the two pre-existing fetch() callsites from lines 370/436 to 387/453. * docs(changelog): add discord heartbeat crash note * test(cli): align plugin registry load-context mock --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
…ct() (openclaw#65087) * fix(discord): clear stale heartbeat timers in SafeGatewayPlugin.connect() The @buape/carbon@0.15.0 heartbeat setup has a race where stopHeartbeat() runs before heartbeatInterval is assigned, leaving a stale setInterval with a closed reconnectCallback. When the stale interval fires ~41s later it throws an uncaught exception that bypasses the EventEmitter error path and crashes the gateway process via process.on('uncaughtException'). Add a connect() override in SafeGatewayPlugin that unconditionally clears both heartbeatInterval and firstHeartbeatTimeout before calling super. The parent's connect() only calls stopHeartbeat() when isConnecting=false; when isConnecting=true it returns early without clearing — this override fills that gap. Fixes openclaw#65009. Related: openclaw#64011, openclaw#63387, openclaw#62038. * test(discord): assert super.connect() delegation in SafeGatewayPlugin tests * fix(ci): update raw-fetch allowlist line numbers for gateway-plugin.ts The connect() override added in the heartbeat fix shifted the two pre-existing fetch() callsites from lines 370/436 to 387/453. * docs(changelog): add discord heartbeat crash note * test(cli): align plugin registry load-context mock --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
…ct() (openclaw#65087) * fix(discord): clear stale heartbeat timers in SafeGatewayPlugin.connect() The @buape/carbon@0.15.0 heartbeat setup has a race where stopHeartbeat() runs before heartbeatInterval is assigned, leaving a stale setInterval with a closed reconnectCallback. When the stale interval fires ~41s later it throws an uncaught exception that bypasses the EventEmitter error path and crashes the gateway process via process.on('uncaughtException'). Add a connect() override in SafeGatewayPlugin that unconditionally clears both heartbeatInterval and firstHeartbeatTimeout before calling super. The parent's connect() only calls stopHeartbeat() when isConnecting=false; when isConnecting=true it returns early without clearing — this override fills that gap. Fixes openclaw#65009. Related: openclaw#64011, openclaw#63387, openclaw#62038. * test(discord): assert super.connect() delegation in SafeGatewayPlugin tests * fix(ci): update raw-fetch allowlist line numbers for gateway-plugin.ts The connect() override added in the heartbeat fix shifted the two pre-existing fetch() callsites from lines 370/436 to 387/453. * docs(changelog): add discord heartbeat crash note * test(cli): align plugin registry load-context mock --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
What does this PR do?
Adds a
connect()override toSafeGatewayPluginthat clears stale heartbeat timers before delegating to the parent, preventing an intermittent uncaught exception that crashes the Discord gateway process and drops in-flight replies.Root Cause
@buape/carbon@0.15.0has a race in its heartbeat initialisation:When
sendHeartbeatdetects a zombie connection it callsstopHeartbeat(), which clearsheartbeatInterval— but the interval has not been assigned yet. ThesetIntervalon the next line then creates a timer whose closure holdsclosed=true. When it fires ~41 seconds later,reconnectCallback(closed=true)throws inside asetIntervalcallback. Node.js routes this toprocess.on('uncaughtException'), bypassing theEventEmitter.on('error')path the gateway supervisor monitors. systemd restarts the service and in-flight replies fail.Solution Applied
Override
connect()inSafeGatewayPluginto unconditionally clear bothheartbeatIntervalandfirstHeartbeatTimeoutbefore callingsuper.connect():The parent's
connect()only callsstopHeartbeat()whenisConnecting=false. WhenisConnecting=trueit returns early — leaving any stale timer alive. This override runs before that early-return check, ensuring stale timers are always cleared on reconnect.Bottleneck Solved
setIntervalfiring with a closedreconnectCallback@buape/carbon@0.15.0without requiring a version bumpTesting
Two new unit tests added to
extensions/discord/src/monitor/gateway-plugin.test.tsverifying thatheartbeatIntervalandfirstHeartbeatTimeoutare cleared whenconnect()is called whileisConnecting=true.Fixes #65009