Summary
scheduleReconnect() in src/gateway/client.ts retries indefinitely with exponential backoff capped at 30s, but has no maximum retry count. If the gateway is unreachable, the node client will loop forever, accumulating WebSocket listeners and zombie processes.
Root Cause
private scheduleReconnect() {
if (this.closed) return;
const delay = this.backoffMs;
this.backoffMs = Math.min(this.backoffMs * 2, 30_000);
setTimeout(() => this.start(), delay).unref(); // always retries, no limit
}
Impact
- Infinite loop when gateway is permanently unreachable
- Each reconnect attempt creates a new WebSocket + event listeners
- Observed: 272 zombie processes in Railway container after 266-day uptime
- Mac node CPU sustained at 16-18% during idle (expected <1%)
Suggested Fix
private retryCount = 0;
private readonly maxRetries = 10;
private scheduleReconnect() {
if (this.closed) return;
if (this.retryCount >= this.maxRetries) {
this.opts.onConnectError?.(new Error("Max reconnection attempts reached"));
return;
}
this.retryCount++;
const delay = this.backoffMs;
this.backoffMs = Math.min(this.backoffMs * 2, 30_000);
setTimeout(() => this.start(), delay).unref();
}
// Reset on successful connect:
private onConnect() {
this.retryCount = 0;
// ... rest of connect logic
}
Note
The LaunchAgent on macOS already handles process restart (30s interval), so the node client itself does not need to retry indefinitely — it can fail fast and let the OS-level supervisor handle the restart.
Environment
Summary
scheduleReconnect()insrc/gateway/client.tsretries indefinitely with exponential backoff capped at 30s, but has no maximum retry count. If the gateway is unreachable, the node client will loop forever, accumulating WebSocket listeners and zombie processes.Root Cause
Impact
Suggested Fix
Note
The LaunchAgent on macOS already handles process restart (30s interval), so the node client itself does not need to retry indefinitely — it can fail fast and let the OS-level supervisor handle the restart.
Environment