fix(whatsapp): catch connection-phase errors in reconnect loop#14484
fix(whatsapp): catch connection-phase errors in reconnect loop#14484onthway wants to merge 5 commits intoopenclaw:mainfrom
Conversation
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: src/web/inbound/monitor.ts
Line: 40:43
Comment:
**Leaked socket on connect failure**
`monitorWebInbox()` creates `sock` and then `await waitForWaConnection(sock)` (`src/web/inbound/monitor.ts:40-43`). If `waitForWaConnection` rejects (e.g. TLS/handshake errors), the function throws without closing `sock.ws`, and after this PR those errors will be retried in a loop. That can accumulate dangling connections/FDs during repeated failures; consider ensuring the socket is closed in a `catch`/`finally` path when connection setup fails.
How can I resolve this? If you propose a fix, please make it concise. |
monitorWebInbox() was called without try-catch inside the while(true) reconnect loop. If the connection attempt itself failed (e.g. ECONNRESET during TLS handshake to web.whatsapp.com), the exception propagated out of the loop and killed the channel permanently — even with maxAttempts:0 (unlimited retries). Wrap the listener creation in try-catch so connection-phase errors are treated as retryable disconnects with the same backoff/maxAttempts logic as post-connection closes. Co-Authored-By: onthway <hruicn@gmail.com>
c4bd407 to
3fce4bd
Compare
bfc1ccb to
f92900f
Compare
This comment was marked as spam.
This comment was marked as spam.
|
Thanks @nikolasdehor for the thorough comparison and for flagging the test from #9727! I've addressed both points:
Should be ready for review now. cc @tyler6204 @gumadeiras @sebslight |
nikolasdehor
left a comment
There was a problem hiding this comment.
Thorough re-review after the latest commits (3fce4bd, 85c0b18, 7accb75, 0590b23). Everything looks good — approving.
Socket leak fix (src/web/inbound/monitor.ts) — sock.ws?.close() in the catch path after waitForWaConnection rejects is exactly the right fix. The inner try/catch around the close call is good defensive coding since the socket may be partially initialized. This directly addresses the Greptile feedback and prevents FD accumulation during repeated connection failures.
Reconnect loop fix (src/web/auto-reply/monitor.ts) — The try-catch around listenerFactory invocation correctly routes connection-phase errors (ECONNRESET, ENOTFOUND, etc.) into the existing backoff/retry path. The catch block properly:
- Increments
reconnectAttemptsand updates status - Respects
maxAttemptslimit - Computes backoff via
computeBackoff - Handles abort signal during sleep
- Uses
formatDurationPrecise(not the oldformatDurationMs)
Test (reconnects-after-initial-connection-failure) — Good port from #9727. The mock correctly throws on first attempt, succeeds on second, and verifies both the retry and the backoff sleep. The sendMessage/sendPoll/sendReaction/sendComposingTo stubs (commit 7accb75) properly match the expected return type.
All CI checks passing. This fixes the root cause of #13371 — connection-phase exceptions no longer crash the reconnect loop. Ship it!
|
Closing as duplicate of #9727. If this is incorrect, please contact us. |
Summary
monitorWebInbox()inside thewhile(true)reconnect loop was not wrapped in try-catchECONNRESETduring TLS handshake toweb.whatsapp.com), the exception propagated out of the loop, permanently killing the WhatsApp channel — even withmaxAttempts: 0(unlimited retries)Root cause
The reconnect loop handles two phases differently:
listener.onClose).catch()→ retryablemonitorWebInbox())Real-world failure sequence observed:
monitorWebInbox()→waitForWaConnection()→ TLS handshakeECONNRESETonweb.whatsapp.com:443→ promise rejectswhile(true)→ channel exits permanentlyTest plan
pnpm buildpassespnpm testpasses (6492 passed, 4 pre-existing failures unrelated)🤖 Generated with Claude Code
Greptile Overview
Greptile Summary
This change wraps WhatsApp Web listener creation inside the reconnect
while (true)loop with atry/catch, so connection-phase failures (e.g. errors thrown bymonitorWebInbox()/waitForWaConnection()) flow through the same backoff/retry path as post-connection closes.The logic fits the existing reconnect design (backoff via
computeBackoff, status emission viatuning.statusSink, and disconnect events viaenqueueSystemEvent).Confidence Score: 3/5
monitorWebInboxallocates a socket before awaiting connection and does not close it on connection-phase rejection; with the new retry behavior that can accumulate resources in real-world flaky-network scenarios.(2/5) Greptile learns from your feedback when you react with thumbs up/down!