-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Description
Bug Report: WhatsApp gateway fails to reconnect after DNS resolution failure
Summary
When the WhatsApp gateway encounters a DNS resolution failure (ENOTFOUND), it exits the channel and does not attempt automatic reconnection, even after DNS is restored. The gateway process continues running but the WhatsApp listener remains dead until manual intervention (gateway restart or relink).
Environment
- Clawdbot version: 2026.1.24-3
- Node version: v25.4.0
- OS: macOS Darwin 22.6.0 (x64)
- Channel: WhatsApp
Steps to Reproduce
- Have WhatsApp gateway connected and working
- DNS resolution fails for
web.whatsapp.com(e.g., due to DNS server issues) - Gateway logs the DNS error and exits the WhatsApp channel
- DNS is restored and working
- Observe that WhatsApp does not automatically reconnect
Expected Behavior
After a DNS resolution failure, the gateway should:
- Retry DNS resolution with exponential backoff
- Automatically reconnect the WhatsApp listener once DNS is available again
- Not require manual restart or relink
Actual Behavior
The WhatsApp channel exits permanently after a DNS failure. The gateway process continues running, but:
- WhatsApp listener is marked as inactive
- No automatic reconnection attempts are made
- Manual
clawdbot gateway restartis required to restore connectivity - Attempting to relink via UI/tool times out ("Timed out waiting for WhatsApp QR") even though
whatsapp_loginreports "WhatsApp is already linked"
Timeline from Logs
2026-01-26T02:19:40.197Z [whatsapp] Web connection closed (status 408). Retry 1/12 in 2.06s…
2026-01-26T02:19:45.865Z [whatsapp] Listening for personal WhatsApp inbound messages. ← reconnected OK
2026-01-26T04:20:32.556Z [whatsapp] Web connection closed (status 408). Retry 1/12 in 2.12s…
2026-01-26T04:20:34.866Z [whatsapp] [default] channel exited: {
"error": {
"data": {
"errno": -3008,
"code": "ENOTFOUND",
"syscall": "getaddrinfo",
"hostname": "web.whatsapp.com"
},
"output": {
"statusCode": 408,
"payload": {
"error": "Request Time-out",
"message": "WebSocket Error (getaddrinfo ENOTFOUND web.whatsapp.com)"
}
}
}
}
After 04:20:34, no further reconnection attempts were logged. The gateway continued running (PID unchanged) but WhatsApp remained dead.
At 07:20, DNS was confirmed working:
$ nslookup web.whatsapp.com 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8#53
web.whatsapp.com canonical name = mmx-ds.cdn.whatsapp.net.
Name: mmx-ds.cdn.whatsapp.net
Address: 57.144.239.32Yet at 07:20, attempts to send messages failed:
[tools] message failed: Error: No active WhatsApp Web listener (account: default).
Start the gateway, then link WhatsApp with: clawdbot channels login --channel whatsapp --account default.
Attempts to generate a new QR code also failed:
Failed to get QR: Error: Timed out waiting for WhatsApp QR
Resolution
Only a full clawdbot gateway restart restored WhatsApp connectivity:
2026-01-26T11:15:46.170Z [gateway] signal SIGTERM received
2026-01-26T11:15:57.919Z [gateway] listening on ws://0.0.0.0:18789 (PID 13538)
2026-01-26T11:15:58.020Z [whatsapp] [default] starting provider (+447580380000)
2026-01-26T11:15:59.454Z [whatsapp] Listening for personal WhatsApp inbound messages.
Analysis
The retry logic handles status 408 (timeout) correctly — it retries and reconnects. However, when a retry itself fails due to DNS (ENOTFOUND), the channel exits completely rather than continuing to retry.
The WhatsApp provider enters a dead state where:
- It's not listening for messages
- It won't respond to relink requests (times out)
- The only recovery is a full gateway restart
Suggested Fix
- DNS failures during reconnection should not exit the channel — they should trigger continued retries with backoff
- Consider a "channel health check" that can restart dead channels without requiring a full gateway restart
- The
whatsapp_logintool should be able to force-restart a dead channel, not just generate a QR for an unlinked account