Summary
When the WhatsApp Web connection drops with status 499, OpenClaw enters a reconnect loop that retries every ~60 seconds indefinitely without exponential backoff. This creates a storm of reconnect attempts, log spam, and unnecessary resource consumption.
Observed Behavior
The reconnect cycle works like this:
- WhatsApp heartbeat detects no messages for 30+ minutes
- Forces reconnect → connection closed with status 499
- Schedules retry 1/12 in ~2 seconds → reconnects successfully
- 60 seconds later, heartbeat fires again → still no messages since the original timeout threshold
- Forces another reconnect → goto step 2
The cycle repeats because the heartbeat uses the original lastInboundAt timestamp (which never gets updated since no actual messages arrive), so every new connection immediately triggers a new timeout detection.
This went on for hours (2:04 PM to 8:48 PM on April 2, 2026) generating hundreds of reconnect cycles.
Log Evidence
14:04:58 No messages received in 39m - restarting connection
14:06:01 No messages received in 40m - restarting connection
14:07:05 No messages received in 41m - restarting connection
...
14:36:45 No messages received in 71m - restarting connection
...
20:41:06 No messages received in 30m - restarting connection
20:42:10 No messages received in 31m - restarting connection
20:43:14 No messages received in 32m - restarting connection
...continued until gateway was killed by auto-update at 20:48
Each cycle also triggers the false creds.json corruption restore (see #60625).
Root Cause Analysis
Two issues compound:
-
No backoff between heartbeat-driven reconnects: After a successful reconnect, the heartbeat should reset its "time since last message" counter or at minimum apply exponential backoff before forcing another reconnect.
-
lastInboundAt is not reset on reconnect: The heartbeat keeps comparing against the original last-message timestamp. Since no new messages arrive (likely because it's nighttime and nobody is messaging), every 60-second heartbeat check immediately exceeds the 30-minute threshold and forces yet another reconnect.
Expected Behavior
- After a reconnect, the heartbeat timer should reset (using the reconnect time as the new baseline)
- If multiple consecutive reconnects fail to receive messages, apply exponential backoff (e.g., 30min → 1h → 2h → 4h cap)
- After N consecutive failed reconnect cycles (e.g., 5-10), stop attempting and log an error suggesting manual intervention
- The 91MB error log generated by this loop should not be possible in normal operation
Impact
- Generated 91MB of error log in one night
- Constant WhatsApp reconnection churn
- Combined with the update that followed, left the gateway down for ~21 hours
Environment
- OpenClaw: 2026.4.2 (observed on 2026.3.31 before update)
- OS: macOS 25.3.0 (ARM64)
- Node: v25.6.1
- WhatsApp account type: Personal (linked device)
Related
Summary
When the WhatsApp Web connection drops with status 499, OpenClaw enters a reconnect loop that retries every ~60 seconds indefinitely without exponential backoff. This creates a storm of reconnect attempts, log spam, and unnecessary resource consumption.
Observed Behavior
The reconnect cycle works like this:
The cycle repeats because the heartbeat uses the original
lastInboundAttimestamp (which never gets updated since no actual messages arrive), so every new connection immediately triggers a new timeout detection.This went on for hours (2:04 PM to 8:48 PM on April 2, 2026) generating hundreds of reconnect cycles.
Log Evidence
Each cycle also triggers the false creds.json corruption restore (see #60625).
Root Cause Analysis
Two issues compound:
No backoff between heartbeat-driven reconnects: After a successful reconnect, the heartbeat should reset its "time since last message" counter or at minimum apply exponential backoff before forcing another reconnect.
lastInboundAtis not reset on reconnect: The heartbeat keeps comparing against the original last-message timestamp. Since no new messages arrive (likely because it's nighttime and nobody is messaging), every 60-second heartbeat check immediately exceeds the 30-minute threshold and forces yet another reconnect.Expected Behavior
Impact
Environment
Related