Skip to content

[Bug]: WhatsApp event loop blocked (eventLoopDelayMaxMs=12088ms) on first inbound message — 2026.5.3-1 Windows #77443

@zaff70

Description

@zaff70

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

After upgrading to 2026.5.3-1 on Windows 11, the WhatsApp plugin blocks the Node.js event loop (eventLoopDelayMaxMs=12088ms, eventLoopUtilization=0.999) immediately after processing the first inbound message, permanently stalling all agent model calls.

Steps to reproduce

  1. Install OpenClaw 2026.5.3-1 on Windows 11 (npm install -g openclaw@2026.5.3-1)
  2. Install WhatsApp plugin manually (openclaw plugins install @openclaw/whatsapp --force will fail due to npm bug; workaround: cd C:\Users<user>.openclaw\npm && npm install @openclaw/whatsapp@2026.5.3 --omit=dev --ignore-scripts --no-audit --no-fund --legacy-peer-deps)
  3. Enable plugin in config: openclaw config patch with {"plugins": {"entries": {"whatsapp": {"enabled": true}}}}
  4. Start gateway: gateway-start.cmd
  5. Confirm WhatsApp connects and shows Listening for personal WhatsApp inbound messages
  6. Send any inbound WhatsApp message to the linked number
  7. Observe: gateway processes the message and sends ack reaction, then event loop blocks permanently

Expected behavior

WhatsApp should process inbound messages and continue operating normally without blocking the event loop, as it did in OpenClaw 2026.4.23 (a979721) where the same phone number, credentials, and configuration handled messages stably with no event loop degradation.

Actual behavior

After the first inbound message is processed, the gateway logs the following liveness warning:

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=34s eventLoopDelayP99Ms=12088 eventLoopDelayMaxMs=12088 eventLoopUtilization=0.999 cpuCoreRatio=1.006 active=2 waiting=0 queued=0

The WhatsApp channel then exits with status 428 (Connection Terminated) and attempts auto-restart. All subsequent agent model calls stall indefinitely. Rolling back to 2026.4.23 immediately restores normal operation with the same credentials and configuration.

OpenClaw version

OpenClaw 2026.5.3-1 (2eae30e)

Operating system

Windows 11 home

Install method

npm global install (npm install -g openclaw@2026.5.3-1)

Model

anthropic/claude-sonnet-4-6 (primary), openai/gpt-5 (fallback)

Provider / routing chain

Direct Anthropic API (api_key mode) — no proxy or router

Additional provider/model setup details

Two agents configured: main (primary: anthropic/claude-sonnet-4-6, fallback: openai/gpt-5) and cron (primary: openai/gpt-4o-mini, fallback: anthropic/claude-haiku-4-5). Active Memory plugin enabled with openai/gpt-4o-mini. All models use api_key auth profiles.

Logs, screenshots, and evidence

19:06:15 [gateway] ready (8 plugins: acpx, active-memory, browser, device-pair, memory-core, phone-control, talk-voice, whatsapp; 84.4s)
19:16:25 [whatsapp] [default] starting provider (+4795001666)
19:16:36 [whatsapp] Listening for personal WhatsApp inbound messages.

18:58:14 [web-inbound] inbound message from=+4746931740 to=+4795001666 body="Hei"
18:58:14 [web-auto-reply] sending ack reaction emoji=👍
18:58:23 [whatsapp/outbound] Sent reaction "👍" -> message ACEDC3F68EF997DA7ECE2D0458DF9504

18:59:07 [web-reconnect] connection closed status=408 reconnectAttempts=1
18:59:07 [channels/whatsapp] WhatsApp Web connection closed (status 408). Retry 1/12 in 2.39s…
18:59:51 [channels/whatsapp] [default] channel exited: statusCode=428 "Connection Terminated"
18:59:51 [channels/whatsapp] [default] auto-restart attempt 1/10 in 5s

19:00:14 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu 
  interval=34s eventLoopDelayP99Ms=12088 eventLoopDelayMaxMs=12088 
  eventLoopUtilization=0.999 cpuCoreRatio=1.006 active=2 waiting=0 queued=0

Additional context:

OS: Windows 11 (10.0.26200 x64)
Node: v22.22.2
WhatsApp plugin: @openclaw/whatsapp 2026.5.3
Same behavior observed on first upgrade attempt to 2026.5.2 (eventLoopDelayMaxMs=8497ms) — rolled back. Now reproduced again on 2026.5.3-1 with higher delay (12088ms).
Rollback to 2026.4.23 (a979721) immediately restores stable operation with identical credentials, config, and WhatsApp session.
Note: openclaw plugins install @openclaw/whatsapp fails on Windows with npm error Cannot read properties of undefined (reading 'spec') — likely a separate bug related to npm v10.9.7 bundled in OpenClaw conflicting with the plugin's peerDependencies structure.

Impact and severity

Affected users: Any OpenClaw user on Windows running WhatsApp as a primary channel who upgrades to 2026.5.3-1
Severity: Blocks workflow completely — WhatsApp becomes non-functional immediately after the first inbound message
Frequency: 100% reproducible — occurs on every gateway start upon receiving the first inbound WhatsApp message
Consequence: All inbound WhatsApp messages go unanswered, all cron agent model calls stall indefinitely, and the only recovery is a full rollback to 2026.4.23. This is the third consecutive release (4.29, 5.2, 5.3-1) where WhatsApp on Windows has been broken on upgrade, suggesting a systematic issue with the Windows event loop and the WhatsApp plugin's Baileys runtime on this platform.

Additional information

Last known good: 2026.4.23 (a979721) — stable, WhatsApp fully functional on same machine/credentials
First known bad: 2026.4.29 (first upgrade attempt) — WhatsApp 408 timeout loop; 2026.5.2 — event loop blocked (8497ms); 2026.5.3-1 — event loop blocked (12088ms, worse than 5.2)
Plugin install bug: openclaw plugins install @openclaw/whatsapp fails on Windows with npm error Cannot read properties of undefined (reading 'spec') in @npmcli/arborist reify.js:1404. Workaround used: manual npm install with --legacy-peer-deps in ~/.openclaw/npm/, followed by manual edits to plugins/installs.json to correct source paths from .ts to dist/*.js. This suggests the plugin installer on Windows is broken independently of the event loop bug.
Event loop pattern: The block occurs during agent model call prep (session-resource-loader: 10616ms) immediately after the first inbound message triggers an embedded agent run, suggesting the Baileys WebSocket reconnect and the agent startup compete for the event loop on Windows in a way that does not occur on 2026.4.23.
Gateway startup time: 2026.5.3-1 starts in ~8-10 seconds vs ~84 seconds for 2026.4.23 (which installs bundled runtime deps on first start). The faster startup in 5.x may be related to the different plugin loading path that exposes the event loop issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions