Summary
After upgrading OpenClaw from 2026.4.24 to newer releases (2026.4.29 and later 2026.5.2), the gateway exhibits severe performance regressions affecting both control-plane responsiveness and overall system stability.
Symptoms include:
- CPU pinned near 100% (Node process)
- control-plane RPC calls becoming extremely slow or timing out
- UI/WebSocket polling effectively unusable
- intermittent fetch timeouts to external APIs (despite system-level connectivity being fine)
- significant improvement after reverting both binary version and config state
This strongly suggests a regression introduced after 2026.4.24, potentially involving interaction with config/state written by newer versions.
Environment
- Host: Linux x64
- OpenClaw versions tested:
2026.4.24 (stable)
2026.4.26
2026.4.29
2026.5.2
- Node versions:
- Gateway mode:
- systemd user service
- LAN binding
- Main model route:
- Features:
- Telegram enabled
- Bonjour disabled (in most tests)
- memory search disabled (in later tests)
Symptoms on newer versions (2026.4.29+)
- gateway process consumes ~95–100% CPU
- control-plane RPC latency becomes extreme:
node.list: up to ~85–109s
agents.list: ~67s
sessions.list: ~58s
cron.list / cron.status: up to ~144s
models.list: several seconds, sometimes ~18–29s
/health and root HTTP endpoints may timeout while WS/RPC still partially function
- Telegram provider operations timing out:
getMe
setMyCommands
deleteWebhook
- internal fetch timeouts, while system-level
curl works
Representative logs:
liveness warning: eventLoopUtilization=1
eventLoopDelayMaxMs in tens of seconds
fetch timeout after ... operation=fetchWithTimeout
CommandLaneTaskTimeoutError
Key Observations
1. System networking is healthy
- DNS resolution: OK
- TLS handshake: OK
- Direct
curl to Telegram API: OK
Indicates the issue is internal (event loop / runtime), not network-level.
2. Config/state from newer versions degrades older versions
- After running newer releases, even older binaries perform worse
- Warning observed when loading config:
- “Config was last written by a newer OpenClaw..."
Suggests possible config/state migration side effects.
3. Startup shows heavy control-plane work
Notable time spent in:
plugins.bootstrap
http.bound
post-attach
Implies overhead beyond normal channel initialization.
4. Polling pattern is normal, but handler cost is abnormal
UI calls appear standard (node.list, cron.status, etc.), but execution cost becomes extremely high in newer versions.
Background / Cron Observations
cron.status / cron.list sometimes normal, sometimes extremely slow
- memory/dreaming job historically takes tens of seconds
- heartbeat activity present in logs
However:
- disabling heartbeat did not fully fix issue
- disabling memory dreaming did not fully fix issue
Conclusion: contributing factors, but not root cause.
What was tried
- disable
bonjour
- disable
telegram
- disable
memory-wiki
- disable
kilocode
- disable
acpx
- disable
browser
- disable memory search
- Node 22 vs Node 24
- clean reinstall
- config cleanup
- archive session transcripts
- trim
MEMORY.md
- disable heartbeat
- disable memory-core dreaming
Result: partial improvements, but regression persists on newer versions.
What resolved the issue
- downgrade to
2026.4.24
- restore older-compatible config backup
- reapply minimal config:
- Telegram enabled
- Bonjour disabled
- memory search disabled
- memory-core dreaming disabled
Result:
- CPU dropped significantly
- control-plane stabilized
- polling overhead reduced
Suspected regression areas
Likely candidates:
- control-plane RPC handlers
node.list / presence pipeline
- cron collectors (
cron.status, cron.list)
- model/provider listing path
- internal fetch (undici) under event-loop pressure
- config/state migration logic
Request
Please investigate regressions introduced after 2026.4.24, especially in:
- control-plane polling handlers
- cron/status collectors
- node presence (
node.list)
- model/provider listing
- config migration / effective config behavior
- fetch timeouts under high event-loop utilization
Additional sanitized logs can be provided if needed.
Summary
After upgrading OpenClaw from
2026.4.24to newer releases (2026.4.29and later2026.5.2), the gateway exhibits severe performance regressions affecting both control-plane responsiveness and overall system stability.Symptoms include:
This strongly suggests a regression introduced after
2026.4.24, potentially involving interaction with config/state written by newer versions.Environment
2026.4.24(stable)2026.4.262026.4.292026.5.2v22.22.2v24.15.0Symptoms on newer versions (
2026.4.29+)node.list: up to ~85–109sagents.list: ~67ssessions.list: ~58scron.list/cron.status: up to ~144smodels.list: several seconds, sometimes ~18–29s/healthand root HTTP endpoints may timeout while WS/RPC still partially functiongetMesetMyCommandsdeleteWebhookcurlworksRepresentative logs:
liveness warning: eventLoopUtilization=1eventLoopDelayMaxMsin tens of secondsfetch timeout after ... operation=fetchWithTimeoutCommandLaneTaskTimeoutErrorKey Observations
1. System networking is healthy
curlto Telegram API: OKIndicates the issue is internal (event loop / runtime), not network-level.
2. Config/state from newer versions degrades older versions
Suggests possible config/state migration side effects.
3. Startup shows heavy control-plane work
Notable time spent in:
plugins.bootstraphttp.boundpost-attachImplies overhead beyond normal channel initialization.
4. Polling pattern is normal, but handler cost is abnormal
UI calls appear standard (
node.list,cron.status, etc.), but execution cost becomes extremely high in newer versions.Background / Cron Observations
cron.status/cron.listsometimes normal, sometimes extremely slowHowever:
Conclusion: contributing factors, but not root cause.
What was tried
bonjourtelegrammemory-wikikilocodeacpxbrowserMEMORY.mdResult: partial improvements, but regression persists on newer versions.
What resolved the issue
2026.4.24Result:
Suspected regression areas
Likely candidates:
node.list/ presence pipelinecron.status,cron.list)Request
Please investigate regressions introduced after
2026.4.24, especially in:node.list)Additional sanitized logs can be provided if needed.