Skip to content

[Bug]: Gateway becomes CPU-saturated and agents stop progressing mid-work #74404

@najef1979-code

Description

@najef1979-code

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

Issues started when upgrading to 2026.4.23 and up, with last version 2026.4.25.
Every version got worse, higher cpu and slower startup and agents slow response:

OpenClaw Gateway becomes CPU-saturated and agents stop progressing mid-work until I send another message. During the bad state, simple HTTP health endpoints can respond, but Gateway WS/RPC commands time out or close during connect.

Environment

  • OpenClaw: 2026.4.26 (be8c246)
  • OS: Ubuntu 24.04.4 LTS
  • Node: v24.14.1
  • Install: npm global via nvm
  • Gateway bind: loopback, port 18789
  • systemd: not installed

Symptoms

  • Agents stop mid-work and only continue after I send another message.
  • openclaw-gateway reaches very high CPU:
    • observed 322% CPU
    • later observed 504% CPU
  • Gateway RPC becomes unreliable:
    • openclaw gateway stability --json --limit 100 failed with gateway timeout after 10000ms
    • openclaw cron show ... failed with gateway closed (1000 normal closure): no close reason
    • openclaw logs --plain ... failed with gateway closed (1000)
  • HTTP liveness still worked:
    • curl http://127.0.0.1:18789/healthz{"ok":true,"status":"live"}
    • curl http://[::1]:18789/healthz{"ok":true,"status":"live"}
    • /readyz returned {"ready":true,"failing":[]} at least once

Steps to reproduce

use 2026.4.23/24/25/26 or check online people complaining.

Expected behavior

in 2026.4.22 agents still respond quickly. Gateway uses max 10% at peak requests.

Actual behavior

CPU goes to 100% after a few minutes after gateway has started. Agents respond but slower and slower until they also stop during a session.

OpenClaw version

2026.4.23/24/25/26

Operating system

Ubuntu 24.04.4 LTS

Install method

npm global via nvm

Model

Minimax/Minimax m2.7

Provider / routing chain

Minimax/Minimax m2.7

Additional provider/model setup details

No response

Logs, screenshots, and evidence

## Evidence

Process list while CPU was high:
text
openclaw-gateway 504% CPU
Per-thread CPU showed many hot `libuv-worker` threads:
text
libuv-worker 78.4%
libuv-worker 63.9%
libuv-worker 46.5%

ibuv-worker 45.1%
...
openclaw-gateway main thread ~21%
`strace -f -p <gateway-pid> -c` showed activity including:
text
epoll_pwait
statx
futex
write/read
Path-level strace showed repeated package/plugin/cron-state paths:
text

/home/najef/.nvm/versions/node/v24.14.1/lib/node_modules/openclaw/dist/package.json = ENOENT
/home/najef/.nvm/versions/node/v24.14.1/lib/node_modules/openclaw/package.json
/home/najef/.nvm/versions/node/v24.14.1/lib/node_modules/openclaw/dist/extensions
/home/najef/.openclaw/cron/jobs-state.json
Logs repeatedly showed a stuck heartbeat session:
text
stuck session: sessionId=arnold sessionKey=agent:arnold:main:heartbeat state=processing age=6733s queueDepth=0
stuck session: sessionId=arnold sessionKey=agent:arnold:main:heartbeat state=processing age=6763s queueDepth=0
...
A cron task also timed out while this was happening:
json

{
  "runtime": "cron",
  "label": "Orion Heartbeat",
  "agentId": "neon",
  "childSessionKey": "agent:neon:telegram:direct:2101884310",
  "status": "timed_out",
  "error": "cron: job execution timed out"
}
## Expected behavior

Gateway should recover or fail stuck heartbeat/session work, not remain in `processing` for hours, and WS/RPC should not become unusable while HTTP health endpoints still respond.

## Notes
I have not confirmed the root cause. The strongest evidence is high CPU across many `libuv-worker` threads plus repeated stuck-session diagnostics and WS/RPC failures.

Impact and severity

Newer versions of OpenClaw are useless for me, they hang, and agents become non responsive.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions