Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
During a 60s gateway CPU sample, CPU averaged 83.66% with 42 of 60 samples at or above 100%, while diagnostic heartbeats still showed active and queued work (active=1 queued=3).
Steps to reproduce
- Start OpenClaw 2026.5.20 in a gateway development session.
- Run an agent/tool workload that keeps the gateway process CPU-bound.
- Capture
pidstat -h -u -r -d -p <gateway-pid> 1 60 and correlate it with gateway diagnostic heartbeats.
- Observe sustained CPU saturation while gateway heartbeats continue to report active and queued work.
Expected behavior
Under sustained CPU/event-loop pressure with queued work, lower-priority session mirror events should back off while terminal and run-scoped tool events still flow so user-visible tool cards can complete.
Actual behavior
The observed sample kept reporting active and queued work during CPU saturation, with no visible lower-priority event backoff in the correlated diagnostic window.
OpenClaw version
2026.5.20
Operating system
Linux (pidstat capture)
Install method
pnpm dev
Model
NOT_ENOUGH_INFO
Provider / routing chain
NOT_ENOUGH_INFO
Additional provider/model setup details
NOT_ENOUGH_INFO
Logs, screenshots, and evidence
pidstat summary:
rows=60
avg_cpu=83.66
avg_usr=79.42
avg_sys=4.25
cpu_ge_90_count=43
cpu_ge_100_count=42
max_cpu=190.0 at 05:07:47
avg_rd=0.00
avg_wr=90.16
Selected raw pidstat rows:
05:07:11 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 2 0.00 0.00 21352908 1381816 4.20 0.00 0.00 0.00 0 openclaw
05:07:40 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 3 0.00 0.00 21359092 1387828 4.22 0.00 0.00 0.00 0 openclaw
05:07:47 1000 <gateway-pid> 177.00 13.00 0.00 0.00 190.00 3 7079.00 0.00 21237004 1265796 3.85 0.00 8.00 0.00 0 openclaw
05:08:10 1000 <gateway-pid> 101.00 0.00 0.00 0.00 101.00 15 10.00 0.00 21237004 1265884 3.85 0.00 0.00 0.00 0 openclaw
Gateway log correlation:
2026-05-21T05:04:17.715+00:00 diagnostic heartbeat: webhooks=0/0/0 active=2 waiting=0 queued=5
2026-05-21T05:07:27.790+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:07:47.187+00:00 agent/embedded embedded run tool start: runId=[redacted run id] tool=exec toolCallId=[redacted tool call id]
2026-05-21T05:08:04.306+00:00 fetch-timeout timeoutMs=10000 elapsedMs=17175 timerDelayMs=7175 eventLoopDelayHint="timer delayed 7175ms, likely event-loop starvation"
2026-05-21T05:08:04.316+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:08:41.295+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
Impact and severity
Affected: gateway sessions under CPU-heavy agent/tool workloads.
Severity: Medium-high; queued work can remain behind CPU-heavy active work while low-priority event streams continue.
Frequency: Observed in the captured 60s CPU sample.
Consequence: Chat latency increases under load, and timer-based checks can drift during event-loop pressure.
Additional information
The fix should preserve terminal tool events so UI tool cards do not remain stale while backing off lower-priority session mirror traffic during diagnostic queue pressure.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
During a 60s gateway CPU sample, CPU averaged 83.66% with 42 of 60 samples at or above 100%, while diagnostic heartbeats still showed active and queued work (
active=1 queued=3).Steps to reproduce
pidstat -h -u -r -d -p <gateway-pid> 1 60and correlate it with gateway diagnostic heartbeats.Expected behavior
Under sustained CPU/event-loop pressure with queued work, lower-priority session mirror events should back off while terminal and run-scoped tool events still flow so user-visible tool cards can complete.
Actual behavior
The observed sample kept reporting active and queued work during CPU saturation, with no visible lower-priority event backoff in the correlated diagnostic window.
OpenClaw version
2026.5.20
Operating system
Linux (pidstat capture)
Install method
pnpm dev
Model
NOT_ENOUGH_INFO
Provider / routing chain
NOT_ENOUGH_INFO
Additional provider/model setup details
NOT_ENOUGH_INFO
Logs, screenshots, and evidence
Impact and severity
Affected: gateway sessions under CPU-heavy agent/tool workloads.
Severity: Medium-high; queued work can remain behind CPU-heavy active work while low-priority event streams continue.
Frequency: Observed in the captured 60s CPU sample.
Consequence: Chat latency increases under load, and timer-based checks can drift during event-loop pressure.
Additional information
The fix should preserve terminal tool events so UI tool cards do not remain stale while backing off lower-priority session mirror traffic during diagnostic queue pressure.