Bug type
Regression (worked before, now fails)
Beta release blocker
Yes
Summary
The sessions.list WebSocket operation is taking 2-5 seconds to complete, causing the event loop to become saturated. This appears to be a performance regression.
Version: 2026.5.3-1 (2eae30e)
Environment:
- Linux 6.17.0-23-generic, Node v24.14.1 via NVM
- Gateway running via systemd user service
- 17 agents configured, 104 sessions in neon session store
- System: 32GB RAM, NVMe disk
Steps to reproduce
-
Set up environment:
- Install OpenClaw gateway (any recent version)
- Configure 17+ agents with active session stores
- Have 100+ sessions in session.json (neon agent has 104)
-
Connect control UI:
- Connect the OpenClaw control UI dashboard to the gateway
- Multiple clients polling simultaneously increases load
-
Observe symptoms:
- Run
openclaw logs --follow to see sessions.list taking 2-5 seconds
- Run
openclaw health to see "Gateway event loop: degraded"
- Run
top to see gateway CPU at 500-800%
- Run
openclaw gateway stability --json to check event loop delays
Expected behavior
Expected vs Actual:
- Expected: sessions.list completes in <100ms
- Actual: sessions.list takes 1800-4300ms
- Event loop utilization jumps from ~0.1 to 0.9-1.0
Actual behavior
Expected vs Actual:
- Expected: sessions.list completes in <100ms
- Actual: sessions.list takes 1800-4300ms
- Event loop utilization jumps from ~0.1 to 0.9-1.0
OpenClaw version
2026.5.3-1 (2eae30e)
Operating system
Linux 6.17.0-23-generic (ubuntu)
Install method
Node v24.14.1 via NVM
Model
Minimax/Minimax.m2.7
Provider / routing chain
Minimax/Minimax.m2.7
Additional provider/model setup details
No response
Logs, screenshots, and evidence
**Log Evidence:**
### sessions.list timing (showing slow responses)
2026-05-04T12:50:53.609Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3244ms conn=99b02c46…bf85 id=3c271020…172f
2026-05-04T12:50:57.864Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 4234ms conn=99b02c46…bf85 id=1c9b2f72…5bb2
2026-05-04T12:51:12.478Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2308ms conn=99b02c46…bf85 id=37a4e542…f959
2026-05-04T12:51:24.276Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 1852ms conn=99b02c46…bf85 id=37cd3730…1cf7
2026-05-04T12:51:32.188Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2946ms conn=99b02c46…bf85 id=57c061d0…fd2a
2026-05-04T12:51:48.132Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2932ms conn=99b02c46…bf85 id=c8c0468a…e9ec
2026-05-04T12:51:50.788Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2440ms conn=99b02c46…bf85 id=78b75b25…c0b6
2026-05-04T12:55:15.993Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2797ms conn=99b02c46…bf85 id=fbd3e198…08c0
2026-05-04T12:55:20.320Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 4292ms conn=99b02c46…bf85 id=7a7e4994…bf5a
2026-05-04T12:55:26.355Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2261ms conn=99b02c46…bf85 id=24e823e…87a5
2026-05-04T12:55:31.584Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2566ms conn=99b02c46…bf85 id=c0b6ca40…fc15
2026-05-04T12:55:36.923Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3791ms conn=99b02c46…bf85 id=bbb582bd…d34c
2026-05-04T12:55:44.919Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3295ms conn=99b02c46…bf85 id=1be7033d…b0e7
2026-05-04T12:55:49.070Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2823ms conn=99b02c46…bf85 id=505cbc23…a684
2026-05-04T12:55:52.556Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 1830ms conn=99b02c46…bf85 id=b1c53436…9963
2026-05-04T12:55:57.160Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 2638ms conn=99b02c46…bf85 id=fa765188…ce9b
2026-05-04T12:56:01.930Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3291ms conn=99b02c46…bf85 id=46ddcc14…ecdf
2026-05-04T12:56:04.907Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 1855ms conn=99b02c46…bf85 id=7f91f31b…e8a6
2026-05-04T12:56:09.190Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3053ms conn=99b02c46…bf85 id=e7966a96…4fa0
2026-05-04T12:56:13.660Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3410ms conn=99b02c46…bf85 id=055694c3…a1da
2026-05-04T12:56:18.158Z info gateway/ws {"subsystem":"gateway/ws"} ⇄ res ✓ sessions.list 3321ms conn=99b02c46…bf85 id=4a3fdffe…8b31
### Event loop degradation (correlating with sessions.list slowness)
2026-05-04T12:47:44.984Z warn diagnostic {"subsystem":"diagnostic"} liveness warning: reasons=event_loop_delay,cpu interval=31s eventLoopDelayP99Ms=1743.8 eventLoopDelayMaxMs=3116.4 eventLoopUtilization=0.75 cpuCoreRatio=4.151 active=1 waiting=0 queued=0
2026-05-04T12:50:16.158Z warn diagnostic {"subsystem":"diagnostic"} liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=31s eventLoopDelayP99Ms=5272.2 eventLoopDelayMaxMs=5289 eventLoopUtilization=0.962 cpuCoreRatio=3.847 active=1 waiting=0 queued=0
2026-05-04T12:50:46.752Z warn diagnostic {"subsystem":"diagnostic"} liveness warning: reasons=event_loop_delay,cpu interval=31s eventLoopDelayP99Ms=3294.6 eventLoopDelayMaxMs=4680.8 eventLoopUtilization=0.923 cpuCoreRatio=4.475 active=1 waiting=0 queued=0
2026-05-04T12:55:20.323Z warn diagnostic {"subsystem":"diagnostic"} liveness warning: reasons=event_loop_delay,cpu interval=32s eventLoopDelayP99Ms=1486.9 eventLoopDelayMaxMs=4299.2 eventLoopUtilization=0.68 cpuCoreRatio=7.171 active=1 waiting=0 queued=0
### Agent cleanup timeouts (consequence of event loop saturation)
2026-05-04T12:50:57.867Z warn agent/embedded {"subsystem":"agent/embedded"} agent cleanup timed out: runId=dd0fa787-bcb0-411d-94fa-32285ac646d2 sessionId=8d026f54-297c-44b3-a4e8-b64f55a473cf step=pi-trajectory-flush timeoutMs=10000
### System load (from top)
load average: 11.41, 11.77, 10.70
%Cpu(s): 98.9 us, 1.1 sy
PID USER COMMAND %CPU
346084 najef openclaw/dist/index.js 754.5
Impact and severity
Impact:
- Gateway event loop degraded (eventLoopUtilization reaching 0.962-1.0)
- High CPU load (load average consistently 10-11, CPU at 99%)
- Slow responsiveness across all WebSocket operations
- Agent cleanup timeouts (pi-trajectory-flush timing out after 10s)
- Gateway CPU usage spikes to 754% (multi-core utilization)
Additional information
Question:
Is there something that can be optimized in the sessions.list operation? Could session listing be cached or made incremental rather than loading all sessions on every poll?
Workaround: None found - the constant polling from control UI creates a backlog that can't be cleared while the operation itself is slow.
Bug type
Regression (worked before, now fails)
Beta release blocker
Yes
Summary
The
sessions.listWebSocket operation is taking 2-5 seconds to complete, causing the event loop to become saturated. This appears to be a performance regression.Version: 2026.5.3-1 (2eae30e)
Environment:
Steps to reproduce
Set up environment:
Connect control UI:
Observe symptoms:
openclaw logs --followto see sessions.list taking 2-5 secondsopenclaw healthto see "Gateway event loop: degraded"topto see gateway CPU at 500-800%openclaw gateway stability --jsonto check event loop delaysExpected behavior
Expected vs Actual:
Actual behavior
Expected vs Actual:
OpenClaw version
2026.5.3-1 (2eae30e)
Operating system
Linux 6.17.0-23-generic (ubuntu)
Install method
Node v24.14.1 via NVM
Model
Minimax/Minimax.m2.7
Provider / routing chain
Minimax/Minimax.m2.7
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Impact:
Additional information
Question:
Is there something that can be optimized in the sessions.list operation? Could session listing be cached or made incremental rather than loading all sessions on every poll?
Workaround: None found - the constant polling from control UI creates a backlog that can't be cleared while the operation itself is slow.