Bug type
Regression (worked before, now fails)
Summary
On 2026.3.12 and 2026.3.13, openclaw gateway status reports RPC probe: ok to ws://127.0.0.1:18789, but openclaw devices list and openclaw nodes status intermittently timeout against the same endpoint.
Steps to reproduce
- Start gateway normally (
openclaw gateway start or systemd user service).
- Run
openclaw gateway status and confirm Runtime: running, Listening: 127.0.0.1:18789, and RPC probe: ok.
- Run
openclaw devices list.
- Run
openclaw nodes status.
- Observe intermittent timeout/failure on the same local endpoint (
ws://127.0.0.1:18789).
Expected behavior
When openclaw gateway status shows RPC probe: ok on ws://127.0.0.1:18789, commands using the same gateway connection (devices list, nodes status) should complete reliably without timeout.
Actual behavior
openclaw gateway status succeeds and reports probe OK, but openclaw devices list and/or openclaw nodes status intermittently fail with timeout/connect failures.
Observed error context includes:
Local loopback ws://127.0.0.1:18789
Connect: failed - timeout
- gateway log lines indicating handshake timeout/close-before-connect.
OpenClaw version
2026.3.12 and 2026.3.13 (regression); last known good 2026.3.8
Operating system
Linux 6.8.0-106-generic (x64)
Install method
npm/global install with gateway managed as systemd user service
Model
openai-codex/gpt-5.3-codex (runtime); default model openai-codex/gpt-5.3-codex
Provider / routing chain
openclaw CLI -> local gateway ws://127.0.0.1:18789 (loopback)
Config file / key location
No response
Additional provider/model setup details
No response
Logs, screenshots, and evidence
From `/tmp/openclaw/openclaw-2026-03-14.log` in the same timeframe:
- `gateway/ws handshake timeout ... remote=127.0.0.1`
- `closed before connect ... cause=handshake-timeout ... host=127.0.0.1:18789 ... code=1006`
CLI output also shows:
- `Local loopback ws://127.0.0.1:18789`
- `Connect: failed - timeout`
Meanwhile, `openclaw gateway status` reports:
- `Probe target: ws://127.0.0.1:18789`
- `RPC probe: ok`
Impact and severity
Affected: local operators managing paired devices/nodes through CLI.
Severity: High for operations (device/node management becomes unreliable).
Frequency: Intermittent but recurring on 2026.3.12/2026.3.13.
Consequence: Node/device workflows fail despite health signal reporting OK, causing confusion and blocked maintenance tasks.
Additional information
Regression window:
- Last known good: 2026.3.8
- First known bad: 2026.3.12
- Still bad on: 2026.3.13
Workaround:
Hypothesis:
- WebSocket handshake/connection handling regression where lightweight probe path succeeds but full command RPC path intermittently times out under runtime load.
Bug type
Regression (worked before, now fails)
Summary
On 2026.3.12 and 2026.3.13,
openclaw gateway statusreportsRPC probe: oktows://127.0.0.1:18789, butopenclaw devices listandopenclaw nodes statusintermittently timeout against the same endpoint.Steps to reproduce
openclaw gateway startor systemd user service).openclaw gateway statusand confirmRuntime: running,Listening: 127.0.0.1:18789, andRPC probe: ok.openclaw devices list.openclaw nodes status.ws://127.0.0.1:18789).Expected behavior
When
openclaw gateway statusshowsRPC probe: okonws://127.0.0.1:18789, commands using the same gateway connection (devices list,nodes status) should complete reliably without timeout.Actual behavior
openclaw gateway statussucceeds and reports probe OK, butopenclaw devices listand/oropenclaw nodes statusintermittently fail with timeout/connect failures.Observed error context includes:
Local loopback ws://127.0.0.1:18789Connect: failed - timeoutOpenClaw version
2026.3.12 and 2026.3.13 (regression); last known good 2026.3.8
Operating system
Linux 6.8.0-106-generic (x64)
Install method
npm/global install with gateway managed as systemd user service
Model
openai-codex/gpt-5.3-codex (runtime); default model openai-codex/gpt-5.3-codex
Provider / routing chain
openclaw CLI -> local gateway ws://127.0.0.1:18789 (loopback)
Config file / key location
No response
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Affected: local operators managing paired devices/nodes through CLI.
Severity: High for operations (device/node management becomes unreliable).
Frequency: Intermittent but recurring on 2026.3.12/2026.3.13.
Consequence: Node/device workflows fail despite health signal reporting OK, causing confusion and blocked maintenance tasks.
Additional information
Regression window:
Workaround:
Hypothesis: