Skip to content

[Bug]: devices list / nodes status timeout while gateway status shows RPC probe: ok (regression in 2026.3.12/2026.3.13) #46316

@dmmmuse

Description

@dmmmuse

Bug type

Regression (worked before, now fails)

Summary

On 2026.3.12 and 2026.3.13, openclaw gateway status reports RPC probe: ok to ws://127.0.0.1:18789, but openclaw devices list and openclaw nodes status intermittently timeout against the same endpoint.

Steps to reproduce

  1. Start gateway normally (openclaw gateway start or systemd user service).
  2. Run openclaw gateway status and confirm Runtime: running, Listening: 127.0.0.1:18789, and RPC probe: ok.
  3. Run openclaw devices list.
  4. Run openclaw nodes status.
  5. Observe intermittent timeout/failure on the same local endpoint (ws://127.0.0.1:18789).

Expected behavior

When openclaw gateway status shows RPC probe: ok on ws://127.0.0.1:18789, commands using the same gateway connection (devices list, nodes status) should complete reliably without timeout.

Actual behavior

openclaw gateway status succeeds and reports probe OK, but openclaw devices list and/or openclaw nodes status intermittently fail with timeout/connect failures.
Observed error context includes:

  • Local loopback ws://127.0.0.1:18789
  • Connect: failed - timeout
  • gateway log lines indicating handshake timeout/close-before-connect.

OpenClaw version

2026.3.12 and 2026.3.13 (regression); last known good 2026.3.8

Operating system

Linux 6.8.0-106-generic (x64)

Install method

npm/global install with gateway managed as systemd user service

Model

openai-codex/gpt-5.3-codex (runtime); default model openai-codex/gpt-5.3-codex

Provider / routing chain

openclaw CLI -> local gateway ws://127.0.0.1:18789 (loopback)

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

From `/tmp/openclaw/openclaw-2026-03-14.log` in the same timeframe:
- `gateway/ws handshake timeout ... remote=127.0.0.1`
- `closed before connect ... cause=handshake-timeout ... host=127.0.0.1:18789 ... code=1006`
CLI output also shows:
- `Local loopback ws://127.0.0.1:18789`
- `Connect: failed - timeout`
Meanwhile, `openclaw gateway status` reports:
- `Probe target: ws://127.0.0.1:18789`
- `RPC probe: ok`

Impact and severity

Affected: local operators managing paired devices/nodes through CLI.
Severity: High for operations (device/node management becomes unreliable).
Frequency: Intermittent but recurring on 2026.3.12/2026.3.13.
Consequence: Node/device workflows fail despite health signal reporting OK, causing confusion and blocked maintenance tasks.

Additional information

Regression window:

  • Last known good: 2026.3.8
  • First known bad: 2026.3.12
  • Still bad on: 2026.3.13

Workaround:

  • Roll back to 2026.3.8

Hypothesis:

  • WebSocket handshake/connection handling regression where lightweight probe path succeeds but full command RPC path intermittently times out under runtime load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions