Skip to content

fix(cli): auto-reconnect logs --follow on transient gateway disconnect #74782#75059

Merged
RomneyDa merged 2 commits intoopenclaw:mainfrom
shashank-poola:main
May 3, 2026
Merged

fix(cli): auto-reconnect logs --follow on transient gateway disconnect #74782#75059
RomneyDa merged 2 commits intoopenclaw:mainfrom
shashank-poola:main

Conversation

@shashank-poola
Copy link
Copy Markdown
Contributor

Closes #74782

What this fixes

openclaw logs --follow exits immediately on any gateway error — including a simple
gateway restart — forcing the user to manually re-run the command.

Before:

$ openclaw logs --follow
... logs streaming ...
[gateway restarts]
Gateway not reachable. Is it running and accessible?
$   ← dead, user re-runs manually

After:

$ openclaw logs --follow --url ws://127.0.0.1:18789
... logs streaming ...
[gateway restarts]
[logs] gateway disconnected, reconnecting in 1s...
[logs] gateway disconnected, reconnecting in 2s...
... logs resume ...

How it works

The --follow polling loop now retries on transient transport failures instead of
calling process.exit(1). Backoff: 1 s → 2 s → 4 s → … → 30 s cap, up to 8
attempts. The retry counter resets after every successful fetch.

Not everything is retried — non-recoverable errors exit immediately:

  • Close code 1008 (policy violation / pairing required)
  • Close code 4000–4999 (auth, rate-limit)
  • Any error that is not a transport-level disconnect

Scope

This adds explicit retry for --url targets. For implicit loopback connections the
existing local-file fallback already prevents exit — that path is preserved as-is.

Verified

pnpm test src/cli/logs-cli.test.ts                                              # 21/21
pnpm exec oxfmt --check --threads=1 src/cli/logs-cli.ts src/cli/logs-cli.test.ts  # clean

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation cli CLI command changes size: S labels Apr 30, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 30, 2026

Codex review: needs changes before merge.

Summary
The PR adds bounded exponential-backoff retry handling for transient openclaw logs --follow Gateway transport failures, focused CLI tests, and a docs note.

Reproducibility: yes. Current main has a source-level reproduction: an explicit --url follow fetch that raises a closed GatewayTransportError reaches the catch path and exits, and the existing explicit-URL test asserts that fatal behavior.

Next step before merge
The remaining defects are narrow and file-local: preserve JSON-mode output for retry notices and add the active changelog entry while keeping the PR scope intact.

Security
Cleared: The diff is limited to CLI retry handling, focused tests, and docs, with no dependency, workflow, permission, secret, install, or release-path changes.

Review findings

  • [P2] Emit retry notices as JSON in --json mode — src/cli/logs-cli.ts:343-348
  • [P3] Add the required changelog entry — src/cli/logs-cli.ts:339
Review details

Best possible solution:

Land one canonical focused CLI fix that retries transient follow-mode Gateway disconnects, preserves fatal auth/pairing behavior and implicit-loopback fallback, keeps JSON output machine-readable, includes tests/docs/changelog, and supersedes the competing PR if needed.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a source-level reproduction: an explicit --url follow fetch that raises a closed GatewayTransportError reaches the catch path and exits, and the existing explicit-URL test asserts that fatal behavior.

Is this the best way to solve the issue?

No, not as-is. The retry loop is the right boundary, but this branch should emit retry notices as JSON under --json and add the required changelog entry before it is the best merge candidate.

Full review comments:

  • [P2] Emit retry notices as JSON in --json mode — src/cli/logs-cli.ts:343-348
    The new retry branch always writes a styled text line to stderr. In openclaw logs --json, existing errors and notices are JSON records and the docs advertise line-delimited JSON events, so this plain text breaks machine consumers during a recoverable disconnect. Branch on jsonMode and emit a notice JSON record instead.
    Confidence: 0.86
  • [P3] Add the required changelog entry — src/cli/logs-cli.ts:339
    This is a user-facing CLI fix, but the PR head only changes docs, the logs CLI, and tests. Add a single-line active CHANGELOG.md Fixes entry for the reconnect behavior with appropriate human credit before merge.
    Confidence: 0.92

Overall correctness: patch is incorrect
Overall confidence: 0.86

Acceptance criteria:

  • pnpm test src/cli/logs-cli.test.ts
  • pnpm exec oxfmt --check --threads=1 src/cli/logs-cli.ts src/cli/logs-cli.test.ts docs/cli/logs.md CHANGELOG.md
  • git diff --check
  • pnpm check:changed in Testbox before merge if the branch is promoted

What I checked:

Likely related people:

  • steipete: History and shortlog point to Peter Steinberger as the dominant maintainer for logs-cli, tests, docs, and gateway/call, including the commits that added local logs fallback and typed Gateway transport failures. (role: current logs CLI and Gateway transport maintainer; confidence: high; commits: 306fe841f54b, e25b54210097, 023d3371a533; files: src/cli/logs-cli.ts, src/cli/logs-cli.test.ts, docs/cli/logs.md)
  • shakkernerd: Recent adjacent history includes the lazy CLI Gateway helper runtime split used by the callGatewayFromCli path that logs.tail relies on. (role: adjacent CLI Gateway helper maintainer; confidence: medium; commits: 36c8282795; files: src/cli/gateway-rpc.ts, src/cli/gateway-rpc.runtime.ts)
  • vincentkoc: Shortlog and recent commits show related maintenance in src/gateway/call.ts, especially credential and secret-input handling around the shared Gateway request path used by the logs CLI. (role: adjacent Gateway call maintainer; confidence: medium; commits: 935bd6de7f, 74e7b8d47b, 42e3d8d693; files: src/gateway/call.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9fff2b779159.

@shashank-poola
Copy link
Copy Markdown
Contributor Author

Codex review: needs maintainer review before merge.

What this changes:

The PR adds bounded exponential-backoff retry handling for openclaw logs --follow transient Gateway disconnects, focused CLI tests, and a docs note for the reconnect behavior.

Maintainer follow-up before merge:

This is already an open implementation PR tied to the reconnect issue; the remaining action is normal maintainer review and validation, not a separate automated repair PR.

Security review:

Security review cleared: The diff is limited to CLI retry handling, focused tests, and docs, with no dependency, workflow, permission, secret, install, or release-path changes.

Review details

Fixed in the latest commit, errorLine() return value is now checked in the retry branch, consistent with the rest of the file.

@shashank-poola
Copy link
Copy Markdown
Contributor Author

@steipete this PR addresses #74782, adds bounded exponential backoff retry for logs --follow on transient gateway disconnects. CI passes, P3 bot feedback addressed.

Happy to adjust anything before merge.

@RomneyDa
Copy link
Copy Markdown
Contributor

RomneyDa commented May 3, 2026

Nice, I attempted similar fix here but like this better.
#75372

@RomneyDa RomneyDa merged commit 23fe355 into openclaw:main May 3, 2026
83 checks passed
vincentkoc added a commit that referenced this pull request May 3, 2026
* 'main' of https://github.com/openclaw/openclaw:
  fix(cli): auto-reconnect logs --follow on transient gateway disconnect #74782 (#75059)
vincentkoc added a commit that referenced this pull request May 3, 2026
#75059 (fixes #74782) added user-facing CLI behavior — bounded
exponential reconnect for `openclaw logs --follow` on transient
gateway disconnect — and updated docs/cli/logs.md, but landed without a
`## Unreleased` entry. Add the missing line so the credited human
contributor is captured in the active release window. Thanks
@shashank-poola.
vincentkoc added a commit that referenced this pull request May 3, 2026
#75372 added `[logs] gateway reconnected` notice and JSON `notice`
records as a follow-up to #75059 and landed today, but its changelog
entry was placed under `## 2026.4.29` (already released). Move it next
to the related #75059 entry under `## Unreleased ### Fixes` so the
released section stays frozen and the credited contributor lands in the
right release window. Thanks @RomneyDa.
arieldiego73 pushed a commit to arieldiego73/openclaw that referenced this pull request May 5, 2026
openclaw#75372 added `[logs] gateway reconnected` notice and JSON `notice`
records as a follow-up to openclaw#75059 and landed today, but its changelog
entry was placed under `## 2026.4.29` (already released). Move it next
to the related openclaw#75059 entry under `## Unreleased ### Fixes` so the
released section stays frozen and the credited contributor lands in the
right release window. Thanks @RomneyDa.
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
…openclaw#74782 (openclaw#75059)

* fix(cli): auto-reconnect logs --follow on transient gateway disconnect

* fix(cli): honor errorLine return value in follow retry warning
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
openclaw#75059 (fixes openclaw#74782) added user-facing CLI behavior — bounded
exponential reconnect for `openclaw logs --follow` on transient
gateway disconnect — and updated docs/cli/logs.md, but landed without a
`## Unreleased` entry. Add the missing line so the credited human
contributor is captured in the active release window. Thanks
@shashank-poola.
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
openclaw#75372 added `[logs] gateway reconnected` notice and JSON `notice`
records as a follow-up to openclaw#75059 and landed today, but its changelog
entry was placed under `## 2026.4.29` (already released). Move it next
to the related openclaw#75059 entry under `## Unreleased ### Fixes` so the
released section stays frozen and the credited contributor lands in the
right release window. Thanks @RomneyDa.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…openclaw#74782 (openclaw#75059)

* fix(cli): auto-reconnect logs --follow on transient gateway disconnect

* fix(cli): honor errorLine return value in follow retry warning
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* 'main' of https://github.com/openclaw/openclaw:
  fix(cli): auto-reconnect logs --follow on transient gateway disconnect openclaw#74782 (openclaw#75059)
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
openclaw#75059 (fixes openclaw#74782) added user-facing CLI behavior — bounded
exponential reconnect for `openclaw logs --follow` on transient
gateway disconnect — and updated docs/cli/logs.md, but landed without a
`## Unreleased` entry. Add the missing line so the credited human
contributor is captured in the active release window. Thanks
@shashank-poola.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
openclaw#75372 added `[logs] gateway reconnected` notice and JSON `notice`
records as a follow-up to openclaw#75059 and landed today, but its changelog
entry was placed under `## 2026.4.29` (already released). Move it next
to the related openclaw#75059 entry under `## Unreleased ### Fixes` so the
released section stays frozen and the credited contributor lands in the
right release window. Thanks @RomneyDa.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI command changes docs Improvements or additions to documentation size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: openclaw logs --follow should auto-reconnect instead of exiting on transient gateway disconnect

2 participants