Skip to content

fix(cli): emit JSON for gateway transport failures#79233

Closed
TurboTheTurtle wants to merge 2 commits into
openclaw:mainfrom
TurboTheTurtle:codex/json-gateway-errors
Closed

fix(cli): emit JSON for gateway transport failures#79233
TurboTheTurtle wants to merge 2 commits into
openclaw:mainfrom
TurboTheTurtle:codex/json-gateway-errors

Conversation

@TurboTheTurtle

@TurboTheTurtle TurboTheTurtle commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: gateway-backed CLI commands that advertise --json could exit on transport close/timeout without writing JSON to stdout.
  • Why it matters: health-check and device automation scripts had to parse stderr even after opting into machine-readable output.
  • What changed: added a shared GatewayTransportError JSON serializer and used it for JSON-mode health, gateway health, and devices list transport failures.
  • What did NOT change (scope boundary): non-JSON command behavior, non-transport exceptions, gateway probe output, and local pairing fallback behavior are unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: JSON-mode gateway-backed commands now emit a structured { "ok": false, "error": ..., "gateway": ... } payload when the gateway transport closes or times out before returning an RPC result.
  • Real environment tested: macOS local source checkout, Node v25.8.2, pnpm v10.33.2.
  • Real stopped/unavailable gateway target: ws://127.0.0.1:9 with a redacted proof token.
  • Exact steps or command run after this patch:
    • OPENCLAW_GATEWAY_URL=ws://127.0.0.1:9 OPENCLAW_GATEWAY_TOKEN=proof-token pnpm openclaw health --json --timeout 500
    • pnpm openclaw gateway health --url ws://127.0.0.1:9 --token proof-token --json --timeout 500
    • pnpm openclaw devices list --url ws://127.0.0.1:9 --token proof-token --json --timeout 500
  • Evidence after fix: redacted stdout from the real unavailable-gateway commands. Each command also exited non-zero (ELIFECYCLE Command failed with exit code 1).

health --json redacted output:

{
  "ok": false,
  "error": {
    "type": "gateway_transport_error",
    "kind": "closed",
    "message": "gateway closed (1006 abnormal closure (no close frame)): no close reason",
    "code": 1006,
    "reason": "no close reason"
  },
  "gateway": {
    "url": "ws://127.0.0.1:9",
    "urlSource": "env OPENCLAW_GATEWAY_URL"
  }
}

gateway health --json and devices list --json produced the same structured error shape with:

{
  "gateway": {
    "url": "ws://127.0.0.1:9",
    "urlSource": "cli --url"
  }
}
  • Observed result after fix: each command wrote structured JSON to stdout and exited non-zero instead of falling through to stderr-only handling.
  • What was not tested: live Windows npm install repro from the issue.

Root Cause (if applicable)

callGateway already produced typed GatewayTransportError instances, but affected JSON commands awaited the gateway RPC before reaching their success-only JSON output branch. The typed transport error escaped to the generic CLI error path, which wrote human-readable stderr and exited without stdout JSON.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/gateway/call.test.ts
    • src/commands/health.test.ts
    • src/cli/gateway-cli.coverage.test.ts
    • src/cli/devices-cli.test.ts
  • Scenario the test should lock in: JSON-mode command paths serialize typed gateway transport close/timeout failures instead of falling through to stderr-only handling.

User-visible / Behavior Changes

When --json is requested, transport-level gateway close/timeout failures now produce stdout JSON before exiting with code 1.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local source checkout
  • Model/provider: N/A
  • Integration/channel (if any): Gateway CLI transport failure handling
  • Relevant config (redacted): loopback gateway URL and proof token only

Steps

  1. Run JSON-mode gateway-backed commands against an unavailable loopback gateway.
  2. Inspect stdout JSON and exit behavior.

Expected

  • Commands write structured JSON with ok: false, transport kind, message/code/timeout details, and gateway target metadata.
  • Commands still exit non-zero.

Actual

  • Real CLI proof above confirms the expected JSON payload and non-zero exit behavior for health --json, gateway health --json, and devices list --json.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: shared transport-error serialization, health --json, gateway health --json, devices list --json, and status/health wrapper option forwarding.
  • Edge cases checked: non-transport errors are not converted, non-JSON mode keeps the existing human error path, and explicit-url device listing still skips local fallback for plain pairing errors.
  • What you did not verify: live Windows shell repro.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: scripts may have expected empty stdout on JSON transport failures.
    • Mitigation: this only changes commands when --json is explicitly requested, matching the documented machine-readable contract.

Changelog

  • Added an Unreleased changelog entry for JSON gateway transport failures.

Additional Validation Note

  • pnpm exec vitest run src/gateway/call.test.ts src/commands/health.test.ts src/cli/gateway-cli.coverage.test.ts src/cli/devices-cli.test.ts --reporter=dot: passed (Test Files 6 passed, Tests 288 passed).
  • pnpm exec vitest run src/cli/program/register.status-health-sessions.test.ts --reporter=dot: passed (Test Files 1 passed, Tests 21 passed).
  • pnpm tsgo:core: passed.
  • pnpm tsgo:core:test: passed.
  • pnpm exec oxfmt --check --threads=1 src/gateway/call.ts src/gateway/call.test.ts src/commands/health.ts src/commands/health.test.ts src/cli/gateway-cli/call.ts src/cli/gateway-cli/register.ts src/cli/gateway-cli.coverage.test.ts src/cli/devices-cli.ts src/cli/devices-cli.test.ts CHANGELOG.md: passed.
  • git diff --check: passed.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime cli CLI command changes commands Command implementations size: L labels May 8, 2026
@TurboTheTurtle TurboTheTurtle marked this pull request as ready for review May 8, 2026 04:32
@openclaw-barnacle openclaw-barnacle Bot added the triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. label May 8, 2026
@clawsweeper

clawsweeper Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge.

Summary
Adds a shared GatewayTransportError JSON serializer and wires it into JSON-mode health, gateway health, and devices list paths, with tests and a changelog entry.

Reproducibility: yes. source-level. Current main awaits gateway RPC calls before stdout JSON writes in health, gateway health, and devices list, and #79108 provides concrete unavailable-gateway commands for the failing paths.

Real behavior proof
Sufficient (live_output): The PR body includes redacted live CLI stdout from real unavailable-gateway runs after the fix for all affected commands, showing structured JSON and non-zero exits.

Next step before merge
A narrow automated repair can rebase the active PR and port the devices-list transport-error catch and test coverage into the current runtime split.

Security
Cleared: The diff changes CLI error serialization, tests, and changelog text only; it adds no dependencies, workflows, permissions, secret handling, or new network calls.

Review findings

  • [P2] Port devices JSON handling into the runtime split — src/cli/devices-cli.ts:532-540
Review details

Best possible solution:

Rebase or repair this PR onto current main, port the devices-list handling into src/cli/devices-cli.runtime.ts, then land one shared typed serializer for the linked issue.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level. Current main awaits gateway RPC calls before stdout JSON writes in health, gateway health, and devices list, and #79108 provides concrete unavailable-gateway commands for the failing paths.

Is this the best way to solve the issue?

Yes in direction, but not in the current branch state. A shared serializer for typed GatewayTransportError failures is narrow and maintainable, but the devices-list catch must move to the current runtime split before merge.

Full review comments:

  • [P2] Port devices JSON handling into the runtime split — src/cli/devices-cli.ts:532-540
    Current main lazy-loads devices list into src/cli/devices-cli.runtime.ts, but this PR head adds the JSON transport-error catch to the pre-split registerDevicesCli body and does not contain the runtime file. After conflict resolution, devices list --json would still await listPairingWithFallback in runDevicesListCommand and can fall back to stderr-only output unless this catch and its coverage are ported.
    Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.84

Acceptance criteria:

  • node scripts/run-vitest.mjs src/gateway/call.test.ts src/commands/health.test.ts src/cli/gateway-cli.coverage.test.ts src/cli/devices-cli.test.ts src/cli/program/register.status-health-sessions.test.ts
  • node scripts/crabbox-wrapper.mjs run --shell -- "pnpm check:changed"
  • git diff --check

What I checked:

Likely related people:

  • steipete: Recent src/gateway/call.ts history includes transport classification, event-loop/client-start, and handshake timeout work that provides the typed GatewayTransportError contract this PR serializes. (role: recent gateway transport contributor; confidence: high; commits: 023d3371a533, 6bbacd14a366, 7994833fac21; files: src/gateway/call.ts, src/gateway/call.test.ts, src/commands/health.ts)
  • vincentkoc: Recent history includes the devices runtime split and adjacent gateway CLI cold-path work, which controls where the devices-list portion of this fix must be ported. (role: recent CLI and devices area contributor; confidence: high; commits: fe25ed214ef5, 4c090accd39a, e31dfa989764; files: src/cli/devices-cli.ts, src/cli/devices-cli.runtime.ts, src/cli/gateway-cli/register.ts)
  • obviyus: Recent gateway diagnostics work touched probe/status behavior adjacent to the gateway probe JSON control case used as the expected comparison for this bug. (role: adjacent gateway diagnostics contributor; confidence: medium; commits: 485c258aaf96; files: src/cli/gateway-cli/register.ts, src/commands/gateway-status.ts, src/gateway/probe.ts)

Remaining risk / open question:

  • The PR head is currently CONFLICTING/DIRTY; resolving it without moving the devices-list catch into src/cli/devices-cli.runtime.ts would leave one reported command path unfixed.
  • The JSON error envelope becomes observable CLI behavior, so the final rebased patch should preserve one deliberate stable shape.
  • I did not run tests in this read-only review; confidence comes from source inspection, live PR metadata, CI status, and the PR body's live-output proof.

Codex review notes: model gpt-5.5, reasoning high; reviewed against ea16a5e9e10c.

Copy link
Copy Markdown
Contributor Author

Agreed. I’ll add an active-version changelog bullet and update the PR body with after-fix output from a real unavailable-gateway run for the affected JSON commands. The goal is to show stdout contains { "ok": false, ... } and the command still exits non-zero.

@TurboTheTurtle TurboTheTurtle changed the title [codex] fix(cli): emit JSON for gateway transport failures fix(cli): emit JSON for gateway transport failures May 8, 2026
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 8, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from f4c949c to 47982b1 Compare May 8, 2026 09:37
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

Copy link
Copy Markdown
Contributor Author

Resolved the GitHub merge conflict by rebasing this branch onto current upstream/main and preserving the JSON gateway changelog entry. Pushed 47982b167d.

Validation: git diff --check on the rebased branch. The conflict was changelog-only.

@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from 47982b1 to 66e9741 Compare May 8, 2026 13:48
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

Copy link
Copy Markdown
Contributor Author

Rebased this once more after main advanced again; new head is 66e9741c44 on latest upstream/main (1a34ef4516).

Validation: git diff --check. The repeated conflict was still changelog-only.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from 66e9741 to 6e11902 Compare May 10, 2026 03:16
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from 6e11902 to 877bff0 Compare May 10, 2026 06:56
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from 877bff0 to d3c63ef Compare May 11, 2026 08:50
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the codex/json-gateway-errors branch from d3c63ef to f5d4ad4 Compare May 12, 2026 07:34
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@TurboTheTurtle

Copy link
Copy Markdown
Contributor Author

Resolved the current merge conflict by rebasing codex/json-gateway-errors onto latest upstream/main (109493bcdd). The new head is f5d4ad45d8.

Conflict resolved:

  • src/cli/gateway-cli.coverage.test.ts: preserved upstream startGatewayServer.mockClear() and the PR-side formatGatewayTransportErrorJson reset/default setup.

Validation on rebased head:

  • pnpm test src/cli/gateway-cli.coverage.test.ts src/cli/devices-cli.test.ts src/commands/health.test.ts src/gateway/call.test.ts
  • pnpm check:changed -- --base upstream/main
  • git diff --check

Author check before push:

f5d4ad45d8 Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com> docs(changelog): note JSON gateway transport errors
ee334b29ea Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com> fix(cli): emit json for gateway transport failures

If this PR is squash-merged or reworked, please preserve author attribution or include:
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
vincentkoc added a commit that referenced this pull request May 16, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes #66832.\nFixes #79108.\nSupersedes #67041.\nSupersedes #79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
@vincentkoc

Copy link
Copy Markdown
Member

Thanks for the focused fix. I landed a repaired equivalent on main in #82690 / e06782d, with contributor credit preserved.

What changed from this PR before landing:

  • Ported the behavior onto current CLI file layout.
  • Reused the existing typed GatewayTransportError path and added a shared JSON formatter.
  • Covered health, gateway health, and devices list JSON-mode failures.

Proof: focused Vitest passed after final rebase (8 files / 348 tests), and Blacksmith Testbox through Crabbox passed pnpm check:changed: tbx_01krryzqc6djxxnbrpea1n3n0t, Actions run https://github.com/openclaw/openclaw/actions/runs/25968977106, exit 0.

@TurboTheTurtle TurboTheTurtle deleted the codex/json-gateway-errors branch May 20, 2026 03:39
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 20, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes openclaw#66832.\nFixes openclaw#79108.\nSupersedes openclaw#67041.\nSupersedes openclaw#79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI command changes commands Command implementations gateway Gateway runtime proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway-backed --json commands emit no JSON on connection failure

2 participants