fix(gateway): broadcast agent-run error payloads by vincentkoc · Pull Request #85355 · openclaw/openclaw

vincentkoc · 2026-05-22T12:41:02Z

Summary

broadcast returned isError agent-run payloads as terminal chat error events after an agent has started
mark the chat:<runId> dedupe entry failed for those returned error payloads
add a regression covering post-agent-start idle-timeout-style error delivery without mirroring assistant transcript entries

Verification

node scripts/run-vitest.mjs src/gateway/server-methods/chat.directive-tags.test.ts (2 files, 152 tests passed)
node_modules/.bin/oxfmt --check src/gateway/server-methods/chat.ts src/gateway/server-methods/chat.directive-tags.test.ts CHANGELOG.md
git diff --check origin/main...HEAD

Real behavior proof

Behavior addressed: returned isError payloads after agentRunStarted=true were persisted/logged but not broadcast as terminal chat errors.
Real environment tested: local Gateway server-method unit harness in a Codex worktree, with dispatchInboundMessage triggering onAgentRunStart and returning a final error payload.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/gateway/server-methods/chat.directive-tags.test.ts.
Evidence after fix: the new regression observes a chat event with state: "error", errorMessage: "LLM idle timeout (120s): no response from model", and a failed chat:<runId> dedupe snapshot.
Observed result after fix: connected Gateway clients receive the terminal error event while normal agent-run final text still is not mirrored into assistant transcript entries.
What was not tested: a live ACP/Ki-Agents 120s idle timeout run; CI and the focused Gateway harness cover the source-level failure path.

clawsweeper · 2026-05-22T12:43:00Z

Codex review: needs maintainer review before merge.

Latest ClawSweeper review: 2026-05-22 12:47 UTC / May 22, 2026, 8:47 AM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR updates Gateway chat handling so returned isError payloads after an agent run starts are broadcast as terminal chat error events, marks the chat dedupe entry failed, and adds a focused regression plus changelog entry.

Reproducibility: yes. Source-level reproduction is high confidence: current main stores block/final replies in deliveredReplies, then the agentRunStarted branch only updates the user transcript and marks dedupe ok, while the PR harness simulates onAgentRunStart plus a final isError payload.

PR rating
Overall: 🐚 platinum hermit
Proof: 🌊 off-meta tidepool
Patch quality: 🐚 platinum hermit
Summary: Focused, reviewable patch with a regression for the source-level failure path and no blocking findings; the remaining confidence gap is live transport proof and duplicate-PR coordination.

Rank-up moves:

Choose the canonical branch relative to fix(gateway): surface resolved chat errors #84953 before landing both.
Let required checks finish, then use a live ACP/WebChat smoke only if maintainers want to retire the transport-proof gap.

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Not applicable: The external contributor proof gate does not apply to this maintainer-authored PR; the PR body provides focused unit-harness proof but not a live ACP timeout run.

Risk before merge

A live ACP/Ki-Agents 120s idle-timeout run was not exercised; the remaining merge risk is whether real connected clients observe exactly the same terminal chat error event as the focused Gateway harness.
There are open external PRs for the same linked bug, especially fix(gateway): surface resolved chat errors #84953, so maintainers should avoid landing duplicate competing fixes.

Maintainer options:

Land with focused Gateway proof (recommended)
If required checks stay green, maintainers can accept the remaining live-timeout proof gap because the patch uses the existing chat error event contract and covers the source-level failure path.
Request live transport proof
Before merge, maintainers can ask for a short ACP/WebChat smoke that injects or triggers the returned isError payload and shows the connected client receiving the terminal error.
Pick the canonical duplicate branch
Pause this branch if maintainers prefer the broader existing fix in fix(gateway): surface resolved chat errors #84953, then close or supersede the other PR to avoid drift.

Next step before merge
No automated repair is needed; this protected maintainer PR has no blocking review findings and should proceed through normal maintainer merge/CI handling.

Security
Cleared: No concrete security or supply-chain regression found; the diff touches Gateway event handling, a regression test, and changelog only, with no new dependencies, workflows, permissions, or secret handling.

Review details

Best possible solution:

Land one canonical Gateway fix that broadcasts returned agent-run error payloads, preserves the transcript ownership boundary, and then close the linked issue while superseding duplicate PRs.

Do we have a high-confidence way to reproduce the issue?

Yes. Source-level reproduction is high confidence: current main stores block/final replies in deliveredReplies, then the agentRunStarted branch only updates the user transcript and marks dedupe ok, while the PR harness simulates onAgentRunStart plus a final isError payload.

Is this the best way to solve the issue?

Yes. The PR is the narrow maintainable fix: it inspects delivered error payloads in the existing post-dispatch path, reuses broadcastChatError, updates dedupe consistently, and avoids mirroring normal Pi assistant turns into transcript history.

Label changes:

add P1: The PR fixes a real Gateway/ACP workflow where users can lose the terminal error event and see a silently stopped response.
add merge-risk: 🚨 message-delivery: The patch changes terminal chat error delivery and dedupe behavior for connected Gateway clients, which focused tests cover but live transport proof has not exercised.
add rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🌊 off-meta tidepool, patch quality is 🐚 platinum hermit, and Focused, reviewable patch with a regression for the source-level failure path and no blocking findings; the remaining confidence gap is live transport proof and duplicate-PR coordination.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The external contributor proof gate does not apply to this maintainer-authored PR; the PR body provides focused unit-harness proof but not a live ACP timeout run.

Label justifications:

P1: The PR fixes a real Gateway/ACP workflow where users can lose the terminal error event and see a silently stopped response.
merge-risk: 🚨 message-delivery: The patch changes terminal chat error delivery and dedupe behavior for connected Gateway clients, which focused tests cover but live transport proof has not exercised.
rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🌊 off-meta tidepool, patch quality is 🐚 platinum hermit, and Focused, reviewable patch with a regression for the source-level failure path and no blocking findings; the remaining confidence gap is live transport proof and duplicate-PR coordination.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The external contributor proof gate does not apply to this maintainer-authored PR; the PR body provides focused unit-harness proof but not a live ACP timeout run.

What I checked:

Current main drops returned agent-run error payloads: On current main, deliveredReplies records block/final payloads, but once agentRunStarted is true the post-dispatch branch only emits the user transcript update and then writes a successful chat dedupe entry. (src/gateway/server-methods/chat.ts:2815, 111bad106544)
Existing protocol supports the intended event: The Gateway protocol already defines ChatErrorEventSchema with state: "error" and optional errorMessage, so the PR uses an existing client-facing event shape rather than adding a new protocol contract. (src/gateway/protocol/schema/logs-chat.ts:121, 111bad106544)
Agent timeout path returns an error payload: The Pi embedded runner timeout path returns a normal reply payload with isError: true, matching the issue's non-throwing flow that would bypass the .catch() broadcast path. (src/agents/pi-embedded-runner/run.ts:2775, 111bad106544)
PR patch broadcasts and marks dedupe failure: The PR detects returned isError payloads after an agent starts, calls broadcastChatError, and writes a failed chat:<runId> dedupe payload instead of ok: true. (src/gateway/server-methods/chat.ts:2902, a230fcc4cc38)
Regression covers terminal error event without transcript mirroring: The new test simulates onAgentRunStart followed by a final isError payload and expects a chat error event, failed dedupe snapshot, user transcript update, and no assistant transcript mirror. (src/gateway/server-methods/chat.directive-tags.test.ts:970, a230fcc4cc38)
Maintainer discussion keeps the bug independent: A maintainer comment on the linked issue says not to close it with fix(codex): recover final text after prompt timeout #84993 and identifies this Gateway chat delivery path as the next narrow repair candidate.

Likely related people:

Peter Steinberger: Current blame for the Gateway post-dispatch branch, deliveredReplies handling, and the timeout payload return path points to the same recent local-history commit. (role: recent area contributor; confidence: medium; commits: 4ee8a2ac2ea5; files: src/gateway/server-methods/chat.ts, src/gateway/server-methods/chat.directive-tags.test.ts, src/agents/pi-embedded-runner/run.ts)
vincentkoc: The live PR and linked issue are assigned to this member account, and the issue discussion includes their maintainer routing note that this Gateway delivery bug is independent of adjacent fixes. (role: current follow-up owner; confidence: medium; commits: a230fcc4cc38; files: src/gateway/server-methods/chat.ts, src/gateway/server-methods/chat.directive-tags.test.ts, CHANGELOG.md)
JulyanXu: The issue discussion and open related PRs include source-level analysis and proposed fixes for the same returned-error-payload delivery path. (role: adjacent contributor; confidence: low; files: src/gateway/server-methods/chat.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 111bad106544.

clawsweeper · 2026-05-22T12:48:59Z

ClawSweeper PR egg

✨ Hatched: 💎 rare Mossy Merge Sprite

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 💎 rare.
Trait: polishes edge cases.
Image traits: location green-check meadow; accessory lint brush; palette moonlit blue and soft silver; mood celebratory; pose balancing on a branch marker; shell glossy opal shell; lighting bright celebratory glints; background smooth stones and checkmarks.
Share on X: post this hatch
Copy: My PR egg hatched a 💎 rare Mossy Merge Sprite in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

…UI error (#4437) ## Release target Refs #4434. This PR targets `v0.0.55`; #4434 should remain open until this OpenClaw upgrade is merged, tagged, and verified in the shipped `.55` release. ## Why this resolves #4434 NemoClaw #4434 reports that `openclaw tui` keeps an active spinner and `connected` status with no visible terminal error when the NVIDIA inference endpoint is unreachable. This branch moves the sandbox OpenClaw pin from `2026.5.22` to `2026.5.27` with npm integrity: `sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==` Upstream proof: - openclaw/openclaw#85815 and openclaw/openclaw@a668982 fix the missing `broadcastChatError()` call for synchronous `chat.send` failures. - openclaw/openclaw#84945 and openclaw/openclaw#85355 show the broader real class of gateway errors not being broadcast to clients. ## Changes - Bumps `Dockerfile`, `Dockerfile.base`, `agents/openclaw/manifest.yaml`, and package metadata to OpenClaw `2026.5.27`. - Updates OpenClaw pin/integrity tests, deployment/version tests, and the existing TUI chat-correlation E2E assertion. - Updates `scripts/patch-openclaw-chat-send.js` so NemoClaw's chat-send run-id preservation shim still recognizes the compiled OpenClaw `2026.5.27` followup-runner admission shape. - Adds a CI-safe Vitest contract harness for the #4434 TUI failure signature and expected visible-error behavior. - Adds the privileged live repro: `test/e2e/test-issue-4434-tui-unreachable-inference.sh`. - Wires that live repro into `nightly-e2e.yaml` as `issue-4434-tui-unreachable-inference-e2e`, including selective dispatch, public-install target-ref handling, failure artifacts, aggregate reporting coverage, and trusted workflow-script checkout for the secret/sudo firewall job. ## Local validation - `npm ci` - `npm ci --include=dev` - `npm run build:cli` - `npm run typecheck:cli` - `npm test -- test/fetch-guard-patch-regression.test.ts test/openclaw-chat-send-patch.test.ts test/openclaw-tui-chat-correlation.test.ts test/issue-4434-tui-unreachable-inference.test.ts` - `npm test -- src/lib/sandbox/version.test.ts src/lib/verify-deployment.test.ts` - `npm test -- test/validate-e2e-coverage.test.ts test/e2e-advisor-dispatch.test.ts test/e2e-script-workflow.test.ts test/issue-4434-tui-unreachable-inference.test.ts nemoclaw/src/package-metadata.test.ts` - `shellcheck test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `bash -n test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `bash -n test/e2e/test-openclaw-tui-chat-correlation.sh` - `NEMOCLAW_ISSUE_4434_LIVE=0 bash test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `git diff --check` - Fresh `npm pack openclaw@2026.5.27` dist smoke with `node scripts/patch-openclaw-chat-send.js "$tmp/package/dist"` - Runtime Docker smoke: `docker build -f Dockerfile --build-arg BASE_IMAGE=ghcr.io/nvidia/nemoclaw/sandbox-base:latest -t nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 .` - Runtime image version smoke: `docker run --rm --entrypoint openclaw nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 --version` -> `OpenClaw 2026.5.27 (27ae826)` - Base-style OpenClaw install smoke in Docker for the `2026.5.27` npm integrity and install path. - Pre-commit suite on `98e0a763efe0925f26cf89129cd4ab63cb0b05f3`: passed, including CLI/plugin coverage hooks. - Pre-push suite reran CLI/plugin coverage; one unrelated `test/nemoclaw-start.test.ts` case timed out during the full concurrent run, then passed directly with `npx vitest run --project cli test/nemoclaw-start.test.ts -t "captures baseline snapshot when openclaw.json is valid and no baseline exists"`. ## Nightly proof Targeted nightly E2E passed on the final PR head: - Run: https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610 - Job: https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610/job/78335355241 - Head: `5f549f661fe81b485f75903146512af4225d4698` - Job: `issue-4434-tui-unreachable-inference-e2e` - Duration: 8m27s The live job runs the requested end-to-end flow on Linux with the repository `NVIDIA_API_KEY` secret: public install from this PR ref, cloud onboard with NVIDIA Endpoints and `nvidia/nemotron-3-super-120b-a12b`, pre-block `nemoclaw <sandbox> status`, pre-block `nemoclaw <sandbox> connect --probe-only`, exact `DOCKER-USER` `DROP` rules for `75.2.113.119` and `99.83.136.103`, in-sandbox endpoint-block verification, `openclaw tui`, `hello`, and final TUI assertion. The passing assertion was: `PASS: openclaw tui surfaced a visible unreachable-inference error and stopped the spinner` The dispatch command for reruns while this job only exists on the PR branch is: ```bash gh workflow run nightly-e2e.yaml --repo NVIDIA/NemoClaw \ --ref issue-4434-openclaw-2026-5-27-proof \ -f target_ref=5f549f661fe81b485f75903146512af4225d4698 \ -f pr_number=4437 \ -f jobs=issue-4434-tui-unreachable-inference-e2e ``` ## Remaining release note - Baseline: #4434 already captures the `v0.0.53` / OpenClaw `2026.5.22` spinner/no-error behavior after the exact firewall block. I did not rerun the mutating baseline repro from this macOS host. - Exact `Dockerfile.base` build was blocked locally because this Docker install does not provide `docker buildx`, while `Dockerfile.base` uses BuildKit `RUN --mount`. The runtime Docker path and a base-style OpenClaw install smoke both passed.  ## Summary by CodeRabbit * **Tests** * Added an opt-in live E2E repro and new unit/integration tests for TUI behavior when inference endpoints are unreachable, validating visible error reporting, spinner shutdown, and compatibility with updated runtime/followup-runner shapes. * **Chores** * Bumped OpenClaw/runtime to 2026.5.27 across builds, manifests, docs, and test expectations. * **Chores / CI** * Added a selective/nightly E2E job to run the repro, include its results in aggregated reports, and upload sanitized logs with sensitive tokens redacted.  --------- Co-authored-by: cjagwani <cjagwani@nvidia.com>

fix(gateway): broadcast agent-run error payloads

a230fcc

openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime size: S maintainer Maintainer-authored PR labels May 22, 2026

vincentkoc self-assigned this May 22, 2026

clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 22, 2026

vincentkoc merged commit 07e61fc into main May 22, 2026
143 of 148 checks passed

vincentkoc deleted the fix/gateway-agent-error-payload-broadcast branch May 22, 2026 12:58

This was referenced May 22, 2026

LLM idle timeout error silently dropped when agentRunStarted is true #84945

Closed

Codex app-server: long agent replies silently truncated at ~1000-1100 chars (stop=null, aborted=false) #84516

Open

github-actions Bot mentioned this pull request May 22, 2026

📡 Upstream Digest — 2026-05-22 14:22 UTC curtismercier/openclaw-mods#916

Open

steipete mentioned this pull request May 22, 2026

fix(gateway): surface resolved chat errors #84953

Closed

4 tasks

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

c4bc67b

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

ad0666b

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

b327d9c

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

39c64af

galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

3fafc58

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

ae42129

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

e338370

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

fbb1d66

ericksoa mentioned this pull request May 28, 2026

test: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error NVIDIA/NemoClaw#4437

Merged

jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

39f96f5

SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

cbba84c

sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026

fix(gateway): broadcast agent-run error payloads (openclaw#85355)

ba51d56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gateway): broadcast agent-run error payloads#85355

fix(gateway): broadcast agent-run error payloads#85355
vincentkoc merged 1 commit into
mainfrom
fix/gateway-agent-error-payload-broadcast

vincentkoc commented May 22, 2026

Uh oh!

clawsweeper Bot commented May 22, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented May 22, 2026

Summary

Verification

Real behavior proof

Uh oh!

clawsweeper Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 22, 2026

Hatch command

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented May 22, 2026 •

edited

Loading