Feat/fix dashboard timeout error display by scotthuang · Pull Request #85815 · openclaw/openclaw

scotthuang · 2026-05-23T19:56:53Z

Summary

Problem: When chat.send RPC handler encounters a synchronous error (e.g., LLM timeout), the catch (err) block only calls respond() to return an RPC error, but does NOT call broadcastChatError(). Dashboard UI relies on WebSocket chat events (state: "error") to update the interface, so it gets stuck with a loading spinner and no error message.
Why it matters: Users see the UI hang forever when LLM requests timeout or fail, with no feedback about what went wrong. This is a terrible user experience that leaves users thinking the system is broken.
What changed: Added broadcastChatError() call in the synchronous catch (err) block (line 3158) of chat.send handler, mirroring the async .catch() path (line 3117) that already had this call.
What did NOT change (scope boundary): No changes to async error handling, no changes to other RPC handlers, no UI changes. Only the synchronous error path in chat.send was fixed.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related #
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: The catch (err) block in chat.send handler (line 3158) was missing the broadcastChatError() call that sends WebSocket events to Dashboard UI. Only respond() was called, which returns an RPC error but doesn't update the UI state.
Missing detection / guardrail: The synchronous error path (for errors thrown before the async dispatch) was not properly broadcasting errors to connected WebSocket clients.
Contributing context: Dashboard UI listens for WebSocket chat events with state: "error" to display errors. Without broadcastChatError(), the UI never receives the error event and stays in loading state forever.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seamless / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/gateway/server-methods/chat.error-broadcast.test.ts
Scenario the test should lock in: When chat.send encounters a synchronous error (e.g., session load failure), broadcastChatError() must be called so Dashboard UI receives the error event.
Why this is the smallest reliable guardrail: Tests the exact missing path (synchronous catch block) without requiring full agent dispatch setup.
Existing test that already covers this (if any): src/gateway/server-methods/chat.directive-tags.test.ts (similar patterns exist)
If no new test is added, why not: N/A - new test was added

User-visible / Behavior Changes

Dashboard UI now properly displays error messages (e.g., "LLM timeout") when chat.send fails synchronously, instead of hanging with a loading spinner forever.
Error appears as a chat message with error state, not just a silent RPC failure.

Diagram (if applicable)

Before:
[User sends message] -> [chat.send RPC] -> [Error thrown] -> [respond() called] -> [UI hangs, no error shown]

After:
[User sends message] -> [chat.send RPC] -> [Error thrown] -> [respond() called] -> [broadcastChatError() called] -> [UI shows error message]

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22+
Model/provider: N/A (error injection)
Integration/channel (if any): Dashboard UI
Relevant config (redacted): N/A

Steps

Open Dashboard UI at http://localhost:19001
Open browser DevTools → Network → filter WS
Send message containing timeout-test (triggers injected error for testing)
Observe UI behavior and WebSocket messages

Expected

UI shows error message "LLM timeout" (or similar)
Loading spinner stops
WebSocket receives chat event with state: "error" and errorMessage

Actual

Before fix: UI hangs with loading spinner forever, no error shown
After fix: UI displays error message, loading stops

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Video: Before fix (UI stuck with loading spinner)

2026-05-24.3.55.49.mov

Before fix (from gateway.log):

[CHAT.SEND] Throwing timeout error directly...
[ws] ⇄ res ✗ chat.send errorCode=UNAVAILABLE errorMessage=Error: LLM timeout

UI hangs, no error displayed.

Video: After fix (error message displayed correctly)

2026-05-24.3.43.11.mov

After fix:

[CHAT.SEND] Throwing timeout error directly...
[ws] ⇄ res ✗ chat.send errorCode=UNAVAILABLE errorMessage=Error: LLM timeout
<-- WebSocket chat event: { state: "error", errorMessage: "Error: LLM timeout" }

UI shows error message correctly.

Real behavior proof

Behavior addressed: Dashboard UI now properly displays error messages when chat.send fails synchronously (e.g., LLM timeout), instead of hanging with a loading spinner forever.
Real environment tested: macOS 26.3 arm64, Node v22+, OpenClaw built from source (pnpm build → node dist/index.js gateway --port 19001), model hy3-preview via tencent-tokenhub, Dashboard UI at http://localhost:19001.
Exact steps or command run after this patch:
1. Built source with fix: pnpm build
2. Started dev instance: openclaw-dev restart (port 19001, state dir ~/.openclaw-dev/)
3. Opened Dashboard UI: http://localhost:19001
4. Opened browser DevTools → Network → WS → Messages
5. Sent message: hello timeout-test (triggers injected timeout error in chat.send handler)
6. Observed UI behavior and WebSocket messages
Evidence after fix: Screen recording showing the UI displays "LLM timeout" error message instead of hanging with loading spinner. Gateway log shows broadcastChatError() was called (via WebSocket chat event with state: "error"). Test file src/gateway/server-methods/chat.error-broadcast.test.ts passes.
Observed result after fix: UI receives WebSocket chat event with state: "error" and errorMessage: "Error: LLM timeout". Loading spinner stops. Error message appears in chat. Test passes: broadcastChatError() is called when synchronous error occurs in chat.send.
What was not tested: Other RPC handlers that may have similar issues (separate investigation needed), real LLM timeout (vs injected error), other Dashboard UI sessions (only tested with dashboard: session key).

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: Sent timeout-test message in Dashboard UI at http://localhost:19001, confirmed error message appears instead of hanging. Verified WebSocket message format matches expected structure (runId, sessionKey, state: "error", errorMessage). Ran unit test: pnpm test src/gateway/server-methods/chat.error-broadcast.test.ts.
Edge cases checked: Verified WebSocket event format is consistent with async error path (line 3117). Checked that broadcastChatError() is called with correct parameters (runId, sessionKey, errorMessage).
What you did not verify: Other RPC handlers that may have similar missing broadcastChatError() calls, real LLM timeout scenario (only tested with injected error), performance impact of additional broadcast call.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps: N/A

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

Risk: broadcastChatError() might be called twice if both sync and async paths trigger (e.g., error in sync path followed by async cleanup error).
- Mitigation: broadcastChatError() is idempotent for the same runId due to agentRunSeq tracking and dedup. The WebSocket clients handle duplicate events gracefully.
Risk: Change affects error visibility for all chat.send failures, not just timeouts.
- Mitigation: This is intentional - all errors should be visible to users, not just timeouts. Previously, async errors were shown but sync errors were not.

clawsweeper · 2026-05-23T20:04:00Z

Codex review: needs maintainer review before merge.

Latest ClawSweeper review: 2026-05-24 01:01 UTC / May 23, 2026, 9:01 PM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

PR Surface
Source +6, Tests +65. Total +71 across 2 files.

View PR surface stats

Area	Files	Added	Net
Source	1	6	+6
Tests	1	65	+65
Docs	0	0	0
Config	0	0	0
Generated	0	0	0
Other	0	0	0
Total	2	71	+71

Summary
The PR adds broadcastChatError() to the synchronous chat.send catch path and adds a regression test for addChatRun throwing before async dispatch.

Reproducibility: yes. Current main can throw synchronously after clientRunId and sessionKey are established, then respond with an RPC error without sending the chat error event the dashboard uses to clear the run state.

PR rating
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster ✨ media proof bonus
Patch quality: 🐚 platinum hermit
Summary: Strong visual proof and a small, well-targeted patch make this a normal good PR with no blocking findings from this review.

Rank-up moves:

none

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (recording): The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

Risk before merge

This read-only review did not execute the new Vitest file; final merge confidence should rely on CI or a maintainer-run node scripts/run-vitest.mjs src/gateway/server-methods/chat.error-broadcast.test.ts.
This patch intentionally fixes only the synchronous catch path; the related agent-started idle-timeout path remains separate in fix(gateway): broadcast idle timeout errors to clients after agent run started #85176.

Maintainer options:

Decide the mitigation before merge
Merge the narrow gateway broadcast fix after required CI and maintainer review, while leaving the separate agent-started timeout path to its own PR.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
No ClawSweeper repair lane is needed because there are no blocking findings; the remaining action is ordinary maintainer review and CI.

Security
Cleared: The diff only adds gateway error fanout and a focused unit test; it does not touch dependencies, CI, credentials, permissions, or command execution paths.

Review details

Best possible solution:

Merge the narrow gateway broadcast fix after required CI and maintainer review, while leaving the separate agent-started timeout path to its own PR.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main can throw synchronously after clientRunId and sessionKey are established, then respond with an RPC error without sending the chat error event the dashboard uses to clear the run state.

Is this the best way to solve the issue?

Yes. Mirroring the existing async broadcastChatError() call inside the synchronous catch is the narrow owner-boundary fix and avoids protocol, config, or UI changes.

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

Label justifications:

P2: The PR fixes a normal-priority dashboard chat failure mode where synchronous chat.send errors leave users without visible error feedback, with limited gateway/UI blast radius.
rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🦞 diamond lobster, patch quality is 🐚 platinum hermit, and Strong visual proof and a small, well-targeted patch make this a normal good PR with no blocking findings from this review.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (recording): The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.
proof: 🎥 video: Contributor real behavior proof includes video or recording evidence. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

What I checked:

Current main synchronous catch: Current main records the synchronous chat.send error in dedupe and calls respond(false, ...), but the catch block does not emit the chat error event that UI clients consume. (src/gateway/server-methods/chat.ts:3158, acf265d4d51d)
Existing async error contract: The async .catch() path already calls broadcastChatError() with context, runId, sessionKey, and errorMessage, so the PR mirrors an established path instead of adding a new protocol shape. (src/gateway/server-methods/chat.ts:3147, acf265d4d51d)
Broadcast helper behavior: broadcastChatError() emits a chat payload with state: "error", sends it to the session node, and clears the run sequence for the run. (src/gateway/server-methods/chat.ts:1958, acf265d4d51d)
Dashboard error-event consumer: The Control UI chat controller treats state: "error" as a terminal event, reconciles the active run as failed, and sets lastError from errorMessage. (ui/src/ui/controllers/chat.ts:754, acf265d4d51d)
Live PR diff: The live PR head b677b7262291a00a13fb97b72955fb3d589cc35e changes only src/gateway/server-methods/chat.ts and adds src/gateway/server-methods/chat.error-broadcast.test.ts; the source change is the six-line error broadcast call. (src/gateway/server-methods/chat.ts:3178, b677b7262291)
Real behavior proof inspected: Downloaded and inspected the before/after recordings from the PR body: the before frame shows the chat stuck in progress after hello timeout-test, while the after frame shows Error: LLM timeout rendered in the dashboard. (b677b7262291)

Likely related people:

steipete: Local blame attributes the current broadcastChatError() helper and chat.send error-path structure to a705a9c911; recent path history also shows repeated gateway and UI chat maintenance by this account. (role: recent area contributor; confidence: high; commits: a705a9c911bc, f739edcf4c7e, b22926601fca; files: src/gateway/server-methods/chat.ts, ui/src/ui/controllers/chat.ts)
vincentkoc: Recent main history includes fix(gateway): broadcast agent-run error payloads, which is directly adjacent to this PR's chat error fanout behavior. (role: adjacent error-broadcast contributor; confidence: medium; commits: 07e61fc847e5, 02908db62b30, cff991c88d04; files: src/gateway/server-methods/chat.ts, ui/src/ui/controllers/chat.ts)
BunsDev: Recent UI history includes Control UI run-status cleanup and responsiveness work in the same chat lifecycle surface that consumes terminal chat events. (role: recent Control UI chat-state contributor; confidence: medium; commits: 4935e24c7a7f, 60171e863882, 6b3cd9043ee6; files: ui/src/ui/controllers/chat.ts, ui/src/ui/app-gateway.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against acf265d4d51d.

clawsweeper · 2026-05-23T20:10:56Z

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Sunspot Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: finds missing screenshots.
Image traits: location flaky test forest; accessory little merge flag; palette plum, gold, and soft gray; mood curious; pose leaning over a miniature review desk; shell frosted glass shell; lighting bright celebratory glints; background little resolved-comment flags.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Sunspot Signal Puff in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

… barrel Fixes the chat error-broadcast regression test so it can resolve its type import. The previous `../types.js` path does not exist in the gateway tree; the shared types are re-exported from `src/gateway/server-methods/types.ts`, so the test must use `./types.js`. Addresses ClawSweeper review on PR openclaw#85815.

scotthuang · 2026-05-23T20:46:28Z

Fixed the P1 finding: changed the test type import to the local server-methods barrel (./types.js) instead of the non-existent ../types.js.

Verified locally with the exact command from the review:

node scripts/run-vitest.mjs run src/gateway/server-methods/chat.error-broadcast.test.ts

Result:

Test Files  2 passed (2)
     Tests  2 passed (2)

Pushed as e7d953f70b.

@clawsweeper re-review

clawsweeper · 2026-05-23T20:46:31Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26343172648
Updated: 2026-05-23T20:57:17.903Z

… barrel Fixes the chat error-broadcast regression test so it can resolve its type import. The previous `../types.js` path does not exist in the gateway tree; the shared types are re-exported from `src/gateway/server-methods/types.ts`, so the test must use `./types.js`. Addresses ClawSweeper review on PR openclaw#85815.

steipete · 2026-05-24T01:25:09Z

Landed. Proof before merge: focused gateway chat error-broadcast regression test, autoreview clean, and live CI green.

Merge commit: a668982

Thanks @scotthuang.

scotthuang · 2026-05-24T04:48:57Z

@clawsweeper hatch

clawsweeper · 2026-05-24T04:49:00Z

🦞👀
ClawSweeper could not hatch this PR egg yet.

Reason: hatch requires an open pull request.

* fix(gateway): broadcast error to UI when chat.send fails synchronously * test(gateway): verify broadcastChatError is called on chat.send error * test(gateway): import GatewayRequestContext from local server-methods barrel Fixes the chat error-broadcast regression test so it can resolve its type import. The previous `../types.js` path does not exist in the gateway tree; the shared types are re-exported from `src/gateway/server-methods/types.ts`, so the test must use `./types.js`. Addresses ClawSweeper review on PR openclaw#85815. --------- Co-authored-by: scotthuang <scotthuang@tencent.com>

…UI error (#4437) ## Release target Refs #4434. This PR targets `v0.0.55`; #4434 should remain open until this OpenClaw upgrade is merged, tagged, and verified in the shipped `.55` release. ## Why this resolves #4434 NemoClaw #4434 reports that `openclaw tui` keeps an active spinner and `connected` status with no visible terminal error when the NVIDIA inference endpoint is unreachable. This branch moves the sandbox OpenClaw pin from `2026.5.22` to `2026.5.27` with npm integrity: `sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==` Upstream proof: - openclaw/openclaw#85815 and openclaw/openclaw@a668982 fix the missing `broadcastChatError()` call for synchronous `chat.send` failures. - openclaw/openclaw#84945 and openclaw/openclaw#85355 show the broader real class of gateway errors not being broadcast to clients. ## Changes - Bumps `Dockerfile`, `Dockerfile.base`, `agents/openclaw/manifest.yaml`, and package metadata to OpenClaw `2026.5.27`. - Updates OpenClaw pin/integrity tests, deployment/version tests, and the existing TUI chat-correlation E2E assertion. - Updates `scripts/patch-openclaw-chat-send.js` so NemoClaw's chat-send run-id preservation shim still recognizes the compiled OpenClaw `2026.5.27` followup-runner admission shape. - Adds a CI-safe Vitest contract harness for the #4434 TUI failure signature and expected visible-error behavior. - Adds the privileged live repro: `test/e2e/test-issue-4434-tui-unreachable-inference.sh`. - Wires that live repro into `nightly-e2e.yaml` as `issue-4434-tui-unreachable-inference-e2e`, including selective dispatch, public-install target-ref handling, failure artifacts, aggregate reporting coverage, and trusted workflow-script checkout for the secret/sudo firewall job. ## Local validation - `npm ci` - `npm ci --include=dev` - `npm run build:cli` - `npm run typecheck:cli` - `npm test -- test/fetch-guard-patch-regression.test.ts test/openclaw-chat-send-patch.test.ts test/openclaw-tui-chat-correlation.test.ts test/issue-4434-tui-unreachable-inference.test.ts` - `npm test -- src/lib/sandbox/version.test.ts src/lib/verify-deployment.test.ts` - `npm test -- test/validate-e2e-coverage.test.ts test/e2e-advisor-dispatch.test.ts test/e2e-script-workflow.test.ts test/issue-4434-tui-unreachable-inference.test.ts nemoclaw/src/package-metadata.test.ts` - `shellcheck test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `bash -n test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `bash -n test/e2e/test-openclaw-tui-chat-correlation.sh` - `NEMOCLAW_ISSUE_4434_LIVE=0 bash test/e2e/test-issue-4434-tui-unreachable-inference.sh` - `git diff --check` - Fresh `npm pack openclaw@2026.5.27` dist smoke with `node scripts/patch-openclaw-chat-send.js "$tmp/package/dist"` - Runtime Docker smoke: `docker build -f Dockerfile --build-arg BASE_IMAGE=ghcr.io/nvidia/nemoclaw/sandbox-base:latest -t nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 .` - Runtime image version smoke: `docker run --rm --entrypoint openclaw nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 --version` -> `OpenClaw 2026.5.27 (27ae826)` - Base-style OpenClaw install smoke in Docker for the `2026.5.27` npm integrity and install path. - Pre-commit suite on `98e0a763efe0925f26cf89129cd4ab63cb0b05f3`: passed, including CLI/plugin coverage hooks. - Pre-push suite reran CLI/plugin coverage; one unrelated `test/nemoclaw-start.test.ts` case timed out during the full concurrent run, then passed directly with `npx vitest run --project cli test/nemoclaw-start.test.ts -t "captures baseline snapshot when openclaw.json is valid and no baseline exists"`. ## Nightly proof Targeted nightly E2E passed on the final PR head: - Run: https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610 - Job: https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610/job/78335355241 - Head: `5f549f661fe81b485f75903146512af4225d4698` - Job: `issue-4434-tui-unreachable-inference-e2e` - Duration: 8m27s The live job runs the requested end-to-end flow on Linux with the repository `NVIDIA_API_KEY` secret: public install from this PR ref, cloud onboard with NVIDIA Endpoints and `nvidia/nemotron-3-super-120b-a12b`, pre-block `nemoclaw <sandbox> status`, pre-block `nemoclaw <sandbox> connect --probe-only`, exact `DOCKER-USER` `DROP` rules for `75.2.113.119` and `99.83.136.103`, in-sandbox endpoint-block verification, `openclaw tui`, `hello`, and final TUI assertion. The passing assertion was: `PASS: openclaw tui surfaced a visible unreachable-inference error and stopped the spinner` The dispatch command for reruns while this job only exists on the PR branch is: ```bash gh workflow run nightly-e2e.yaml --repo NVIDIA/NemoClaw \ --ref issue-4434-openclaw-2026-5-27-proof \ -f target_ref=5f549f661fe81b485f75903146512af4225d4698 \ -f pr_number=4437 \ -f jobs=issue-4434-tui-unreachable-inference-e2e ``` ## Remaining release note - Baseline: #4434 already captures the `v0.0.53` / OpenClaw `2026.5.22` spinner/no-error behavior after the exact firewall block. I did not rerun the mutating baseline repro from this macOS host. - Exact `Dockerfile.base` build was blocked locally because this Docker install does not provide `docker buildx`, while `Dockerfile.base` uses BuildKit `RUN --mount`. The runtime Docker path and a base-style OpenClaw install smoke both passed.  ## Summary by CodeRabbit * **Tests** * Added an opt-in live E2E repro and new unit/integration tests for TUI behavior when inference endpoints are unreachable, validating visible error reporting, spinner shutdown, and compatibility with updated runtime/followup-runner shapes. * **Chores** * Bumped OpenClaw/runtime to 2026.5.27 across builds, manifests, docs, and test expectations. * **Chores / CI** * Added a selective/nightly E2E job to run the repro, include its results in aggregated reports, and upload sanitized logs with sensitive tokens redacted.  --------- Co-authored-by: cjagwani <cjagwani@nvidia.com>

* fix(gateway): broadcast error to UI when chat.send fails synchronously * test(gateway): verify broadcastChatError is called on chat.send error * test(gateway): import GatewayRequestContext from local server-methods barrel Fixes the chat error-broadcast regression test so it can resolve its type import. The previous `../types.js` path does not exist in the gateway tree; the shared types are re-exported from `src/gateway/server-methods/types.ts`, so the test must use `./types.js`. Addresses ClawSweeper review on PR openclaw#85815. --------- Co-authored-by: scotthuang <scotthuang@tencent.com>

openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 23, 2026

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 23, 2026

steipete force-pushed the feat/fix-dashboard-timeout-error-display branch from e7d953f to 6ad1283 Compare May 23, 2026 23:57

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 23, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026

scotthuang added 3 commits May 24, 2026 01:50

fix(gateway): broadcast error to UI when chat.send fails synchronously

0811354

test(gateway): verify broadcastChatError is called on chat.send error

eed2d34

steipete force-pushed the feat/fix-dashboard-timeout-error-display branch from 6ad1283 to b677b72 Compare May 24, 2026 00:56

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026

steipete merged commit a668982 into openclaw:main May 24, 2026
110 checks passed

github-actions Bot mentioned this pull request May 24, 2026

📡 Upstream Digest — 2026-05-24 02:32 UTC curtismercier/openclaw-mods#930

Open

ericksoa mentioned this pull request May 28, 2026

test: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error NVIDIA/NemoClaw#4437

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/fix dashboard timeout error display#85815

Feat/fix dashboard timeout error display#85815
steipete merged 3 commits into
openclaw:mainfrom
scotthuang:feat/fix-dashboard-timeout-error-display

scotthuang commented May 23, 2026

Uh oh!

clawsweeper Bot commented May 23, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 23, 2026 •

edited

Loading

Uh oh!

scotthuang commented May 23, 2026

Uh oh!

clawsweeper Bot commented May 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

steipete commented May 24, 2026

Uh oh!

scotthuang commented May 24, 2026

Uh oh!

clawsweeper Bot commented May 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

scotthuang commented May 23, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Video: Before fix (UI stuck with loading spinner)

Video: After fix (error message displayed correctly)

Real behavior proof

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

clawsweeper Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hatch command

Uh oh!

scotthuang commented May 23, 2026

Uh oh!

clawsweeper Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

steipete commented May 24, 2026

Uh oh!

scotthuang commented May 24, 2026

Uh oh!

clawsweeper Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clawsweeper Bot commented May 23, 2026 •

edited

Loading

clawsweeper Bot commented May 23, 2026 •

edited

Loading

clawsweeper Bot commented May 23, 2026 •

edited

Loading

clawsweeper Bot commented May 24, 2026 •

edited

Loading