Skip to content

Feat/fix dashboard timeout error display#85815

Merged
steipete merged 3 commits into
openclaw:mainfrom
scotthuang:feat/fix-dashboard-timeout-error-display
May 24, 2026
Merged

Feat/fix dashboard timeout error display#85815
steipete merged 3 commits into
openclaw:mainfrom
scotthuang:feat/fix-dashboard-timeout-error-display

Conversation

@scotthuang

Copy link
Copy Markdown
Contributor

Summary

  • Problem: When chat.send RPC handler encounters a synchronous error (e.g., LLM timeout), the catch (err) block only calls respond() to return an RPC error, but does NOT call broadcastChatError(). Dashboard UI relies on WebSocket chat events (state: "error") to update the interface, so it gets stuck with a loading spinner and no error message.
  • Why it matters: Users see the UI hang forever when LLM requests timeout or fail, with no feedback about what went wrong. This is a terrible user experience that leaves users thinking the system is broken.
  • What changed: Added broadcastChatError() call in the synchronous catch (err) block (line 3158) of chat.send handler, mirroring the async .catch() path (line 3117) that already had this call.
  • What did NOT change (scope boundary): No changes to async error handling, no changes to other RPC handlers, no UI changes. Only the synchronous error path in chat.send was fixed.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: The catch (err) block in chat.send handler (line 3158) was missing the broadcastChatError() call that sends WebSocket events to Dashboard UI. Only respond() was called, which returns an RPC error but doesn't update the UI state.
  • Missing detection / guardrail: The synchronous error path (for errors thrown before the async dispatch) was not properly broadcasting errors to connected WebSocket clients.
  • Contributing context: Dashboard UI listens for WebSocket chat events with state: "error" to display errors. Without broadcastChatError(), the UI never receives the error event and stays in loading state forever.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seamless / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/chat.error-broadcast.test.ts
  • Scenario the test should lock in: When chat.send encounters a synchronous error (e.g., session load failure), broadcastChatError() must be called so Dashboard UI receives the error event.
  • Why this is the smallest reliable guardrail: Tests the exact missing path (synchronous catch block) without requiring full agent dispatch setup.
  • Existing test that already covers this (if any): src/gateway/server-methods/chat.directive-tags.test.ts (similar patterns exist)
  • If no new test is added, why not: N/A - new test was added

User-visible / Behavior Changes

  • Dashboard UI now properly displays error messages (e.g., "LLM timeout") when chat.send fails synchronously, instead of hanging with a loading spinner forever.
  • Error appears as a chat message with error state, not just a silent RPC failure.

Diagram (if applicable)

Before:
[User sends message] -> [chat.send RPC] -> [Error thrown] -> [respond() called] -> [UI hangs, no error shown]

After:
[User sends message] -> [chat.send RPC] -> [Error thrown] -> [respond() called] -> [broadcastChatError() called] -> [UI shows error message]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22+
  • Model/provider: N/A (error injection)
  • Integration/channel (if any): Dashboard UI
  • Relevant config (redacted): N/A

Steps

  1. Open Dashboard UI at http://localhost:19001
  2. Open browser DevTools → Network → filter WS
  3. Send message containing timeout-test (triggers injected error for testing)
  4. Observe UI behavior and WebSocket messages

Expected

  • UI shows error message "LLM timeout" (or similar)
  • Loading spinner stops
  • WebSocket receives chat event with state: "error" and errorMessage

Actual

  • Before fix: UI hangs with loading spinner forever, no error shown
  • After fix: UI displays error message, loading stops

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Video: Before fix (UI stuck with loading spinner)

2026-05-24.3.55.49.mov

Before fix (from gateway.log):

[CHAT.SEND] Throwing timeout error directly...
[ws] ⇄ res ✗ chat.send errorCode=UNAVAILABLE errorMessage=Error: LLM timeout

UI hangs, no error displayed.

Video: After fix (error message displayed correctly)

2026-05-24.3.43.11.mov

After fix:

[CHAT.SEND] Throwing timeout error directly...
[ws] ⇄ res ✗ chat.send errorCode=UNAVAILABLE errorMessage=Error: LLM timeout
<-- WebSocket chat event: { state: "error", errorMessage: "Error: LLM timeout" }

UI shows error message correctly.

Real behavior proof

  • Behavior addressed: Dashboard UI now properly displays error messages when chat.send fails synchronously (e.g., LLM timeout), instead of hanging with a loading spinner forever.
  • Real environment tested: macOS 26.3 arm64, Node v22+, OpenClaw built from source (pnpm buildnode dist/index.js gateway --port 19001), model hy3-preview via tencent-tokenhub, Dashboard UI at http://localhost:19001.
  • Exact steps or command run after this patch:
    1. Built source with fix: pnpm build
    2. Started dev instance: openclaw-dev restart (port 19001, state dir ~/.openclaw-dev/)
    3. Opened Dashboard UI: http://localhost:19001
    4. Opened browser DevTools → Network → WS → Messages
    5. Sent message: hello timeout-test (triggers injected timeout error in chat.send handler)
    6. Observed UI behavior and WebSocket messages
  • Evidence after fix: Screen recording showing the UI displays "LLM timeout" error message instead of hanging with loading spinner. Gateway log shows broadcastChatError() was called (via WebSocket chat event with state: "error"). Test file src/gateway/server-methods/chat.error-broadcast.test.ts passes.
  • Observed result after fix: UI receives WebSocket chat event with state: "error" and errorMessage: "Error: LLM timeout". Loading spinner stops. Error message appears in chat. Test passes: broadcastChatError() is called when synchronous error occurs in chat.send.
  • What was not tested: Other RPC handlers that may have similar issues (separate investigation needed), real LLM timeout (vs injected error), other Dashboard UI sessions (only tested with dashboard: session key).

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Sent timeout-test message in Dashboard UI at http://localhost:19001, confirmed error message appears instead of hanging. Verified WebSocket message format matches expected structure (runId, sessionKey, state: "error", errorMessage). Ran unit test: pnpm test src/gateway/server-methods/chat.error-broadcast.test.ts.
  • Edge cases checked: Verified WebSocket event format is consistent with async error path (line 3117). Checked that broadcastChatError() is called with correct parameters (runId, sessionKey, errorMessage).
  • What you did not verify: Other RPC handlers that may have similar missing broadcastChatError() calls, real LLM timeout scenario (only tested with injected error), performance impact of additional broadcast call.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: broadcastChatError() might be called twice if both sync and async paths trigger (e.g., error in sync path followed by async cleanup error).
    • Mitigation: broadcastChatError() is idempotent for the same runId due to agentRunSeq tracking and dedup. The WebSocket clients handle duplicate events gracefully.
  • Risk: Change affects error visibility for all chat.send failures, not just timeouts.
    • Mitigation: This is intentional - all errors should be visible to users, not just timeouts. Previously, async errors were shown but sync errors were not.

@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Latest ClawSweeper review: 2026-05-24 01:01 UTC / May 23, 2026, 9:01 PM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

PR Surface
Source +6, Tests +65. Total +71 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 6 0 +6
Tests 1 65 0 +65
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 71 0 +71

Summary
The PR adds broadcastChatError() to the synchronous chat.send catch path and adds a regression test for addChatRun throwing before async dispatch.

Reproducibility: yes. Current main can throw synchronously after clientRunId and sessionKey are established, then respond with an RPC error without sending the chat error event the dashboard uses to clear the run state.

PR rating
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster ✨ media proof bonus
Patch quality: 🐚 platinum hermit
Summary: Strong visual proof and a small, well-targeted patch make this a normal good PR with no blocking findings from this review.

Rank-up moves:

  • none
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (recording): The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

Risk before merge

Maintainer options:

  1. Decide the mitigation before merge
    Merge the narrow gateway broadcast fix after required CI and maintainer review, while leaving the separate agent-started timeout path to its own PR.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
No ClawSweeper repair lane is needed because there are no blocking findings; the remaining action is ordinary maintainer review and CI.

Security
Cleared: The diff only adds gateway error fanout and a focused unit test; it does not touch dependencies, CI, credentials, permissions, or command execution paths.

Review details

Best possible solution:

Merge the narrow gateway broadcast fix after required CI and maintainer review, while leaving the separate agent-started timeout path to its own PR.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main can throw synchronously after clientRunId and sessionKey are established, then respond with an RPC error without sending the chat error event the dashboard uses to clear the run state.

Is this the best way to solve the issue?

Yes. Mirroring the existing async broadcastChatError() call inside the synchronous catch is the narrow owner-boundary fix and avoids protocol, config, or UI changes.

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

Label justifications:

  • P2: The PR fixes a normal-priority dashboard chat failure mode where synchronous chat.send errors leave users without visible error feedback, with limited gateway/UI blast radius.
  • rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🦞 diamond lobster, patch quality is 🐚 platinum hermit, and Strong visual proof and a small, well-targeted patch make this a normal good PR with no blocking findings from this review.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (recording): The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.
  • proof: 🎥 video: Contributor real behavior proof includes video or recording evidence. The PR body provides before/after dashboard recordings, local steps, and log snippets; inspected frames show the after-fix dashboard rendering Error: LLM timeout instead of remaining in progress.

What I checked:

Likely related people:

  • steipete: Local blame attributes the current broadcastChatError() helper and chat.send error-path structure to a705a9c911; recent path history also shows repeated gateway and UI chat maintenance by this account. (role: recent area contributor; confidence: high; commits: a705a9c911bc, f739edcf4c7e, b22926601fca; files: src/gateway/server-methods/chat.ts, ui/src/ui/controllers/chat.ts)
  • vincentkoc: Recent main history includes fix(gateway): broadcast agent-run error payloads, which is directly adjacent to this PR's chat error fanout behavior. (role: adjacent error-broadcast contributor; confidence: medium; commits: 07e61fc847e5, 02908db62b30, cff991c88d04; files: src/gateway/server-methods/chat.ts, ui/src/ui/controllers/chat.ts)
  • BunsDev: Recent UI history includes Control UI run-status cleanup and responsiveness work in the same chat lifecycle surface that consumes terminal chat events. (role: recent Control UI chat-state contributor; confidence: medium; commits: 4935e24c7a7f, 60171e863882, 6b3cd9043ee6; files: ui/src/ui/controllers/chat.ts, ui/src/ui/app-gateway.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against acf265d4d51d.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: 🎥 video Contributor real behavior proof includes video or recording evidence. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal backlog priority with limited blast radius. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Sunspot Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: finds missing screenshots.
Image traits: location flaky test forest; accessory little merge flag; palette plum, gold, and soft gray; mood curious; pose leaning over a miniature review desk; shell frosted glass shell; lighting bright celebratory glints; background little resolved-comment flags.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Sunspot Signal Puff in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

scotthuang pushed a commit to scotthuang/openclaw that referenced this pull request May 23, 2026
… barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 23, 2026
@scotthuang

Copy link
Copy Markdown
Contributor Author

Fixed the P1 finding: changed the test type import to the local server-methods barrel (./types.js) instead of the non-existent ../types.js.

Verified locally with the exact command from the review:

node scripts/run-vitest.mjs run src/gateway/server-methods/chat.error-broadcast.test.ts

Result:

Test Files  2 passed (2)
     Tests  2 passed (2)

Pushed as e7d953f70b.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 23, 2026
@steipete steipete force-pushed the feat/fix-dashboard-timeout-error-display branch from e7d953f to 6ad1283 Compare May 23, 2026 23:57
steipete pushed a commit to scotthuang/openclaw that referenced this pull request May 23, 2026
… barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 23, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026
scotthuang added 3 commits May 24, 2026 01:50
… barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.
@steipete steipete force-pushed the feat/fix-dashboard-timeout-error-display branch from 6ad1283 to b677b72 Compare May 24, 2026 00:56
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026
@steipete steipete merged commit a668982 into openclaw:main May 24, 2026
110 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed. Proof before merge: focused gateway chat error-broadcast regression test, autoreview clean, and live CI green.

Merge commit: a668982

Thanks @scotthuang.

@scotthuang

Copy link
Copy Markdown
Contributor Author

@clawsweeper hatch

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper could not hatch this PR egg yet.

Reason: hatch requires an open pull request.

SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
cv pushed a commit to NVIDIA/NemoClaw that referenced this pull request Jun 4, 2026
…UI error (#4437)

## Release target

Refs #4434. This PR targets `v0.0.55`; #4434 should remain open until
this OpenClaw upgrade is merged, tagged, and verified in the shipped
`.55` release.

## Why this resolves #4434

NemoClaw #4434 reports that `openclaw tui` keeps an active spinner and
`connected` status with no visible terminal error when the NVIDIA
inference endpoint is unreachable. This branch moves the sandbox
OpenClaw pin from `2026.5.22` to `2026.5.27` with npm integrity:


`sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==`

Upstream proof:

- openclaw/openclaw#85815 and
openclaw/openclaw@a668982
fix the missing `broadcastChatError()` call for synchronous `chat.send`
failures.
- openclaw/openclaw#84945 and
openclaw/openclaw#85355 show the broader real
class of gateway errors not being broadcast to clients.

## Changes

- Bumps `Dockerfile`, `Dockerfile.base`,
`agents/openclaw/manifest.yaml`, and package metadata to OpenClaw
`2026.5.27`.
- Updates OpenClaw pin/integrity tests, deployment/version tests, and
the existing TUI chat-correlation E2E assertion.
- Updates `scripts/patch-openclaw-chat-send.js` so NemoClaw's chat-send
run-id preservation shim still recognizes the compiled OpenClaw
`2026.5.27` followup-runner admission shape.
- Adds a CI-safe Vitest contract harness for the #4434 TUI failure
signature and expected visible-error behavior.
- Adds the privileged live repro:
`test/e2e/test-issue-4434-tui-unreachable-inference.sh`.
- Wires that live repro into `nightly-e2e.yaml` as
`issue-4434-tui-unreachable-inference-e2e`, including selective
dispatch, public-install target-ref handling, failure artifacts,
aggregate reporting coverage, and trusted workflow-script checkout for
the secret/sudo firewall job.

## Local validation

- `npm ci`
- `npm ci --include=dev`
- `npm run build:cli`
- `npm run typecheck:cli`
- `npm test -- test/fetch-guard-patch-regression.test.ts
test/openclaw-chat-send-patch.test.ts
test/openclaw-tui-chat-correlation.test.ts
test/issue-4434-tui-unreachable-inference.test.ts`
- `npm test -- src/lib/sandbox/version.test.ts
src/lib/verify-deployment.test.ts`
- `npm test -- test/validate-e2e-coverage.test.ts
test/e2e-advisor-dispatch.test.ts test/e2e-script-workflow.test.ts
test/issue-4434-tui-unreachable-inference.test.ts
nemoclaw/src/package-metadata.test.ts`
- `shellcheck test/e2e/test-issue-4434-tui-unreachable-inference.sh`
- `bash -n test/e2e/test-issue-4434-tui-unreachable-inference.sh`
- `bash -n test/e2e/test-openclaw-tui-chat-correlation.sh`
- `NEMOCLAW_ISSUE_4434_LIVE=0 bash
test/e2e/test-issue-4434-tui-unreachable-inference.sh`
- `git diff --check`
- Fresh `npm pack openclaw@2026.5.27` dist smoke with `node
scripts/patch-openclaw-chat-send.js "$tmp/package/dist"`
- Runtime Docker smoke: `docker build -f Dockerfile --build-arg
BASE_IMAGE=ghcr.io/nvidia/nemoclaw/sandbox-base:latest -t
nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 .`
- Runtime image version smoke: `docker run --rm --entrypoint openclaw
nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 --version` ->
`OpenClaw 2026.5.27 (27ae826)`
- Base-style OpenClaw install smoke in Docker for the `2026.5.27` npm
integrity and install path.
- Pre-commit suite on `98e0a763efe0925f26cf89129cd4ab63cb0b05f3`:
passed, including CLI/plugin coverage hooks.
- Pre-push suite reran CLI/plugin coverage; one unrelated
`test/nemoclaw-start.test.ts` case timed out during the full concurrent
run, then passed directly with `npx vitest run --project cli
test/nemoclaw-start.test.ts -t "captures baseline snapshot when
openclaw.json is valid and no baseline exists"`.

## Nightly proof

Targeted nightly E2E passed on the final PR head:

- Run: https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610
- Job:
https://github.com/NVIDIA/NemoClaw/actions/runs/26586935610/job/78335355241
- Head: `5f549f661fe81b485f75903146512af4225d4698`
- Job: `issue-4434-tui-unreachable-inference-e2e`
- Duration: 8m27s

The live job runs the requested end-to-end flow on Linux with the
repository `NVIDIA_API_KEY` secret: public install from this PR ref,
cloud onboard with NVIDIA Endpoints and
`nvidia/nemotron-3-super-120b-a12b`, pre-block `nemoclaw <sandbox>
status`, pre-block `nemoclaw <sandbox> connect --probe-only`, exact
`DOCKER-USER` `DROP` rules for `75.2.113.119` and `99.83.136.103`,
in-sandbox endpoint-block verification, `openclaw tui`, `hello`, and
final TUI assertion.

The passing assertion was:

`PASS: openclaw tui surfaced a visible unreachable-inference error and
stopped the spinner`

The dispatch command for reruns while this job only exists on the PR
branch is:

```bash
gh workflow run nightly-e2e.yaml --repo NVIDIA/NemoClaw \
  --ref issue-4434-openclaw-2026-5-27-proof \
  -f target_ref=5f549f661fe81b485f75903146512af4225d4698 \
  -f pr_number=4437 \
  -f jobs=issue-4434-tui-unreachable-inference-e2e
```

## Remaining release note

- Baseline: #4434 already captures the `v0.0.53` / OpenClaw `2026.5.22`
spinner/no-error behavior after the exact firewall block. I did not
rerun the mutating baseline repro from this macOS host.
- Exact `Dockerfile.base` build was blocked locally because this Docker
install does not provide `docker buildx`, while `Dockerfile.base` uses
BuildKit `RUN --mount`. The runtime Docker path and a base-style
OpenClaw install smoke both passed.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added an opt-in live E2E repro and new unit/integration tests for TUI
behavior when inference endpoints are unreachable, validating visible
error reporting, spinner shutdown, and compatibility with updated
runtime/followup-runner shapes.

* **Chores**
* Bumped OpenClaw/runtime to 2026.5.27 across builds, manifests, docs,
and test expectations.

* **Chores / CI**
* Added a selective/nightly E2E job to run the repro, include its
results in aggregated reports, and upload sanitized logs with sensitive
tokens redacted.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: cjagwani <cjagwani@nvidia.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
* fix(gateway): broadcast error to UI when chat.send fails synchronously

* test(gateway): verify broadcastChatError is called on chat.send error

* test(gateway): import GatewayRequestContext from local server-methods barrel

Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.

Addresses ClawSweeper review on PR openclaw#85815.

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui gateway Gateway runtime P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. proof: 🎥 video Contributor real behavior proof includes video or recording evidence. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants