fix #71992: [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 by zhangguiping-xydt · Pull Request #88786 · openclaw/openclaw

zhangguiping-xydt · 2026-05-31T20:50:58Z

Summary

Fixes [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 #71992 by clearing active Control UI WebChat stream/run state before committing terminal assistant messages.
Preserves final and aborted fallback text behavior by snapshotting the live stream before lifecycle reconciliation.
Adds regression assertions that final and aborted terminal commits happen only after chatStream and chatRunId are cleared.

Root Cause

Root cause: Terminal final / aborted events could append an assistant message into chatMessages while the same assistant content still existed in the live chatStream / chatRunId state. Because the Control UI renders committed history and the live stream item separately, Lit batching could briefly render the completed assistant reply twice.
Why this is root-cause fix: The terminal path now snapshots any fallback stream text, reconciles the run lifecycle first, and only then commits the assistant message. That removes the overlapping committed-message + live-stream render state instead of hiding duplicate DOM output after the fact.
Fix classification: Minimal root-cause UI state-ordering fix for the Control UI WebChat renderer.
Patch quality notes: The diff is limited to WebChat terminal event handling and focused regression coverage; no protocol, schema, provider, or persisted transcript format changes are included.

Real behavior proof

Behavior or issue addressed: Control UI WebChat should render a completed assistant reply exactly once after a send completes, with no leftover streaming bubble for the same assistant turn.
Real environment tested: Local OpenClaw gateway serving the real Control UI from this PR head (095b9b496f1e320b9ff61617c2b2b4eb0f92e186) on loopback, driven by Playwright in Chromium. The gateway used the repo's deterministic qa mock-openai provider so the proof exercises a real browser WebChat send without depending on external model billing/auth.
Exact steps or command run after this patch:
- Started pnpm openclaw qa mock-openai --host 127.0.0.1 --port 44180.
- Started node scripts/run-node.mjs gateway run with OPENCLAW_CONFIG_PATH=/tmp/openclaw-dev-pr88786-mock-proof/.openclaw-dev/openclaw.json and gateway port 19190.
- Ran node /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-send-proof-after-sync.mjs with OPENCLAW_PROOF_SESSION=dailyfix-88786-mock-proof and OPENCLAW_EXPECTED_REPLY=daily-fix-88786-proof.

Evidence after fix: Copied proof result from /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-send-proof-after-sync.json plus screenshot artifact /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-after-one-reply-after-sync.png (sha256:05db04780c0b9b707d78920c714fae19b34d9586ddc448d0a40cbeb755eadc14).

{
  "url": "http://127.0.0.1:19190/chat?session=dailyfix-88786-mock-proof#token=<redacted>",
  "prompt": "reply exactly `daily-fix-88786-proof`",
  "expectedReply": "daily-fix-88786-proof",
  "before": {
    "messageCount": 0,
    "userCount": 0,
    "assistantCount": 0,
    "chatRunId": null,
    "chatStream": null,
    "liveStreamingBubbleCount": 0
  },
  "after": {
    "messageCount": 2,
    "userCount": 1,
    "assistantCount": 1,
    "messageSummaries": [
      { "role": "user", "text": "reply exactly `daily-fix-88786-proof`" },
      { "role": "assistant", "text": "daily-fix-88786-proof" }
    ],
    "assistantGroups": [
      { "text": "daily-fix-88786-proof Assistant 2026年6月3日 0:45", "bubbleCount": 1, "streamingBubbleCount": 0 }
    ],
    "chatRunId": null,
    "chatStream": null,
    "liveStreamingBubbleCount": 0
  },
  "result": {
    "newUserMessages": 1,
    "newAssistantMessages": 1,
    "newAssistantTexts": ["daily-fix-88786-proof"],
    "chatRunIdCleared": true,
    "chatStreamCleared": true,
    "liveStreamingBubbleCount": 0,
    "completedAssistantReply": true,
    "exactlyOneCompletedAssistantReply": true
  }
}

Observed result after fix: One WebChat send added exactly one user message and exactly one completed assistant message. The completed assistant text was daily-fix-88786-proof; chatRunId and chatStream were both null; .chat-bubble.streaming count was 0; the rendered assistant group had one bubble.
What was not tested: This proof does not claim to fix the separate persisted-transcript structural duplicate shape described in BUG: Control UI (webchat) double-records assistant messages in session JSONL #39469. It also does not depend on a third-party live model response; model-provider billing/auth behavior is outside this UI rendering fix.

Review findings addressed

RF-001 / RF-002 / RF-003: Added an after-fix real Control UI WebChat send proof with copied JSON, runtime log, and screenshot artifact showing one completed assistant reply and no live streaming duplicate.
RF-004: Explicitly out of scope: persisted transcript structural duplicates from BUG: Control UI (webchat) double-records assistant messages in session JSONL #39469 are a separate backend/history shape and are not treated as fixed by this PR.
RF-005: No further code change was needed after review; the missing item was contributor-supplied WebChat behavior proof, now recorded against current head.

Regression Test Plan

Target test file: ui/src/ui/controllers/chat.test.ts.
Ran node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts.
Ran pnpm tsgo:test:ui.
Ran git diff --check.
Ran the Control UI WebChat Playwright proof described above against the current PR head.

Merge risk

Why it matters / User impact: Users sending messages in Control UI WebChat should not see every assistant response duplicated after the terminal event lands.
Risk labels considered: app: web-ui, size: S, impact:session-state, impact:message-loss.
Risk explanation: The change touches only terminal event ordering in Control UI chat state. The main risk is losing fallback text when a terminal payload does not include a visible assistant message; the implementation snapshots fallback text before clearing stream state to preserve that behavior.
Why acceptable: Unit coverage asserts final and aborted terminal commits happen after stream/run clearing, and the browser proof shows a real WebChat send completing with one rendered assistant reply and no streaming duplicate.
Maintainer-ready confidence: High for the client-side duplicate visible reply in [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 #71992; separate persisted transcript duplicates remain out of scope as documented above.

clawsweeper · 2026-05-31T20:53:20Z

Codex review: needs maintainer review before merge. Reviewed June 2, 2026, 1:27 PM ET / 17:27 UTC.

Summary
The PR moves Control UI terminal WebChat final/aborted message commits after run/stream lifecycle reconciliation and adds regression tests for the assignment ordering.

PR surface: Source +5, Tests +30. Total +35 across 2 files.

Reproducibility: yes. source-level. On current main, active-run final/aborted handling can assign chatMessages while chatStream is still non-null, and the renderer appends the active stream separately.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

[P1] The PR intentionally fixes only the visible client-side WebChat stream/history overlap; persisted transcript structural duplicate reports should remain separately tracked and not be considered solved by this patch.

Maintainer options:

Decide the mitigation before merge
Land the focused Control UI ordering fix for the client-side WebChat duplicate-render path while keeping backend transcript dedupe as separate work.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

[P2] No repair lane is needed because the PR already contains the narrow code/test change and sufficient after-fix behavior proof; remaining action is normal maintainer merge validation.

Security
Cleared: The diff only changes Control UI TypeScript state ordering and tests, with no dependency, CI, secret, package, or supply-chain surface touched.

Review details

Best possible solution:

Land the focused Control UI ordering fix for the client-side WebChat duplicate-render path while keeping backend transcript dedupe as separate work.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level. On current main, active-run final/aborted handling can assign chatMessages while chatStream is still non-null, and the renderer appends the active stream separately.

Is this the best way to solve the issue?

Yes. Reordering terminal lifecycle reconciliation before the committed assistant-message append fixes the state overlap at the owner boundary; renderer-side dedupe would only hide the duplicate output and backend transcript dedupe is a separate issue.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f789081bae12.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
remove rating: 🌊 off-meta tidepool: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

P2: This is a focused user-facing Control UI WebChat duplication regression with limited blast radius and no crash or security boundary impact.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.

Evidence reviewed

PR surface:

Source +5, Tests +30. Total +35 across 2 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	37	32	+5
Tests	1	31	1	+30
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	2	68	33	+35

What I checked:

Repository policy read: Root and scoped UI review policy were read fully; the root policy requires whole-path review and real behavior proof, and ui/AGENTS.md adds no conflicting rule for these files. (AGENTS.md:1, f789081bae12)
Current main has the overlap shape: On current main, the final and aborted branches assign state.chatMessages before calling reconcileTerminalRun, so a render scheduled by that mutation can still see live chatStream state. (ui/src/ui/controllers/chat.ts:1130, f789081bae12)
Lifecycle clear contract: reconcileChatRunLifecycle is the callee that clears chatStream/chatStreamStartedAt and chatRunId when called with clearChatStream and clearLocalRun. (ui/src/ui/chat/run-lifecycle.ts:177, f789081bae12)
Renderer consumes stream separately: buildChatItems appends a separate active stream item whenever props.stream is non-null, which explains why committed history plus uncleared stream can render as duplicate assistant output. (ui/src/ui/chat/build-chat-items.ts:646, f789081bae12)
Patch ordering: The PR snapshots visible/fallback terminal text, calls reconcileTerminalRun first, then appends the terminal assistant message, preserving fallback text while removing the transient overlap. (ui/src/ui/controllers/chat.ts:1129, 095b9b496f1e)
Regression test shape: The PR adds a chatMessages assignment tracker and asserts final/aborted commits happen after chatRunId and chatStream are already null. (ui/src/ui/controllers/chat.test.ts:82, 095b9b496f1e)

Likely related people:

vincentkoc: Recent GitHub file history shows multiple Control UI chat startup, history, and streaming changes in the same controller/test area. (role: recent area contributor; confidence: high; commits: c69a8d633d45, 43ced7bc4949, b3742b9edbd7; files: ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
steipete: GitHub file history shows recent Control UI chat and webchat-related work touching the same controller/test surface and surrounding chat state behavior. (role: recent adjacent contributor; confidence: high; commits: 2682c027746b, dd457474b391, 4f4d10863916; files: ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
BunsDev: Introduced the shared Control UI run lifecycle cleanup path that this PR now calls before terminal message commits. (role: adjacent lifecycle owner; confidence: medium; commits: 4935e24c7a7f, 60171e863882; files: ui/src/ui/chat/run-lifecycle.ts, ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
NVIDIAN: Local blame for the current final/aborted terminal-event block points at the current shallow-boundary commit by this author, so they may have recent context even though the full history is better represented by GitHub file history. (role: recent line owner; confidence: low; commits: eb417bc672e6; files: ui/src/ui/controllers/chat.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

zhangguiping-xydt · 2026-06-02T17:19:56Z

@clawsweeper re-review

clawsweeper · 2026-06-02T17:20:00Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26836267706
Updated: 2026-06-02T17:28:20.961Z

vincentkoc · 2026-06-03T08:45:47Z

Maintainer verification for f6244fb6808533f3128a7e5ff6fba409cf57ac67:

Reviewed the full Control UI terminal-event path: handleChatEvent now snapshots visible/fallback final text, clears the live run/stream through reconcileChatRunLifecycle, then commits the terminal assistant message. That is the right owner-boundary fix for the client-side chatMessages + chatStream render overlap.
Scoped caveat: this fixes the visible WebChat stream/final duplication race for [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 #71992. It does not claim to fix broader persisted transcript duplication; that stays separate.
Local proof:
- git diff --check origin/main...HEAD
- node scripts/run-vitest.mjs run ui/src/ui/controllers/chat.test.ts ui/src/ui/chat/run-lifecycle.test.ts ui/src/ui/chat/build-chat-items.test.ts -> 3 files / 135 tests passed
- node scripts/run-vitest.mjs run ui/src/ui/app-chat.test.ts ui/src/ui/controllers/sessions.test.ts -> 2 shards / 158 tests passed
- node scripts/run-vitest.mjs run --config test/vitest/vitest.ui-e2e.config.ts --configLoader runner ui/src/ui/e2e/chat-flow.e2e.test.ts -> 11 mocked Control UI E2E tests passed
Remote proof: Blacksmith Testbox via Crabbox tbx_01kt6a4zn7awkdy12d6b0q2d1q, run https://github.com/openclaw/openclaw/actions/runs/26873514898, command node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed, exit 0.
Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main -> clean, no accepted/actionable findings; overall patch correct 0.84.
PR CI after maintainer rebase/push: MERGEABLE/CLEAN, 121 pass / 10 skipped, no failing checks.

Best-fix verdict: best narrow fix for this UI race. Renderer-side dedupe would hide symptoms and backend transcript dedupe is a different bug class.

openclaw-barnacle Bot added app: web-ui App: web-ui size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 31, 2026

zhangguiping-xydt force-pushed the feat/issue-71992 branch 3 times, most recently from 2fe5f20 to 095b9b4 Compare June 2, 2026 16:50

vincentkoc self-assigned this Jun 3, 2026

fix(ui): clear chat stream before terminal commits

f6244fb

vincentkoc force-pushed the feat/issue-71992 branch from 095b9b4 to f6244fb Compare June 3, 2026 08:41

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 3, 2026

vincentkoc merged commit c2d7b4a into openclaw:main Jun 3, 2026
136 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix #71992: [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469#88786

fix #71992: [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469#88786
vincentkoc merged 1 commit into
openclaw:mainfrom
zhangguiping-xydt:feat/issue-71992

zhangguiping-xydt commented May 31, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

zhangguiping-xydt commented Jun 2, 2026

Uh oh!

clawsweeper Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

vincentkoc commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zhangguiping-xydt commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Real behavior proof

Review findings addressed

Regression Test Plan

Merge risk

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhangguiping-xydt commented Jun 2, 2026

Uh oh!

clawsweeper Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentkoc commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhangguiping-xydt commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented Jun 2, 2026 •

edited

Loading