Skip to content

fix #71992: [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469#88786

Merged
vincentkoc merged 1 commit into
openclaw:mainfrom
zhangguiping-xydt:feat/issue-71992
Jun 3, 2026
Merged

fix #71992: [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469#88786
vincentkoc merged 1 commit into
openclaw:mainfrom
zhangguiping-xydt:feat/issue-71992

Conversation

@zhangguiping-xydt

@zhangguiping-xydt zhangguiping-xydt commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

Root Cause

  • Root cause: Terminal final / aborted events could append an assistant message into chatMessages while the same assistant content still existed in the live chatStream / chatRunId state. Because the Control UI renders committed history and the live stream item separately, Lit batching could briefly render the completed assistant reply twice.
  • Why this is root-cause fix: The terminal path now snapshots any fallback stream text, reconciles the run lifecycle first, and only then commits the assistant message. That removes the overlapping committed-message + live-stream render state instead of hiding duplicate DOM output after the fact.
  • Fix classification: Minimal root-cause UI state-ordering fix for the Control UI WebChat renderer.
  • Patch quality notes: The diff is limited to WebChat terminal event handling and focused regression coverage; no protocol, schema, provider, or persisted transcript format changes are included.

Real behavior proof

  • Behavior or issue addressed: Control UI WebChat should render a completed assistant reply exactly once after a send completes, with no leftover streaming bubble for the same assistant turn.

  • Real environment tested: Local OpenClaw gateway serving the real Control UI from this PR head (095b9b496f1e320b9ff61617c2b2b4eb0f92e186) on loopback, driven by Playwright in Chromium. The gateway used the repo's deterministic qa mock-openai provider so the proof exercises a real browser WebChat send without depending on external model billing/auth.

  • Exact steps or command run after this patch:

    • Started pnpm openclaw qa mock-openai --host 127.0.0.1 --port 44180.
    • Started node scripts/run-node.mjs gateway run with OPENCLAW_CONFIG_PATH=/tmp/openclaw-dev-pr88786-mock-proof/.openclaw-dev/openclaw.json and gateway port 19190.
    • Ran node /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-send-proof-after-sync.mjs with OPENCLAW_PROOF_SESSION=dailyfix-88786-mock-proof and OPENCLAW_EXPECTED_REPLY=daily-fix-88786-proof.
  • Evidence after fix: Copied proof result from /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-send-proof-after-sync.json plus screenshot artifact /media/vdc/code/ai/aispace/openclaw-pr-88786-evidence/control-ui-after-one-reply-after-sync.png (sha256:05db04780c0b9b707d78920c714fae19b34d9586ddc448d0a40cbeb755eadc14).

    {
      "url": "http://127.0.0.1:19190/chat?session=dailyfix-88786-mock-proof#token=<redacted>",
      "prompt": "reply exactly `daily-fix-88786-proof`",
      "expectedReply": "daily-fix-88786-proof",
      "before": {
        "messageCount": 0,
        "userCount": 0,
        "assistantCount": 0,
        "chatRunId": null,
        "chatStream": null,
        "liveStreamingBubbleCount": 0
      },
      "after": {
        "messageCount": 2,
        "userCount": 1,
        "assistantCount": 1,
        "messageSummaries": [
          { "role": "user", "text": "reply exactly `daily-fix-88786-proof`" },
          { "role": "assistant", "text": "daily-fix-88786-proof" }
        ],
        "assistantGroups": [
          { "text": "daily-fix-88786-proof Assistant 2026年6月3日 0:45", "bubbleCount": 1, "streamingBubbleCount": 0 }
        ],
        "chatRunId": null,
        "chatStream": null,
        "liveStreamingBubbleCount": 0
      },
      "result": {
        "newUserMessages": 1,
        "newAssistantMessages": 1,
        "newAssistantTexts": ["daily-fix-88786-proof"],
        "chatRunIdCleared": true,
        "chatStreamCleared": true,
        "liveStreamingBubbleCount": 0,
        "completedAssistantReply": true,
        "exactlyOneCompletedAssistantReply": true
      }
    }
  • Observed result after fix: One WebChat send added exactly one user message and exactly one completed assistant message. The completed assistant text was daily-fix-88786-proof; chatRunId and chatStream were both null; .chat-bubble.streaming count was 0; the rendered assistant group had one bubble.

  • What was not tested: This proof does not claim to fix the separate persisted-transcript structural duplicate shape described in BUG: Control UI (webchat) double-records assistant messages in session JSONL #39469. It also does not depend on a third-party live model response; model-provider billing/auth behavior is outside this UI rendering fix.

Review findings addressed

  • RF-001 / RF-002 / RF-003: Added an after-fix real Control UI WebChat send proof with copied JSON, runtime log, and screenshot artifact showing one completed assistant reply and no live streaming duplicate.
  • RF-004: Explicitly out of scope: persisted transcript structural duplicates from BUG: Control UI (webchat) double-records assistant messages in session JSONL #39469 are a separate backend/history shape and are not treated as fixed by this PR.
  • RF-005: No further code change was needed after review; the missing item was contributor-supplied WebChat behavior proof, now recorded against current head.

Regression Test Plan

  • Target test file: ui/src/ui/controllers/chat.test.ts.
  • Ran node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts.
  • Ran pnpm tsgo:test:ui.
  • Ran git diff --check.
  • Ran the Control UI WebChat Playwright proof described above against the current PR head.

Merge risk

  • Why it matters / User impact: Users sending messages in Control UI WebChat should not see every assistant response duplicated after the terminal event lands.
  • Risk labels considered: app: web-ui, size: S, impact:session-state, impact:message-loss.
  • Risk explanation: The change touches only terminal event ordering in Control UI chat state. The main risk is losing fallback text when a terminal payload does not include a visible assistant message; the implementation snapshots fallback text before clearing stream state to preserve that behavior.
  • Why acceptable: Unit coverage asserts final and aborted terminal commits happen after stream/run clearing, and the browser proof shows a real WebChat send completing with one rendered assistant reply and no streaming duplicate.
  • Maintainer-ready confidence: High for the client-side duplicate visible reply in [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 #71992; separate persisted transcript duplicates remain out of scope as documented above.

@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 31, 2026
@clawsweeper

clawsweeper Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 2, 2026, 1:27 PM ET / 17:27 UTC.

Summary
The PR moves Control UI terminal WebChat final/aborted message commits after run/stream lifecycle reconciliation and adds regression tests for the assignment ordering.

PR surface: Source +5, Tests +30. Total +35 across 2 files.

Reproducibility: yes. source-level. On current main, active-run final/aborted handling can assign chatMessages while chatStream is still non-null, and the renderer appends the active stream separately.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] The PR intentionally fixes only the visible client-side WebChat stream/history overlap; persisted transcript structural duplicate reports should remain separately tracked and not be considered solved by this patch.

Maintainer options:

  1. Decide the mitigation before merge
    Land the focused Control UI ordering fix for the client-side WebChat duplicate-render path while keeping backend transcript dedupe as separate work.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P2] No repair lane is needed because the PR already contains the narrow code/test change and sufficient after-fix behavior proof; remaining action is normal maintainer merge validation.

Security
Cleared: The diff only changes Control UI TypeScript state ordering and tests, with no dependency, CI, secret, package, or supply-chain surface touched.

Review details

Best possible solution:

Land the focused Control UI ordering fix for the client-side WebChat duplicate-render path while keeping backend transcript dedupe as separate work.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level. On current main, active-run final/aborted handling can assign chatMessages while chatStream is still non-null, and the renderer appends the active stream separately.

Is this the best way to solve the issue?

Yes. Reordering terminal lifecycle reconciliation before the committed assistant-message append fixes the state overlap at the owner boundary; renderer-side dedupe would only hide the duplicate output and backend transcript dedupe is a separate issue.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f789081bae12.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
  • remove rating: 🌊 off-meta tidepool: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P2: This is a focused user-facing Control UI WebChat duplication regression with limited blast radius and no crash or security boundary impact.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied after-fix WebChat Playwright output showing one completed assistant reply, cleared chatRunId/chatStream, and zero live streaming bubbles.
Evidence reviewed

PR surface:

Source +5, Tests +30. Total +35 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 37 32 +5
Tests 1 31 1 +30
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 68 33 +35

What I checked:

  • Repository policy read: Root and scoped UI review policy were read fully; the root policy requires whole-path review and real behavior proof, and ui/AGENTS.md adds no conflicting rule for these files. (AGENTS.md:1, f789081bae12)
  • Current main has the overlap shape: On current main, the final and aborted branches assign state.chatMessages before calling reconcileTerminalRun, so a render scheduled by that mutation can still see live chatStream state. (ui/src/ui/controllers/chat.ts:1130, f789081bae12)
  • Lifecycle clear contract: reconcileChatRunLifecycle is the callee that clears chatStream/chatStreamStartedAt and chatRunId when called with clearChatStream and clearLocalRun. (ui/src/ui/chat/run-lifecycle.ts:177, f789081bae12)
  • Renderer consumes stream separately: buildChatItems appends a separate active stream item whenever props.stream is non-null, which explains why committed history plus uncleared stream can render as duplicate assistant output. (ui/src/ui/chat/build-chat-items.ts:646, f789081bae12)
  • Patch ordering: The PR snapshots visible/fallback terminal text, calls reconcileTerminalRun first, then appends the terminal assistant message, preserving fallback text while removing the transient overlap. (ui/src/ui/controllers/chat.ts:1129, 095b9b496f1e)
  • Regression test shape: The PR adds a chatMessages assignment tracker and asserts final/aborted commits happen after chatRunId and chatStream are already null. (ui/src/ui/controllers/chat.test.ts:82, 095b9b496f1e)

Likely related people:

  • vincentkoc: Recent GitHub file history shows multiple Control UI chat startup, history, and streaming changes in the same controller/test area. (role: recent area contributor; confidence: high; commits: c69a8d633d45, 43ced7bc4949, b3742b9edbd7; files: ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
  • steipete: GitHub file history shows recent Control UI chat and webchat-related work touching the same controller/test surface and surrounding chat state behavior. (role: recent adjacent contributor; confidence: high; commits: 2682c027746b, dd457474b391, 4f4d10863916; files: ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
  • BunsDev: Introduced the shared Control UI run lifecycle cleanup path that this PR now calls before terminal message commits. (role: adjacent lifecycle owner; confidence: medium; commits: 4935e24c7a7f, 60171e863882; files: ui/src/ui/chat/run-lifecycle.ts, ui/src/ui/controllers/chat.ts, ui/src/ui/controllers/chat.test.ts)
  • NVIDIAN: Local blame for the current final/aborted terminal-event block points at the current shallow-boundary commit by this author, so they may have recent context even though the full history is better represented by GitHub file history. (role: recent line owner; confidence: low; commits: eb417bc672e6; files: ui/src/ui/controllers/chat.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. labels May 31, 2026
@zhangguiping-xydt zhangguiping-xydt force-pushed the feat/issue-71992 branch 3 times, most recently from 2fe5f20 to 095b9b4 Compare June 2, 2026 16:50
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 2, 2026
@zhangguiping-xydt

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels Jun 2, 2026
@vincentkoc vincentkoc self-assigned this Jun 3, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 3, 2026
@vincentkoc

Copy link
Copy Markdown
Member

Maintainer verification for f6244fb6808533f3128a7e5ff6fba409cf57ac67:

  • Reviewed the full Control UI terminal-event path: handleChatEvent now snapshots visible/fallback final text, clears the live run/stream through reconcileChatRunLifecycle, then commits the terminal assistant message. That is the right owner-boundary fix for the client-side chatMessages + chatStream render overlap.
  • Scoped caveat: this fixes the visible WebChat stream/final duplication race for [Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469 #71992. It does not claim to fix broader persisted transcript duplication; that stays separate.
  • Local proof:
    • git diff --check origin/main...HEAD
    • node scripts/run-vitest.mjs run ui/src/ui/controllers/chat.test.ts ui/src/ui/chat/run-lifecycle.test.ts ui/src/ui/chat/build-chat-items.test.ts -> 3 files / 135 tests passed
    • node scripts/run-vitest.mjs run ui/src/ui/app-chat.test.ts ui/src/ui/controllers/sessions.test.ts -> 2 shards / 158 tests passed
    • node scripts/run-vitest.mjs run --config test/vitest/vitest.ui-e2e.config.ts --configLoader runner ui/src/ui/e2e/chat-flow.e2e.test.ts -> 11 mocked Control UI E2E tests passed
  • Remote proof: Blacksmith Testbox via Crabbox tbx_01kt6a4zn7awkdy12d6b0q2d1q, run https://github.com/openclaw/openclaw/actions/runs/26873514898, command node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed, exit 0.
  • Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main -> clean, no accepted/actionable findings; overall patch correct 0.84.
  • PR CI after maintainer rebase/push: MERGEABLE/CLEAN, 121 pass / 10 skipped, no failing checks.

Best-fix verdict: best narrow fix for this UI race. Renderer-side dedupe would hide symptoms and backend transcript dedupe is a different bug class.

@vincentkoc vincentkoc merged commit c2d7b4a into openclaw:main Jun 3, 2026
136 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Control UI webchat duplicates every assistant reply on 2026.4.21 — regression from #5964/#39469

2 participants