Skip to content

fix(webchat): stabilize live transcript run state#85956

Merged
BunsDev merged 5 commits into
mainfrom
meow/webchat-transcript-run-state-truth
May 25, 2026
Merged

fix(webchat): stabilize live transcript run state#85956
BunsDev merged 5 commits into
mainfrom
meow/webchat-transcript-run-state-truth

Conversation

@BunsDev

@BunsDev BunsDev commented May 24, 2026

Copy link
Copy Markdown
Member

Summary

Fixes #83528 and #82611; refs #83949.

  • Mirror Codex app-server inbound prompts at turn start with the existing ${turnId}:prompt identity so WebChat can see external-channel user messages before the turn finishes, while preserving suppressNextUserMessagePersistence for already-persisted inbound prompts.
  • Keep hidden external-channel live runs private by sending chat/tool/agent updates only to exact selected-session message subscribers instead of broadcasting globally.
  • Subscribe Control UI to the selected session message stream on connect/reconnect and session switch, adopt observed run ids for non-WebChat-originated runs including the default main alias/canonical key pair, dedupe accumulated stream snapshots around tool cards, and keep selected-session session.message reloads alias-aware.
  • Emit stable messageSeq values for multi-message Codex mirror batches so selected-session subscribers receive per-message transcript positions instead of post-batch duplicates.

Real behavior proof

  • Behavior addressed: WebChat/Control UI now has durable inbound prompt truth at Codex turn start, selected-session live updates for external-channel runs without unrelated-session leakage, passive run-id adoption and session.message handling across the default main alias/canonical key pair, reconnect-safe selected-session subscription state, stable mirrored transcript message sequences, suppression-safe prompt mirroring, and no duplicated accumulated stream text around tool cards.
  • Real environment tested: Local OpenClaw Codex worktree on branch meow/webchat-transcript-run-state-truth, latest head 23f7c0f47037.
  • Exact steps or command run after this patch:
    • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts -- -t "does not mirror the Codex prompt early when user message persistence is suppressed"
    • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/transcript-mirror.test.ts
    • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/session-message-events.test.ts
    • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/chat/build-chat-items.test.ts ui/src/ui/app-gateway.sessions.node.test.ts ui/src/ui/app-render.helpers.node.test.ts ui/src/ui/app-gateway-chat-load.node.test.ts ui/src/ui/app-settings.refresh-active-tab.node.test.ts
    • node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
    • node --import tsx scripts/check-no-extension-test-core-imports.ts
    • git diff --check origin/main..HEAD
  • Evidence after fix:
    • Suppression regression: failed before the guard because the forbidden prompt appeared; passed after the guard.
    • Codex app-server tests: 2 files passed, 233 tests passed.
    • Gateway/session event tests: 4 files passed, 116 tests passed.
    • Control UI tests: 7 files passed, 238 tests passed.
    • Rebased-base test-type command passed.
    • Boundary check passed: check-no-extension-test-core-imports checked 1920 extension files and 1 plugin helper.
    • git diff --check origin/main..HEAD passed on latest head.
    • The regressions prove message-bearing prompt transcript updates with message ids and stable sequences, suppressed prompt persistence does not early-mirror a duplicate user row, multi-message mirror batch messageSeq [1, 2], selected-session sequence delivery, alias/canonical subscription de-dupe, stale subscription completion cleanup, main-alias session.message reloads, and passive main-alias run-id adoption.
  • Observed result after fix: The early Codex prompt mirror writes one user transcript row and emits a message-bearing selected-session update before assistant completion only when prompt persistence is not suppressed; final mirrored batches emit correct per-message sequence metadata; hidden external-channel chat events route only to exact selected-session subscribers; selected Control UI sessions subscribe/re-subscribe without alias churn or stale async overwrite; canonical agent:main:main transcript events reload selected main; passive chat state adopts observed run ids; and accumulated stream snapshots render only their new suffix after tool cards.
  • What was not tested: Live Feishu or Telegram external-channel browser proof was not run in this pass. Remote pnpm check:changed through Crabbox was attempted before this follow-up but blocked: Blacksmith Testbox is not configured for .github/workflows/crabbox-hydrate.yml, and default AWS Crabbox could not refresh credentials because no EC2 IMDS role/credentials were available on this machine.

Verification

  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts -- -t "does not mirror the Codex prompt early when user message persistence is suppressed"
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/transcript-mirror.test.ts
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/session-message-events.test.ts
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/chat/build-chat-items.test.ts ui/src/ui/app-gateway.sessions.node.test.ts ui/src/ui/app-render.helpers.node.test.ts ui/src/ui/app-gateway-chat-load.node.test.ts ui/src/ui/app-settings.refresh-active-tab.node.test.ts
  • node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
  • node --import tsx scripts/check-no-extension-test-core-imports.ts
  • git diff --check origin/main..HEAD

Copilot AI review requested due to automatic review settings May 24, 2026 06:30
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime extensions: codex size: M maintainer Maintainer-authored PR labels May 24, 2026
@BunsDev

BunsDev commented May 24, 2026

Copy link
Copy Markdown
Member Author

@clawsweeper review

Please review this as the canonical maintainer PR for #83528 and #82611. Focus on the early Codex prompt mirror idempotency, exact selected-session hidden-run delivery, and Control UI selected-session subscription/run-id adoption boundaries.

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 24, 2026, 11:49 PM ET / 03:49 UTC.

Summary
The branch mirrors Codex prompts at turn start, routes hidden gateway live events to selected-session subscribers, makes Control UI selected-session subscription/run-id handling alias-aware, deduplicates accumulated stream snapshots, and adds regression tests plus a changelog entry.

PR surface: Source +289, Tests +404, Docs +1, Other -1. Total +693 across 44 files.

Reproducibility: yes. source inspection gives a high-confidence reproduction basis: current main requires exact UI session-key matches for passive run handling and does not subscribe the selected Control UI session message stream, while Codex transcript mirroring only emits file-level updates after append. I did not run tests because this review is read-only.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Mantis proof suggestion
A visible WebChat observer proof would materially reduce the remaining live-behavior uncertainty for this session/message-delivery PR. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify WebChat selected-session view live-updates for a channel-originated hidden run and does not show updates from another session.

Risk before merge

  • This intentionally changes live message/session-state delivery for hidden external-channel runs; green unit shards do not fully settle upgrade behavior in real channel/browser observer setups.
  • The PR body says live Feishu or Telegram external-channel browser proof was not run, so maintainers must decide whether the targeted local regression proof is enough before merge.
  • The branch includes incidental formatter-only churn in scripts/tests outside the core behavior path, which broadens review slightly even though no behavior change was found there.

Maintainer options:

  1. Land With Targeted Regression Proof (recommended)
    Maintainers can merge once they accept the listed Codex, gateway, session-message, and Control UI regression shards plus the documented live-proof gap.
  2. Ask For Live Observer Proof
    Before merge, request a short WebChat/Control UI recording or redacted logs showing a channel-originated run updating only the selected session live.
  3. Respin If Formatter Churn Is Unwanted
    If maintainers want the landing diff narrower, ask for a respin that drops incidental script/test formatting churn while preserving the WebChat/gateway/Codex fix stack.

Next step before merge
No automated repair is needed; this protected maintainer PR needs human merge/proof acceptance for the P1 session/message-delivery change.

Security
Cleared: No concrete security or supply-chain regression was found; the patch does not change dependencies, workflows, lockfiles, credential handling, or permissions, and the event-routing changes narrow hidden-run delivery to selected-session subscribers.

Review details

Best possible solution:

Land this focused fix after maintainer review accepts the targeted regression proof and any desired live observer proof, then close the linked user reports through the merged PR.

Do we have a high-confidence way to reproduce the issue?

Yes, source inspection gives a high-confidence reproduction basis: current main requires exact UI session-key matches for passive run handling and does not subscribe the selected Control UI session message stream, while Codex transcript mirroring only emits file-level updates after append. I did not run tests because this review is read-only.

Is this the best way to solve the issue?

Yes, this is a maintainable repair shape: it reuses the existing transcript idempotency and sessions.messages.subscribe contracts, adds alias-aware comparisons at the UI boundary, and preserves suppressNextUserMessagePersistence for already-persisted prompts.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9db04a27eb20.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal proof with exact focused Vitest, tsgo, boundary-check, and diff-check commands plus observed red/green regression results; live external-channel browser proof remains explicitly untested but is not the only proof offered.

Label justifications:

  • P1: The PR targets live WebChat/Control UI observer regressions for channel-originated runs, a real user workflow affecting message visibility now.
  • merge-risk: 🚨 message-delivery: The diff changes which websocket subscribers receive hidden live chat/agent/tool events and must not leak or drop selected-session updates.
  • merge-risk: 🚨 session-state: The diff changes selected-session subscription state, run-id adoption, main alias/canonical equivalence, and deferred session.message reload handling.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix terminal proof with exact focused Vitest, tsgo, boundary-check, and diff-check commands plus observed red/green regression results; live external-channel browser proof remains explicitly untested but is not the only proof offered.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal proof with exact focused Vitest, tsgo, boundary-check, and diff-check commands plus observed red/green regression results; live external-channel browser proof remains explicitly untested but is not the only proof offered.
Evidence reviewed

PR surface:

Source +289, Tests +404, Docs +1, Other -1. Total +693 across 44 files.

View PR surface stats
Area Files Added Removed Net
Source 14 347 58 +289
Tests 22 505 101 +404
Docs 1 1 0 +1
Config 0 0 0 0
Generated 0 0 0 0
Other 7 27 28 -1
Total 44 880 187 +693

What I checked:

  • Repository policy and scoped guides inspected: Root and touched-scope guides were read; relevant policy treats gateway/session-state/message-delivery changes as compatibility-sensitive and protected maintainer items as human-handled. (AGENTS.md:1, 9db04a27eb20)
  • PR diff scope checked: The effective PR diff against current main is 44 files with 880 insertions and 187 deletions, centered on Codex app-server transcript mirroring, gateway session events, Control UI state, tests, and a changelog entry. (fc688af8c4a6)
  • Codex early prompt mirror implementation: PR head calls mirrorPromptAtTurnStartBestEffort after the app-server turn is active and skips the early mirror when suppressNextUserMessagePersistence is set, preserving the existing already-persisted prompt contract. (extensions/codex/src/app-server/run-attempt.ts:3149, fc688af8c4a6)
  • Transcript update payloads are message-bearing and sequenced: mirrorCodexAppServerTranscript now tracks appended messages under the write lock and emits one session transcript update per newly appended message with messageId, message, and stable messageSeq. (extensions/codex/src/app-server/transcript-mirror.ts:133, fc688af8c4a6)
  • Hidden live events route to exact selected-session subscribers: server-chat sends hidden chat/agent payloads through sessionMessageSubscribers instead of global broadcast, and the regression test proves only the selected session connection receives hidden live chat/final events. (src/gateway/server-chat.ts:631, fc688af8c4a6)
  • Control UI selected-session subscription and alias handling: PR head subscribes the selected session on connect and session switch, tracks requested versus canonical keys to avoid alias churn, and compares main alias/canonical session keys consistently for chat and session.message handling. (ui/src/ui/controllers/sessions.ts:536, fc688af8c4a6)

Likely related people:

  • Vincent Koc: Current shallow/grafted blame for the Control UI chat handler, Codex transcript mirror, and sessions message subscription methods points to 0acc3e3 across the central source paths inspected. (role: current-main area contributor; confidence: medium; commits: 0acc3e32164a; files: ui/src/ui/controllers/chat.ts, extensions/codex/src/app-server/transcript-mirror.ts, src/gateway/server-methods/sessions.ts)
  • Galin Iliev: Recent main history includes 42bdc94, a gateway session-tool mirror/backoff change in src/gateway/server-chat.ts adjacent to this PR's selected-session live event routing. (role: recent adjacent gateway contributor; confidence: medium; commits: 42bdc949f297; files: src/gateway/server-chat.ts)
  • Val Alexander: Author-history search for the UI area shows multiple recent Control UI/session/settings commits, making this a plausible review/routing candidate for the UI side of the patch. (role: recent Control UI contributor; confidence: low; commits: a45ebf3281ed, 2cfb660a9bb8, d0c83777fb58; files: ui/src/ui/controllers/chat.ts, ui/src/ui/app-gateway.ts, ui/src/ui/controllers/sessions.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Stabilizes WebChat/Control UI live transcript run state for external-channel sessions by making inbound prompts visible earlier, tightening live event routing for hidden runs, and improving client-side run/stream handling.

Changes:

  • Codex app-server now mirrors the user prompt into the transcript at turn start using the existing ${turnId}:prompt identity for idempotent dedupe later.
  • Gateway routes live chat/agent updates for hidden (non-Control-UI-visible) runs only to selected-session message subscribers instead of broadcasting globally.
  • Control UI subscribes/re-subscribes to selected-session message streams on connect/reconnect and session switch, adopts observed runIds for passive observation, and deduplicates accumulated stream snapshots around tool cards.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ui/src/ui/controllers/sessions.ts Adds selected-session message stream subscribe/unsubscribe syncing state.
ui/src/ui/controllers/sessions.test.ts Adds tests for selected-session message stream subscription behavior.
ui/src/ui/controllers/chat.ts Adopts observed runId for selected-session deltas when no local run is active.
ui/src/ui/controllers/chat.test.ts Adds coverage for runId adoption from externally-originated deltas.
ui/src/ui/chat/build-chat-items.ts Deduplicates accumulated stream snapshots around tool card boundaries.
ui/src/ui/chat/build-chat-items.test.ts Tests stream snapshot deduplication across tool cards.
ui/src/ui/app.ts Adds app state field for selected-session message subscription key.
ui/src/ui/app-view-state.ts Extends view-state typing for selected-session message subscription key.
ui/src/ui/app-settings.refresh-active-tab.node.test.ts Updates mocks for new sessions controller export.
ui/src/ui/app-render.helpers.ts Subscribes selected-session message stream on session switch.
ui/src/ui/app-render.helpers.node.test.ts Asserts session switch triggers selected-session message subscription sync.
ui/src/ui/app-gateway.ts Syncs selected-session message subscription on connect/reconnect (forced).
ui/src/ui/app-gateway.sessions.node.test.ts Updates sessions controller mocks for new export.
ui/src/ui/app-gateway-chat-load.node.test.ts Updates sessions controller mocks for new export.
src/gateway/server-runtime-subscriptions.ts Plumbs sessionMessageSubscribers into agent event handler wiring.
src/gateway/server-chat.ts Adds targeted delivery for hidden runs via session message subscribers; refactors send helpers.
src/gateway/server-chat.agent-events.test.ts Verifies hidden live chat is delivered only to exact session message subscribers.
extensions/codex/src/app-server/run-attempt.ts Mirrors Codex prompt into transcript at turn start (best-effort).
extensions/codex/src/app-server/run-attempt.test.ts Tests early prompt mirroring + idempotent dedupe with end-of-turn mirror.
CHANGELOG.md Documents the WebChat/Control UI stability fix and related issues.

Comment thread ui/src/ui/controllers/sessions.ts Outdated
Comment thread ui/src/ui/controllers/chat.ts Outdated
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 24, 2026
@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Tiny Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: polishes edge cases.
Image traits: location release reef; accessory miniature diff map; palette moss green and polished brass; mood bright-eyed; pose waving from a small platform; shell frosted glass shell; lighting golden review-room light; background miniature CI buoys.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Tiny Signal Puff in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels May 24, 2026
@BunsDev

BunsDev commented May 24, 2026

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

Follow-up commit ba4fb81a18 addresses the open findings:

  • prompt mirror now emits message-bearing transcript updates with messageId for newly appended mirrored messages only;
  • selected-session subscription state tracks requested and canonical keys separately and ignores/cleans up stale async completions;
  • Control UI passive run-id adoption treats the default main alias and agent:main:main canonical key as equivalent.

Verification on this head:

  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/transcript-mirror.test.ts -> 2 files, 231 tests passed.
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/session-message-events.test.ts -> 4 files, 111 tests passed.
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/chat/build-chat-items.test.ts ui/src/ui/app-gateway.sessions.node.test.ts ui/src/ui/app-render.helpers.node.test.ts ui/src/ui/app-gateway-chat-load.node.test.ts ui/src/ui/app-settings.refresh-active-tab.node.test.ts -> 7 files, 237 tests passed.
  • git diff --check origin/main..HEAD -> passed.

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@BunsDev

BunsDev commented May 24, 2026

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

Boundary-only follow-up 160f62c973 keeps the new transcript mirror regression on the public plugin-sdk runtime surface. Additional proof on latest head:

  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/transcript-mirror.test.ts -> 1 file, 12 tests passed.
  • node --import tsx scripts/check-no-extension-test-core-imports.ts -> passed.
  • git diff --check origin/main..HEAD -> passed.

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. labels May 24, 2026
@BunsDev

BunsDev commented May 24, 2026

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

Follow-up commit e7d411c861 addresses the two latest P2 findings from the durable review:

  • session.message handling now uses the same default mainagent:main:main equivalence as chat events, including deferred reload/queue reconciliation paths; added a selected main regression for canonical agent:main:main transcript events.
  • Codex transcript mirror now computes and emits stable per-message messageSeq values for newly appended mirror batches after the write lock is released; added a multi-message batch regression proving [1, 2].

Latest proof on head e7d411c861:

  • red/green regression shard: OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/transcript-mirror.test.ts ui/src/ui/app-gateway.sessions.node.test.ts -> 2 files, 36 tests passed after failing before the fix.
  • Codex app-server shard -> 2 files, 232 tests passed.
  • Gateway/session event shard -> 4 files, 111 tests passed.
  • Control UI shard -> 7 files, 238 tests passed.
  • node --import tsx scripts/check-no-extension-test-core-imports.ts -> passed.
  • git diff --check origin/main..HEAD -> passed.

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added status: 🔁 re-review loop A fresh ClawSweeper review was explicitly requested after the latest review. and removed status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 24, 2026
@BunsDev BunsDev force-pushed the meow/webchat-transcript-run-state-truth branch from e7d411c to 318eb0d Compare May 24, 2026 13:27
@BunsDev BunsDev self-assigned this May 24, 2026
@BunsDev

BunsDev commented May 25, 2026

Copy link
Copy Markdown
Member Author

@copilot resolve the merge conflicts in this pull request

Copilot AI commented May 25, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved. I merged the latest origin/main into this branch and fixed the conflict in src/gateway/session-message-events.test.ts, preserving the selected-session message-sequence regression and upstream wording updates. Addressed in commit 0d7559e.

@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot added cli CLI command changes scripts Repository scripts agents Agent runtime and tooling size: XL and removed size: L labels May 25, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@BunsDev BunsDev force-pushed the meow/webchat-transcript-run-state-truth branch from fc688af to a3b75ec Compare May 25, 2026 03:53
@openclaw-barnacle openclaw-barnacle Bot added size: L and removed cli CLI command changes scripts Repository scripts agents Agent runtime and tooling size: XL labels May 25, 2026
@BunsDev

BunsDev commented May 25, 2026

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

Latest cleaned/rebased head: a3b75eccc2a2.

I rebased the PR stack onto current origin/main and removed the merge-heavy PR head; the diff is back to the narrow WebChat/Codex/gateway/UI transcript surfaces plus changelog.

Proof on this head:

  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/transcript-mirror.test.ts (2 files, 233 tests)
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/session-message-events.test.ts (4 files, 116 tests)
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/chat/build-chat-items.test.ts ui/src/ui/app-gateway.sessions.node.test.ts ui/src/ui/app-render.helpers.node.test.ts ui/src/ui/app-gateway-chat-load.node.test.ts ui/src/ui/app-settings.refresh-active-tab.node.test.ts (7 files, 238 tests)
  • node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
  • node --import tsx scripts/check-no-extension-test-core-imports.ts
  • git diff --check origin/main..HEAD

@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@BunsDev BunsDev force-pushed the meow/webchat-transcript-run-state-truth branch from a3b75ec to 23f7c0f Compare May 25, 2026 04:02
@BunsDev

BunsDev commented May 25, 2026

Copy link
Copy Markdown
Member Author

Maintainer merge verification for 23f7c0f4703783dc07f77ef4a2b15675dc61c1ba:

  • Rebased cleanly onto current origin/main; final diff is scoped to WebChat/Codex/gateway/UI transcript state plus changelog.
  • Review threads: 0 unresolved.
  • GitHub merge state: CLEAN.
  • Current-head CI: 101 green, 0 pending. Non-green rollup entries are canceled superseded automation from prior force-pushed heads (auto-response, dispatch, label, Real behavior proof).
  • ClawSweeper durable review/labels: proof: sufficient, rating: 🦞 diamond lobster, status: 👀 ready for maintainer look. The final re-review run for the previous head was canceled by the last rebase-only force-push.

Local proof on the cleaned stack:

OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/transcript-mirror.test.ts
# 2 files, 233 tests passed

OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/session-message-events.test.ts
# 4 files, 116 tests passed

OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 node scripts/run-vitest.mjs ui/src/ui/controllers/chat.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/chat/build-chat-items.test.ts ui/src/ui/app-gateway.sessions.node.test.ts ui/src/ui/app-render.helpers.node.test.ts ui/src/ui/app-gateway-chat-load.node.test.ts ui/src/ui/app-settings.refresh-active-tab.node.test.ts
# 7 files, 238 tests passed

node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
node --import tsx scripts/check-no-extension-test-core-imports.ts
git diff --check origin/main..HEAD

Known proof gap accepted for merge: no fresh live Feishu/Telegram browser observer proof was run in this pass; the PR has focused local regressions plus current-head CI and sufficient proof labels.

@BunsDev BunsDev merged commit 119a01c into main May 25, 2026
101 of 105 checks passed
@BunsDev BunsDev deleted the meow/webchat-transcript-run-state-truth branch May 25, 2026 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui extensions: codex gateway Gateway runtime maintainer Maintainer-authored PR merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: L status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Codex runtime delays inbound user transcript writes until turn end, so WebChat/Control UI cannot see external messages immediately

3 participants