fix: render WebChat message tool replies by 100yenadmin · Pull Request #81144 · openclaw/openclaw

100yenadmin · 2026-05-12T20:05:25Z

This PR restores the visible WebChat reply for Codex same-session message(action="send") calls by carrying the sanitized tool-result text into the Codex telemetry/rendering path. The safety boundary is that WebChat displays the sanitized result detail, not the raw tool arguments that may include markdown or same-session routing internals.

Summary

This is the option-2 alternative for #81109, alongside #81110. It keeps the Codex message tool path available for WebChat so the model can deliberately send the visible reply after a tool-heavy turn, but now renders only the sanitized same-session message text. That preserves the personality-restoring message tool behavior without leaking raw reasoning-tag content from original tool arguments.

Problem: same-session WebChat message(action="send") calls could be treated as external delivery or suppressed, and the first renderer version could have reused raw telemetry text.
Why it matters: the message tool is the mechanism that lets Codex recover a warm, user-facing reply after tool execution instead of ending with a sterile final answer or Sent..
What changed: same-session WebChat sends return status: "ok", keep semantic deliveryStatus: "sent", feed sanitized result text into Codex telemetry, and render that text through source-suppression-safe reply payloads.
What did NOT change (scope boundary): no duplicate Codex-native workspace tools, no Codex app-server filtering change, and no external channel delivery behavior change.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Related Codex harness default can suppress direct WebChat final replies #81109
Alternative to fix: keep codex webchat replies automatic #81110
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

Behavior or issue addressed: WebChat turns configured for message_tool_only can use the message tool for the visible reply without routing through an external channel, without being suppressed by final source-reply suppression, and without rendering unsanitized reasoning-tag text.
Real environment tested: local OpenClaw checkout at /Volumes/LEXAR/repos/openclaw-webchat-message-renderer, branch fix/webchat-message-tool-renderer, latest patch 45e0d6de92a, using real source modules and focused Vitest coverage. This is not a full browser WebChat E2E run.
Exact steps or command run after this patch:

OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/agents/tools/message-tool.test.ts extensions/codex/src/app-server/dynamic-tools.test.ts src/agents/pi-embedded-runner/run/payloads.test.ts src/agents/pi-embedded-runner/run/tool-media-payloads.test.ts
pnpm exec oxfmt --check --threads=1 src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts extensions/codex/src/app-server/dynamic-tools.ts extensions/codex/src/app-server/dynamic-tools.test.ts
git diff --check

Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):

[test] passed 3 Vitest shards in 9.61s
Test Files  2 passed (2), 1 passed (1), 1 passed (1)
Tests  29 passed (29), 50 passed (50), 25 passed (25)

oxfmt: All matched files use the correct format.
git diff --check: passed

Focused regressions now assert:

same-session WebChat result.details.message === "Actual visible reply after tools."
Codex bridge telemetry.messagingToolSentTexts === ["Visible reply from Codex."]
Codex bridge telemetry target text === "Visible reply from Codex."

Observed result after fix: same-session WebChat message sends are bridge-successful (status: "ok", deliveryStatus: "sent"), the renderer receives source-suppression-safe payload metadata, and Codex telemetry prefers sanitized tool-result details over raw original args.
What was not tested: full browser WebChat DOM rendering and broad repo type/lint. Those should remain GitHub CI/Testbox work per local-resource policy.
Before evidence (optional but encouraged): ClawSweeper identified that raw attempt.messagingToolSentTexts could render <think>hidden</think>Visible reply even though createMessageTool sanitized its copied params. The new test locks the sanitized result path.

Root Cause (if applicable)

Root cause: the first renderer consumed Codex bridge messaging telemetry collected from original tool args, while createMessageTool strips reasoning tags on a copied params object before returning same-session result details.
Missing detection / guardrail: coverage proved the WebChat message-tool payload existed, but did not assert reasoning-tag sanitization at the Codex bridge telemetry boundary.
Contributing context (if known): WebChat same-session delivery is intentionally not an outbound channel send, so the safe display text must come from the sanitized tool result, not raw transport args.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/agents/tools/message-tool.test.ts, extensions/codex/src/app-server/dynamic-tools.test.ts
Scenario the test should lock in: same-session WebChat message sends sanitize reasoning tags before display, and Codex dynamic bridge telemetry records sanitized result text.
Why this is the smallest reliable guardrail: it covers both the source sanitizer and the bridge telemetry seam without requiring a browser WebChat E2E.
Existing test that already covers this (if any): none before the sanitizer follow-up.
If no new test is added, why not: N/A, new regressions are included.

User-visible / Behavior Changes

WebChat users can receive a real visible assistant reply from the message tool path in message_tool_only mode. The rendered text is sanitized and can pass through source-reply suppression intentionally.

Diagram (if applicable)

flowchart LR
  call["message(action=send)"] --> sanitize["Message tool sanitizes visible text"]
  sanitize --> result["Tool result detail text"]
  result --> telemetry["Codex dynamic-tool telemetry"]
  telemetry --> payload["Embedded-run payload"]
  payload --> webchat["WebChat visible reply"]
  raw["Raw original args"] -. "not used for visible text" .-> telemetry

Before:
[Codex calls message tool] -> [message tool sanitizes copied params]
  -> [Codex telemetry stores raw original args]
  -> [WebChat renderer could display raw telemetry]

After:
[Codex calls message tool] -> [message tool returns sanitized same-session result]
  -> [Codex telemetry prefers sanitized result.details.message]
  -> [WebChat renderer displays sanitized visible reply]

Security Impact (required)

New permissions/capabilities? (Yes/No): No
Secrets/tokens handling changed? (Yes/No): No
New/changed network calls? (Yes/No): No
Command/tool execution surface changed? (Yes/No): No
Data access scope changed? (Yes/No): No
If any Yes, explain risk + mitigation: N/A. The security-sensitive review finding was about reasoning-tag text disclosure; the patch mitigates it by rendering sanitized result text.

Repro + Verification

Environment

OS: macOS local development host
Runtime/container: local Lexar-backed OpenClaw checkout
Model/provider: not model-provider dependent
Integration/channel (if any): WebChat same-session message tool path and Codex dynamic bridge
Relevant config (redacted): sourceReplyDeliveryMode: "message_tool_only", current channel webchat

Steps

Run the focused message tool and Codex bridge tests.
Verify same-session result details contain sanitized message text.
Verify bridge telemetry records sanitized result text and classifies status: "ok" as success.

Expected

Same-session WebChat message send returns success semantics.
Renderable telemetry contains sanitized visible text only.
Source-suppression payload metadata survives.

Actual

Focused tests passed, 104 total checks across 3 Vitest shards.
Formatter and whitespace checks passed.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: same-session WebChat message result, Codex bridge success classification, sanitized telemetry, payload metadata preservation.
Edge cases checked: reasoning-tag message args, status: "ok" plus deliveryStatus: "sent", external sends remain outside this same-session path.
What you did not verify: browser DOM rendering and full GitHub CI matrix.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

ClawSweeper P2 addressed by 45e0d6de92a: render sanitized WebChat message-tool text.

Compatibility / Migration

Backward compatible? (Yes/No): Yes
Config/env changes? (Yes/No): No
Migration needed? (Yes/No): No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: maintainers may prefer fix: keep codex webchat replies automatic #81110's automatic-delivery product shape instead of this message-tool renderer.
- Mitigation: this remains explicitly framed as the option-2 alternative, with fix: keep codex webchat replies automatic #81110 available as the narrower automatic-delivery path.

clawsweeper · 2026-05-12T20:08:43Z

Codex review: needs real behavior proof before merge.

Summary
The PR adds a same-session WebChat message-tool result path, prefers sanitized tool-result details in Codex telemetry, and marks the resulting embedded-run payloads deliverable despite source reply suppression.

Reproducibility: yes. from source inspection, but not by executing current main: current main can suppress WebChat final delivery in message_tool_only mode while Codex telemetry reads original message-tool args instead of the sanitized result details.

Real behavior proof
Needs real behavior proof before merge: The PR body and comments provide copied focused Vitest/format output, but before merge the contributor should add redacted live WebChat/Gateway proof such as a terminal log, screenshot, recording, or linked artifact; updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Human handling is needed to choose this renderer approach versus the open automatic-delivery alternative and to require live behavior proof beyond tests before merge.

Security
Cleared: No concrete security or supply-chain issue found; the diff adds no dependencies, permissions, secret handling, or network calls, and it reduces the reasoning-tag disclosure risk by preferring sanitized result details.

Review details

Best possible solution:

Pick one canonical fix for #81109: land this sanitized renderer path with live behavior proof, or land #81110 and close the unused alternative.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection, but not by executing current main: current main can suppress WebChat final delivery in message_tool_only mode while Codex telemetry reads original message-tool args instead of the sanitized result details.

Is this the best way to solve the issue?

Unclear as a product choice. The implementation is a maintainable sanitized renderer option, but the narrower alternative is to keep internal WebChat replies automatic via #81110.

Acceptance criteria:

Contributor proof should exercise a real WebChat/Gateway path in message_tool_only mode and show the sanitized visible reply, with private endpoints, tokens, and user data redacted.
If maintainers choose this PR, focused validation should include pnpm test src/agents/tools/message-tool.test.ts extensions/codex/src/app-server/dynamic-tools.test.ts src/agents/pi-embedded-runner/run/payloads.test.ts src/agents/pi-embedded-runner/run/tool-media-payloads.test.ts plus the relevant changed gate.

What I checked:

Current main sanitizer boundary: Current main copies message-tool arguments and strips reasoning tags from text/content/message/caption before outbound handling, so sanitized text exists only on the copied params/result path. (src/agents/tools/message-tool.ts:826, 8a6c18a08a03)
Current main telemetry root cause: Current main Codex dynamic-tool telemetry records message text from original args, which is the unsafe source if WebChat renders that telemetry after the sanitizer copied params. (extensions/codex/src/app-server/dynamic-tools.ts:281, 8a6c18a08a03)
Current main suppression bypass seam: Current dispatch can deliver payloads despite source-reply suppression only when reply-payload metadata sets deliverDespiteSourceReplySuppression and send policy does not deny it. (src/auto-reply/reply/dispatch-from-config.ts:1557, 8a6c18a08a03)
PR same-session WebChat path: The PR head returns status ok, deliveryStatus sent, delivery webchat-session, channel webchat, and sanitized message details for same-session WebChat message sends before outbound delivery is invoked. (src/agents/tools/message-tool.ts:784, 45e0d6de92ac)
PR sanitized Codex telemetry: The PR head makes Codex telemetry prefer text from result.details before falling back to original args, and adds a helper that reads only record-shaped result details. (extensions/codex/src/app-server/dynamic-tools.ts:281, 45e0d6de92ac)
PR embedded-run rendering path: The PR head renders WebChat messaging-tool texts only for message_tool_only WebChat runs and marks those payloads for source-suppression delivery. (src/agents/pi-embedded-runner/run.ts:266, 45e0d6de92ac)

Likely related people:

@steipete: Recent history shows central work on message-tool delivery, message-tool-only reply guidance, and bounded Codex dynamic-tool responses in the affected paths. (role: feature owner / adjacent owner; confidence: high; commits: 5e8e77ed83eb, b62166301efd, 09baec68eac7; files: src/agents/tools/message-tool.ts, extensions/codex/src/app-server/dynamic-tools.ts)
@pashpashpash: Path history for the Codex app-server dynamic-tool and harness surfaces includes deferred dynamic tools, structured tool replies, and Codex runtime policy work. (role: Codex dynamic-tool area contributor; confidence: high; commits: 3f217964d1f9, 439d8edf68e2, 02fe0d8978db; files: extensions/codex/src/app-server/dynamic-tools.ts, src/auto-reply/reply/dispatch-from-config.ts)
@vincentkoc: Recent merged work hardened Codex harness control surfaces and sanitizer-related agent behavior near the same telemetry and message rendering boundaries. (role: adjacent owner; confidence: medium; commits: ac3cd1a0ca8c, 92d33e4de85a, 47f6a98909b5; files: extensions/codex/src/app-server/dynamic-tools.ts, src/agents/tools/message-tool.ts, src/plugin-sdk/channel-streaming.test.ts)

Remaining risk / open question:

The supplied proof is only focused Vitest/format output; it does not show a live WebChat browser or Gateway path rendering the sanitized reply after the fix.
Maintainers still need to choose between this message-tool renderer shape and the open automatic-delivery alternative at fix: keep codex webchat replies automatic #81110.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8a6c18a08a03.

100yenadmin · 2026-05-12T20:59:18Z

Addressed ClawSweeper's P2 in cf2766eb792:

same-session WebChat sends now return bridge-recognized status: "ok" and preserve semantic delivery state as deliveryStatus: "sent"
added Codex dynamic bridge coverage proving the same-session WebChat message(action="send") result is classified as success: true instead of an error

Validation run after the patch:

OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/agents/tools/message-tool.test.ts extensions/codex/src/app-server/dynamic-tools.test.ts src/agents/pi-embedded-runner/run/payloads.test.ts src/agents/pi-embedded-runner/run/tool-media-payloads.test.ts
# passed: 3 Vitest shards, 102 tests

pnpm exec oxfmt --check --threads=1 src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts extensions/codex/src/app-server/dynamic-tools.test.ts
# passed

git diff --check
# passed before commit

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/25766152487
Updated: 2026-05-12T22:40:39.426Z

100yenadmin · 2026-05-12T21:11:57Z

Follow-up on the CI failures from my previous push:

Fixed the extension-boundary violation by removing the direct ../../../../src/agents/tools/message-tool.js import from extensions/codex/src/app-server/dynamic-tools.test.ts.
Kept the proof split correctly: src/agents/tools/message-tool.test.ts verifies the real WebChat same-session message tool result now reports status: "ok" plus deliveryStatus: "sent", while the Codex extension bridge test verifies that status: "ok" message sends are classified as successful dynamic tool results.

Local validation from /Volumes/LEXAR/repos/openclaw-webchat-message-renderer:

OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test extensions/codex/src/app-server/dynamic-tools.test.ts src/agents/tools/message-tool.test.ts passed.
pnpm exec oxfmt --check --threads=1 extensions/codex/src/app-server/dynamic-tools.test.ts src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts passed.
pnpm run lint:plugins:no-extension-test-core-imports passed.
pnpm tsgo:extensions:test passed.

100yenadmin · 2026-05-12T22:24:13Z

Addressed the ClawSweeper sanitizer finding in 45e0d6de92a: Codex bridge telemetry now prefers sanitized message tool result details over raw original args, and same-session WebChat message output is covered at both the message-tool and Codex dynamic bridge seams. Local proof: focused 3-shard command passed 104 tests; oxfmt --check and git diff --check passed. @clawsweeper re-review please

100yenadmin · 2026-05-14T10:32:32Z

@pashpashpash looks like #81586 landed the broader internal-UI message-tool sink for the same WebChat/Codex issue this PR was carrying as the option-2 renderer path. All good on the final shape, but I think your clanker forgot two housekeeping bits: close this as superseded and credit me / #81144 in the changelog 😄

steipete · 2026-05-15T13:32:08Z

Thanks for working on this. This WebChat/TUI current-run message-tool path has now been fixed on main by #81586, merged as 78eb92e.

I rechecked the current code path: the message tool now returns the internal UI source reply sink, Codex telemetry extracts it, and the Pi payload builder projects it back into visible WebChat/TUI reply payloads plus transcript mirroring. Since this PR is superseded by the landed broader fix, I’m closing it to keep the queue clean.

steipete · 2026-05-15T13:32:12Z

Superseded by #81586, which is merged on main.

fix: render webchat message tool replies

fa83e78

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

fix: classify webchat message sends as codex successes

cf2766e

openclaw-barnacle Bot added the extensions: codex label May 12, 2026

test: keep codex message bridge test on extension boundary

486d54d

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

clawsweeper Bot mentioned this pull request May 12, 2026

fix: keep codex webchat replies automatic #81110

Closed

25 tasks

fix: render sanitized webchat message-tool text

45e0d6d

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026

steipete closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: render WebChat message tool replies#81144

fix: render WebChat message tool replies#81144
100yenadmin wants to merge 4 commits into
openclaw:mainfrom
electricsheephq:fix/webchat-message-tool-renderer

100yenadmin commented May 12, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 12, 2026 •

edited

Loading

Uh oh!

100yenadmin commented May 12, 2026 •

edited by clawsweeper Bot

Loading

Uh oh!

100yenadmin commented May 12, 2026

Uh oh!

100yenadmin commented May 12, 2026

Uh oh!

100yenadmin commented May 14, 2026

Uh oh!

steipete commented May 15, 2026

Uh oh!

steipete commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

100yenadmin commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

clawsweeper Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

100yenadmin commented May 12, 2026 • edited by clawsweeper Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

100yenadmin commented May 12, 2026

Uh oh!

100yenadmin commented May 12, 2026

Uh oh!

100yenadmin commented May 14, 2026

Uh oh!

steipete commented May 15, 2026

Uh oh!

steipete commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

100yenadmin commented May 12, 2026 •

edited

Loading

clawsweeper Bot commented May 12, 2026 •

edited

Loading

100yenadmin commented May 12, 2026 •

edited by clawsweeper Bot

Loading