Skip to content

fix: keep codex webchat replies automatic#81110

Closed
100yenadmin wants to merge 3 commits into
openclaw:mainfrom
electricsheephq:fix/codex-webchat-direct-delivery
Closed

fix: keep codex webchat replies automatic#81110
100yenadmin wants to merge 3 commits into
openclaw:mainfrom
electricsheephq:fix/codex-webchat-direct-delivery

Conversation

@100yenadmin

@100yenadmin 100yenadmin commented May 12, 2026

Copy link
Copy Markdown
Contributor

This PR separates internal WebChat turns from external channel-origin WebChat delivery so Codex harness defaults do not accidentally suppress normal same-surface replies. Internal WebChat keeps automatic final delivery, while explicit external origins still use the message-tool-only policy where that contract is required.

Summary

This is the narrow safety fix for #81109. Codex harnesses default direct visible replies to the message tool, which is right for external direct-chat surfaces, but internal WebChat is not an outbound delivery target. This PR keeps WebChat-only turns automatic so the user sees the assistant's final reply, while preserving message_tool_only for explicit external origins that happen to be operated through WebChat.

  • Problem: internal WebChat Codex turns could suppress the visible final reply and steer the model toward message(action="send"), producing Sent. or no useful visible answer.
  • Why it matters: WebChat is the local interactive surface; users need the answer rendered directly unless the turn is explicitly acting on behalf of an external origin.
  • What changed: harness sourceVisibleReplies: "message_tool" defaults are skipped only for internal-only WebChat source contexts.
  • What did NOT change (scope boundary): external-origin WebChat routes still keep the Codex message-tool default, and external channel delivery behavior is unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: a direct internal WebChat Codex turn should render the assistant final reply to the WebChat thread instead of suppressing automatic delivery, while explicit external WebChat origins should still use the Codex message-tool default.
  • Real environment tested: local OpenClaw checkout at /Volumes/LEXAR/repos/openclaw-codex-webchat-delivery, branch fix/codex-webchat-direct-delivery, latest patch d34e6ee76d4, including dispatch policy tests and in-process Gateway WebSocket WebChat proof.
  • Exact steps or command run after this patch:
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/auto-reply/reply/dispatch-from-config.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts
pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts
git diff --check
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
[test] passed 2 Vitest shards in 16.09s
Gateway shard: Test Files 1 passed, Tests 18 passed
Auto-reply shard: Test Files 1 passed, Tests 108 passed

oxfmt: All matched files use the correct format.
git diff --check: passed
  • Observed result after fix: the internal WebChat regression asserts sourceReplyDeliveryMode === "automatic", queuedFinal === true, and the final payload text is visible webchat reply. The new external-origin regression asserts sourceReplyDeliveryMode === "message_tool_only", queuedFinal === false, and no automatic final WebChat send occurs for an explicit OriginatingChannel: "imessage" route.
  • What was not tested: manual browser DOM rendering. The Gateway proof uses the same in-process WebSocket route that WebChat uses.
  • Before evidence (optional but encouraged): ClawSweeper identified the first guard as too broad because it matched WebChat surface/provider even when an explicit external origin was present.

Root Cause (if applicable)

  • Root cause: the first guard treated any WebChat provider/surface/origin as internal WebChat, but chat.send can carry WebChat surface metadata while explicitly operating on behalf of another channel.
  • Missing detection / guardrail: coverage proved internal WebChat stayed automatic but did not assert the external-origin WebChat route kept message_tool_only.
  • Contributing context (if known): Dashboard webchat turns can enter message-tool-only mode without a send target #81092 explicitly calls out preserving message-tool delivery when WebChat is acting as an external-origin operator surface.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/auto-reply/reply/dispatch-from-config.test.ts, src/gateway/server.chat.gateway-server-chat-b.test.ts
  • Scenario the test should lock in: internal-only WebChat stays automatic; explicit external WebChat origins keep Codex message-tool delivery.
  • Why this is the smallest reliable guardrail: dispatch policy is deterministic and the Gateway test exercises the WebChat chat.send route without needing a browser.
  • Existing test that already covers this (if any): none for the explicit external-origin harness default before this patch.
  • If no new test is added, why not: N/A, a new regression is included.

User-visible / Behavior Changes

Internal WebChat Codex replies render automatically again. WebChat sessions that explicitly deliver on behalf of an external channel still preserve message-tool-only behavior.

Diagram (if applicable)

flowchart LR
  turn["Incoming WebChat turn"] --> origin{"External origin?"}
  origin -->|no| internal["Internal WebChat"]
  internal --> auto["Automatic final reply"]
  origin -->|yes| external["Explicit external-origin route"]
  external --> messageOnly["message_tool_only contract"]
  messageOnly --> channel["Channel-specific delivery"]
Loading
Internal WebChat:
[WebChat source only] -> [Codex harness default skipped] -> [automatic final reply]

Explicit external origin through WebChat:
[WebChat surface + OriginatingChannel=imessage]
  -> [Codex harness default preserved]
  -> [message_tool_only, no accidental automatic external final]

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS local development host
  • Runtime/container: local Lexar-backed OpenClaw checkout
  • Model/provider: Codex harness delivery-default policy, not model-provider dependent
  • Integration/channel (if any): WebChat dispatch and in-process Gateway WebSocket route
  • Relevant config (redacted): harness deliveryDefaults.sourceVisibleReplies: "message_tool"

Steps

  1. Run the dispatch and Gateway WebChat proof command.
  2. Confirm internal WebChat uses automatic source reply delivery.
  3. Confirm explicit external-origin WebChat keeps message_tool_only.

Expected

  • Internal WebChat final text is queued and visible.
  • Explicit external-origin WebChat does not bypass the Codex message-tool default.

Actual

  • 126 focused tests passed across dispatch and Gateway shards.
  • Formatter and whitespace checks passed.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: internal WebChat automatic delivery, external-origin WebChat message-tool preservation, Gateway WebSocket WebChat final reply path.
  • Edge cases checked: OriginatingChannel: "webchat" vs OriginatingChannel: "imessage", explicit deliver route, Codex harness message_tool default.
  • What you did not verify: manual browser DOM rendering and full GitHub CI matrix.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

ClawSweeper P2 addressed by d34e6ee76d4: limit the WebChat bypass to internal-only sources.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

@openclaw-barnacle openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 12, 2026
@clawsweeper

clawsweeper Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
The PR changes reply dispatch so internal Codex WebChat turns skip harness message-tool defaults, with dispatch and Gateway WebSocket regressions for internal WebChat and external-origin WebChat routes.

Reproducibility: yes. by source inspection: current main applies the Codex message_tool harness default to direct WebChat turns and the shared policy maps it to message_tool_only, suppressing automatic final delivery. I did not execute tests because this review is read-only.

Real behavior proof
Sufficient (terminal): The PR body provides after-fix terminal output from focused dispatch tests and an in-process Gateway WebSocket WebChat proof that exercises the changed route.

Next step before merge
No ClawSweeper repair lane is needed; the patch has no blocking findings and should proceed through normal maintainer review, CI, and product choice against the open alternative.

Security
Cleared: The diff only changes reply-dispatch policy and tests, with no new permissions, dependencies, scripts, secret handling, or code execution surface.

Review details

Best possible solution:

Land one narrow WebChat/Codex delivery fix that keeps internal WebChat replies visible while preserving message-tool delivery for explicit external origins.

Do we have a high-confidence way to reproduce the issue?

Yes by source inspection: current main applies the Codex message_tool harness default to direct WebChat turns and the shared policy maps it to message_tool_only, suppressing automatic final delivery. I did not execute tests because this review is read-only.

Is this the best way to solve the issue?

Yes: the patch targets the harness-default path for internal-only WebChat contexts and adds a regression for explicit external-origin WebChat, which is the narrow maintainable boundary for this bug.

What I checked:

Likely related people:

  • pashpashpash: History shows this account introduced the Codex harness visible-reply default and later Codex routing work related to this delivery path. (role: introduced behavior and Codex harness contributor; confidence: high; commits: 439d8edf68e2, 42e259a696a2, 02fe0d8978db; files: extensions/codex/harness.ts, src/auto-reply/reply/dispatch-from-config.ts, src/auto-reply/reply/source-reply-delivery-mode.ts)
  • steipete: Recent history includes Codex harness setup/refactors and visible-reply policy commits across the same surfaces. (role: adjacent Codex and visible-reply area contributor; confidence: medium; commits: dd26e8c44d4e, 47c0ce5f8531, e1fd27fb24ae; files: extensions/codex/harness.ts, src/auto-reply/reply/source-reply-delivery-mode.ts)
  • vincentkoc: Recent history shows gateway chat.send and source-reply visibility fixes near the context-building path this PR exercises. (role: recent gateway and source-reply contributor; confidence: medium; commits: 7fd7f6f35591, a2f1d1dfd8ab; files: src/gateway/server-methods/chat.ts, src/auto-reply/reply/source-reply-delivery-mode.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against 73abe9e98a29.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 12, 2026
@openclaw-barnacle openclaw-barnacle Bot added the gateway Gateway runtime label May 12, 2026
@100yenadmin

100yenadmin commented May 12, 2026

Copy link
Copy Markdown
Contributor Author

Added a gateway-level WebChat proof for the ClawSweeper real-behavior gate.

What changed after the review:

  • Added src/gateway/server.chat.gateway-server-chat-b.test.ts coverage that opens a real in-process Gateway WebSocket client in WebChat mode, registers the Codex harness default sourceVisibleReplies: "message_tool", sends chat.send, waits for the final chat event, and asserts the final reply remains automatic with visible webchat reply.
  • Updated the PR body's Real behavior proof section with the gateway proof commands and terminal output.

Validation from /Volumes/LEXAR/repos/openclaw-codex-webchat-delivery:

  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/gateway/server.chat.gateway-server-chat-b.test.ts -- -t "webchat chat.send keeps Codex harness visible replies automatic" passed.
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/gateway/server.chat.gateway-server-chat-b.test.ts passed, 18/18 tests.
  • pnpm exec oxfmt --check --threads=1 src/gateway/server.chat.gateway-server-chat-b.test.ts passed.
  • git diff --check passed.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@100yenadmin

Copy link
Copy Markdown
Contributor Author

Addressed the ClawSweeper internal-only WebChat finding in d34e6ee76d4: the bypass now applies only to internal WebChat source contexts, while explicit external WebChat origins keep message_tool_only. Local proof: dispatch + Gateway WebChat tests passed 126 tests; oxfmt --check and git diff --check passed. @clawsweeper re-review please

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 12, 2026
@pashpashpash

Copy link
Copy Markdown
Contributor

Thanks @100yenadmin for digging into this and for separating internal WebChat from external-origin delivery here. That helped clarify the real product boundary.

I’m going to close this in favor of #81586. The reason is that this PR makes internal WebChat automatic again, which is a reasonable safety patch, but it routes around the Codex harness contract instead of fixing the missing sink. In Codex message-tool-only mode, the model is expected to use tools.message for source-visible replies. WebChat and the TUI are internal current-run surfaces, not outbound channels, so the fix we’re taking is to let a targetless message.send satisfy the active internal UI turn while keeping WebChat/TUI non-targetable and not inheriting lastChannel.

That preserves explicit external channel routing and fixes the TUI/WebChat shape the same way, without adding a fallback or suppressing Codex’s intended message-tool path.

@100yenadmin

Copy link
Copy Markdown
Contributor Author

@pashpashpash looks like #81586 landed the broader internal-UI message-tool sink for the same WebChat/Codex issue this PR was carrying as the option-2 renderer path. All good on the final shape, but I think your clanker forgot two housekeeping bits: close this as superseded and credit me / #81144 in the changelog 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex harness default can suppress direct WebChat final replies

2 participants