fix(agents): preserve commentary/final_answer phase separation by ringlochid · Pull Request #59643 · openclaw/openclaw

ringlochid · 2026-04-02T11:34:10Z

Summary

Problem: assistant turns that contain both commentary and final_answer text can be flattened into one visible output, which leaks commentary into user-facing replies and can produce duplicate or malformed final delivery.
Why it matters: this breaks the expected final-only user experience, causes duplicate replies after tool/send paths, and corrupts replay/context because mixed-phase text is persisted and replayed ambiguously.
What changed: preserved phase separation end-to-end across stored-message conversion, replay/input-item rebuilding, WebSocket partial phase propagation, and visible extraction/delivery so user-visible output prefers final_answer while still falling back safely when no final text exists.
What did NOT change (scope boundary): this does not globally redefine every text extractor in the repo, does not change tool-call semantics, and does not attempt a broader phase-aware audit outside the main OpenAI WS -> embedded subscribe -> visible delivery path.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends #59150
Related ACP stream deliveryMode: final_only still sends duplicate block+final replies #56198
Related ACP block replies not marked as visible text on Discord, causing full message duplication #58892
Related [Bug]: Telegram/Codex still leaks commentary-phase tool-call trace text with multilingual garbage tokens in 2026.3.13 #52084
Related Text between tool calls leaks to messaging channels #25592
Related message.send triggers duplicate delivery-mirror replies #44467
Related PR fix: strip raw OpenAI function-call wire format from user-facing text #30479
Related PR WhatsApp: send commentary live #57484
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: assistant text phase truth already existed at block level, but several layers still flattened mixed-phase text. Stored assistant messages could carry both commentary and final-answer blocks under one misleading top-level phase; replay collapsed them back together; stream partials could lose item-phase attribution; and visible extraction/delivery then consumed flattened text instead of phase-aware text.
Missing detection / guardrail: there was no focused regression coverage for mixed-phase stored messages, phase-aware replay splitting, signature-only phased partials, or text_end / message_end delivery interactions where commentary previews must be suppressed/replaced by final output.
Prior context (git blame, prior PR, issue, or refactor if known): this bug family overlaps longstanding leakage/duplication reports in Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends #59150, ACP stream deliveryMode: final_only still sends duplicate block+final replies #56198, ACP block replies not marked as visible text on Discord, causing full message duplication #58892, [Bug]: Telegram/Codex still leaks commentary-phase tool-call trace text with multilingual garbage tokens in 2026.3.13 #52084, Text between tool calls leaks to messaging channels #25592, and message.send triggers duplicate delivery-mirror replies #44467. Adjacent prior art includes fix: strip raw OpenAI function-call wire format from user-facing text #30479 (stripping raw user-facing protocol leakage) and WhatsApp: send commentary live #57484 (commentary-delivery semantics on a channel path), but those did not fix mixed-phase persistence/replay/delivery end-to-end.
Why this regressed now: the issue is not a single recent regression; it is an accumulated phase-separation gap that became more visible once commentary, block replies, tool sends, and final-answer delivery all coexisted on the same assistant turn path.
If unknown, what was ruled out: ruled out “display-only” root cause. Investigation confirmed the problem begins upstream in stored-message conversion/replay semantics, not only in final rendering.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file:
- src/agents/openai-ws-stream.test.ts
- src/agents/pi-embedded-utils.test.ts
- src/agents/pi-embedded-subscribe.handlers.messages.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts
Scenario the test should lock in: mixed commentary/final stored messages stay phase-separated on replay; stream partials preserve phase; visible extraction prefers final_answer -> commentary -> legacy/unphased; commentary text_end block replies are suppressed until final delivery; final replacement at message_end works; and duplicate/prefix-extension regressions remain fixed.
Why this is the smallest reliable guardrail: the bug spans stored-message conversion, stream partial attribution, and delivery seams. Unit-only coverage at one layer would miss the cross-layer collapse that caused the visible duplication/leak.
Existing test that already covers this (if any): none before this branch for the mixed-phase end-to-end path.
If no new test is added, why not: N/A

User-visible / Behavior Changes

Visible assistant output now prefers final_answer text when both commentary and final-answer phases exist in one turn.
Commentary-only previews are no longer allowed to leak through as the final visible reply in the main embedded delivery path.
When commentary streamed first and final text arrives later, the final visible reply replaces the preview instead of duplicating it.
If no final-answer text exists, commentary/unphased fallback still works instead of producing an empty reply.

Diagram (if applicable)

Before:
[mixed commentary + final_answer blocks]
  -> [stored/replayed as flattened assistant text]
  -> [visible extractor concatenates all text]
  -> [commentary leak and/or duplicate final reply]

After:
[mixed commentary + final_answer blocks]
  -> [phase preserved in storage/replay/partials]
  -> [visible extractor prefers final_answer]
  -> [commentary preview suppressed/replaced]
  -> [single intended final visible reply]

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Ubuntu (local dev host)
Runtime/container: local Node/pnpm repo checkout
Model/provider: OpenAI WS / embedded subscribe path
Integration/channel (if any): embedded delivery path with Telegram/Discord-adjacent visible-output semantics
Relevant config (redacted): default OpenAI WS / embedded subscribe test harnesses; no special secrets required

Steps

Produce or replay an assistant turn containing both commentary and final_answer text blocks.
Observe stored/replayed assistant content and the user-visible delivery path.
Confirm whether visible output leaks commentary, duplicates final delivery, or collapses mixed phases.

Expected

Stored and replayed assistant content preserves phase boundaries.
User-visible extraction prefers final_answer when present.
Commentary previews are suppressed or replaced rather than duplicated.

Actual

Before this fix: mixed-phase turns could flatten into one visible assistant reply, leak commentary, or produce duplicate final delivery.
On this branch: phase separation is preserved through replay and delivery, and the targeted duplicate/leak regressions are covered by tests.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)
local transcript evidence showed assistant turns with both commentary and final_answer blocks in a single message
targeted regression slice passed: 8 suites / 164 tests
fresh independent review loop passed for storage/replay, stream-phase, visible-delivery, and holistic closure before branch submission

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios:
- inspected mixed-phase transcript evidence and confirmed commentary + final-answer coexistence in one assistant turn
- verified stored-message/replay, stream partial propagation, and visible delivery changes in the touched files
- ran the targeted regression slice and confirmed 8 suites / 164 tests passed
- confirmed the branch diff remains scoped to the intended 10 files
Edge cases checked:
- commentary leaking through text_end block replies
- final replacement at message_end after commentary streamed first
- legacy/unphased + phased replay collapse
- signature-only phased partials without top-level partial.phase
- prefix-extension and duplicate text_end regressions
What you did not verify:
- full-repo tsc --noEmit on this host (prior typecheck attempts were memory-constrained)
- a broader follow-up audit of other phase-blind helper paths such as src/agents/tools/sessions-helpers.ts

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: other assistant-text consumers outside the main embedded delivery path may still use phase-blind flattening and show adjacent inconsistencies.
- Mitigation: this PR keeps scope tight to the verified main issue path and leaves sessions-helpers parity as explicit follow-up watchpoint rather than silently broadening behavior.
Risk: replay/delivery edge cases could regress around partial/final transitions.
- Mitigation: regression coverage now locks in mixed-phase replay splitting, phase-aware partial attribution, commentary suppression, final replacement, and duplicate/text-end edge cases.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f545716520

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-02T11:39:46Z

+      const blockReplyChunk = replace ? cleanedText : deltaText;
+      if (!shouldDeferTextEndBlockReply && blockReplyChunk) {
+        if (ctx.blockChunker) {
+          ctx.blockChunker.append(blockReplyChunk);
+        } else {


Append full final text when commentary was deferred

When blockReplyChunking is enabled with blockReplyBreak: "text_end", phase-aware handling appends only deltaText for final_answer blocks (replace ? cleanedText : deltaText). If commentary was previously deferred and the final answer extends that commentary (for example "Hello" -> "Hello world"), the chunk buffer contains only the suffix. On providers/paths that do not emit text_end before message_end (the OpenAI WS path streams text_delta + done), handleMessageEnd drains that buffer and emits a truncated block reply (e.g. " world") instead of the full final answer.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

This PR fixes mixed-phase assistant turns (commentary + final_answer) being flattened into user-visible output by preserving phase separation across storage/replay, OpenAI WS streaming partials, embedded subscribe handling, and visible-text extraction/delivery.

Changes:

Added phase-aware visible text extraction that prefers final_answer over commentary, with legacy fallback.
Updated embedded subscribe streaming/delivery to suppress commentary block replies and replace previews when final output arrives.
Updated OpenAI WS stream partial emission and replay conversion to carry phase via textSignature and split replayed assistant text on phase changes; added regression tests across seams.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/agents/pi-embedded-utils.ts	Adds phase-aware extraction (`extractAssistantVisibleText`) and shared sanitization helpers.
src/agents/pi-embedded-utils.test.ts	Adds unit coverage for visible-text phase preference and legacy fallback behavior.
src/agents/pi-embedded-subscribe.handlers.messages.ts	Uses phase-aware extraction during streaming and message_end delivery; suppresses commentary block replies and supports replacement updates.
src/agents/pi-embedded-subscribe.handlers.messages.test.ts	Adds regression coverage for preview replacement and message_end replacement emission.
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts	Extends seam tests for text_end/message_end duplication and phase-aware suppression behaviors.
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts	Makes tests async and flushes microtasks to validate non-duplication reliably.
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts	Makes parametrized test async and flushes microtasks for text_end behavior.
src/agents/openai-ws-stream.ts	Emits accumulated, phase-attributed text partials using output item phase mapping and stable `textSignature`.
src/agents/openai-ws-stream.test.ts	Adds replay splitting + mixed-phase message construction tests and WS streaming phase mapping tests.
src/agents/openai-ws-message-conversion.ts	Splits replayed assistant text into separate input items when block signatures indicate phase changes; omits top-level phase for mixed-phase stored messages.

Comments suppressed due to low confidence (1)

src/agents/pi-embedded-subscribe.handlers.messages.ts:651

In the message_end block-reply path, ctx.state.lastBlockReplyText is set to the raw text value (no stripDowngradedToolCallText + no trimEnd), but flushBlockReplyBuffer() later emits via emitBlockChunk(), which compares against lastBlockReplyText using a different normalization (stripDowngradedToolCallText(stripBlockTags(...)).trimEnd()). If the buffered blockBuffer contains trailing whitespace/newlines or downgraded tool-call markers, the equality check can fail and the buffered block can be emitted again, reintroducing duplicate replies in blockReplyBreak === "message_end" flows.

Consider normalizing lastBlockReplyText the same way emitBlockChunk() does and/or clearing/resetting blockBuffer when emitting the final message_end reply so the subsequent flushBlockReplyBuffer() cannot re-emit the same content.

    if (hasBufferedBlockReply && ctx.blockChunker?.hasBuffered()) {
      ctx.blockChunker.drain({ force: true, emit: ctx.emitBlockChunk });
      ctx.blockChunker.reset();
    } else if (text !== ctx.state.lastBlockReplyText) {
      // Check for duplicates before emitting (same logic as emitBlockChunk).
      const normalizedText = normalizeTextForComparison(text);
      if (
        isMessagingToolDuplicateNormalized(
          normalizedText,
          ctx.state.messagingToolSentTextsNormalized,
        )
      ) {
        ctx.log.debug(
          `Skipping message_end block reply - already sent via messaging tool: ${text.slice(0, 50)}...`,
        );
      } else {
        ctx.state.lastBlockReplyText = text;
        emitSplitResultAsBlockReply(ctx.consumeReplyDirectives(text, { final: true }));
      }

greptile-apps · 2026-04-02T11:42:38Z

Greptile Summary

This PR preserves commentary/final_answer phase separation end-to-end: stored-message conversion now splits mixed-phase text into separate input items during replay, the WebSocket stream layer tracks per-item phases and emits phase-attributed textSignature-stamped partials, and the delivery layer (handleMessageUpdate/handleMessageEnd) prefers final_answer text via the new extractAssistantVisibleText helper and suppresses commentary block replies until the final phase arrives. All remaining findings are P2.

Confidence Score: 5/5

Safe to merge; all findings are minor style and performance suggestions with no correctness impact.
The core phase-separation logic is sound across all four layers (conversion, stream, utils, delivery). The lastBlockReplyText guard is safe because it is reset at message_start. Only P2 issues remain: a redundant normalizeAssistantPhase call, a missing explanatory comment on the guard, and up to six content-array passes per streaming delta in extractAssistantVisibleText.
No files require special attention.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/pi-embedded-utils.ts
Line: 258-260

Comment:
**Double call to `normalizeAssistantPhase`**

`normalizeAssistantPhase(parsed.phase)` is invoked twice — once to check truthiness and once to include the value. Extract it to a local variable to avoid the redundant call.

```suggestion
    const phase = normalizeAssistantPhase(parsed.phase);
    return {
      ...(typeof parsed.id === "string" ? { id: parsed.id } : {}),
      ...(phase ? { phase } : {}),
    };
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-subscribe.handlers.messages.ts
Line: 400-412

Comment:
**`lastBlockReplyText` guard scope**

The `!ctx.state.lastBlockReplyText` guard prevents the forced buffer reset when `text_end` carries the final-answer phase, but `lastBlockReplyText` is reset to `undefined` at `message_start` (in `pi-embedded-subscribe.ts`), so this is safe within a single message turn.

That said, the comment explaining why only-first-block-reply-in-this-turn gets the forced reset would help future readers understand this isn't protecting against a cross-turn leak.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-utils.ts
Line: 333-345

Comment:
**Up to three content-array passes per `extractAssistantVisibleText` call**

Each `extractAssistantTextForPhase` call does two scans of `msg.content` (`hasExplicitPhasedTextBlocks` scan + filter), so `extractAssistantVisibleText` can trigger up to six iterations over the content array. This is called on every streaming delta from `handleMessageUpdate`, which compounds for long responses. A single-pass accumulator that buckets text by phase would reduce this to O(n) once per call. Not blocking, but worth tracking for high-throughput paths.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Merge branch 'openclaw:main' into fix/co..." | Re-trigger Greptile}

greptile-apps · 2026-04-02T11:42:41Z

+      ...(normalizeAssistantPhase(parsed.phase)
+        ? { phase: normalizeAssistantPhase(parsed.phase) }
+        : {}),


Double call to normalizeAssistantPhase

normalizeAssistantPhase(parsed.phase) is invoked twice — once to check truthiness and once to include the value. Extract it to a local variable to avoid the redundant call.

Suggested change

...(normalizeAssistantPhase(parsed.phase)

? { phase: normalizeAssistantPhase(parsed.phase) }

: {}),

const phase = normalizeAssistantPhase(parsed.phase);

return {

...(typeof parsed.id === "string" ? { id: parsed.id } : {}),

...(phase ? { phase } : {}),

};

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/pi-embedded-utils.ts Line: 258-260 Comment: **Double call to `normalizeAssistantPhase`** `normalizeAssistantPhase(parsed.phase)` is invoked twice — once to check truthiness and once to include the value. Extract it to a local variable to avoid the redundant call. ```suggestion const phase = normalizeAssistantPhase(parsed.phase); return { ...(typeof parsed.id === "string" ? { id: parsed.id } : {}), ...(phase ? { phase } : {}), }; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-02T11:42:43Z

+      if (
+        !shouldDeferTextEndBlockReply &&
+        evtType === "text_end" &&
+        !ctx.state.lastBlockReplyText &&
+        cleanedText
+      ) {
+        if (ctx.blockChunker) {
+          ctx.blockChunker.reset();
+          ctx.blockChunker.append(cleanedText);
+        } else {
+          ctx.state.blockBuffer = cleanedText;
+        }
+      }


lastBlockReplyText guard scope

The !ctx.state.lastBlockReplyText guard prevents the forced buffer reset when text_end carries the final-answer phase, but lastBlockReplyText is reset to undefined at message_start (in pi-embedded-subscribe.ts), so this is safe within a single message turn.

That said, the comment explaining why only-first-block-reply-in-this-turn gets the forced reset would help future readers understand this isn't protecting against a cross-turn leak.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/pi-embedded-subscribe.handlers.messages.ts Line: 400-412 Comment: **`lastBlockReplyText` guard scope** The `!ctx.state.lastBlockReplyText` guard prevents the forced buffer reset when `text_end` carries the final-answer phase, but `lastBlockReplyText` is reset to `undefined` at `message_start` (in `pi-embedded-subscribe.ts`), so this is safe within a single message turn. That said, the comment explaining why only-first-block-reply-in-this-turn gets the forced reset would help future readers understand this isn't protecting against a cross-turn leak. How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-04-02T11:42:43Z

+export function extractAssistantVisibleText(msg: AssistantMessage): string {
+  const finalAnswerText = extractAssistantTextForPhase(msg, "final_answer");
+  if (finalAnswerText.trim()) {
+    return finalAnswerText;
+  }
+
+  const commentaryText = extractAssistantTextForPhase(msg, "commentary");
+  if (commentaryText.trim()) {
+    return commentaryText;
+  }
+
+  return extractAssistantTextForPhase(msg);
+}


Up to three content-array passes per extractAssistantVisibleText call

Each extractAssistantTextForPhase call does two scans of msg.content (hasExplicitPhasedTextBlocks scan + filter), so extractAssistantVisibleText can trigger up to six iterations over the content array. This is called on every streaming delta from handleMessageUpdate, which compounds for long responses. A single-pass accumulator that buckets text by phase would reduce this to O(n) once per call. Not blocking, but worth tracking for high-throughput paths.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/pi-embedded-utils.ts Line: 333-345 Comment: **Up to three content-array passes per `extractAssistantVisibleText` call** Each `extractAssistantTextForPhase` call does two scans of `msg.content` (`hasExplicitPhasedTextBlocks` scan + filter), so `extractAssistantVisibleText` can trigger up to six iterations over the content array. This is called on every streaming delta from `handleMessageUpdate`, which compounds for long responses. A single-pass accumulator that buckets text by phase would reduce this to O(n) once per call. Not blocking, but worth tracking for high-throughput paths. How can I resolve this? If you propose a fix, please make it concise.

ringlochid · 2026-04-02T12:06:01Z

Short blocker note from triage:

The commentary/final_answer phase-separation slice itself is green on this PR head: the directly touched suites passed locally (6 files / 138 tests).
The branch head currently includes a merge from main (f545716), and main is already red in overlapping CI families.
Concrete base/main evidence:
- main CI run 23898123099 is failing in unrelated families including:
  - src/secrets/exec-secret-ref-id-parity.test.ts
  - src/config/doc-baseline.integration.test.ts
  - src/config/schema.base.generated.test.ts
  - src/secrets/target-registry.test.ts
  - src/secrets/runtime.test.ts
  - Windows timeout in src/tasks/task-executor.test.ts
- Independent local repro on both this PR head and upstream/main:
  - pnpm exec vitest run src/daemon/service-env.test.ts
  - failing assertions: 'gateway service env' does not default NODE_EXTRA_CA_CERTS on non-macOS and 'node service env' does not default NODE_EXTRA_CA_CERTS on non-macOS
  - actual mismatch: expected NODE_EXTRA_CA_CERTS to be undefined, got /etc/ssl/certs/ca-certificates.crt

So current CI looks blocked by base/main failures unless a branch-specific regression shows up outside the touched phase-separation slice.

100yenadmin · 2026-04-05T15:31:39Z

#59643 looks directionally right for the mixed-phase assistant-turn bug family in the main WS -> embedded subscribe path.

One thing we found while comparing this against a live webchat/dashboard repro: the remaining leak may be outside the exact scope of this PR, because some non-embedded consumers still appear to render assistant text generically rather than using phase-aware visible-text extraction.

In other words, I think this PR fixes the core stream/subscriber path, but there may still be a follow-up needed for webchat/dashboard history/render consumers so #59150 can fully close.

steipete · 2026-04-05T17:16:55Z

Maintainer note from current main before review:

main now also includes 8f9b1ad (fix: remove assistant replay canonicalization repair)
that cleanup removed the old assistant replay canonicalization repair from the sanitize/replay path
after checking current message-shaping code, that repair layer does not look like the root cause for the mixed commentary / final_answer visible-output bug family this PR is targeting

So I am reviewing this PR as the active phase-separation candidate on top of a cleaner main, not as something blocked on the older replay canonicalization path.

steipete

Two blockers from review against current main:

src/agents/pi-embedded-subscribe.handlers.messages.ts:299-330 / :347-429
- visible-output suppression is still computed from evt.message, not from assistantMessageEvent.partial
- this PR then uses partialAssistant to derive phaseAwareVisibleText and can emit a commentary preview from partial.phase === "commentary"
- the added test "replaces commentary preview text when a final_answer partial arrives" locks that behavior in
- that is the opposite of the current fix direction from #61282: commentary partials should stay suppressed, not become visible and later get replaced
src/agents/openai-ws-stream.ts:556-675
- the late-map fallback still emits an unphased text_delta when response.output_text.delta arrives before response.output_item.added
- the added test "degrades safely when a text delta arrives before its item mapping" explicitly accepts that first unphased partial
- for the bug family here, that is not safe: an unphased partial is treated as visible text by downstream consumers, so a commentary chunk can still leak before the phase metadata arrives

Because both cases still allow commentary-visible partial delivery on realistic fallback/event-order paths, I do not think this is landable yet as the fix for #59150.

@ringlochid

… (thanks @ringlochid)

@ringlochid

…hanks @ringlochid)

…@ringlochid

…@ringlochid)

steipete · 2026-04-05T18:03:44Z

Landed.

rebased head sha: 501dcb1f315ebbe91e232e331215fbfbea3863ea
merge sha: 318111286fbd02526d65dd6cb741e90f746314b6

What changed on top:

strict visible-delivery phase resolution for mixed assistant blocks
commentary-only partials stay hidden until final_answer
OpenAI WS tests tightened: no dangling first-call stream, proxy store=false asserted at payload-builder level, full session cleanup in afterEach

Local gates:

pnpm check
pnpm build
pnpm test src/agents/openai-ws-stream.test.ts
pnpm test src/agents/pi-embedded-subscribe.handlers.messages.test.ts src/agents/pi-embedded-utils.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts

Thanks @ringlochid.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 501dcb1f31

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-05T18:10:55Z

+              if (event.type === "response.output_text.delta") {
+                const key = getOutputTextKey(event.item_id, event.content_index);
+                const nextText = `${outputTextByPart.get(key) ?? ""}${event.delta}`;
+                outputTextByPart.set(key, nextText);
+                if (outputItemPhaseById.has(event.item_id)) {


Allow HTTP fallback when phase mapping has not emitted output

When a response.output_text.delta arrives before response.output_item.added, this branch buffers text and emits nothing until phase metadata is known, but sawWsOutput is still marked true earlier in the same handler. If the socket then drops before the item mapping event, auto mode skips both retry/fallback (!sawWsOutput check in the catch path) even though no user-visible partial was emitted, so the turn fails instead of cleanly falling back to HTTP on transient disconnects.

Useful? React with 👍 / 👎.

100yenadmin · 2026-04-05T18:17:58Z

Thanks for landing this and for the follow-up tightening. The merged phase-separation coverage looks solid for the core WS → embedded subscribe path.

One narrow test gap I noticed after the merge: the late-map buffering test currently only exercises commentary-phase items. There's no coverage for the case where multiple pre-map deltas arrive and the eventual item resolves to final_answer — which is the more realistic streaming race under WS timing jitter.

I'll send a small test-only follow-up PR for that if it's useful. Closed my earlier #61457 since it's now redundant against merged main.

@ringlochid

… (thanks @ringlochid)

@ringlochid

…hanks @ringlochid)

@ringlochid

… (thanks @ringlochid)

@ringlochid

…hanks @ringlochid)

@ringlochid

… (thanks @ringlochid)

@ringlochid

…hanks @ringlochid)

@ringlochid

… (thanks @ringlochid)

@ringlochid

…hanks @ringlochid)

Copilot AI review requested due to automatic review settings April 2, 2026 11:34

openclaw-barnacle Bot added agents Agent runtime and tooling size: XL labels Apr 2, 2026

Copilot started reviewing on behalf of ringlochid April 2, 2026 11:34 View session

chatgpt-codex-connector Bot reviewed Apr 2, 2026

View reviewed changes

Copilot AI reviewed Apr 2, 2026

View reviewed changes

greptile-apps Bot reviewed Apr 2, 2026

View reviewed changes

omarshahine mentioned this pull request Apr 5, 2026

Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends #59150

Closed

steipete requested changes Apr 5, 2026

View reviewed changes

ringlochid and others added 3 commits April 5, 2026 18:34

Fix commentary/final answer phase separation

0ea8f56

fix: hide commentary partial leaks until final answer (openclaw#59643)…

9f5ed94

… (thanks @ringlochid)

fix: preserve strict final-answer delivery phases (openclaw#59643) (t…

501dcb1

…hanks @ringlochid)

steipete force-pushed the fix/commentary-final-answer-phase-separation branch from f545716 to 501dcb1 Compare April 5, 2026 18:03

steipete merged commit 3181112 into openclaw:main Apr 5, 2026
9 checks passed

steipete added a commit that referenced this pull request Apr 5, 2026

fix: hide commentary partial leaks until final answer (#59643) (thanks …

98ce1c2

…@ringlochid)

100yenadmin mentioned this pull request Apr 5, 2026

fix(agents): suppress commentary partial visibility and buffer unphased early deltas #61457

Closed

3 tasks

chatgpt-codex-connector Bot reviewed Apr 5, 2026

View reviewed changes

100yenadmin mentioned this pull request Apr 5, 2026

fix(agents,gateway): phase-aware assistant text extraction — suppress OpenAI commentary leaks in sessions-helpers, TUI, and REST history #61463

Merged

5 tasks

github-actions Bot mentioned this pull request Apr 5, 2026

📡 Upstream Digest — 2026-04-05 18:32 UTC curtismercier/openclaw-mods#479

Open

This was referenced Apr 5, 2026

fix(agents): harden OpenAI phase-aware visible text — suppress commentary partials, prevent empty final_answer fallback leak #61481

Closed

Text between tool calls leaks to messaging channels #25592

Open

steipete mentioned this pull request Apr 5, 2026

fix: prefer final-answer text across web chat surfaces #61506

Merged

afurm mentioned this pull request Apr 6, 2026

fix(agents): suppress commentary leaks in history-based replies #61747

Closed

19 tasks

tlnini-afk mentioned this pull request Apr 6, 2026

DeepSeek preamble text leaks to Telegram after 3.13 → 4.5 upgrade (untagged assistant text bypasses commentary filter) #62121

Open

bogdanovich mentioned this pull request Apr 25, 2026

Regression: Telegram topic leaks raw tool_uses JSON before final answer on 2026.4.23 #71475

Closed

lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026

fix: hide commentary partial leaks until final answer (openclaw#59643)…

95a9ead

… (thanks @ringlochid)

lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026

fix: preserve strict final-answer delivery phases (openclaw#59643) (t…

4d85de5

…hanks @ringlochid)

clawsweeper Bot mentioned this pull request Apr 29, 2026

[Bug]: Discord early commentary is not committed immediately and can be lost on gateway restart #55036

Closed

ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026

fix: hide commentary partial leaks until final answer (openclaw#59643)…

ee36961

… (thanks @ringlochid)

ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026

fix: preserve strict final-answer delivery phases (openclaw#59643) (t…

2ea1624

…hanks @ringlochid)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026

fix: hide commentary partial leaks until final answer (openclaw#59643)…

534c260

… (thanks @ringlochid)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026

fix: preserve strict final-answer delivery phases (openclaw#59643) (t…

70bf2af

…hanks @ringlochid)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026

fix: hide commentary partial leaks until final answer (openclaw#59643)…

4e406d1

… (thanks @ringlochid)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026

fix: preserve strict final-answer delivery phases (openclaw#59643) (t…

ac13e49

…hanks @ringlochid)

Uh oh!

Conversation

ringlochid commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause / Regression History (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

ringlochid commented Apr 2, 2026

Uh oh!

100yenadmin commented Apr 5, 2026

Uh oh!

steipete commented Apr 5, 2026

Uh oh!

steipete left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steipete commented Apr 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

100yenadmin commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ringlochid commented Apr 2, 2026 •

edited

Loading