Skip to content

feat(auto-reply): durable inter-tool commentary via verbose standalone progress (supersedes #89850/#89890)#91976

Merged
obviyus merged 6 commits into
openclaw:mainfrom
anagnorisis2peripeteia:feat/verbose-commentary-progress
Jun 11, 2026
Merged

feat(auto-reply): durable inter-tool commentary via verbose standalone progress (supersedes #89850/#89890)#91976
obviyus merged 6 commits into
openclaw:mainfrom
anagnorisis2peripeteia:feat/verbose-commentary-progress

Conversation

@anagnorisis2peripeteia

@anagnorisis2peripeteia anagnorisis2peripeteia commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Supersedes #89850 and #89890, reshaped per @obviyus's guidance there: instead of persisting the ephemeral streaming draft (new persistProgress config key, Telegram-only), this adds a second consumer for inter-tool commentary in core's verbose standalone-progress path — the same place tool summaries become durable messages in dispatch-from-config — gated by the existing /verbose on. No new config key, and it works on every channel that uses the generic dispatch path.

It also closes the backend gap that made "verbose already does tool summaries" untrue for CLI runs: the CLI runner now emits the same durable tool summaries the embedded runner does, so /verbose on yields the full interleaved record (💬 commentary + 🛠️ tool summaries) on both backends and in both streaming modes.

What it does

Commit 1 — core (dispatch-from-config):

  • With verbose progress on, kind:"preamble" item events (inter-tool commentary — emitted by the Claude CLI parser since feat(cli): emit commentary progress events from Claude CLI parser #89834, and natively by codex) are delivered as standalone 💬 progress messages through the same delivery/guard chain as tool summaries (shouldSendToolSummaries, progress-delivery suppression, late-text drop, route-to-originating vs dispatcher).
  • The latest text per itemId is buffered so snapshot-style producers send one message per commentary block; the buffer flushes when the producer moves on (next item, tool start/result, block reply, final reply) and always drains before the final answer. Retractions (empty text for a buffered item) drop unsent blocks.
  • Verbose runs force commentary classification on (commentaryProgressEnabled: true), so inter-tool narration routes to the commentary lane instead of being folded into the final answer text — the answer message stays purely the final answer.
  • New onVerboseProgressVisibility reply option: dispatch hands the channel a live getter for whether the durable verbose lane is active.

Commit 2 — telegram:

  • With streaming on, the dispatcher used to divert tool-kind payloads (which would include the new commentary messages) into the ephemeral progress draft, where they were discarded at final — so verbose runs lost their progress record whenever streaming was enabled. While the durable lane is active, tool payloads are now sent as real standalone messages, and the draft yields its commentary lines (tool/plan status lines keep the draft for liveness). Reasoning lane and tool status reactions unchanged.

Commit 3 — CLI tool summaries:

  • The CLI parser already emits tool result events (name, toolCallId, isError, sanitized result), but the runner bridge dropped them — so CLI-backed runs had no durable tool record under verbose while embedded runs did. The bridge now forwards result events, and both runners render the same formatToolAggregate summaries the embedded runner emits (args-derived meta captured at tool start; output block under /verbose full), delivered through each runner's existing tool-result route.

Commit 4 — discord (review finding):

  • Discord consumes the same dispatch visibility getter as Telegram: while the durable lane is delivering commentary standalone, its progress draft skips preamble lines, so commentary renders exactly once. Covered by an active/inactive regression pair.

Result (with /verbose on)

streaming before after
off answer only (claude-cli emitted no tool summaries; commentary folded into the answer) interleaved 💬 commentary + 🛠️ tool messages in real time, clean answer last
progress progress only in the ephemeral draft, vanishes at final same durable 💬 + 🛠️ record; draft keeps live tool status; answer last

Real behavior proof

  • Behavior or issue addressed: Inter-tool commentary (assistant narration between tool calls) was only visible inside the ephemeral Telegram streaming draft — or folded into the final answer with streaming off — and was lost the moment the final answer arrived. /verbose on had no durable commentary record. After this PR, commentary lands as standalone durable 💬 progress messages in both streaming modes.
  • Real environment tested: real OpenClaw gateway built from this branch (pnpm openclaw gateway) in a Linux container with a desktop; real Telegram bot + real Telegram account on Telegram Desktop; real claude-cli/claude-opus-4-8 backend (live Anthropic API); tdlib user-driver sending real DMs. The baseline build (merge-base 050c0813b39) ran in the identical setup for before/after.
  • Exact steps or command run after this patch: started the gateway from the built branch; sent /new, then a DM asking the agent to run date -u and uname -a via exec, narrating before each command; repeated with streaming.mode: "off" and streaming.mode: "progress" (agents.defaults.verboseDefault: "on" in both); screen-recorded Telegram Desktop throughout. This recreates the Mantis telegram-desktop proof flow locally (same mechanism: telegram-desktop + user-driver + screen capture).
  • Evidence after fix: screenshots and screen recordings below (PR vs baseline, both streaming modes). Gateway runtime log excerpt for the streamed PR run shows the durable sends landing ahead of the final: [telegram] sendMessage ok chat=… message=425, message=426 (commentary), then outbound send ok … messageId=427 (final answer).
  • Observed result after fix: with verbose on, each inter-tool commentary block arrived as its own Telegram message in real time, in both streaming modes; the streaming draft still rendered live tool status and was discarded at final as before; the final answer contained no interleaved narration. Baseline runs show no commentary anywhere.
  • What was not tested: other channels' draft renderers (Discord/Slack) live — they take the unchanged generic dispatch path, covered by the unit tests; codex-native preamble producers live (same item-event shape, covered by the snapshot-collapse tests).
streaming off streaming on (progress)
baseline before-stream-off before-stream-on
PR after-stream-off after-stream-on

Baseline runs answer with no visible commentary or tool record in both modes. PR runs show the full interleaved durable record — 💬 "I'll run these one at a time…"🛠️ date -u💬 "Got the time. Next: uname -a…"🛠️ uname -a — landing in real time before the clean final answer, in both streaming modes.

Motion captures (drafts building, messages landing live):
after-stream-off · after-stream-on · before-stream-off · before-stream-on

Tests

  • 7 new dispatch-from-config cases: ordering (commentary before its tool summary; trailing commentary before final), snapshot collapse per itemId, item-transition flush, retraction drop, verbose-off passthrough (channel callback still forwarded, nothing standalone, no classification forced), visibility getter on/off.
  • 5 new CLI tool-summary tracker cases (meta from start args, output block under full verbose, disabled gate, error propagation, untracked result) and a Discord active/inactive regression pair proving commentary renders exactly once.
  • tsgo clean; dispatch-from-config (200) + followup-runner/agent-runner-execution/agent-runner-cli-dispatch + discord message-handler.process — 572 tests green on Windows and the dispatch suite in-box on Linux. The one telegram-suite failure on Windows ("records streamed final replies into the prompt context cache") is byte-identical on pristine upstream/main (pre-existing environment issue).

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: 🎥 video Contributor real behavior proof includes video or recording evidence. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. labels Jun 10, 2026
@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge. Reviewed June 10, 2026, 10:40 PM ET / 02:40 UTC.

Summary
The branch adds durable standalone verbose commentary and CLI tool-result summaries, then makes Telegram and Discord progress drafts yield duplicate commentary to the durable lane.

PR surface: Source +233, Tests +436. Total +669 across 10 files.

Reproducibility: yes. from source inspection: enable verbose progress with a commentary-enabled Slack or Microsoft Teams draft and emit a preamble item; the channel draft handles it while shared dispatch also creates the durable progress message.

Review metrics: 2 noteworthy metrics.

  • Draft-consumer adoption: 2 handled, at least 2 unhandled. Telegram and Discord honor the new visibility signal, while Slack and Microsoft Teams also render item commentary without it.
  • Shared reply-option surface: 1 callback added. The callback becomes an internal compatibility contract controlling which lane owns user-visible progress.

Merge readiness
Overall: 🦐 gold shrimp
Proof: 🦞 diamond lobster ✨ media proof bonus
Patch quality: 🦐 gold shrimp
Result: needs maintainer review before merge.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Migrate or gate every sibling commentary-draft consumer.
  • [P2] Add active/inactive exactly-once regressions for Slack, Microsoft Teams, and any other affected channel.

Risk before merge

  • [P1] Merging this exact head can make Slack and Microsoft Teams show the same verbose commentary in both their ephemeral progress draft and a durable standalone message.
  • [P1] The new shared callback is an internal compatibility surface; future draft consumers can reintroduce duplication or suppression unless ownership is centralized or contract-tested.

Maintainer options:

  1. Complete sibling adoption (recommended)
    Update Slack, Microsoft Teams, and every other commentary-draft consumer to yield while durable verbose progress is active, with active/inactive exactly-once tests.
  2. Gate unsupported channels
    Restrict durable commentary to channels that explicitly declare ownership-contract support until every draft consumer is migrated.
  3. Pause the shared rollout
    Keep the existing transient behavior rather than merging a partially adopted cross-channel delivery contract.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
Audit every production `onItemEvent` commentary draft consumer; make it yield preamble commentary while durable verbose progress is active, and add active/inactive exactly-once regression tests for each affected channel.

Next step before merge

  • The blocking defect is a bounded mechanical migration of identified channel consumers plus focused exactly-once regressions.

Security
Cleared: The diff does not alter dependencies, workflows, permissions, secrets, publishing metadata, downloaded artifacts, or other supply-chain execution paths.

Review findings

  • [P1] Handle every commentary draft consumer before enabling the durable lane — src/auto-reply/get-reply-options.types.ts:112
Review details

Best possible solution:

Use one explicit shared progress-ownership contract, migrate every current commentary-draft consumer in the same change, and preserve active/inactive exactly-once regression coverage per affected channel.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection: enable verbose progress with a commentary-enabled Slack or Microsoft Teams draft and emit a preamble item; the channel draft handles it while shared dispatch also creates the durable progress message.

Is this the best way to solve the issue?

No. Durable verbose commentary is a reasonable direction, but changing the generic dispatch lane while coordinating only Telegram and Discord is not the narrowest complete fix; every existing commentary-draft consumer must adopt or be gated from the contract.

Full review comments:

  • [P1] Handle every commentary draft consumer before enabling the durable lane — src/auto-reply/get-reply-options.types.ts:112
    Slack and Microsoft Teams also render onItemEvent preamble commentary in ephemeral progress drafts, but they never receive or honor onVerboseProgressVisibility. With verbose enabled, shared dispatch now sends the preamble as a durable progress message while those drafts still render it, producing duplicate commentary. Migrate every current draft consumer to the ownership signal, or gate durable delivery to channels that declare support, and add active/inactive exactly-once regressions.
    Confidence: 0.99

Overall correctness: patch is incorrect
Overall confidence: 0.99

AGENTS.md: found and applied where relevant.

Codex review notes: reasoning high; reviewed against d4fcc3869621.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR provides valid real Telegram before/after screenshots and recordings for streaming-off and progress modes, visibly showing durable commentary and tool summaries before a clean final answer; the evidence does not cover the unhandled sibling channels.
  • add rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🦞 diamond lobster and patch quality is 🦐 gold shrimp.
  • add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (recording): The PR provides valid real Telegram before/after screenshots and recordings for streaming-off and progress modes, visibly showing durable commentary and tool summaries before a clean final answer; the evidence does not cover the unhandled sibling channels.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦐 gold shrimp, so this older rating label is no longer current.
  • remove status: 👀 ready for maintainer look: Current PR status label is status: ⏳ waiting on author.

Label justifications:

  • P2: The opt-in feature has a concrete duplicate-delivery defect in sibling channels, but it does not affect default non-verbose runs.
  • merge-risk: 🚨 compatibility: The new shared callback contract is adopted by only part of the current channel-draft surface.
  • merge-risk: 🚨 message-delivery: Unhandled channel consumers can deliver the same commentary through both ephemeral and durable paths.
  • rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🦞 diamond lobster and patch quality is 🦐 gold shrimp.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (recording): The PR provides valid real Telegram before/after screenshots and recordings for streaming-off and progress modes, visibly showing durable commentary and tool summaries before a clean final answer; the evidence does not cover the unhandled sibling channels.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR provides valid real Telegram before/after screenshots and recordings for streaming-off and progress modes, visibly showing durable commentary and tool summaries before a clean final answer; the evidence does not cover the unhandled sibling channels.
  • proof: 🎥 video: Contributor real behavior proof includes video or recording evidence. The PR provides valid real Telegram before/after screenshots and recordings for streaming-off and progress modes, visibly showing durable commentary and tool summaries before a clean final answer; the evidence does not cover the unhandled sibling channels.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes visible Telegram commentary and tool-summary persistence and ordering, which is directly demonstrable in Telegram Desktop.
Evidence reviewed

PR surface:

Source +233, Tests +436. Total +669 across 10 files.

View PR surface stats
Area Files Added Removed Net
Source 7 272 39 +233
Tests 3 437 1 +436
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 10 709 40 +669

Acceptance criteria:

  • [P2] node scripts/run-vitest.mjs extensions/slack/src/monitor/message-handler/dispatch.preview-fallback.test.ts.
  • [P1] node scripts/run-vitest.mjs extensions/msteams/src/reply-dispatcher.test.ts.
  • [P1] node scripts/run-vitest.mjs extensions/discord/src/monitor/message-handler.process.test.ts.
  • [P1] node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts.

What I checked:

Likely related people:

  • obviyus: Authored the final exact-head refactor and Discord test adjustment, is assigned to this PR, and has direct context on the intended shared ownership contract. (role: recent area contributor and reviewer; confidence: high; commits: 0bfdb4d2ec15, 1b1fc2065780; files: src/auto-reply/reply/dispatch-from-config.ts, extensions/discord/src/monitor/message-handler.process.ts, extensions/discord/src/monitor/message-handler.process.test.ts)
  • anagnorisis2peripeteia: Introduced the merged Claude CLI commentary producer in feat(cli): emit commentary progress events from Claude CLI parser #89834 and authored the initial durable-commentary, Telegram, CLI-summary, and Discord commits in this stack. (role: feature introducer; confidence: high; commits: 7a602c7385ba, ae6dee7a321b, fc35be15fe18; files: src/auto-reply/reply/dispatch-from-config.ts, src/auto-reply/reply/agent-runner-cli-dispatch.ts, extensions/telegram/src/bot-message-dispatch.ts)
  • Shakker: Current-main history introduced the Microsoft Teams progress item-event consumer that must be accounted for by the new cross-channel ownership contract. (role: recent adjacent contributor; confidence: medium; commits: 400b3e04fb; files: extensions/msteams/src/reply-dispatcher.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot added the proof: supplied External PR includes structured after-fix real behavior proof. label Jun 10, 2026
@anagnorisis2peripeteia anagnorisis2peripeteia marked this pull request as draft June 10, 2026 17:16
@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord channel: telegram Channel integration: telegram size: L labels Jun 10, 2026
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 10, 2026
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Addressed the P1: Discord now consumes the same durable-lane visibility getter as Telegram and yields its preamble draft lines while standalone verbose commentary is active (commit 4, with an active/inactive regression pair proving exactly-once rendering).

Also folded in the sibling backend gap this review surfaced indirectly: CLI-backed runs emitted no durable tool summaries under verbose at all (the parser's tool result events were dropped at the runner bridge), so the durable lane was commentary-only on claude-cli. Commit 3 forwards those events and renders the same formatToolAggregate summaries the embedded runner emits. Fresh real-Telegram captures in the body now show the full interleaved durable record (commentary + tool summaries + clean answer) in both streaming modes.

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 10, 2026
@anagnorisis2peripeteia anagnorisis2peripeteia marked this pull request as ready for review June 10, 2026 18:36
@obviyus obviyus self-assigned this Jun 11, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 11, 2026
… progress messages

When verbose progress is enabled, preamble item events now flush as durable
standalone progress messages through the same delivery path as tool summaries,
instead of living only in ephemeral channel streaming drafts. The latest text
per item id is buffered so snapshot-style producers send one message per item;
the buffer flushes when the producer moves on (next item, tool event, block
reply, or final reply) and drains before the final answer.

Verbose runs also force commentary classification on (commentaryProgressEnabled),
so inter-tool text routes to the commentary lane rather than being folded into
the final answer text.

Dispatch additionally exposes a live verbose-progress visibility getter via the
new onVerboseProgressVisibility reply option, so draft-rendering channels can
route progress to the durable lane while it is active.
anagnorisis2peripeteia and others added 5 commits June 11, 2026 08:14
…to the streaming draft

With streaming on, the dispatcher diverted tool-kind payloads (including the
new durable commentary messages) into the ephemeral progress draft, where they
were discarded when the final answer arrived - so verbose runs lost their
progress record whenever streaming was enabled. While the durable verbose lane
is active (per the dispatch visibility getter), tool payloads are now sent as
real standalone messages and the draft yields its commentary lines; tool/plan
draft lines keep the draft since they have no durable counterpart. Reasoning
lane and tool status reactions are unaffected.
…sults

The CLI parser already emits tool result events (name, toolCallId, isError,
sanitized result), but the runner bridge dropped them, so CLI-backed runs had
no durable tool record under verbose while embedded runs did. The bridge now
forwards result events, and both runners feed a summary tracker that renders
the same formatToolAggregate line the embedded runner emits (meta captured
from the start event args), plus the tool output block when full verbose
output is enabled. Delivery rides each runner's existing tool-result route, so
verbose gating, ordering ahead of the final answer, and the Telegram durable
routing all apply unchanged.
…ss is active

Discord consumes the dispatch verbose-progress visibility getter the same way
Telegram does: while the durable lane is delivering commentary as standalone
messages, the ephemeral progress draft skips its preamble lines so commentary
renders exactly once. Covered by an active/inactive regression pair.
@obviyus obviyus force-pushed the feat/verbose-commentary-progress branch from 1b1fc20 to f8d867e Compare June 11, 2026 02:47
@obviyus obviyus merged commit bd96e4d into openclaw:main Jun 11, 2026
21 checks passed
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 11, 2026
@obviyus

obviyus commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Landed via rebase onto main (head f8d867e5e2, merge tip bd96e4d22dafe64f558fb7f3ba5977aa3a93aee6).

  • Pre-land validation (rebased onto latest main): tsgo core/extensions prod+test lanes, oxlint/oxfmt on touched files, and the touched suites — dispatch-from-config + agent-runner-cli-dispatch + followup-runner (280 tests), Discord message-handler.process (104), Telegram bot-message-dispatch (124) — all green.
  • The two red checks-node-agentic-agents-* lanes on the PR head were pre-existing breakage on main itself (reproduced on pristine origin/main; unrelated to this change).
  • Maintainer commits on top of your branch: a one-line test-type fix and a small distill pass (unified the onItemEvent wiring, shared the followup tool-summary delivery between runners) — behavior unchanged.
  • Changelog: release-only in this repo; release notes derive from the PR body, which already documents the behavior change.

Thanks @anagnorisis2peripeteia — and for the careful reshape from the persist-mode PRs to this core-owned design!

obviyus pushed a commit to ragesaq/openclaw that referenced this pull request Jun 11, 2026
…ane is off

After openclaw#91976, the claude-cli JSONL parser reclassifies assistant text that
precedes a tool_use block as commentary. The classification gate
(commentaryProgressEnabled !== undefined) was looser than the delivery gate
(commentaryProgressEnabled === true && onItemEvent), so any channel that
defined the flag as false engaged classification with no consumer wired:
flushPendingClaudeCommentaryText() called an undefined onCommentaryText and
silently discarded the text. On Discord with verbose off this dropped all
inter-tool narration and the pre-final-answer preamble text.

Two-layer fix:
- Align the classify gate with the delivery gate in both CLI dispatch sites
  (agent-runner-execution, followup-runner) so classification only engages
  when a commentary consumer exists.
- Defense in depth: flushPendingClaudeCommentaryText() now falls back to the
  assistant text lane instead of discarding when no consumer is wired, so no
  future gate mismatch can silently eat model output.

Reported on Discord: claude-cli backend lost interleaved narration and the
regular-text reasoning preamble with or without /verbose on.
obviyus pushed a commit that referenced this pull request Jun 11, 2026
…ane is off

After #91976, the claude-cli JSONL parser reclassifies assistant text that
precedes a tool_use block as commentary. The classification gate
(commentaryProgressEnabled !== undefined) was looser than the delivery gate
(commentaryProgressEnabled === true && onItemEvent), so any channel that
defined the flag as false engaged classification with no consumer wired:
flushPendingClaudeCommentaryText() called an undefined onCommentaryText and
silently discarded the text. On Discord with verbose off this dropped all
inter-tool narration and the pre-final-answer preamble text.

Two-layer fix:
- Align the classify gate with the delivery gate in both CLI dispatch sites
  (agent-runner-execution, followup-runner) so classification only engages
  when a commentary consumer exists.
- Defense in depth: flushPendingClaudeCommentaryText() now falls back to the
  assistant text lane instead of discarding when no consumer is wired, so no
  future gate mismatch can silently eat model output.

Reported on Discord: claude-cli backend lost interleaved narration and the
regular-text reasoning preamble with or without /verbose on.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 12, 2026
…ane is off

After openclaw#91976, the claude-cli JSONL parser reclassifies assistant text that
precedes a tool_use block as commentary. The classification gate
(commentaryProgressEnabled !== undefined) was looser than the delivery gate
(commentaryProgressEnabled === true && onItemEvent), so any channel that
defined the flag as false engaged classification with no consumer wired:
flushPendingClaudeCommentaryText() called an undefined onCommentaryText and
silently discarded the text. On Discord with verbose off this dropped all
inter-tool narration and the pre-final-answer preamble text.

Two-layer fix:
- Align the classify gate with the delivery gate in both CLI dispatch sites
  (agent-runner-execution, followup-runner) so classification only engages
  when a commentary consumer exists.
- Defense in depth: flushPendingClaudeCommentaryText() now falls back to the
  assistant text lane instead of discarding when no consumer is wired, so no
  future gate mismatch can silently eat model output.

Reported on Discord: claude-cli backend lost interleaved narration and the
regular-text reasoning preamble with or without /verbose on.
anagnorisis2peripeteia added a commit to anagnorisis2peripeteia/openclaw that referenced this pull request Jun 12, 2026
The pin-from-here mirror replays the origin run's agent-event bus into each
pinned target's own dispatch. The CLI runner emits stream:"tool" events with
both phase:"start" and phase:"result", but the resolver routed EVERY tool event
to onToolStart — so phase:"result" events (which drive the durable verbose tool
summary, openclaw#91976) were mis-rendered and the mirror lost its tool record while
still showing commentary.

Run the bus tool events through the same createCliToolSummaryTracker the native
CLI dispatch uses: "start" captures args-meta by toolCallId; "result" formats the
aggregate and delivers it via the target dispatch's onToolResult (which still
gates the actual send on verbose). The mirror's tool summaries are now
byte-identical to a native turn's, in both streaming modes. toolProgressDetail is
threaded from the origin config so the args detail matches.

Tests: resolver renders a durable summary from start+result and routes result to
onToolResult (not onToolStart); error propagation; existing stream-routing
regression. echo-mirror-resolver 9 + mirror-dispatch 5 green.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: discord Channel integration: discord channel: telegram Channel integration: telegram mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. proof: 🎥 video Contributor real behavior proof includes video or recording evidence. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. size: L status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants