Skip to content

fix(agents/cli): bridge CLI tool_use lifecycle events to channel preview#80046

Merged
obviyus merged 1 commit into
openclaw:mainfrom
adele-with-a-b:fix/cli-backend-telegram-tool-progress
May 29, 2026
Merged

fix(agents/cli): bridge CLI tool_use lifecycle events to channel preview#80046
obviyus merged 1 commit into
openclaw:mainfrom
adele-with-a-b:fix/cli-backend-telegram-tool-progress

Conversation

@adele-with-a-b

@adele-with-a-b adele-with-a-b commented May 10, 2026

Copy link
Copy Markdown
Contributor

Summary

PR #76914 (merged 2026-05-09) bridged CLI-runtime assistant text deltas into channel preview pipelines, closing the silence gap on the API vs CLI runtime split for streaming text. Tool-progress events were left out of that fix — they're still dropped on the CLI runtime path while the embedded native (API) runtime emits them normally. Channel previews that subscribe to stream: "tool" for live tool-progress decoration (Telegram's "📖 Read:", "🛠️ Bash:", etc.) stay silent during tool-heavy CLI-backed turns, so users see typing indicators and final text but never the work-in-progress trail the API path already produces.

This PR closes the rest of the API/CLI parity gap by surfacing tool-use lifecycle events from the CLI stream output and bridging them through the same agent-event bus #76914 used for assistant text.

Behavior change

Two scope tiers:

  • Parser layer (src/agents/cli-output.ts) — claude-cli-specific. Recognises the claude --output-format stream-json dialect's tool_use content blocks and tool_result user-messages and surfaces them through new optional callbacks on createCliJsonlStreamingParser. Other CLI dialects (codex-cli, kiro-cli) ship their own parsers and continue to work as today; this PR only adds parsing for claude-cli's dialect because that's the one OpenClaw bundles a parser for.
  • Bridge layer (src/agents/cli-runner/execute.ts, src/agents/cli-runner/claude-live-session.ts, src/auto-reply/reply/agent-runner-execution.ts) — CLI-generic. The runCliAgent execution path is gated by isCliProvider(cliExecutionProvider, runtimeConfig) at agent-runner-execution.ts:1415, so any CLI runtime emitting agent-event { stream: "tool" } for its run id (whether claude-cli today or future CLIs that adopt the same emission shape) gets the events forwarded to params.opts.onToolStart automatically. This is the matching layer to PR fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914's assistant-bridge addition and shares its serialization, drain, and silentExpected guards.

Layer-by-layer details:

  1. Parser — new dispatchClaudeCliStreamingToolEvent in src/agents/cli-output.ts recognises three claude stream-json record shapes:

    • stream_event { event: { type: "content_block_start", content_block: { type: "tool_use", id, name } } } → record id+name in tracker (no emit yet; args haven't streamed).
    • assistant { message: { content: [{ type: "tool_use", id, name, input }] } } → emit onToolUseStart({ toolCallId, name, args: input }) with full args.
    • user { message: { content: [{ type: "tool_result", tool_use_id, is_error, content }] } } → emit onToolResult({ toolCallId, name, isError, result }).
    • content_block_stop without preceding assistant snapshot → fallback emit start with empty args (turn-aborted edge case).

    The shape {phase: "start" | "result", name, toolCallId, args | isError | result} matches what the embedded native runtime already emits at src/agents/pi-embedded-subscribe.handlers.tools.ts:696-705,998-1009, so existing channel consumers pick them up unchanged. Per-parser ToolUseTracker dedupes start/result events. New optional callbacks; when neither is subscribed, behavior is byte-identical to before.

  2. Parser consumer — both claude-cli execution paths in src/agents/cli-runner/execute.ts (live-session and JSONL-streaming) wire the new callbacks to emitAgentEvent({ runId, stream: "tool", data: {...} }). Args and results are passed through sanitizeToolArgs / sanitizeToolResult from src/agents/pi-embedded-subscribe.tools.ts before emission, matching the embedded native runtime's privacy contract at pi-embedded-subscribe.handlers.tools.ts:703,705,1005,1007 — redacts Authorization headers, API key fields, and shell commands containing tokens before they reach the shared bus. The parser is constructed fresh per turn in claude-live-session.ts:createTurn, so the ToolUseTracker is per-turn — no cross-turn state leak.

  3. Agent-event bus consumer (CLI-generic)src/auto-reply/reply/agent-runner-execution.ts adds a parallel rawUnsubscribeToolBridge = onAgentEvent(...) next to the assistant-text bridge from fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914 inside the isCliProvider(...) branch. Filters by runId, respects silentExpected (heartbeat / NO_REPLY runs do not leak), and forwards phase === "start" and "update" events to params.opts.onToolStart. Mirrors fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914's serialization pattern (toolBridgeDelivery Promise chain + drainToolBridgeDelivery) so callbacks land in order even when onToolStart is async, and unsubscribe is mirrored at every place the assistant unsubscribe is called (success, catch, finally) so the listener never leaks.

No new public APIs in plugin-sdk; channel-side code is unchanged. Telegram, Discord, Slack, etc. preview pipelines automatically light up because they already subscribe to stream: "tool" on the agent-event bus through params.opts.onToolStart.

Scope note: this PR forwards phase: "start" and phase: "update" events through onToolStart (the channel-progress activation surface). phase: "result" is emitted on the bus but not forwarded to the channel here — the embedded handler's result events also include a meta field that this CLI emit omits, and the channel-progress pipeline today only consumes activation events. Adding result-phase forwarding (with meta parity) is left as a follow-up PR if/when a consumer needs completion events on the CLI path.

Verification

  • pnpm test src/agents/cli-output.test.ts → 21/21 PASS (5 new tests: full happy-path, fallback on content_block_stop without assistant snapshot, dedup when both content_block_start and assistant snapshot announce same tool, error tool_result with name passthrough, parser-vs-consumer privacy-contract regression locking in that args/result surface raw at the parser boundary and the consumer applies sanitizeToolArgs/sanitizeToolResult before emission).
  • pnpm test src/auto-reply/reply/agent-runner-execution.test.ts → 73/73 PASS (2 new bridge tests: forwards tool events to onToolStart with correct args; respects silentExpected).
  • pnpm test src/agents/cli-runner → 50/50 PASS.
  • pnpm check:changed → all gates green (typecheck core + tests, lint core, runtime sidecar loaders, import cycles, webhook/pairing guards).
  • pnpm exec oxfmt --check --threads=1 src/agents/cli-output.ts src/agents/cli-output.test.ts src/agents/cli-runner/execute.ts src/agents/cli-runner/claude-live-session.ts src/auto-reply/reply/agent-runner-execution.ts src/auto-reply/reply/agent-runner-execution.test.ts → clean.
  • Pre-existing failures on origin/main@cb86388cec: pi-embedded-runner.run-embedded-pi-agent.auth-profile-rotation.e2e.test.ts has 3 failures + 1 unhandled rejection on clean main (verified by stashing this branch and re-running). Backoff/auth-rotation assertion expected "vi.fn()" to be called 2 times, but got 3 times. Unrelated.

Real behavior proof

Behavior or issue addressed: CLI-runtime turns silently dropping tool_use content blocks and tool_result user-messages between cli-output.ts and the channel preview pipeline, so Telegram's tool-progress UI never lights up for CLI-backed turns even though the underlying CLI is producing the data.

Real environment tested: Live OpenClaw 2026.5.7 deployment on macOS with the claude-cli backend driving Telegram. Same setup I used for the proof on #76914.

Exact steps or command run after this patch: I hand-applied this PR's three structural changes to the installed 2026.5.7 dist files (claude-live-session-*.js, execute.runtime-*.js, agent-runner.runtime-*.js), instrumented each layer with a tracing log, restarted the OpenClaw gateway via launchctl kickstart -k "gui/$(id -u)/ai.openclaw.gateway", and sent a real Telegram turn: "Read /etc/hosts, then read /etc/zshrc, then count their combined lines. Use Bash for the count." — three tool calls per turn, driven through the live gateway. I also captured the parser layer in isolation by piping a real claude-cli --output-format stream-json JSONL fixture through the patched createCliJsonlStreamingParser:

echo "Run 'ls /' and tell me how many entries." \
  | claude -p --output-format stream-json --include-partial-messages \
           --verbose --setting-sources user --allowedTools Bash \
  > /tmp/claude-stream-fixture.jsonl
node --import tsx ./proof.mts

proof.mts:

import { readFileSync } from "node:fs";
import { createCliJsonlStreamingParser } from "./src/agents/cli-output.ts";
const raw = readFileSync("/tmp/claude-stream-fixture.jsonl", "utf-8");
const events: Array<{ kind: string; data: unknown }> = [];
const parser = createCliJsonlStreamingParser({
  backend: { command: "claude", output: "jsonl", jsonlDialect: "claude-stream-json" },
  providerId: "claude-cli",
  onAssistantDelta: () => undefined,
  onToolUseStart: (d) => events.push({ kind: "tool.start", data: d }),
  onToolResult: (d) => events.push({ kind: "tool.result", data: d }),
});
parser.push(raw); parser.finish();
for (const e of events) console.log(`${e.kind}: ${JSON.stringify(e.data)}`);

Evidence after fix: Redacted runtime logs from the live Telegram + claude-cli turn, captured directly from ~/.openclaw/logs/gateway.err.log with the dist hand-patches in place. Each layer of the pipeline writes a [demo-trace] line so the chain is visible end-to-end:

[claude-cli stream-json record arrives on stdout]
parser → dispatchClaudeCliStreamingToolEvent type=stream_event
parser → emitToolStart name=Read toolCallId=toolu_...01 hasCallback=true
runtime onToolUseStart name=Read toolCallId=toolu_...01 runId=8be3085a-...
agent-runner BUS evt stream=tool phase=start name=Read
agent-runner forward to onToolStart hasCallback=true name=Read
[Telegram channel] PREVIEW partial-branch enabled=true suppressed=false normalized=📖 Read: from /etc/hosts
[Telegram editMessageText emitted]

Same chain fires for the second Read, then for the Bash call, then for tool results. The parser-isolation harness captured this terminal output for the same kind of turn driven through the real claude CLI:

Captured 2 tool events from 29 JSONL records:
  - tool.start: {"toolCallId":"toolu_01EXAMPLE","name":"Bash","args":{"command":"ls / | wc -l","description":"Count entries in root directory"}}
  - tool.result: {"toolCallId":"toolu_01EXAMPLE","name":"Bash","isError":false,"result":"      16"}

Observed result after fix: With this PR's structural change applied, my Telegram turn against the live gateway showed the bot's preview message updating with 📖 Read: from /etc/hosts, then 📖 Read: from /etc/zshrc, then 🛠️ Bash: ... lines while the agent worked, before delivering the final answer — the same per-tool decoration users already see on embedded-native-runtime agents. With streaming.mode: "progress" the lines accumulate for the full turn; with streaming.mode: "partial" (the default) they flash before the streaming answer text replaces the preview, matching the existing embedded-runtime behavior. Without the patch (stock 2026.5.7), the same prompt produces zero [demo-trace] BUS evt stream=tool log lines and the Telegram preview shows only the assistant text — the onToolStart callback the channel pipeline already exposes is never invoked because no CLI-side emitter publishes to it.

What was not tested: I did not exercise non-claude CLI runtimes (codex-cli, kiro-cli) against the bridge layer — those don't ship a stream-json parser in this repo, so the parser layer change doesn't apply to them. The bridge layer at agent-runner-execution.ts is gated by isCliProvider(...) and would forward any stream: "tool" events those runtimes might emit in the future, but that future is not exercised here. I also did not test the phase: "result" channel-side rendering — the bridge filters result to keep parity with #76914's activation-only forwarding (see Scope note above), so this PR's change is invisible to consumers that only handle start/update today.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: L triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@clawsweeper

clawsweeper Bot commented May 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge. Reviewed May 29, 2026, 3:17 AM ET / 07:17 UTC.

Summary
The PR adds Claude CLI stream-json tool-use/result parsing, sanitized stream: "tool" agent events, and CLI lifecycle forwarding into channel tool-progress callbacks with focused tests.

PR surface: Source +370, Tests +513. Total +883 across 9 files.

Reproducibility: yes. source inspection gives a high-confidence reproduction path: emit CLI stream: "tool" start/update events during a main message_tool_only turn and the new adapter forwards them unconditionally, while current embedded code and tests suppress progress after message-send completion.

Review metrics: none identified.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦞 diamond lobster
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Mirror embedded message_tool_only suppression in the main CLI tool bridge.
  • [P2] Add a focused main-path regression test for CLI tool events after message.send source delivery.

Mantis proof suggestion
A native Telegram run would directly show whether claude-cli Read/Bash progress appears and whether source-delivery turns avoid extra progress after message.send. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify a claude-cli Telegram turn shows Read/Bash tool progress and suppresses extra progress after message.send source delivery.

Risk before merge

  • [P1] Merging now can surface channel tool-progress callbacks in main-turn CLI message_tool_only runs after a message tool has already delivered the source reply.

Maintainer options:

  1. Mirror embedded source-delivery suppression (recommended)
    Update the CLI tool bridge so main turns carry enough tool lifecycle state to suppress progress after message_tool_only message.send delivery, matching the embedded handler.
  2. Hold for maintainer delivery semantics
    If maintainers want different CLI behavior from embedded source-delivery runs, pause merge until that policy is explicitly recorded and tested.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
Update the CLI tool bridge so `sourceReplyDeliveryMode: "message_tool_only"` preserves the embedded handler's suppression after a message.send tool completes. Preserve toolCallId/result state as needed, keep result-phase channel rendering out of scope, and add focused regression coverage in `src/auto-reply/reply/agent-runner-execution.test.ts`.

Next step before merge

  • [P2] A narrow automated repair can preserve the embedded message_tool_only suppression contract in the PR branch and add the missing regression coverage.

Security
Cleared: No concrete security or supply-chain issue found; the diff does not add dependencies, workflows, downloads, or credentials handling, and tool args/results are passed through the existing sanitizer before bus emission.

Review findings

  • [P1] Honor message-tool-only suppression in the main CLI bridge — src/auto-reply/reply/agent-runner-execution.ts:2029-2039
Review details

Best possible solution:

Land the parser and bridge after the main CLI tool bridge preserves the same message-tool-only suppression semantics as the embedded path and the focused regression passes.

Do we have a high-confidence way to reproduce the issue?

Yes, source inspection gives a high-confidence reproduction path: emit CLI stream: "tool" start/update events during a main message_tool_only turn and the new adapter forwards them unconditionally, while current embedded code and tests suppress progress after message-send completion.

Is this the best way to solve the issue?

No, not quite yet: the parser and bridge are the right narrow fix for the CLI/API parity gap, but the main CLI consumer needs the embedded source-delivery suppression before this is the safest implementation.

Full review comments:

  • [P1] Honor message-tool-only suppression in the main CLI bridge — src/auto-reply/reply/agent-runner-execution.ts:2029-2039
    This new adapter forwards every CLI tool start/update straight to onToolStart, but the embedded path tracks message_tool_only message.send calls and suppresses later progress once that source delivery completes. Because the CLI bridge drops the toolCallId/result lifecycle before this layer, a CLI-backed source-delivery turn can still emit channel progress after the message tool already delivered the reply; please preserve the same suppression contract here and cover the main CLI path with a regression test.
    Confidence: 0.89

Overall correctness: patch is incorrect
Overall confidence: 0.9

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 0bc591a7d781.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment include redacted live Telegram plus claude-cli logs and parser harness output showing the central tool-progress path after the patch, though the remaining suppression bug still needs a code fix.

Label justifications:

  • P2: This is a normal-priority agent/channel preview bug fix with a concrete remaining message-delivery regression, not an emergency runtime outage.
  • merge-risk: 🚨 message-delivery: The PR changes channel-visible tool-progress delivery and currently can emit extra progress after message_tool_only source delivery on the main CLI path.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦞 diamond lobster and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (logs): The PR body and follow-up comment include redacted live Telegram plus claude-cli logs and parser harness output showing the central tool-progress path after the patch, though the remaining suppression bug still needs a code fix.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment include redacted live Telegram plus claude-cli logs and parser harness output showing the central tool-progress path after the patch, though the remaining suppression bug still needs a code fix.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes Telegram-visible tool-progress preview behavior, which is suitable for a short Telegram Desktop proof once the remaining bridge bug is fixed.
Evidence reviewed

PR surface:

Source +370, Tests +513. Total +883 across 9 files.

View PR surface stats
Area Files Added Removed Net
Source 6 378 8 +370
Tests 3 513 0 +513
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 9 891 8 +883

Acceptance criteria:

  • [P1] node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts src/agents/cli-output.test.ts --run.
  • [P1] pnpm tsgo:core.
  • [P1] pnpm tsgo:core:test.

What I checked:

  • Repository policy read: Root AGENTS.md and the scoped agents guide were read; the review applied the OpenClaw guidance that channel delivery and agent runtime changes need sibling-surface proof and real behavior proof. (AGENTS.md:8, 0bc591a7d781)
  • Telegram maintainer note: The Telegram note says streaming owns one visible preview message and Telegram behavior PRs need real Telegram proof when they touch streaming or transport-visible behavior. (.agents/maintainer-notes/telegram.md:5, 0bc591a7d781)
  • PR bridge source: At PR head, the new main-run onToolEvent adapter forwards every start/update directly to signalToolStart and onToolStart without checking sourceReplyDeliveryMode or message-tool completion state. (src/auto-reply/reply/agent-runner-execution.ts:2029, 52d11f5743cb)
  • Current embedded suppression contract: Current main's embedded handler tracks message_tool_only message-send tool calls, returns once progress should be suppressed, and only then forwards onToolStart. (src/auto-reply/reply/agent-runner-execution.ts:2267, 0bc591a7d781)
  • Existing regression coverage: Current main already has a regression test proving progress callbacks are suppressed after message_tool_only delivery completes on the embedded path. (src/auto-reply/reply/agent-runner-execution.test.ts:3683, 0bc591a7d781)
  • PR parser source: At PR head, the parser recognizes Claude tool_use, server_tool_use, mcp_tool_use, and tool-result records, and emits start/result callbacks through a per-parser tracker. (src/agents/cli-output.ts:490, 52d11f5743cb)

Likely related people:

  • Peter Steinberger: Current-main blame on the embedded tool-progress and message_tool_only suppression block points to recent work in agent-runner-execution.ts. (role: recent area contributor; confidence: medium; commits: 0e86ca135225, 560c7440fb88; files: src/auto-reply/reply/agent-runner-execution.ts)
  • jack-stormentswe: The linked merged CLI assistant preview bridge was credited to this contributor and is the adjacent bridge shape this PR extends to tool events. (role: adjacent feature contributor; confidence: medium; commits: 560c7440fb88; files: src/auto-reply/reply/agent-runner-execution.ts)
  • Ayaan Zaidi: Merged reasoning-bridge work touched the same CLI lifecycle path shortly before this PR and is relevant to the shared bridge structure. (role: adjacent bridge contributor; confidence: medium; commits: 02f2e08493f4; files: src/auto-reply/reply/agent-runner-execution.ts)
  • Cameron Beeley: A nearby commit added the CLI assistant text-delta reasoning bridge that this PR's tool bridge follows. (role: adjacent bridge contributor; confidence: low; commits: 01b55b5c6876; files: src/auto-reply/reply/agent-runner-execution.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@adele-with-a-b adele-with-a-b force-pushed the fix/cli-backend-telegram-tool-progress branch from e82b32c to b761ec2 Compare May 10, 2026 01:44
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@adele-with-a-b adele-with-a-b force-pushed the fix/cli-backend-telegram-tool-progress branch 2 times, most recently from dc4dd60 to b58bfe6 Compare May 10, 2026 02:01
@adele-with-a-b adele-with-a-b marked this pull request as ready for review May 10, 2026 02:01
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
obviyus pushed a commit to anagnorisis2peripeteia/openclaw that referenced this pull request May 14, 2026
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes
to `stream: "assistant"` agent-events for the current runId and re-emits
them as reasoning content through `params.opts.onReasoningStream`. Mirrors
the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046:
same Promise-chain serialization + drain, same silentExpected gate, same
unsubscribe pattern at success/catch/finally.

The reply lane is untouched -- `onPartialReply` continues to settle the
final assistant text via openclaw#76914. The reasoning lane now reflects the
model's live text output during streaming, which is the only "what is the
model producing right now" signal available for claude-opus-4-7 over
claude-cli (Anthropic suppresses readable thinking_delta events on the
wire for opus-4-7; only thinking content_block + signature_delta arrive).

The bridge is gated on isCliProvider so API/native runtimes that already
get reasoning content from real thinking_delta events do NOT double-receive
text_delta as reasoning.

Tests cover:
- Forwards assistant agent-events to onReasoningStream with correct text
- Respects silentExpected (heartbeat / NO_REPLY runs don't emit)
- Does not fire on the API/native runtime path (gate works)
obviyus pushed a commit that referenced this pull request May 14, 2026
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes
to `stream: "assistant"` agent-events for the current runId and re-emits
them as reasoning content through `params.opts.onReasoningStream`. Mirrors
the assistant-text bridge from #76914 and the tool-event bridge from #80046:
same Promise-chain serialization + drain, same silentExpected gate, same
unsubscribe pattern at success/catch/finally.

The reply lane is untouched -- `onPartialReply` continues to settle the
final assistant text via #76914. The reasoning lane now reflects the
model's live text output during streaming, which is the only "what is the
model producing right now" signal available for claude-opus-4-7 over
claude-cli (Anthropic suppresses readable thinking_delta events on the
wire for opus-4-7; only thinking content_block + signature_delta arrive).

The bridge is gated on isCliProvider so API/native runtimes that already
get reasoning content from real thinking_delta events do NOT double-receive
text_delta as reasoning.

Tests cover:
- Forwards assistant agent-events to onReasoningStream with correct text
- Respects silentExpected (heartbeat / NO_REPLY runs don't emit)
- Does not fire on the API/native runtime path (gate works)
@clawsweeper clawsweeper Bot added the mantis: telegram-visible-proof Mantis should capture Telegram visible proof. label May 14, 2026
@adele-with-a-b adele-with-a-b force-pushed the fix/cli-backend-telegram-tool-progress branch from b58bfe6 to ee29656 Compare May 19, 2026 16:30
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 19, 2026
@adele-with-a-b

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Rebased onto current upstream/main at 48acdd3d85 and reconciled with the new runCliAgentWithLifecycle reusable bridge shape per the prior review's "rebase + integrate the tool bridge alongside the newer assistant/reasoning bridge structure" recommendation. Old SHA b58bfe6dd3 → new SHA ee29656ad9.

What changed in the rebase

The pre-rebase PR did the bridging inline in agent-runner-execution.ts (~190 lines of subscribe/unsubscribe/drain). Upstream main has since refactored that surface into runCliAgentWithLifecycle (src/auto-reply/reply/agent-runner-cli-dispatch.ts) with onAssistantText / onReasoningText / onErrorBeforeLifecycle callbacks and a shared bridge lifecycle. The new shape:

  • src/auto-reply/reply/agent-runner-cli-dispatch.ts — adds a CliToolEventPayload type and a new createToolEventBridge helper symmetric to the existing createAssistantTextBridge. Adds an optional onToolEvent parameter to runCliAgentWithLifecycle. Subscribes/unsubscribes/drains the tool bridge in the existing try/catch/finally alongside assistant + reasoning. The bridge filters phase to "start" | "update" (matches the embedded native path at agent-runner-execution.ts:1885) and shares the suppressAssistantBridge flag (= silentExpected).
  • src/auto-reply/reply/agent-runner-execution.ts:1660 — adds an onToolEvent adapter that forwards the bus event to params.opts?.onToolStart({name, phase, args, detailMode: toolProgressDetail}) and pairs it with params.typingSignals.signalToolStart() via Promise.all, mirroring the embedded native handler at 1882-1890.
  • src/auto-reply/reply/followup-runner.ts:611 — symmetric wiring: onToolEvent forwards to opts?.onToolStart with detailMode: toolProgressDetail.
  • Parser + emit changes (cli-output.ts, cli-output.test.ts, claude-live-session.ts, execute.ts) — unchanged from the pre-rebase PR. The bus-emit shape for stream: "tool" events matches what the embedded native runtime emits.
  • Tests (agent-runner-execution.test.ts:1357 + :1429) — survived the rebase cleanly. Both still assert the bus-emit → bridge → onToolStart path: 1357 verifies start phase reaches onToolStart while result phase is filtered, 1429 verifies silentExpected suppresses tool events.

Reviewer pass before push

Ran our own reviewer on the rebased diff before pushing. Findings addressed in the amended commit:

  1. CHANGELOG.md — dropped entirely. Per repo policy ("Contributor PR authors should not edit CHANGELOG.md; maintainer/AI adds entries during landing/merge") and to remove the duplicated bullets / stale rebase block. Maintainer can add the changelog entry on landing.
  2. Typing wake-up parity — added params.typingSignals.signalToolStart() to the onToolEvent adapter at agent-runner-execution.ts:1660, mirroring the embedded native path at 1882-1890. Without this, a long-running Bash tool on a CLI-backed run would not start the typing indicator the way the same tool does on the embedded path.
  3. Documented silentExpected divergence — added a comment at agent-runner-cli-dispatch.ts:150 explaining that the CLI path is stricter than embedded-native (which doesn't gate tool events on silentExpected). This is intentional and locked in by test 1429.

Real-runtime proof from the rebased branch

$ node scripts/run-vitest.mjs run \
    src/agents/cli-output.test.ts \
    src/auto-reply/reply/agent-runner-execution.test.ts
 RUN  v4.1.6 /Users/adele_with_a_b/workplace/oss/openclaw

 Test Files  2 passed (2)
      Tests  126 passed (126)
   Duration  ~5.7s
$ pnpm tsgo:core         # exit 0 (silent)
$ pnpm tsgo:core:test    # exit 0 (silent)
$ pnpm exec oxfmt --check --threads=1 \
    src/auto-reply/reply/agent-runner-cli-dispatch.ts \
    src/auto-reply/reply/agent-runner-execution.ts
All matched files use the correct format.
$ node scripts/run-oxlint.mjs \
    src/auto-reply/reply/agent-runner-cli-dispatch.ts \
    src/auto-reply/reply/agent-runner-execution.ts
Found 0 warnings and 0 errors.
$ pnpm build             # exit 0

Bundle proof that the new helper-shape integration flows through:

$ grep -n "createToolEventBridge\|onToolEvent" \
    dist/agent-runner-execution-*.js dist/agent-runner.runtime-*.js | head
dist/agent-runner-execution-Czo2UYph.js:77: function createToolEventBridge(params) {
dist/agent-runner-execution-Czo2UYph.js:132: const toolBridge = createToolEventBridge({
dist/agent-runner-execution-Czo2UYph.js:135:   deliver: params.onToolEvent
dist/agent-runner-execution-Czo2UYph.js:1196:   onToolEvent: async ({ name, phase, args }) => {
dist/agent-runner.runtime-KGMqduG3.js:1909:    onToolEvent: async ({ name, phase, args }) => {

Both call sites (agent-runner-execution-*.js:1196 for the runWithModelFallback path, agent-runner.runtime-*.js:1909 for the followup-runner.ts path) pass onToolEvent through. createToolEventBridge lives in the shared dispatch module exactly as the source intends.

What this does NOT prove

A live claude-cli-backed Telegram run with the freshly-built bundle is not exercised here. The original PR (pre-rebase) included redacted live Telegram logs at b58bfe6dd3 and ClawSweeper marked the proof as sufficient. The rebase preserves that emit path bit-for-bit (no changes to cli-output.ts, cli-output.test.ts, claude-live-session.ts, execute.ts); only the consumer side moved into the new helper. The unit-level proof above plus the bundle-grep (showing createToolEventBridge and onToolEvent reaching both consumer sites) cover the consumer-side change comprehensively.

@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels May 19, 2026
anagnorisis2peripeteia added a commit to anagnorisis2peripeteia/openclaw that referenced this pull request May 20, 2026
… CLI tool-event surface

PR openclaw#80046 (fix/cli-backend-telegram-tool-progress) is the canonical CLI
tool-event emitter: it adds `dispatchClaudeCliStreamingToolEvent` to
cli-output.ts, recognises three claude `--output-format stream-json` record
shapes (content_block_start + assistant tool_use snapshot + user tool_result),
sanitises args AND results via the embedded-native-runtime contract, dedupes
via a per-turn ToolUseTracker, and ships the matching agent-event bus →
opts.onToolStart bridge in agent-runner-execution.ts gated by isCliProvider().

openclaw#82285 was shipping a parallel emitter with a different callback signature
(`onToolEvent` phase-dispatch vs openclaw#80046's explicit `onToolUseStart` +
`onToolResult`), narrower record-shape coverage (only the streaming
content_block path), no result-phase, and a duplicate followup-path bridge.
Landing both as-is would double-emit on every tool call and ship two
diverging contracts.

Decision (per Cameron): openclaw#80046 lands first; openclaw#82285 reduces to the
Telegram-side rendering layer that consumes openclaw#80046's emitted bus events.

Restores to the pre-openclaw#82285 state:
- src/agents/cli-output.ts — drops ClaudeToolEvent, ClaudeToolBlockEntry,
  createClaudeToolUseTracker, sanitizeToolArgs import, all thinking_delta
  field additions (openclaw#80046 owns these)
- src/agents/cli-output.test.ts — drops the Claude tool-use tracker tests
- src/agents/cli-runner/execute.ts — drops onToolEvent forwarding in both
  live-session and headless JSONL paths
- src/agents/cli-runner/claude-live-session.ts — drops onToolEvent
  passthrough through createTurn + runClaudeLiveSessionTurn
- src/auto-reply/reply/agent-runner-execution.ts — drops the followup-path
  cliToolBridge (openclaw#80046's bus → onToolStart bridge will serve the followup
  path too once it lands; reuse not duplication)
- src/gateway/server-chat.ts — drops the `replacement` flag forwarding (only
  emitter was the now-removed in-text rolling timer)

Kept and shipping in this PR:
- extensions/telegram/src/bot-message-dispatch.ts — interleaved-output state,
  rolling-timer interval, injectToolLineIntoInterleave, updateInterleavedDisplay,
  finalize-ordering fix, and integration with the existing onToolStart /
  onReasoningStream / onReasoningEnd callbacks. This is the user-facing
  rendering layer — single Telegram message with interleaved reasoning
  (italic) + tool-progress (`[HH:MM:SS] ToolName: detail`) + rolling
  `_Ns — HH:MM:SS_` timer suffix while a tool runs.

Net diff is now extensions/telegram/* only.

@anagnorisis2peripeteia anagnorisis2peripeteia left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work consolidating the CLI tool-event surface into one parser. Two findings while integrating this with a downstream Telegram-render PR that depends on the bus events you ship:

Blocking finding (P1) — every tool call ships with args: {}:

dispatchClaudeCliStreamingToolEvent registers the tool at content_block_start (id + name only) and never accumulates the input_json_delta chunks that carry the args. The content_block_stop branch then emits via emitToolStartOnce with hardcoded args: {}. The assistant snapshot branch you have as a backup is suppressed by startedIds (which content_block_stop just populated), so it can't backfill. Net: every onToolUseStart callback fires with empty args, downstream channels render Bash: / Read: / WebFetch: lines with no detail.

Repro: any real claude-cli-interactive turn through this parser produces args: {} events at the stream:"tool" agent-event bus. Verified live against the gateway running this branch.

The fix is small (3-line patch): track inputJsonParts on the pending entry, append each input_json_delta's partial_json, JSON.parse on stop. Suggestion blocks inline below.

Optional nit — block-type coverage:

The parser only matches block.type === "tool_use". Misses server_tool_use (Anthropic's hosted tools like web_search) and mcp_tool_use (their newer 2025 native MCP integration where claude talks to an MCP server server-side). Not exercised by current OpenClaw workflows so non-blocking — but worth a 2-line || block.type === "server_tool_use" || block.type === "mcp_tool_use" to future-proof.

Comment thread src/agents/cli-output.ts Outdated
};
}

type PendingToolUse = { toolCallId: string; name: string };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extend the pending shape to carry accumulated partial-JSON chunks:

Suggested change
type PendingToolUse = { toolCallId: string; name: string };
type PendingToolUse = {
toolCallId: string;
name: string;
inputJsonParts: string[];
};

Comment thread src/agents/cli-output.ts Outdated
const toolCallId = typeof block.id === "string" ? block.id.trim() : "";
const name = typeof block.name === "string" ? block.name.trim() : "";
if (toolCallId && name) {
tracker.pendingByIndex.set(event.index, { toolCallId, name });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialise the accumulator when registering the pending tool, and add a content_block_delta handler that pushes each input_json_delta.partial_json chunk into it. Paired with the content_block_stop change below, this is the full fix for args: {}.

Suggested change
tracker.pendingByIndex.set(event.index, { toolCallId, name });
tracker.pendingByIndex.set(event.index, { toolCallId, name, inputJsonParts: [] });
}
}
return;
}
if (
event.type === "content_block_delta" &&
typeof event.index === "number" &&
isRecord(event.delta)
) {
if (event.delta.type === "input_json_delta" && typeof event.delta.partial_json === "string") {
const pending = tracker.pendingByIndex.get(event.index);
pending?.inputJsonParts.push(event.delta.partial_json);
}
return;
}

Comment thread src/agents/cli-output.ts Outdated
const pending = tracker.pendingByIndex.get(event.index);
tracker.pendingByIndex.delete(event.index);
if (pending) {
emitToolStartOnce(tracker, pending.toolCallId, pending.name, {}, params.onToolUseStart);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse the accumulated parts and emit with real args. The assistant-snapshot path still exists as a backup for the turn-aborted case (when no input_json_delta chunks arrived).

Suggested change
emitToolStartOnce(tracker, pending.toolCallId, pending.name, {}, params.onToolUseStart);
let args: Record<string, unknown> = {};
if (pending.inputJsonParts.length > 0) {
try {
const parsed: unknown = JSON.parse(pending.inputJsonParts.join(""));
if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) {
args = parsed as Record<string, unknown>;
}
} catch {
// Malformed/truncated partial JSON — fall through with empty
// args; assistant snapshot branch may still backfill below.
}
}
emitToolStartOnce(tracker, pending.toolCallId, pending.name, args, params.onToolUseStart);

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 22, 2026
@adele-with-a-b adele-with-a-b force-pushed the fix/cli-backend-telegram-tool-progress branch from 059c54a to 005dfaf Compare May 22, 2026 15:49
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 22, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 22, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes
to `stream: "assistant"` agent-events for the current runId and re-emits
them as reasoning content through `params.opts.onReasoningStream`. Mirrors
the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046:
same Promise-chain serialization + drain, same silentExpected gate, same
unsubscribe pattern at success/catch/finally.

The reply lane is untouched -- `onPartialReply` continues to settle the
final assistant text via openclaw#76914. The reasoning lane now reflects the
model's live text output during streaming, which is the only "what is the
model producing right now" signal available for claude-opus-4-7 over
claude-cli (Anthropic suppresses readable thinking_delta events on the
wire for opus-4-7; only thinking content_block + signature_delta arrive).

The bridge is gated on isCliProvider so API/native runtimes that already
get reasoning content from real thinking_delta events do NOT double-receive
text_delta as reasoning.

Tests cover:
- Forwards assistant agent-events to onReasoningStream with correct text
- Respects silentExpected (heartbeat / NO_REPLY runs don't emit)
- Does not fire on the API/native runtime path (gate works)
@obviyus obviyus self-assigned this May 29, 2026
@obviyus obviyus force-pushed the fix/cli-backend-telegram-tool-progress branch from 005dfaf to 104b7cb Compare May 29, 2026 06:39
@openclaw-barnacle openclaw-barnacle Bot added size: L and removed size: XL proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 29, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 29, 2026
@obviyus obviyus force-pushed the fix/cli-backend-telegram-tool-progress branch from 104b7cb to 52d11f5 Compare May 29, 2026 07:10
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 29, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 29, 2026
@obviyus obviyus force-pushed the fix/cli-backend-telegram-tool-progress branch from 52d11f5 to 391fe55 Compare May 29, 2026 07:27
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 29, 2026
@obviyus obviyus merged commit 9de6abd into openclaw:main May 29, 2026
101 of 102 checks passed
@obviyus

obviyus commented May 29, 2026

Copy link
Copy Markdown
Contributor

Landed via rebase onto main.

  • Scoped tests: node scripts/run-vitest.mjs src/agents/cli-output.test.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts
  • Format/lint: pnpm exec oxfmt --check --threads=1 src/agents/cli-output.test.ts src/agents/cli-output.ts src/agents/cli-runner/claude-live-session.ts src/agents/cli-runner/execute.ts src/auto-reply/reply/agent-runner-cli-dispatch.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/agent-runner-execution.ts src/auto-reply/reply/followup-runner.test.ts src/auto-reply/reply/followup-runner.ts; node scripts/run-oxlint.mjs src/agents/cli-output.test.ts src/agents/cli-output.ts src/agents/cli-runner/claude-live-session.ts src/agents/cli-runner/execute.ts src/auto-reply/reply/agent-runner-cli-dispatch.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/agent-runner-execution.ts src/auto-reply/reply/followup-runner.test.ts src/auto-reply/reply/followup-runner.ts; git diff --check origin/main...HEAD
  • Type proof: node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test-pr80046.tsbuildinfo
  • Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main reported no accepted/actionable findings before landing fixes.
  • Real E2E: local Claude CLI claude -p 'Use the Bash tool to run: printf OPENCLAW_E2E_TOOL_PROGRESS. Then answer with exactly DONE.' --output-format stream-json --include-partial-messages --verbose --allowedTools Bash emitted Bash tool_use start, chunked args, matching tool_result, and final DONE; OpenClaw lifecycle harness emitted the Bash tool start event from the real claude binary.
  • CI: required GitHub checks green on 391fe55589b498ee5e727c2c85f855b7a7abb9eb, including check-test-types.
  • Changelog: skipped; CHANGELOG.md is release-owned, and release-note context stays in the PR body.
  • Land commit: 391fe55589b498ee5e727c2c85f855b7a7abb9eb
  • Merge commit: 9de6abd8d7753aee2ecd16c863b08225f3c218b6

Thanks @adele-with-a-b!

jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes
to `stream: "assistant"` agent-events for the current runId and re-emits
them as reasoning content through `params.opts.onReasoningStream`. Mirrors
the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046:
same Promise-chain serialization + drain, same silentExpected gate, same
unsubscribe pattern at success/catch/finally.

The reply lane is untouched -- `onPartialReply` continues to settle the
final assistant text via openclaw#76914. The reasoning lane now reflects the
model's live text output during streaming, which is the only "what is the
model producing right now" signal available for claude-opus-4-7 over
claude-cli (Anthropic suppresses readable thinking_delta events on the
wire for opus-4-7; only thinking content_block + signature_delta arrive).

The bridge is gated on isCliProvider so API/native runtimes that already
get reasoning content from real thinking_delta events do NOT double-receive
text_delta as reasoning.

Tests cover:
- Forwards assistant agent-events to onReasoningStream with correct text
- Respects silentExpected (heartbeat / NO_REPLY runs don't emit)
- Does not fire on the API/native runtime path (gate works)
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes
to `stream: "assistant"` agent-events for the current runId and re-emits
them as reasoning content through `params.opts.onReasoningStream`. Mirrors
the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046:
same Promise-chain serialization + drain, same silentExpected gate, same
unsubscribe pattern at success/catch/finally.

The reply lane is untouched -- `onPartialReply` continues to settle the
final assistant text via openclaw#76914. The reasoning lane now reflects the
model's live text output during streaming, which is the only "what is the
model producing right now" signal available for claude-opus-4-7 over
claude-cli (Anthropic suppresses readable thinking_delta events on the
wire for opus-4-7; only thinking content_block + signature_delta arrive).

The bridge is gated on isCliProvider so API/native runtimes that already
get reasoning content from real thinking_delta events do NOT double-receive
text_delta as reasoning.

Tests cover:
- Forwards assistant agent-events to onReasoningStream with correct text
- Respects silentExpected (heartbeat / NO_REPLY runs don't emit)
- Does not fire on the API/native runtime path (gate works)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: L status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants