fix(agents/cli): bridge CLI tool_use lifecycle events to channel preview#80046
Conversation
|
Codex review: needs changes before merge. Reviewed May 29, 2026, 3:17 AM ET / 07:17 UTC. Summary PR surface: Source +370, Tests +513. Total +883 across 9 files. Reproducibility: yes. source inspection gives a high-confidence reproduction path: emit CLI Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Mantis proof suggestion Risk before merge
Maintainer options:
Copy recommended automerge instructionNext step before merge
Security Review findings
Review detailsBest possible solution: Land the parser and bridge after the main CLI tool bridge preserves the same message-tool-only suppression semantics as the embedded path and the focused regression passes. Do we have a high-confidence way to reproduce the issue? Yes, source inspection gives a high-confidence reproduction path: emit CLI Is this the best way to solve the issue? No, not quite yet: the parser and bridge are the right narrow fix for the CLI/API parity gap, but the main CLI consumer needs the embedded source-delivery suppression before this is the safest implementation. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 0bc591a7d781. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +370, Tests +513. Total +883 across 9 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
e82b32c to
b761ec2
Compare
dc4dd60 to
b58bfe6
Compare
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes to `stream: "assistant"` agent-events for the current runId and re-emits them as reasoning content through `params.opts.onReasoningStream`. Mirrors the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046: same Promise-chain serialization + drain, same silentExpected gate, same unsubscribe pattern at success/catch/finally. The reply lane is untouched -- `onPartialReply` continues to settle the final assistant text via openclaw#76914. The reasoning lane now reflects the model's live text output during streaming, which is the only "what is the model producing right now" signal available for claude-opus-4-7 over claude-cli (Anthropic suppresses readable thinking_delta events on the wire for opus-4-7; only thinking content_block + signature_delta arrive). The bridge is gated on isCliProvider so API/native runtimes that already get reasoning content from real thinking_delta events do NOT double-receive text_delta as reasoning. Tests cover: - Forwards assistant agent-events to onReasoningStream with correct text - Respects silentExpected (heartbeat / NO_REPLY runs don't emit) - Does not fire on the API/native runtime path (gate works)
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes to `stream: "assistant"` agent-events for the current runId and re-emits them as reasoning content through `params.opts.onReasoningStream`. Mirrors the assistant-text bridge from #76914 and the tool-event bridge from #80046: same Promise-chain serialization + drain, same silentExpected gate, same unsubscribe pattern at success/catch/finally. The reply lane is untouched -- `onPartialReply` continues to settle the final assistant text via #76914. The reasoning lane now reflects the model's live text output during streaming, which is the only "what is the model producing right now" signal available for claude-opus-4-7 over claude-cli (Anthropic suppresses readable thinking_delta events on the wire for opus-4-7; only thinking content_block + signature_delta arrive). The bridge is gated on isCliProvider so API/native runtimes that already get reasoning content from real thinking_delta events do NOT double-receive text_delta as reasoning. Tests cover: - Forwards assistant agent-events to onReasoningStream with correct text - Respects silentExpected (heartbeat / NO_REPLY runs don't emit) - Does not fire on the API/native runtime path (gate works)
b58bfe6 to
ee29656
Compare
|
@clawsweeper re-review Rebased onto current What changed in the rebaseThe pre-rebase PR did the bridging inline in
Reviewer pass before pushRan our own reviewer on the rebased diff before pushing. Findings addressed in the amended commit:
Real-runtime proof from the rebased branchBundle proof that the new helper-shape integration flows through: Both call sites ( What this does NOT proveA live |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
… CLI tool-event surface PR openclaw#80046 (fix/cli-backend-telegram-tool-progress) is the canonical CLI tool-event emitter: it adds `dispatchClaudeCliStreamingToolEvent` to cli-output.ts, recognises three claude `--output-format stream-json` record shapes (content_block_start + assistant tool_use snapshot + user tool_result), sanitises args AND results via the embedded-native-runtime contract, dedupes via a per-turn ToolUseTracker, and ships the matching agent-event bus → opts.onToolStart bridge in agent-runner-execution.ts gated by isCliProvider(). openclaw#82285 was shipping a parallel emitter with a different callback signature (`onToolEvent` phase-dispatch vs openclaw#80046's explicit `onToolUseStart` + `onToolResult`), narrower record-shape coverage (only the streaming content_block path), no result-phase, and a duplicate followup-path bridge. Landing both as-is would double-emit on every tool call and ship two diverging contracts. Decision (per Cameron): openclaw#80046 lands first; openclaw#82285 reduces to the Telegram-side rendering layer that consumes openclaw#80046's emitted bus events. Restores to the pre-openclaw#82285 state: - src/agents/cli-output.ts — drops ClaudeToolEvent, ClaudeToolBlockEntry, createClaudeToolUseTracker, sanitizeToolArgs import, all thinking_delta field additions (openclaw#80046 owns these) - src/agents/cli-output.test.ts — drops the Claude tool-use tracker tests - src/agents/cli-runner/execute.ts — drops onToolEvent forwarding in both live-session and headless JSONL paths - src/agents/cli-runner/claude-live-session.ts — drops onToolEvent passthrough through createTurn + runClaudeLiveSessionTurn - src/auto-reply/reply/agent-runner-execution.ts — drops the followup-path cliToolBridge (openclaw#80046's bus → onToolStart bridge will serve the followup path too once it lands; reuse not duplication) - src/gateway/server-chat.ts — drops the `replacement` flag forwarding (only emitter was the now-removed in-text rolling timer) Kept and shipping in this PR: - extensions/telegram/src/bot-message-dispatch.ts — interleaved-output state, rolling-timer interval, injectToolLineIntoInterleave, updateInterleavedDisplay, finalize-ordering fix, and integration with the existing onToolStart / onReasoningStream / onReasoningEnd callbacks. This is the user-facing rendering layer — single Telegram message with interleaved reasoning (italic) + tool-progress (`[HH:MM:SS] ToolName: detail`) + rolling `_Ns — HH:MM:SS_` timer suffix while a tool runs. Net diff is now extensions/telegram/* only.
anagnorisis2peripeteia
left a comment
There was a problem hiding this comment.
Nice work consolidating the CLI tool-event surface into one parser. Two findings while integrating this with a downstream Telegram-render PR that depends on the bus events you ship:
Blocking finding (P1) — every tool call ships with args: {}:
dispatchClaudeCliStreamingToolEvent registers the tool at content_block_start (id + name only) and never accumulates the input_json_delta chunks that carry the args. The content_block_stop branch then emits via emitToolStartOnce with hardcoded args: {}. The assistant snapshot branch you have as a backup is suppressed by startedIds (which content_block_stop just populated), so it can't backfill. Net: every onToolUseStart callback fires with empty args, downstream channels render Bash: / Read: / WebFetch: lines with no detail.
Repro: any real claude-cli-interactive turn through this parser produces args: {} events at the stream:"tool" agent-event bus. Verified live against the gateway running this branch.
The fix is small (3-line patch): track inputJsonParts on the pending entry, append each input_json_delta's partial_json, JSON.parse on stop. Suggestion blocks inline below.
Optional nit — block-type coverage:
The parser only matches block.type === "tool_use". Misses server_tool_use (Anthropic's hosted tools like web_search) and mcp_tool_use (their newer 2025 native MCP integration where claude talks to an MCP server server-side). Not exercised by current OpenClaw workflows so non-blocking — but worth a 2-line || block.type === "server_tool_use" || block.type === "mcp_tool_use" to future-proof.
| }; | ||
| } | ||
|
|
||
| type PendingToolUse = { toolCallId: string; name: string }; |
There was a problem hiding this comment.
Extend the pending shape to carry accumulated partial-JSON chunks:
| type PendingToolUse = { toolCallId: string; name: string }; | |
| type PendingToolUse = { | |
| toolCallId: string; | |
| name: string; | |
| inputJsonParts: string[]; | |
| }; |
| const toolCallId = typeof block.id === "string" ? block.id.trim() : ""; | ||
| const name = typeof block.name === "string" ? block.name.trim() : ""; | ||
| if (toolCallId && name) { | ||
| tracker.pendingByIndex.set(event.index, { toolCallId, name }); |
There was a problem hiding this comment.
Initialise the accumulator when registering the pending tool, and add a content_block_delta handler that pushes each input_json_delta.partial_json chunk into it. Paired with the content_block_stop change below, this is the full fix for args: {}.
| tracker.pendingByIndex.set(event.index, { toolCallId, name }); | |
| tracker.pendingByIndex.set(event.index, { toolCallId, name, inputJsonParts: [] }); | |
| } | |
| } | |
| return; | |
| } | |
| if ( | |
| event.type === "content_block_delta" && | |
| typeof event.index === "number" && | |
| isRecord(event.delta) | |
| ) { | |
| if (event.delta.type === "input_json_delta" && typeof event.delta.partial_json === "string") { | |
| const pending = tracker.pendingByIndex.get(event.index); | |
| pending?.inputJsonParts.push(event.delta.partial_json); | |
| } | |
| return; | |
| } |
| const pending = tracker.pendingByIndex.get(event.index); | ||
| tracker.pendingByIndex.delete(event.index); | ||
| if (pending) { | ||
| emitToolStartOnce(tracker, pending.toolCallId, pending.name, {}, params.onToolUseStart); |
There was a problem hiding this comment.
Parse the accumulated parts and emit with real args. The assistant-snapshot path still exists as a backup for the turn-aborted case (when no input_json_delta chunks arrived).
| emitToolStartOnce(tracker, pending.toolCallId, pending.name, {}, params.onToolUseStart); | |
| let args: Record<string, unknown> = {}; | |
| if (pending.inputJsonParts.length > 0) { | |
| try { | |
| const parsed: unknown = JSON.parse(pending.inputJsonParts.join("")); | |
| if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) { | |
| args = parsed as Record<string, unknown>; | |
| } | |
| } catch { | |
| // Malformed/truncated partial JSON — fall through with empty | |
| // args; assistant snapshot branch may still backfill below. | |
| } | |
| } | |
| emitToolStartOnce(tracker, pending.toolCallId, pending.name, args, params.onToolUseStart); |
059c54a to
005dfaf
Compare
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes to `stream: "assistant"` agent-events for the current runId and re-emits them as reasoning content through `params.opts.onReasoningStream`. Mirrors the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046: same Promise-chain serialization + drain, same silentExpected gate, same unsubscribe pattern at success/catch/finally. The reply lane is untouched -- `onPartialReply` continues to settle the final assistant text via openclaw#76914. The reasoning lane now reflects the model's live text output during streaming, which is the only "what is the model producing right now" signal available for claude-opus-4-7 over claude-cli (Anthropic suppresses readable thinking_delta events on the wire for opus-4-7; only thinking content_block + signature_delta arrive). The bridge is gated on isCliProvider so API/native runtimes that already get reasoning content from real thinking_delta events do NOT double-receive text_delta as reasoning. Tests cover: - Forwards assistant agent-events to onReasoningStream with correct text - Respects silentExpected (heartbeat / NO_REPLY runs don't emit) - Does not fire on the API/native runtime path (gate works)
005dfaf to
104b7cb
Compare
104b7cb to
52d11f5
Compare
52d11f5 to
391fe55
Compare
|
Landed via rebase onto main.
Thanks @adele-with-a-b! |
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes to `stream: "assistant"` agent-events for the current runId and re-emits them as reasoning content through `params.opts.onReasoningStream`. Mirrors the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046: same Promise-chain serialization + drain, same silentExpected gate, same unsubscribe pattern at success/catch/finally. The reply lane is untouched -- `onPartialReply` continues to settle the final assistant text via openclaw#76914. The reasoning lane now reflects the model's live text output during streaming, which is the only "what is the model producing right now" signal available for claude-opus-4-7 over claude-cli (Anthropic suppresses readable thinking_delta events on the wire for opus-4-7; only thinking content_block + signature_delta arrive). The bridge is gated on isCliProvider so API/native runtimes that already get reasoning content from real thinking_delta events do NOT double-receive text_delta as reasoning. Tests cover: - Forwards assistant agent-events to onReasoningStream with correct text - Respects silentExpected (heartbeat / NO_REPLY runs don't emit) - Does not fire on the API/native runtime path (gate works)
Add a CLI-runtime-gated bridge in runAgentTurnWithFallback that subscribes to `stream: "assistant"` agent-events for the current runId and re-emits them as reasoning content through `params.opts.onReasoningStream`. Mirrors the assistant-text bridge from openclaw#76914 and the tool-event bridge from openclaw#80046: same Promise-chain serialization + drain, same silentExpected gate, same unsubscribe pattern at success/catch/finally. The reply lane is untouched -- `onPartialReply` continues to settle the final assistant text via openclaw#76914. The reasoning lane now reflects the model's live text output during streaming, which is the only "what is the model producing right now" signal available for claude-opus-4-7 over claude-cli (Anthropic suppresses readable thinking_delta events on the wire for opus-4-7; only thinking content_block + signature_delta arrive). The bridge is gated on isCliProvider so API/native runtimes that already get reasoning content from real thinking_delta events do NOT double-receive text_delta as reasoning. Tests cover: - Forwards assistant agent-events to onReasoningStream with correct text - Respects silentExpected (heartbeat / NO_REPLY runs don't emit) - Does not fire on the API/native runtime path (gate works)
Summary
PR #76914 (merged 2026-05-09) bridged CLI-runtime assistant text deltas into channel preview pipelines, closing the silence gap on the API vs CLI runtime split for streaming text. Tool-progress events were left out of that fix — they're still dropped on the CLI runtime path while the embedded native (API) runtime emits them normally. Channel previews that subscribe to
stream: "tool"for live tool-progress decoration (Telegram's "📖 Read:", "🛠️ Bash:", etc.) stay silent during tool-heavy CLI-backed turns, so users see typing indicators and final text but never the work-in-progress trail the API path already produces.This PR closes the rest of the API/CLI parity gap by surfacing tool-use lifecycle events from the CLI stream output and bridging them through the same agent-event bus #76914 used for assistant text.
Behavior change
Two scope tiers:
src/agents/cli-output.ts) — claude-cli-specific. Recognises the claude--output-format stream-jsondialect's tool_use content blocks and tool_result user-messages and surfaces them through new optional callbacks oncreateCliJsonlStreamingParser. Other CLI dialects (codex-cli, kiro-cli) ship their own parsers and continue to work as today; this PR only adds parsing for claude-cli's dialect because that's the one OpenClaw bundles a parser for.src/agents/cli-runner/execute.ts,src/agents/cli-runner/claude-live-session.ts,src/auto-reply/reply/agent-runner-execution.ts) — CLI-generic. TherunCliAgentexecution path is gated byisCliProvider(cliExecutionProvider, runtimeConfig)atagent-runner-execution.ts:1415, so any CLI runtime emittingagent-event { stream: "tool" }for its run id (whether claude-cli today or future CLIs that adopt the same emission shape) gets the events forwarded toparams.opts.onToolStartautomatically. This is the matching layer to PR fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914's assistant-bridge addition and shares its serialization, drain, and silentExpected guards.Layer-by-layer details:
Parser — new
dispatchClaudeCliStreamingToolEventinsrc/agents/cli-output.tsrecognises three claude stream-json record shapes:stream_event { event: { type: "content_block_start", content_block: { type: "tool_use", id, name } } }→ record id+name in tracker (no emit yet; args haven't streamed).assistant { message: { content: [{ type: "tool_use", id, name, input }] } }→ emitonToolUseStart({ toolCallId, name, args: input })with full args.user { message: { content: [{ type: "tool_result", tool_use_id, is_error, content }] } }→ emitonToolResult({ toolCallId, name, isError, result }).content_block_stopwithout precedingassistantsnapshot → fallback emitstartwith empty args (turn-aborted edge case).The shape
{phase: "start" | "result", name, toolCallId, args | isError | result}matches what the embedded native runtime already emits atsrc/agents/pi-embedded-subscribe.handlers.tools.ts:696-705,998-1009, so existing channel consumers pick them up unchanged. Per-parserToolUseTrackerdedupes start/result events. New optional callbacks; when neither is subscribed, behavior is byte-identical to before.Parser consumer — both
claude-cliexecution paths insrc/agents/cli-runner/execute.ts(live-session and JSONL-streaming) wire the new callbacks toemitAgentEvent({ runId, stream: "tool", data: {...} }). Args and results are passed throughsanitizeToolArgs/sanitizeToolResultfromsrc/agents/pi-embedded-subscribe.tools.tsbefore emission, matching the embedded native runtime's privacy contract atpi-embedded-subscribe.handlers.tools.ts:703,705,1005,1007— redacts Authorization headers, API key fields, and shell commands containing tokens before they reach the shared bus. The parser is constructed fresh per turn inclaude-live-session.ts:createTurn, so theToolUseTrackeris per-turn — no cross-turn state leak.Agent-event bus consumer (CLI-generic) —
src/auto-reply/reply/agent-runner-execution.tsadds a parallelrawUnsubscribeToolBridge = onAgentEvent(...)next to the assistant-text bridge from fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914 inside theisCliProvider(...)branch. Filters byrunId, respectssilentExpected(heartbeat / NO_REPLY runs do not leak), and forwardsphase === "start"and"update"events toparams.opts.onToolStart. Mirrors fix(agents/cli): bridge CLI assistant deltas into channel preview (#76869) #76914's serialization pattern (toolBridgeDeliveryPromise chain +drainToolBridgeDelivery) so callbacks land in order even whenonToolStartis async, and unsubscribe is mirrored at every place the assistant unsubscribe is called (success, catch, finally) so the listener never leaks.No new public APIs in plugin-sdk; channel-side code is unchanged. Telegram, Discord, Slack, etc. preview pipelines automatically light up because they already subscribe to
stream: "tool"on the agent-event bus throughparams.opts.onToolStart.Scope note: this PR forwards
phase: "start"andphase: "update"events throughonToolStart(the channel-progress activation surface).phase: "result"is emitted on the bus but not forwarded to the channel here — the embedded handler'sresultevents also include ametafield that this CLI emit omits, and the channel-progress pipeline today only consumes activation events. Addingresult-phase forwarding (withmetaparity) is left as a follow-up PR if/when a consumer needs completion events on the CLI path.Verification
pnpm test src/agents/cli-output.test.ts→ 21/21 PASS (5 new tests: full happy-path, fallback oncontent_block_stopwithout assistant snapshot, dedup when bothcontent_block_startandassistantsnapshot announce same tool, error tool_result with name passthrough, parser-vs-consumer privacy-contract regression locking in that args/result surface raw at the parser boundary and the consumer appliessanitizeToolArgs/sanitizeToolResultbefore emission).pnpm test src/auto-reply/reply/agent-runner-execution.test.ts→ 73/73 PASS (2 new bridge tests: forwards tool events toonToolStartwith correct args; respectssilentExpected).pnpm test src/agents/cli-runner→ 50/50 PASS.pnpm check:changed→ all gates green (typecheck core + tests, lint core, runtime sidecar loaders, import cycles, webhook/pairing guards).pnpm exec oxfmt --check --threads=1 src/agents/cli-output.ts src/agents/cli-output.test.ts src/agents/cli-runner/execute.ts src/agents/cli-runner/claude-live-session.ts src/auto-reply/reply/agent-runner-execution.ts src/auto-reply/reply/agent-runner-execution.test.ts→ clean.origin/main@cb86388cec:pi-embedded-runner.run-embedded-pi-agent.auth-profile-rotation.e2e.test.tshas 3 failures + 1 unhandled rejection on clean main (verified by stashing this branch and re-running). Backoff/auth-rotation assertionexpected "vi.fn()" to be called 2 times, but got 3 times. Unrelated.Real behavior proof
Behavior or issue addressed: CLI-runtime turns silently dropping
tool_usecontent blocks andtool_resultuser-messages betweencli-output.tsand the channel preview pipeline, so Telegram's tool-progress UI never lights up for CLI-backed turns even though the underlying CLI is producing the data.Real environment tested: Live OpenClaw
2026.5.7deployment on macOS with theclaude-clibackend driving Telegram. Same setup I used for the proof on #76914.Exact steps or command run after this patch: I hand-applied this PR's three structural changes to the installed
2026.5.7dist files (claude-live-session-*.js,execute.runtime-*.js,agent-runner.runtime-*.js), instrumented each layer with a tracing log, restarted the OpenClaw gateway vialaunchctl kickstart -k "gui/$(id -u)/ai.openclaw.gateway", and sent a real Telegram turn: "Read /etc/hosts, then read /etc/zshrc, then count their combined lines. Use Bash for the count." — three tool calls per turn, driven through the live gateway. I also captured the parser layer in isolation by piping a real claude-cli--output-format stream-jsonJSONL fixture through the patchedcreateCliJsonlStreamingParser:proof.mts:Evidence after fix: Redacted runtime logs from the live Telegram + claude-cli turn, captured directly from
~/.openclaw/logs/gateway.err.logwith the dist hand-patches in place. Each layer of the pipeline writes a[demo-trace]line so the chain is visible end-to-end:Same chain fires for the second
Read, then for theBashcall, then for tool results. The parser-isolation harness captured this terminal output for the same kind of turn driven through the realclaudeCLI:Observed result after fix: With this PR's structural change applied, my Telegram turn against the live gateway showed the bot's preview message updating with
📖 Read: from /etc/hosts, then📖 Read: from /etc/zshrc, then🛠️ Bash: ...lines while the agent worked, before delivering the final answer — the same per-tool decoration users already see on embedded-native-runtime agents. Withstreaming.mode: "progress"the lines accumulate for the full turn; withstreaming.mode: "partial"(the default) they flash before the streaming answer text replaces the preview, matching the existing embedded-runtime behavior. Without the patch (stock2026.5.7), the same prompt produces zero[demo-trace] BUS evt stream=toollog lines and the Telegram preview shows only the assistant text — theonToolStartcallback the channel pipeline already exposes is never invoked because no CLI-side emitter publishes to it.What was not tested: I did not exercise non-claude CLI runtimes (codex-cli, kiro-cli) against the bridge layer — those don't ship a stream-json parser in this repo, so the parser layer change doesn't apply to them. The bridge layer at
agent-runner-execution.tsis gated byisCliProvider(...)and would forward anystream: "tool"events those runtimes might emit in the future, but that future is not exercised here. I also did not test thephase: "result"channel-side rendering — the bridge filtersresultto keep parity with #76914's activation-only forwarding (see Scope note above), so this PR's change is invisible to consumers that only handlestart/updatetoday.