Skip to content

feat(telegram): interleave CLI tool-progress + reasoning + rolling timer in one Telegram message#82285

Closed
anagnorisis2peripeteia wants to merge 9 commits into
openclaw:mainfrom
anagnorisis2peripeteia:feat/cli-tool-use-injection
Closed

feat(telegram): interleave CLI tool-progress + reasoning + rolling timer in one Telegram message#82285
anagnorisis2peripeteia wants to merge 9 commits into
openclaw:mainfrom
anagnorisis2peripeteia:feat/cli-tool-use-injection

Conversation

@anagnorisis2peripeteia

@anagnorisis2peripeteia anagnorisis2peripeteia commented May 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem. When a tool runs mid-turn, the Telegram reasoning ("Thinking") message goes silent — no signal which tool is running or that the turn is still alive. On slow tools (Read/Bash/Grep/WebFetch) the user sees a frozen indicator and abandons a turn that's still in flight.
  • What changed. Two parts that close the loop for CLI backends:
    1. Detect — the claude stream-json JSONL parser now spots tool_use/server_tool_use/mcp_tool_use blocks and emits a stream:"tool" agent-event ({phase, name, args, toolCallId}), which followup-runner already turns into onToolStart.
    2. Interleave — the Telegram dispatcher injects that tool-progress into the reasoning-lane rolling message ([HH:MM:SS] 🛠️ ToolName: detail + a rolling Ns — HH:MM timer) instead of a separate channel-progress message, falling back to the legacy per-tool message when the reasoning lane isn't active.
  • Scope boundary. Gated by the claude stream-json dialect — non-Claude CLI backends untouched. Reuses upstream's followup-runner (stream:"tool"onToolStart) rather than adding a new consumer. No change to tool execution; the embedded path is unaffected (it already emits tool events upstream).
  • AI-assisted: yes — drafted with Claude.

Real behavior proof

  • Behavior addressed: during a turn that invokes tools, the reasoning message shows interleaved [HH:MM:SS] 🛠️ ToolName lines + a rolling elapsed timer in one updating message, instead of a silent gap or a separate progress message.
  • Real environment tested: Windows 11 Pro 26200, Node v24.12.0, live OpenClaw gateway delivering over Telegram, Opus.
  • Exact steps or command run after this patch: unset NODE_OPTIONS && pnpm exec vitest run extensions/telegram/src/bot-message-dispatch.test.ts; then a live Telegram turn that invokes one or more tools; observe the interleaved tool lines + the rolling timer in the single reasoning message, and the timer stripping when assistant text resumes.
  • Evidence after fix: live Telegram capture of the interleaved rolling-timer message: telegram rolling-timer. Each [HH:MM:SS] 🛠️ … line is one onToolStart injected into the reasoning draft.
  • Observed result after fix: tool calls render as [HH:MM:SS] 🛠️ ToolName: detail lines inside the reasoning message with the timer ticking; when assistant text resumes the timer line is stripped and text continues from a clean boundary.
  • What was not tested: non-Telegram channels (Telegram-only renderer); the focused bot-message-dispatch interleave unit test is the planned guardrail to land with this change.

Change Type

  • Feature

Scope

  • Integrations

Linked

  • Consumes the onToolStart surface that followup-runner drives from the stream:"tool" agent-event bus — no new event surface introduced here.

Risks and Mitigations

  • Risk: the rolling-timer setInterval leaks if a turn ends without a following assistant delta (e.g. abort mid-tool).
    • Mitigation: the interval is cleared on the reasoning-lane cleanup path, on the next assistant delta, and on turn finalize; the dispatcher owns the handle and clears it on every terminal path.
  • Risk: the injected _Ns — HH:MM_ line could confuse a strict-Markdown consumer.
    • Mitigation: Telegram-only surface, rendered verbatim; the marker uses a conservative Markdown-safe shape.
  • Risk: interleaving could duplicate reasoning text into both the answer and reasoning lanes.
    • Mitigation: finalize forces the reasoning lane closed when interleaved output is set, so the rolling message resolves cleanly instead of double-delivering.

Security Impact

  • New permissions/capabilities: No.
  • Secrets/tokens handling changed: No.
  • New/changed network calls: No.
  • Command/tool execution surface changed: No — renders that tools are in flight; executes nothing.
  • Data access scope changed: No.

Compatibility / Migration

  • Backward compatible: falls back to the existing per-tool channel-progress message when the reasoning lane isn't active. No new config or env vars.

@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations agents Agent runtime and tooling extensions: anthropic size: XL triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 15, 2026
@clawsweeper

clawsweeper Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 26, 2026, 4:48 PM ET / 20:48 UTC.

Summary
The PR adds Claude CLI stream-json tool-use detection plus Telegram rendering that interleaves tool-progress lines, reasoning text, and a rolling timer into one updating reasoning message.

PR surface: Source +646, Tests +20. Total +666 across 7 files.

Reproducibility: yes. for the blocking delivery issues by source inspection: PR head emits tool progress into assistantText and also emits a structured stream:"tool" event consumed by Telegram onToolStart. I did not run a live Telegram reproduction in this read-only review.

Review metrics: 1 noteworthy metric.

  • Streaming Producer Paths: 1 new stream:"tool" producer; 1 new inline assistant marker path. The PR changes both the structured progress event stream and the assistant text stream, so reviewers need to verify they do not become competing user-visible sources.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Result: blocked until stronger real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Remove or gate the inline assistant tool marker so Telegram does not double-render tool progress.
  • Either parse native Claude thinking_delta records in the CLI parser or remove the unused thinkingDelta routing from this PR.
  • Attach current-head Telegram proof showing timer ticks, cleanup after assistant text resumes, and no duplicate tool line in the answer.

Proof guidance:
Needs stronger real behavior proof before merge: The PR supplies a static Telegram screenshot that shows tool lines, but it does not prove current-head timer ticking, timer cleanup, or absence of duplicate assistant-stream tool text. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A real Telegram Desktop recording would materially verify the visible interleave, timer tick, cleanup, and duplicate-message risk. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify current head interleaves CLI tool progress into one Telegram reasoning message, timer ticks, cleanup removes the timer, and the final answer has no duplicate tool line.

Risk before merge

  • Verbose Claude CLI runs can show the same tool progress through the assistant stream and the Telegram reasoning interleave, polluting the final answer and duplicating visible progress.
  • The PR wires thinkingDelta handling in execute.ts, but the CLI parser still ignores native thinking_delta records, so the advertised reasoning path is incomplete unless a separate stacked change supplies different input.
  • The proof only shows a static Telegram message, so timer ticks, timer cleanup, and absence of duplicate answer-lane tool text remain unproven on current head.

Maintainer options:

  1. Use Structured Tool Events As The Source Of Truth (recommended)
    Keep Claude CLI assistant text clean when a structured tool-event consumer is active, and let Telegram render tool progress only from onToolStart.
  2. Make Inline Markers An Explicit Fallback
    If maintainers want inline markers for consumers without progress callbacks, gate them behind a fallback path with tests proving Telegram and live chat do not double-render.
  3. Pause For The Claude CLI Stack
    Maintainers can pause this until the related Claude CLI reasoning/backend work settles if they want one reviewed stream-json contract.

Next step before merge
Contributor or maintainer follow-up is needed because the remaining blockers are delivery-contract changes plus real Telegram proof that automation cannot provide for the contributor's live setup.

Security
Cleared: No concrete security or supply-chain issue was found in the current head diff; no dependency, workflow, credential, or new network surface is added, and tool args are passed through the existing sanitizer.

Review findings

  • [P1] Keep tool progress out of assistant text — src/agents/cli-output.ts:700-707
  • [P2] Wire real thinking deltas before forwarding them — src/agents/cli-runner/execute.ts:502-508
Review details

Best possible solution:

Land after structured tool events are the single source for Telegram tool-progress delivery, CLI thinking-delta scope is either implemented or removed, and current-head Telegram proof shows interleave, timer ticking, cleanup, and no duplicate answer text.

Do we have a high-confidence way to reproduce the issue?

Yes for the blocking delivery issues by source inspection: PR head emits tool progress into assistantText and also emits a structured stream:"tool" event consumed by Telegram onToolStart. I did not run a live Telegram reproduction in this read-only review.

Is this the best way to solve the issue?

No for the current patch. The UX direction is useful, but the maintainable fix is to keep one tool-progress delivery source per consumer and either support native CLI thinking_delta records or remove the dead reasoning routing from this PR.

Full review comments:

  • [P1] Keep tool progress out of assistant text — src/agents/cli-output.ts:700-707
    This appends the tool marker into assistantText, but the same tool is also emitted as stream:"tool" and Telegram renders that event through onToolStart. In verbose Telegram runs, users can see the progress line both inside the reasoning message and in the assistant/final answer stream; keep assistant text clean when structured progress events are available or make inline markers a fallback for consumers without progress callbacks.
    Confidence: 0.86
  • [P2] Wire real thinking deltas before forwarding them — src/agents/cli-runner/execute.ts:502-508
    The new execute.ts branch forwards thinkingDelta as stream:"thinking", but the CLI parser feeding it still returns null for native thinking_delta records and only accepts text_delta. That leaves the advertised Claude CLI reasoning path silent unless another stacked change changes the parser; parse thinking_delta here with coverage or remove this dead routing from the PR scope.
    Confidence: 0.78

Overall correctness: patch is incorrect
Overall confidence: 0.85

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 80655fe9552f.

Label changes

Label justifications:

  • P2: This is a normal-priority Telegram/agent streaming feature with limited blast radius but concrete delivery blockers.
  • merge-risk: 🚨 compatibility: The PR changes Claude CLI assistant stream contents for verbose users, which can affect existing channel/live-chat consumers during upgrade.
  • merge-risk: 🚨 message-delivery: The PR can duplicate tool progress across the reasoning lane and assistant answer stream, producing confusing visible Telegram output.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦪 silver shellfish and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The PR supplies a static Telegram screenshot that shows tool lines, but it does not prove current-head timer ticking, timer cleanup, or absence of duplicate assistant-stream tool text. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
  • proof: 📸 screenshot: Contributor real behavior proof includes screenshot evidence. The PR supplies a static Telegram screenshot that shows tool lines, but it does not prove current-head timer ticking, timer cleanup, or absence of duplicate assistant-stream tool text.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. This changes visible Telegram chat streaming, so a short Telegram Desktop proof should show the interleaved tool lines, rolling timer, cleanup, and final answer boundary.
Evidence reviewed

PR surface:

Source +646, Tests +20. Total +666 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 6 666 20 +646
Tests 1 24 4 +20
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 690 24 +666

What I checked:

  • Repository policy applied: Root AGENTS.md and the Telegram maintainer note were read; Telegram streaming changes need real Telegram proof and channel/message delivery is compatibility-sensitive. (AGENTS.md:1, 80655fe9552f)
  • Scoped Telegram review note applied: The Telegram maintainer note requires real Telegram proof for streaming, transport, and visible reply behavior changes. (.agents/maintainer-notes/telegram.md:1, 80655fe9552f)
  • Inline assistant marker path: PR head appends the formatted tool line directly to assistantText and emits it through onAssistantDelta when verbose is enabled. (src/agents/cli-output.ts:700, 979043380706)
  • Structured tool-event path: The same parser also emits stream:"tool" events, and execute.ts forwards them to the agent event bus with sanitized args and toolCallId. (src/agents/cli-runner/execute.ts:624, 979043380706)
  • Telegram consumes tool events: PR head consumes onToolStart and injects the formatted tool progress into the reasoning-lane interleaved message, so the inline assistant marker becomes a second visible delivery path. (extensions/telegram/src/bot-message-dispatch.ts:2083, 979043380706)
  • Thinking delta contract: Current Anthropic transport handling recognizes content_block_delta records with delta.type === "thinking_delta", while the PR's CLI parser still returns null unless delta.type is "text_delta". (src/agents/anthropic-transport-stream.ts:1201, 80655fe9552f)

Likely related people:

  • steipete: Recent path history shows work on Telegram progress drafts, Claude CLI parser plumbing, and gateway tool/live-chat fanout surfaces. (role: recent area contributor; confidence: medium; commits: 0afccc62ab72, 77d9ac30bb8d, 459e89ada899; files: extensions/telegram/src/bot-message-dispatch.ts, src/agents/cli-output.ts, src/gateway/server-chat.ts)
  • Patrick-Erichsen: GitHub path history shows this author added Telegram progress preview flows that this PR changes. (role: feature-history contributor; confidence: medium; commits: d60ab485114a; files: extensions/telegram/src/bot-message-dispatch.ts)
  • jalehman: Recent Telegram dispatcher and reply serialization work touches the same channel delivery area. (role: recent adjacent contributor; confidence: medium; commits: 62b51a6295ee; files: extensions/telegram/src/bot-message-dispatch.ts)
  • zhouhe-xydt: Recent Claude stream-json parser work changed usage handling in the same cli-output parser surface. (role: recent parser contributor; confidence: medium; commits: 84229d995a34; files: src/agents/cli-output.ts)
  • vincentkoc: Path history shows prior live-chat merge behavior work in the gateway projector affected by this PR's replace semantics. (role: feature-history contributor; confidence: medium; commits: 013939cfc7c2; files: src/gateway/live-chat-projector.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@anagnorisis2peripeteia anagnorisis2peripeteia force-pushed the feat/cli-tool-use-injection branch from b7a08fa to afc942f Compare May 15, 2026 19:16
@clawsweeper clawsweeper Bot added the mantis: telegram-visible-proof Mantis should capture Telegram visible proof. label May 15, 2026
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@openclaw-barnacle openclaw-barnacle Bot added the gateway Gateway runtime label May 16, 2026
@clawsweeper clawsweeper Bot added the P2 Normal backlog priority with limited blast radius. label May 16, 2026
@clawsweeper

clawsweeper Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 17, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

anagnorisis2peripeteia added a commit to anagnorisis2peripeteia/openclaw that referenced this pull request May 26, 2026
… subscription path

New `claude-cli-interactive` backend that runs `claude` without `-p`
under a per-turn `bun` wrapper. Bypasses the headless mode that
Anthropic's June 15, 2026 split moves onto the separate programmatic
credit pool — every event flows through a loopback HTTPS MITM proxy
that taps the API SSE stream and re-emits it as
`claude -p --output-format stream-json` JSONL records on the
wrapper's own stdout. Interactive subscription stays in play.

Backend wiring
- `extensions/anthropic/cli-backend-interactive.ts`: registers the
  new backend (`claude-cli-interactive`); inherits user-configured
  `claude-cli` overrides via `inheritUserConfigFrom` with `-p`-mode
  flags stripped; normalizeConfig classifies inherited-vs-direct-
  override using the user-configured claude-cli.command in
  OpenClawConfig (no more basename heuristic). serialize:true to
  queue same-session resumes. maxPromptArgChars: 30k (Windows
  CreateProcess hard limit) / 200k (Unix ARG_MAX).
- `extensions/anthropic/interactive-proxy/wrapper.ts`: per-turn
  process owning the proxy, cert lifecycle, claude spawn,
  request-source classification routing, and the `result` record
  on `end_turn`. Heartbeat ping every 30s of stdout silence so the
  gateway keeps the Telegram typing indicator alive through long
  tool-execution gaps. Spilled-prompt recovery: read overflow off
  wrapper's stdin, write to a per-run subdir at
  `${tmpdir()}/openclaw-interactive-proxy/run-<32-hex>/` (0700) +
  file 0600, inject `--add-dir <perRunDir>` so claude's Read tool
  resolves the path; per-run dir isolates concurrent runs.
- `extensions/anthropic/interactive-proxy/mitm-server.ts`: two-stage
  proxy (CONNECT → loopback TLS terminator). Classifies each
  outbound `/v1/messages` request body into `auxiliary` (no tools)
  / `tool_followup` (last msg has tool_result) / `compaction`
  (compact.ts prompt markers) / `normal`. Tags every emitted SSE
  event with `_reqId` + `_requestType`. Synthetic `api_error` event
  on non-SSE 4xx/5xx so the wrapper exits nonzero and the CLI runner
  triggers failover. Both stages bind port 0.
- `extensions/anthropic/interactive-proxy/cert-manager.ts`: lazy
  CA + leaf cert generation under `~/.openclaw/proxy-certs/`. CERT
  dir created at 0700 BEFORE openssl writes keys so there's no
  world-readable window. Key files hardened to 0600 with fail-closed
  regeneration if hardening fails.
- `extensions/anthropic/interactive-proxy/tty-spoof.cjs`: NODE_OPTIONS
  preload — minimal TTY-property + tty.isatty patches that keep
  claude in interactive mode under piped stdio.

Routing (wrapper)
- `auxiliary` → drop entire stream (haiku title-gen / classifier /
  skill-search side-calls; OpenClaw injects MCP tools so the real
  user turn always has tools).
- `compaction` → rewrite text_delta as thinking_delta so the
  summary surfaces as model reasoning, not as the user's reply.
- `tool_followup` + `normal` → emit events through; emit `result`
  record on `end_turn`. Belt-and-braces response-content fingerprint
  (`<analysis>` + Primary-Request-and-Intent + Pending-Tasks)
  catches anything that slipped past request-level classification.

Framework
- `src/agents/cli-backends.ts`: propagate `inheritUserConfigFrom`
  through `resolveSetupCliBackendPolicy` so cold-setup / live-test /
  fallback paths consult the same inheritance metadata as the
  registered-backend path. Lookup now uses
  `registered?.inheritUserConfigFrom ?? fallbackPolicy?.
  inheritUserConfigFrom`.
- `src/plugins/cli-backend.types.ts`: declare
  `inheritUserConfigFrom?` on `CliBackendPlugin`.
- `src/agents/command/attempt-execution.ts`: fallback session
  binding uses `params.originalProvider` rather than the hardcoded
  "claude-cli" id.

Scope boundary
- Existing `claude-cli` backend is untouched (its predicate
  `liveSession === "claude-stdio"` continues to match it alone).
- Default backend selection is unchanged; new backend is opt-in
  via the `claude-cli-interactive` id.
- Tool-use rendering / waiting-on-tool injection lands in openclaw#82285,
  not here — this PR establishes the `stream_event` surface that
  builds on.

(cherry picked from commit f4b9491)
(cherry picked from commit 724e260194516a0798620a9a876b91d120fa331d)
(cherry picked from commit 0e2a13e3743934efe4757393fb7dee785b9ed838)
@anagnorisis2peripeteia

anagnorisis2peripeteia commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current upstream/main. Builds on #81851 — its commits appear in this diff because cross-repo PRs target main directly; the net-new changes here are the Telegram interleave renderer + claude stream-json tool-event surface described above. Worth reviewing #81851 first.

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@socket-security

socket-security Bot commented May 26, 2026

Copy link
Copy Markdown

No dependency changes detected. Learn more about Socket for GitHub.

👍 No dependency changes detected in pull request

Comment thread extensions/anthropic/interactive-proxy/mitm-server.ts Fixed
…lling-timer message

Completes the rendering side of openclaw#82285's "tool-use injection + rolling-timer
indicator". cli-output now emits structured stream:"tool" agent events
(previous commit refactored away the in-text injection); this commit ports
the live dist patch that consumes those events and renders them into one
continuously-updating Telegram message in the reasoning lane.

New state, scoped alongside splitReasoningOnNextStream / draftLaneEventQueue:
- interleavedOutput — cumulative body (italicized reasoning chunks +
  timestamped tool/item/plan/approval/command-output/patch lines)
- rawReasoningCheckpoint — index in the upstream reasoning text we've
  already mirrored, so re-deliveries don't duplicate
- activeTimerSuffix / activeTimerInterval / activeToolStartTime — rolling
  `\n_Ns — HH:MM:SS_` timer state painted between tool starts

New helpers:
- clearActiveTimer — clears the interval and resets the suffix
- formatTimerClock — local HH:MM:SS
- updateInterleavedDisplay — writes "Thinking\n\n" + body + timer suffix
  into the reasoning lane stream (marks hasStreamedMessage so the lane
  isn't garbage-collected as empty)
- startToolTimer — setInterval(3000) that repaints + flushes the lane so
  the timer actually rolls past the typing-indicator throttle. Catches
  flush errors so a torn-down lane doesn't crash the interval.
- injectToolLineIntoInterleave(line, { startTimer? }) — enqueues onto the
  existing draftLaneEventQueue, appends `\n[HH:MM:SS] {line}\n`, optionally
  restarts the timer, updates the display, and flushes. Returns false when
  the reasoning lane isn't active so callers can fall back to the legacy
  per-tool channel-progress message.

Callback integration:
- onReasoningStream — replaced the ingestDraftLaneSegments-based path with
  a checkpoint-driven append into interleavedOutput. Each new portion of
  the upstream reasoning text gets italicized line-by-line and appended,
  the active timer is cleared, the lane is marked finalized AFTER the
  append (so post-run cleanup keeps the message instead of clearing it),
  and reasoningStepState bookkeeping fires.
- onReasoningEnd — sets reasoningLane.finalized = true when
  hasStreamedMessage is set, plugging the same cleanup-deletes-message
  hole on the no-final-reasoning-burst path.
- onToolStart — tries injectToolLineIntoInterleave first with
  startTimer:true; falls back to pushStreamToolProgress when no reasoning
  lane is active (room events, suppressed reasoning level, no streaming).

Doesn't yet wire injectToolLineIntoInterleave into onItemEvent /
onPlanUpdate / onApprovalEvent / onCommandOutput / onPatchSummary — those
already render via pushStreamToolProgress today and the interleave-first
pattern there is a follow-up; tool-start is the high-frequency signal that
drives the rolling-timer UX.

(cherry picked from commit 1346e0b630db9c96dd4f3700e7f7a498671558d3)
(cherry picked from commit 60e90e7067628d37505801b0aedd220968da0079)
…vedOutput is set

Mirrors a local hot-patch the runtime added today: when a turn ends in a
path that doesn't fire onReasoningEnd (turn aborted, error mid-reasoning,
supersession), the lane.finalized check in the dispatch finally block falls
into the stream.clear() branch and the interleaved message is deleted from
Telegram. Guard the lane-iteration loop with a forced finalized=true when
interleavedOutput has content AND hasStreamedMessage is set — guarantees
the visible message survives any non-onReasoningEnd exit path.

(cherry picked from commit c958dc8b25acec95acef59794117199c872b03a4)
(cherry picked from commit 45725d2a18ae0a95bfd955046013376e5d7db842)
The previous commit ported the dist hot-patch verbatim, including its
explicit `await reasoningLane.stream.flush()` after each 3s timer tick and
each tool-line injection. That flush was belt-and-braces — the
draft-stream loop (createDraftStreamLoop) already self-pumps on update():
when `now - lastSentAt >= throttleMs` the next update fires
startBackgroundFlush() immediately. The tool-timer ticks every 3000ms and
DEFAULT_THROTTLE_MS is 1000ms, so every tick is already past the throttle
window and delivers on its own.

The explicit flush() actively hurt us during Telegram backpressure: a
hard send-now bypasses the natural retry-after gating in the send path,
making rolling timers contribute to 429 storms instead of riding the
existing rate-limit handling. Dropping it lets Telegram's per-stream
throttle govern delivery cadence and the existing 429 retry-after machinery
in network-errors.ts gate us automatically.

User-visible effect: timer continues to roll every ~1-3s (gated by the
existing throttleMs), but during 429 backoff both the tick and the
underlying send pause together — which is the correct behaviour.

(cherry picked from commit 772444a608288e6cf71e67f9dfe1bc329ff9cbf5)
(cherry picked from commit 967fe511e17144d5d9de9a88902ae6da62787621)
…eper P1s

Two findings from the most recent ClawSweeper pass on openclaw#82285:

1. P1 — src/agents/cli-runner/execute.ts: `normalizeVerboseLevel`,
   `getAgentRunContext`, and `loadSessionEntryByKey` were left as imports
   when the previous commit (7b1e440) stripped the
   `shouldInjectToolInlineMarkers` resolver they fed. `tsconfig.core.json`
   has `noUnusedLocals: true`, so this fails typecheck before any runtime
   path can be exercised. Strip the three imports.

2. P1 — extensions/telegram/src/bot-message-dispatch.ts: `startToolTimer()`
   creates a `setInterval(3000ms)` repeating handle. The terminal cleanup
   path stops or clears the lane draft streams but never calls
   `clearActiveTimer()`, so a turn that ends mid-tool (or is superseded
   while a tool is running) leaks the interval for the rest of the
   process — every 3s it wakes to repaint a torn-down lane. Add
   `clearActiveTimer()` at the top of the dispatch `finally` block so it
   fires regardless of which exit path the turn takes.

(cherry picked from commit 751db41387008467e71e3a636c74687f980e459e)
(cherry picked from commit 83de5a7ff1fd14bafcedd290853f16b4106843b5)
(cherry picked from commit c292a6e1e39710dbe5971ceabdf6f3199f399ea8)
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

…aude-stream-json backends

Tool calls during a Claude turn now render as an in-stream textual
marker (with a rolling 8s timer indicator) AND a structured
`stream: "tool"` agent event. Without this, the assistant text
stream just pauses for however long the tool runs — often tens of
seconds on `Read` / `Bash` / `Grep` / `WebFetch` — and the
end-user UI freezes with no signal about which tool is running.

Gated on `usesClaudeStreamJsonDialect({backend, providerId})` —
matches `backend.jsonlDialect === "claude-stream-json"` or
`providerId === "claude-cli"`. Non-Claude backends are unaffected.

Wiring
- `src/agents/cli-output.ts`: `createClaudeToolUseTracker` closure
  watches `content_block_start` / `content_block_delta`
  (`input_json_delta` partial-JSON accumulation) /
  `content_block_stop` for `tool_use` / `server_tool_use` /
  `mcp_tool_use` blocks. On block-stop, sanitizes args via
  `sanitizeToolArgs`, paints an inline marker through the existing
  `onAssistantDelta` path, and arms a `setInterval` keepalive that
  refreshes the rolling timer line every 8s.
- `src/agents/cli-output.ts`: new optional `onToolEvent` callback
  on `createCliJsonlStreamingParser` — additive, existing callers
  unaffected. Emits `{phase: "start", name, args, itemId}`.
- `src/agents/cli-runner/execute.ts` and
  `src/agents/cli-runner/claude-live-session.ts`: wire the parser
  callback at both Claude code paths (headless JSONL + live
  session) and forward to the agent event bus as
  `emitAgentEvent({stream: "tool", data: {...}})`.
- `src/gateway/live-chat-projector.ts` +
  `src/gateway/server-chat.ts`: thread the `replacement` flag so
  the terminal timer-strip actually replaces the painted text
  rather than appending a duplicate; resolve tool-verbose at
  emission time (per-session, not once at parser construction).

Cleanup contract
- `clearToolKeepalive()` runs on `result`, on `finish()`, and on
  the next `text_delta` — the parser closure owns the interval
  handle and clears it on every terminal path.
- `sanitizeToolArgs` truncates and redacts known-sensitive keys
  before either emission path so file paths / WebFetch URLs /
  Bash commands don't leak to Telegram or the agent event bus.

Scope boundary
- Gated by dialect — no impact on OpenAI / xAI / other CLI
  backends.
- `onToolEvent` is opt-in: existing parser consumers that only
  subscribe to `onAssistantDelta` still work.
- No tool-output / tool-result event (only tool-start); the result
  is reflected in claude's next `text_delta` and the timer-strip
  on resumed text handles the UX.

(cherry picked from commit 834599963a2c0bcb5c1ed400d82ffa48f17a86e5)
(cherry picked from commit adbf73208065d6bc36085c62ed0e716427217b7a)
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

…ext replace contract

ClawSweeper flagged that the assistant-stream events carried a bespoke
`replacement: true` field while the live-chat merger only honours `replace`.
The field was never consumed: the CLI assistant bridge forwards full text
(it does not append deltas), so `resolveDraftPartialText` already replaces by
`text` and the rolling-timer terminal cleanup wins on the existing contract.

Remove the parallel field end-to-end (CliStreamingDelta type, the two
execute.ts emit sites, and the tick-cleanup emitter) instead of wiring a
second replacement vocabulary. Cleanup behaviour is unchanged.
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

…+ dedupe reasoning header

- Restore the rolling-timer terminal-cleanup signal on the existing
  live-chat `replace` contract (cli-output -> execute agent event) and
  forward it through server-chat's emitChatDelta into
  resolveMergedAssistantText, so a shorter timer-stripped prefix replaces
  the buffer instead of being kept as a stale rollback in live chat /
  Control UI. Adds merge coverage for both the kept-prefix and
  replaced-prefix cases.
- Interleaved reasoning display: store only the formatted body, not the
  full formatReasoningMessage output, so updateInterleavedDisplay's
  "Thinking" header is no longer duplicated.
@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia

Copy link
Copy Markdown
Contributor Author

Superseded by #87072. That PR reimagines this work as an opt-in, projection-only Telegram interleaved progress lane built on the existing structured-event contract (no contract change), default-off, with unit tests — replacing this branch's bundled approach. Closing in favour of the cleaner split.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: telegram Channel integration: telegram gateway Gateway runtime mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: 📸 screenshot Contributor real behavior proof includes screenshot evidence. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: L status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants