Skip to content

feat: stream partial responses via Telegram sendMessageDraft API#123

Closed
nimdraugsael wants to merge 2 commits intoRichardAtCT:mainfrom
nimdraugsael:feat/streaming-drafts
Closed

feat: stream partial responses via Telegram sendMessageDraft API#123
nimdraugsael wants to merge 2 commits intoRichardAtCT:mainfrom
nimdraugsael:feat/streaming-drafts

Conversation

@nimdraugsael
Copy link
Copy Markdown
Contributor

Recently Telegram API released support of response streaming for Telegram bots. Attached feature preview from Durov's telegram channel

IMG_2258.MP4

DraftStreamer accumulates two sections — a tool header (showing tool calls and reasoning snippets as they arrive) and a response body (streamed token-by-token) — and sends throttled drafts that self-disable on API errors for graceful fallback.

SDK changes:

  • Enable --include-partial-messages for token-level StreamEvent deltas (content_block_delta / text_delta)
  • Handle StreamEvent to emit stream_delta type updates

Orchestrator changes:

  • Feed tool calls and reasoning via append_tool(), response text via append_text() (stream_delta only, to avoid duplication)
  • Skip progress_msg.edit_text when draft streaming is active

New settings: ENABLE_STREAM_DRAFTS, STREAM_DRAFT_INTERVAL

27 tests covering accumulation, throttle, composition, truncation, tool lines, and self-disable behavior.

Stream Claude's tool activity and response text to the user in
real-time using Telegram's sendMessageDraft, giving a smooth
typing-preview experience in private chats.

DraftStreamer accumulates two sections — a tool header (showing
tool calls and reasoning snippets as they arrive) and a response
body (streamed token-by-token) — and sends throttled drafts that
self-disable on API errors for graceful fallback.

SDK changes:
- Enable --include-partial-messages for token-level StreamEvent
  deltas (content_block_delta / text_delta)
- Handle StreamEvent to emit stream_delta type updates

Orchestrator changes:
- Feed tool calls and reasoning via append_tool(), response text
  via append_text() (stream_delta only, to avoid duplication)
- Skip progress_msg.edit_text when draft streaming is active

New settings: ENABLE_STREAM_DRAFTS, STREAM_DRAFT_INTERVAL

27 tests covering accumulation, throttle, composition, truncation,
tool lines, and self-disable behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FridayOpenClawBot
Copy link
Copy Markdown

PR Review
Reviewed head: ac9b6fe27d186bc79b1da6ad855f3b71c41b4e36

Summary

  • Adds DraftStreamer to stream partial Claude responses to Telegram via the new sendMessageDraft API, showing tool activity and response text in real-time
  • Integrates stream deltas from the SDK (include_partial_messages, StreamEvent handling) and wires into agentic_text for private chats only (opt-in via ENABLE_STREAM_DRAFTS)
  • 27 unit tests covering accumulation, throttle, truncation, self-disable, and flush

What looks good

  • Clean self-disabling pattern on API error — graceful fallback to normal delivery without crashing the request
  • Tail-truncation with ellipsis prefix is correct for partial streaming (avoids abrupt cutoff mid-word being the last thing the user sees)
  • private chat restriction is a sensible first-ship constraint; group threads need more testing with draft semantics

Issues / questions

  1. [Important] src/bot/utils/draft_streamer.pygenerate_draft_id uses hash() which is randomized per-process in Python 3 (PYTHONHASHSEED). Two rapid calls for the same chat_id are extremely unlikely to collide, but more importantly the & 0x7FFFFFFF | 1 mask could return the same ID on a process restart for a resumed session. Is draft continuity across restarts intentional or is a new ID on each restart fine? If fine, a simple secrets.randbits(30) | 1 is cleaner and removes the hash dependency.

  2. [Important] src/bot/orchestrator.py (_make_stream_callback) — when draft_streamer is active and verbose_level >= 1, tool lines go to the draft but NOT to tool_log. On error/fallback the progress message won't contain any tool activity. Is that intentional? If the draft streamer disables mid-run (e.g. sendMessageDraft flakes), the user may see nothing until the final response.

  3. [Important] src/bot/orchestrator.py (agentic_text) — draft_streamer.flush() is called in finally but exceptions are silently swallowed. If flush() raises (e.g. the bot object is torn down), the suppression is fine, but a logger.debug inside the except would help diagnose silent draft gaps in production.

  4. [Nit] src/bot/utils/draft_streamer.py_MAX_TOOL_LINES = 10 uses self._tool_lines[-_MAX_TOOL_LINES:] (shows most recent 10), but the "... +N more" prefix is prepended before those lines. The net display is "... +N more\nline1\n...\nline10", which reads a bit awkwardly if only 1–2 lines were trimmed. Consider the threshold before showing the overflow notice.

  5. [Nit] src/config/settings.py — the blank line between agentic_mode and reply_quote was removed (looks unintentional, adds noise to the diff).

Suggested tests (if needed)

  • Test that when DraftStreamer self-disables mid-stream, append_text / append_tool calls afterwards are no-ops (this path is implied but worth asserting explicitly)
  • Test orchestrator fallback: when draft streamer disables, tool_log remains empty and the final progress message still shows something useful (or confirm empty is expected)

Verdict
⚠️ Merge after fixes — the self-disable graceful fallback is solid, but the tool_log gap on streamer failure (#2) and the draft ID semantics (#1) deserve a quick answer before merge.

Friday, AI assistant to @RichardAtCT

Replace hash()-based draft ID with secrets.randbits(30) for cleaner
randomness, always populate tool_log as fallback when streamer disables
mid-run, add debug logging for flush failures, raise overflow notice
threshold to >= 3 lines, and restore missing blank line in settings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FridayOpenClawBot
Copy link
Copy Markdown

PR Review (incremental — new commit since last review)
Reviewed head: 26d6e0a8a1c8eb62f2e3632365c5e57cc52f6f89

Summary

  • Fix commit addressing prior review feedback: randomness, tool_log fallback, overflow UX, flush error logging

What looks good

  • secrets.randbits(30) | 1 is the right call — drops the hash(f"...") fragility and the chat_id coupling that wasn't needed
  • Always populating tool_log regardless of streamer state (orchestrator.py:710,723) correctly solves the fallback gap
  • Debug-logging the flush failure instead of silently swallowing it is good hygiene

Issues / questions

  1. [Nit] draft_streamer.py:_compose_draftoverflow >= 3 suppresses the "...+N more" notice for 1–2 overflow lines, but those lines are still being dropped from visible (you still slice to [-_MAX_TOOL_LINES:]). For overflow=1 or 2, the user sees the last 10 but silently loses earlier lines with no indicator at all. Might be fine UX-wise, but worth a comment explaining the intent.

Verdict
✅ Ready to merge — the fixes are clean and the new tests (small overflow + mid-stream disable) cover the changed behaviour well.

Friday, AI assistant to @RichardAtCT

Copy link
Copy Markdown
Owner

@RichardAtCT RichardAtCT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — streaming draft responses via sendMessageDraft is a nice UX improvement. Clean implementation.

RichardAtCT added a commit that referenced this pull request Mar 4, 2026
@RichardAtCT
Copy link
Copy Markdown
Owner

Merged manually after rebasing to resolve conflicts with voice transcription PR. Included in main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants