feat: unified streaming infrastructure (draft + edit fallback)#922
Closed
teknium1 wants to merge 1 commit into
Closed
feat: unified streaming infrastructure (draft + edit fallback)#922teknium1 wants to merge 1 commit into
teknium1 wants to merge 1 commit into
Conversation
Unified streaming architecture combining the best of PRs #774 and #798, with improvements. This is a draft — awaiting proper streaming token implementation and testing before merge. Layer 1 — Core streaming (run_agent.py): - stream_delta_callback on AIAgent.__init__ (per-instance) - _interruptible_streaming_api_call() for chat completions with SimpleNamespace response reconstruction - Tool-call suppression (callback only fires for text-only responses) - on_first_delta callback (stops thinking spinner on first token) - Provider fallback when streaming unsupported - reasoning_content accumulation - Interrupt support (client.close() + rebuild) Layer 2 — Display (cli.py, gateway/): - CLI: line-buffered _stream_delta/_flush_stream via _cprint - Gateway: async stream consumer with dual transport: * Draft (Bot API 9.3+ sendMessageDraft) as primary * Progressive editMessageText as fallback * Auto mode tries draft, falls back seamlessly - Config-driven: streaming.enabled, edit_interval, buffer_threshold, cursor, transport (auto/draft/edit) - Uses self.config (no duplicate yaml reads) - already_sent flag prevents duplicate sends in base.py Telegram-specific (gateway/platforms/telegram.py): - send_raw / edit_message_raw (plain text, no MarkdownV2) - send_draft / finalize_draft (Bot API 9.3+) - delete_message - All methods pass message_thread_id for forum topic support (fix for #774's missing thread_id bug) Tests: 10 new tests covering accumulator shape, callback order, tool-call suppression, provider fallback, already_sent contract. Config example: streaming: enabled: true edit_interval: 1.0 buffer_threshold: 100 cursor: ' ▉' transport: auto # auto, draft, or edit Supersedes: #774 (jobless0x), #798 (OutThisLife), #697 (clicksingh)
This was referenced Mar 11, 2026
jobless0x
added a commit
to jobless0x/hermes-agent
that referenced
this pull request
Mar 14, 2026
…draft+edit), already_sent Implements Layer 2 of the unified streaming architecture described in NousResearch#922, which explicitly lists the async gateway stream consumer as awaiting implementation. GatewayStreamConsumer bridges the sync stream_delta_callback from the agent thread to the async Telegram transport methods using an asyncio.Queue. Key design decisions: - Thread bridge via queue.Queue + finish() sentinel — matches the existing tool_progress_callback pattern in gateway/run.py, no new threading primitives - Draft transport (sendMessageDraft, Bot API 9.5+) for private DMs only; groups/channels return Textdraft_peer_invalid by design — short-circuited at the consumer level before any API call - Edit transport: send_raw() for first message, edit_message_raw() on interval - Auto mode: draft-first with seamless mid-stream fallback to edit - already_sent flag: set on first confirmed delivery; callers skip send() - Short/instant responses (< buffer_threshold): still delivered via the same transport negotiation path, never silently dropped - Finalization: sendMessageDraft previews are ephemeral — finalize_draft() commits a permanent sendMessage() so the response is never lost - run_with_timeout(): 300 s safety timeout wraps run()
jobless0x
added a commit
to jobless0x/hermes-agent
that referenced
this pull request
Mar 14, 2026
…nt wiring StreamingConfig (gateway/config.py): - Typed dataclass with live-benchmarked defaults (edit_interval=0.15s, buffer_threshold=20 chars) — 42% smoother than placeholder values in NousResearch#922 - Loaded once at startup from config.yaml streaming: block via from_dict() - _hermes_home module-level variable makes config paths monkeypatchable in tests - to_dict() / from_dict() for clean serialization roundtrip _run_agent wiring (gateway/run.py): - Reads self.config.streaming — no duplicate config.yaml file reads per message - Checks adapter.supports_streaming before creating consumer — other platforms fall through unchanged - Consumer task started alongside progress_task; awaited before reading already_sent - already_sent propagated into handler_result dict so base adapter skips re-send - Quick-commands lookup fixed: self.config.get() -> getattr() (config is dataclass) - Stream consumer cleaned up in finally block alongside existing task cleanup
5 tasks
|
Heads up, opened a completion PR (#1312) that builds directly on this one. Delivers the dual-transport Key additions: All in-scope "What is still needed" items are checked off. 137 tests, 0 failures. |
teknium1
added a commit
that referenced
this pull request
Mar 16, 2026
… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs #922 (unified streaming), #1312 (gateway consumer), #774 (Telegram streaming), #798 (CLI streaming), #1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
4 tasks
Contributor
Author
|
Merged via PR #1538 — unified streaming infrastructure. Your streaming concepts and code patterns were incorporated with authorship credit in the commit messages. Thank you for your contribution! |
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
This was referenced Apr 28, 2026
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Unified streaming architecture combining the best aspects of PRs #774 (jobless0x) and #798 (OutThisLife), with bug fixes and improvements.
Background
Three PRs attempted streaming independently (#697, #774, #798), all modifying the same core files with incompatible implementations. This branch unifies them into a single coherent architecture.
What was taken from each:
stream_delta_callbackon__init___interruptible_streaming_api_call()_cprintintegrationalready_sentanti-duplicationon_first_deltaspinner stopsendMessageDraft(Bot API 9.3+)send_raw/edit_message_rawstreaming:config sectionWhat was fixed/improved:
message_thread_idthreaded through all streaming methods — feat: telegram streaming with dual-transport (draft + edit fallback) #774 was missing this, breaking forum topic chatsself.config— feat: telegram streaming with dual-transport (draft + edit fallback) #774 re-read config.yaml; now uses the already-loaded dictedit_interval: 1.0,buffer_threshold: 100reasoning_contentaccumulation — from Streaming TUI, streaming CLI output with line-buffered rendering #798, missing in feat: telegram streaming with dual-transport (draft + edit fallback) #774Architecture
Config
What is still needed
sendMessageDraftbehavior with PTB 22.6 in productionFiles changed
run_agent.py— streaming callback + API method (+159)cli.py— line-buffered CLI rendering (+29)gateway/run.py— stream consumer + config + wiring (+174)gateway/platforms/base.py— streaming interfaces + already_sent (+45)gateway/platforms/telegram.py— Telegram streaming methods (+93)tests/test_streaming.py— 10 tests (+257)Tests
10 new tests: accumulator shape (3), callback order (3), provider fallback (2), already_sent contract (2). All passing.
Supersedes #774, #798, #697.