feat: gateway reasoning visibility modes by raulvidis · Pull Request #1214 · NousResearch/hermes-agent

raulvidis · 2026-03-13T23:36:22Z

Summary

Per-chat reasoning visibility via /reasoning [off|on|stream]: off disables reasoning at the provider level, on enables reasoning silently (for future use), stream shows live italicized reasoning previews that auto-delete when the final response arrives.
Inline keyboard picker on Telegram — /reasoning with no argument shows an Off / On / Stream button row; callback queries route through the gateway command handler.
Reasoning streaming infrastructure — reasoning_callback on AIAgent surfaces reasoning deltas from both chat completions and Codex Responses API streams. Gateway consumes these via an async queue + background task that edits a single preview message.
Edit failure recovery — when Telegram rejects an edit (e.g. "message is not modified"), the preview is deleted and a fresh message is sent instead of silently dropping updates.
Typing metadata — send_typing now accepts metadata so typing indicators go to the correct forum thread.

Changed files

File	Key changes
`run_agent.py`	`reasoning_callback` param, `_merge_stream_reasoning()`, `_run_streaming_chat_completion()` for reasoning deltas, `last_reasoning` in result dict
`gateway/run.py`	`/reasoning` command handler, `_resolve_session_reasoning_config()`, `send_reasoning_updates()` background task, `_format_reasoning_preview()`, `_delete_preview_message()`
`gateway/platforms/telegram.py`	`delete_message()`, inline keyboard for `/reasoning`, `CallbackQueryHandler`, "not modified" treated as success in `edit_message`
`gateway/platforms/base.py`	`delete_message()` base method, `send_typing(metadata=)` signature, `_keep_typing` metadata forwarding
`gateway/session.py`	`reasoning_mode` field on `SessionEntry`, `set_reasoning_mode()`, preserved across `/reset`

New test files

File	Coverage
`tests/gateway/test_reasoning_command.py`	8 tests: command output, mode persistence, config resolution, preview formatting, preview deletion, edit-failure recovery
`tests/gateway/test_session.py`	3 new tests: reasoning mode roundtrip, reset preservation, group session keys
`tests/gateway/test_telegram_format.py`	1 new test: "not modified" edit treated as success
`tests/test_run_agent_reasoning_stream.py`	1 test: streaming chat completion emits reasoning callback

Test plan

pytest tests/gateway/test_reasoning_command.py — 8/8 pass
pytest tests/gateway/test_session.py — 21/21 pass (including 3 new)
pytest tests/gateway/test_telegram_format.py — 54/54 pass (including 1 new)
pytest tests/test_run_agent_reasoning_stream.py — requires full dep install (firecrawl)
Manual: /reasoning stream on Telegram shows italic preview that deletes on completion
Manual: /reasoning off disables provider-level reasoning
Manual: inline keyboard picker works in DM and group chats

🤖 Generated with Claude Code

- /reasoning now supports display modes (show/hide/stream) and effort levels (none/minimal/low/medium/high/xhigh) - Telegram inline keyboard shows all options in 3 rows with current status displayed above - "show" mode sends reasoning as a separate message after the response - "stream" mode streams reasoning live and auto-deletes (Telegram only) - "hide" mode enables reasoning at API level but doesn't display it - Effort levels set the reasoning budget without changing display mode - CLI reasoning callback is now a no-op (uses TUI spinner instead) - Fix: /reasoning effort changes no longer overwritten by config reload Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs #922 (unified streaming), #1312 (gateway consumer), #774 (Telegram streaming), #798 (CLI streaming), #1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.

…ment The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR #1214 (raulvidis) for gateway reasoning visibility.

raulvidis · 2026-03-19T21:52:02Z

Closing — superseded by merged streaming infrastructure #1538

… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.

…ment The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR NousResearch#1214 (raulvidis) for gateway reasoning visibility.

… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.

…ment The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR NousResearch#1214 (raulvidis) for gateway reasoning visibility.

… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.

…ment The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR NousResearch#1214 (raulvidis) for gateway reasoning visibility.

… providers Stage 1 of streaming support. Adds: - stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery - _interruptible_streaming_api_call() handling chat_completions + anthropic_messages - Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming - _fire_stream_delta() fires both display and TTS callbacks - _fire_reasoning_delta() for reasoning content streaming - Tool-call suppression: callbacks only fire on text-only responses - on_first_delta callback for spinner control on first token - Provider fallback: graceful degradation to non-streaming - _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks - Anthropic streaming returns native Message for downstream compatibility Drawing from PRs NousResearch#922 (unified streaming), NousResearch#1312 (gateway consumer), NousResearch#774 (Telegram streaming), NousResearch#798 (CLI streaming), NousResearch#1214 (reasoning modes). Credit: jobless0x, OutThisLife, clicksingh, raulvidis.

…ment The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR NousResearch#1214 (raulvidis) for gateway reasoning visibility.

raulvidis and others added 11 commits March 14, 2026 00:19

feat: add gateway reasoning modes

2a9024b

fix: make gateway typing metadata-compatible

5c314d9

fix: align gateway reasoning modes

51a2021

fix: make reasoning previews transient

71c7395

fix: italicize reasoning previews

9dc293f

fix: stabilize telegram reasoning preview markdown

b425aa1

fix: avoid plain text fallback on telegram edit errors

eb1dc20

fix: keep a single reasoning preview message

fe7ce2d

fix: recover reasoning previews after edit failures

bd86914

fix: treat unchanged telegram edits as success

9320513

teknium1 mentioned this pull request Mar 16, 2026

feat: unified streaming infrastructure — real-time token delivery for CLI + gateway #1538

Merged

4 tasks

raulvidis closed this Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: gateway reasoning visibility modes#1214

feat: gateway reasoning visibility modes#1214
raulvidis wants to merge 11 commits into
NousResearch:mainfrom
raulvidis:feat/reasoning-clean

raulvidis commented Mar 13, 2026 •

edited

Loading

Uh oh!

raulvidis commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

raulvidis commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changed files

New test files

Test plan

Uh oh!

raulvidis commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

raulvidis commented Mar 13, 2026 •

edited

Loading