fix(agent): output-token parse for OpenRouter; empty-stream guard; cron-session prefix (#38652 #38725 #38788) by ashishpatel26 · Pull Request #40405 · NousResearch/hermes-agent

ashishpatel26 · 2026-06-06T09:28:23Z

Summary

Three P1 bug fixes: infinite token-reset loop, fabricated empty stream turn, cron summary leak.

Fix 1 - #38652: parse_available_output_tokens_from_error misses OpenRouter format

Root cause: guard required "max_tokens" + "available_tokens" keywords. OpenRouter uses "maximum context length is N" and "K in the output" - neither present, returns None, caller loops forever.

Fix: expanded guard; added extraction computing available = context_length - text_input - tool_input.

Files: agent/model_metadata.py

Fix 2 - #38725: Streaming parser fabricates empty stop turn on zero-chunk stream

Root cause: after stream loop with zero chunks, finish_reason is None. The or "stop" fallback fabricated a successful empty turn, hiding the provider error.

Fix: zero-chunk guard raises RuntimeError so retry machinery handles recovery.

Files: agent/chat_completion_helpers.py

Fix 3 - #38788: Cron session summary leaks into live conversations after compression

Root cause A: compression rotated session_id dropping cron_ prefix; later resume injected cron summaries into live conversations.

Root cause B: _previous_summary never cleared on session switch.

Fix A: session rotation preserves cron_ prefix when session is cron-sourced.

Fix B: added ContextCompressor.on_session_start() clearing _previous_summary on each switch.

Files: agent/conversation_compression.py, agent/context_compressor.py

Test plan

[Bug]: parse_available_output_tokens_from_error() misses OpenRouter/Nous "in the output" format — causes infinite auto-reset loop #38652: OpenRouter near-full context - confirm available token count returned, no loop
Streaming parser silently fabricates an empty stop turn when an OpenAI-wire stream yields zero chunks #38725: zero-chunk SSE stream - confirm RuntimeError raised not silent empty turn
#38788A: cron compression - verify session_id retains cron_ prefix
#38788B: post-compression cron resume - verify no stale cron summary in live conversation

…on-session prefix Three P1 fixes in one batch: 1. parse_available_output_tokens_from_error (NousResearch#38652): guard and extraction now recognise the OpenRouter/Nous error format ("maximum context length is N... K in the output"). The old guard required "max_tokens" + "available_tokens" keywords — both absent in OpenRouter responses — so the function returned None, the caller could not reduce max_tokens, and the session entered an infinite auto-reset loop. 2. Zero-chunk stream guard (NousResearch#38725): _call_chat_completions now raises RuntimeError when the SSE loop exits with finish_reason=None and no accumulated content/tool-calls. Previously the or-"stop" fallback fabricated a syntactically valid but empty completion, masking upstream errors as successful turns. 3. Cron session_id prefix preservation (NousResearch#38788): compression-triggered session rotation now carries forward the cron_ prefix so post-compression cron sessions remain identifiable. ContextCompressor.on_session_start clears _previous_summary on every session switch, preventing stale cron summaries from bleeding into live conversations resumed via /resume. Fixes NousResearch#38652, NousResearch#38725, NousResearch#38788.

teknium1 · 2026-06-06T15:26:18Z

Salvaged the two isolated reliability fixes (zero-chunk stream guard + OpenRouter output-cap parsing) into #40589 with credit. I split out the two compression-state changes (_previous_summary clear + cron session-id prefix, #38788) — those touch the compression hot path and warrant separate review rather than riding along. If you want, re-submit those as a focused PR and we'll review them on their own. Thanks!

#40589

@ashishpatel26

…ut-cap errors (#40589) Two isolated reliability fixes: - chat_completion_helpers: raise on a zero-chunk stream (no finish_reason, no content/reasoning/tool_calls) so retry handles it instead of fabricating a successful empty turn. - model_metadata: parse the OpenRouter/Nous output-cap error phrasing ("maximum context length is N ... (A of text input, B of tool input, C in the output)") so parse_available_output_tokens_from_error returns a real cap and the caller stops looping on it. Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing fixes. The PR also bundled compression-state changes (on_session_start clearing _previous_summary; cron session-id prefix preservation, #38788); those touch the compression hot path and are split out for separate review. Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>

@ashishpatel26

…ut-cap errors (NousResearch#40589) Two isolated reliability fixes: - chat_completion_helpers: raise on a zero-chunk stream (no finish_reason, no content/reasoning/tool_calls) so retry handles it instead of fabricating a successful empty turn. - model_metadata: parse the OpenRouter/Nous output-cap error phrasing ("maximum context length is N ... (A of text input, B of tool input, C in the output)") so parse_available_output_tokens_from_error returns a real cap and the caller stops looping on it. Salvaged from NousResearch#40405 (@ashishpatel26) — took the two stream/error-parsing fixes. The PR also bundled compression-state changes (on_session_start clearing _previous_summary; cron session-id prefix preservation, NousResearch#38788); those touch the compression hot path and are split out for separate review. Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/openrouter OpenRouter aggregator P1 High — major feature broken, no workaround labels Jun 6, 2026

teknium1 mentioned this pull request Jun 6, 2026

fix: guard empty streams + parse OpenRouter output-cap errors #40589

Merged

teknium1 closed this Jun 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): output-token parse for OpenRouter; empty-stream guard; cron-session prefix (#38652 #38725 #38788)#40405

fix(agent): output-token parse for OpenRouter; empty-stream guard; cron-session prefix (#38652 #38725 #38788)#40405
ashishpatel26 wants to merge 1 commit into
NousResearch:mainfrom
ashishpatel26:fix/p1-output-tokens-empty-stream-cron-leak

ashishpatel26 commented Jun 6, 2026 •

edited

Loading

Uh oh!

teknium1 commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ashishpatel26 commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix 1 - #38652: parse_available_output_tokens_from_error misses OpenRouter format

Fix 2 - #38725: Streaming parser fabricates empty stop turn on zero-chunk stream

Fix 3 - #38788: Cron session summary leaks into live conversations after compression

Test plan

Uh oh!

teknium1 commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ashishpatel26 commented Jun 6, 2026 •

edited

Loading