Skip to content

fix(agent): output-token parse for OpenRouter; empty-stream guard; cron-session prefix (#38652 #38725 #38788)#40405

Closed
ashishpatel26 wants to merge 1 commit into
NousResearch:mainfrom
ashishpatel26:fix/p1-output-tokens-empty-stream-cron-leak
Closed

fix(agent): output-token parse for OpenRouter; empty-stream guard; cron-session prefix (#38652 #38725 #38788)#40405
ashishpatel26 wants to merge 1 commit into
NousResearch:mainfrom
ashishpatel26:fix/p1-output-tokens-empty-stream-cron-leak

Conversation

@ashishpatel26

@ashishpatel26 ashishpatel26 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Three P1 bug fixes: infinite token-reset loop, fabricated empty stream turn, cron summary leak.

Fix 1 - #38652: parse_available_output_tokens_from_error misses OpenRouter format

Root cause: guard required "max_tokens" + "available_tokens" keywords. OpenRouter uses "maximum context length is N" and "K in the output" - neither present, returns None, caller loops forever.

Fix: expanded guard; added extraction computing available = context_length - text_input - tool_input.

Files: agent/model_metadata.py

Fix 2 - #38725: Streaming parser fabricates empty stop turn on zero-chunk stream

Root cause: after stream loop with zero chunks, finish_reason is None. The or "stop" fallback fabricated a successful empty turn, hiding the provider error.

Fix: zero-chunk guard raises RuntimeError so retry machinery handles recovery.

Files: agent/chat_completion_helpers.py

Fix 3 - #38788: Cron session summary leaks into live conversations after compression

Root cause A: compression rotated session_id dropping cron_ prefix; later resume injected cron summaries into live conversations.

Root cause B: _previous_summary never cleared on session switch.

Fix A: session rotation preserves cron_ prefix when session is cron-sourced.

Fix B: added ContextCompressor.on_session_start() clearing _previous_summary on each switch.

Files: agent/conversation_compression.py, agent/context_compressor.py

Test plan

…on-session prefix

Three P1 fixes in one batch:

1. parse_available_output_tokens_from_error (NousResearch#38652): guard and extraction
   now recognise the OpenRouter/Nous error format ("maximum context length
   is N... K in the output"). The old guard required "max_tokens" +
   "available_tokens" keywords — both absent in OpenRouter responses —
   so the function returned None, the caller could not reduce max_tokens,
   and the session entered an infinite auto-reset loop.

2. Zero-chunk stream guard (NousResearch#38725): _call_chat_completions now raises
   RuntimeError when the SSE loop exits with finish_reason=None and no
   accumulated content/tool-calls. Previously the or-"stop" fallback
   fabricated a syntactically valid but empty completion, masking upstream
   errors as successful turns.

3. Cron session_id prefix preservation (NousResearch#38788): compression-triggered
   session rotation now carries forward the cron_ prefix so post-compression
   cron sessions remain identifiable. ContextCompressor.on_session_start
   clears _previous_summary on every session switch, preventing stale cron
   summaries from bleeding into live conversations resumed via /resume.

Fixes NousResearch#38652, NousResearch#38725, NousResearch#38788.
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/openrouter OpenRouter aggregator P1 High — major feature broken, no workaround labels Jun 6, 2026
@teknium1

teknium1 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Salvaged the two isolated reliability fixes (zero-chunk stream guard + OpenRouter output-cap parsing) into #40589 with credit. I split out the two compression-state changes (_previous_summary clear + cron session-id prefix, #38788) — those touch the compression hot path and warrant separate review rather than riding along. If you want, re-submit those as a focused PR and we'll review them on their own. Thanks!

#40589

@teknium1 teknium1 closed this Jun 6, 2026
teknium1 added a commit that referenced this pull request Jun 7, 2026
…ut-cap errors (#40589)

Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
  no content/reasoning/tool_calls) so retry handles it instead of
  fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
  ("maximum context length is N ... (A of text input, B of tool input,
  C in the output)") so parse_available_output_tokens_from_error returns
  a real cap and the caller stops looping on it.

Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, #38788);
those touch the compression hot path and are split out for separate review.

Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…ut-cap errors (NousResearch#40589)

Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
  no content/reasoning/tool_calls) so retry handles it instead of
  fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
  ("maximum context length is N ... (A of text input, B of tool input,
  C in the output)") so parse_available_output_tokens_from_error returns
  a real cap and the caller stops looping on it.

Salvaged from NousResearch#40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, NousResearch#38788);
those touch the compression hot path and are split out for separate review.

Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/openrouter OpenRouter aggregator type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants