Skip to content

fix: guard empty streams + parse OpenRouter output-cap errors#40589

Merged
teknium1 merged 1 commit into
mainfrom
salvage/40405-stream-and-outputcap
Jun 7, 2026
Merged

fix: guard empty streams + parse OpenRouter output-cap errors#40589
teknium1 merged 1 commit into
mainfrom
salvage/40405-stream-and-outputcap

Conversation

@teknium1

@teknium1 teknium1 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Two isolated reliability fixes carved out of #40405:

  1. Zero-chunk stream guard (agent/chat_completion_helpers.py): a stream that yields nothing usable (no finish_reason, no content/reasoning/tool_calls) is a provider/upstream error or malformed SSE, not a legitimate empty completion. Raise so the retry machinery handles it instead of fabricating a successful empty turn.
  2. OpenRouter/Nous output-cap parsing (agent/model_metadata.py): parse_available_output_tokens_from_error now recognizes the "maximum context length is N … (A of text input, B of tool input, C in the output)" phrasing and computes the available output cap, so the caller stops looping on that 400.

Scope note

The original PR (#40405) also bundled two compression-state changes (on_session_start clearing _previous_summary, and cron session-id prefix preservation through compression, citing #38788). Those touch the compression hot path and are a different concern — split out for separate review rather than landing alongside the stream/error-parsing fixes.

Validation

tests/test_output_cap_parsing.py 4 passed (incl. Anthropic-format regression + no-room → None). py_compile OK on both files.

Salvaged from #40405 (@ashishpatel26), credited.

…ut-cap errors

Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
  no content/reasoning/tool_calls) so retry handles it instead of
  fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
  ("maximum context length is N ... (A of text input, B of tool input,
  C in the output)") so parse_available_output_tokens_from_error returns
  a real cap and the caller stops looping on it.

Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, #38788);
those touch the compression hot path and are split out for separate review.

Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/openrouter OpenRouter aggregator provider/nous Nous Research API (OAuth) labels Jun 6, 2026
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: salvage/40405-stream-and-outputcap vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9963 on HEAD, 9962 on base (🆕 +1)

🆕 New issues (1):

Rule Count
unresolved-import 1
First entries
tests/test_output_cap_parsing.py:1: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 5167 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 1fb99b1 into main Jun 7, 2026
23 checks passed
@teknium1 teknium1 deleted the salvage/40405-stream-and-outputcap branch June 7, 2026 10:52
agogo233 added a commit to agogo233/hermes-agent that referenced this pull request Jun 8, 2026
* upstream/main: (430 commits)
  fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (NousResearch#40607)
  docs: add Urdu translation of README (NousResearch#40578)
  fix(hindsight): send only new-turn delta on append retains instead of whole session (NousResearch#40605)
  feat(gateway): render terminal tool calls as native bash code blocks on markdown platforms (NousResearch#41215)
  feat(desktop): stop the chat viewport from following streaming output (NousResearch#41414)
  chore(release): map AlchemistChaos co-author email for NousResearch#40135 salvage
  fix(desktop): recover chat after sleep/wake by revalidating a stale remote backend
  fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config
  fix(web): honor Hermes config-aware SEARXNG_URL lookup
  install.sh: hint at root-owned npm cache when desktop npm install fails (NousResearch#39688)
  fix(tools): percent-encode non-ascii URL components
  fix(skills): browse shows full catalog, not first 5000 (NousResearch#41413)
  feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network
  feat(desktop): full tool-backend config (pickers + per-backend settings) in Settings (NousResearch#41232)
  hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool
  fix: skip MCP preflight content-type probe on reconnect when already ready (NousResearch#40604)
  fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests
  fix: defer scratch workspace cleanup when task has active children (NousResearch#33774)
  feat(onboarding): opt-in structured profile-build path on first contact (NousResearch#41114)
  feat(compression): temporal anchoring in compaction summaries (NousResearch#41102)
  test(discord): align clarify/model-picker tests with fail-closed component auth (NousResearch#41338)
  chore(release): map Dusk1e and LaPhilosophie for approval fail-closed salvage (NousResearch#33844, NousResearch#33866, NousResearch#30964)
  fix(discord): fail closed for component button auth when no allowlist set
  fix(feishu): fail closed for update prompt card actions
  fix(slack): re-check gateway auth on approval and slash-confirm buttons
  fix: guard int(os.getenv()) casts against malformed env vars (NousResearch#40598)
  fix: respect Honcho env var fallback in doctor and honcho status
  chore(release): add synapsesx to AUTHOR_MAP for NousResearch#40495 salvage
  fix(research): keep tool_call/tool_response pairs intact when compressing trajectories
  fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS
  fix(desktop): make the running-turn timer per-session (NousResearch#41182)
  test(approval): regression for shell-escape denylist bypass (NousResearch#36846, NousResearch#36847)
  fix(security): strip shell escapes in denylist normalizer; fail-closed on missing approval module
  fix(stream+output-cap): guard empty streams and parse OpenRouter output-cap errors (NousResearch#40589)
  fix(desktop): bootstrap falls back to installed agent install.sh on GitHub 404
  feat(dashboard): change UI font from the theme picker, independent of theme (NousResearch#41145)
  fix(cli): return bool (not None) when a destructive-slash confirmation is cancelled (NousResearch#40583)
  fix(desktop): preserve configured base_url on same-provider model switch (NousResearch#41121)
  fix(desktop): stop bare-URL autolinker swallowing trailing emphasis asterisks (NousResearch#41093)
  fix(cron): bound the desktop run-history query to one job (NousResearch#41088)
  fix(desktop): scope in-session /model switch per-session, stop process-env leak (NousResearch#41120)
  chore: map bmoore210 author email for PR NousResearch#40550 salvage
  fix(desktop): scope session list to active profile + longer timeout
  fix: harden gateway startup and turn persistence
  fix(computer_use): honor custom vision routing
  fix(aux): honor model.default_headers on auxiliary client too (NousResearch#40033)
  fix(agent): honor model.default_headers for custom OpenAI-compatible providers (NousResearch#40033)
  docs(i18n): port deep-audit corrections to zh-Hans mirror (NousResearch#41104)
  fix(compression): don't overwrite the -1 post-compression sentinel in preflight seed (NousResearch#36718)
  chore(release): map singhsanidhya741@gmail.com to sanidhyasin (NousResearch#41094)
  ...
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…ut-cap errors (NousResearch#40589)

Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
  no content/reasoning/tool_calls) so retry handles it instead of
  fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
  ("maximum context length is N ... (A of text input, B of tool input,
  C in the output)") so parse_available_output_tokens_from_error returns
  a real cap and the caller stops looping on it.

Salvaged from NousResearch#40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, NousResearch#38788);
those touch the compression hot path and are split out for separate review.

Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/nous Nous Research API (OAuth) provider/openrouter OpenRouter aggregator type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants