fix: empty response recovery for reasoning models (mimo, qwen, GLM) by teknium1 · Pull Request #8609 · NousResearch/hermes-agent

teknium1 · 2026-04-12T21:16:58Z

Summary

Fixes the (empty) response bug affecting open reasoning models on OpenRouter (mimo-v2-pro, qwen3.5, GLM).

Root cause: Models like mimo-v2-pro always return reasoning/reasoning_details fields via OpenRouter — even on transient empty responses. The recovery chain had a not _has_structured guard on the retry path that blocked retries for ANY model with reasoning after the 2 prefill attempts were exhausted. Combined with retry counters that never reset during tool-calling turns, this caused permanent (empty) in long sessions.

Fixes

1. Allow retries after prefill exhaustion

# Before: retry path blocked when reasoning present
if _truly_empty and not _has_structured and self._empty_content_retries < 3:

# After: allow retries once prefill is exhausted
_prefill_exhausted = _has_structured and self._thinking_prefill_retries >= 2
if _truly_empty and (not _has_structured or _prefill_exhausted) and self._empty_content_retries < 3:

2. Reset counters on tool-call recovery
When prefill succeeds (model makes proper tool calls after prefill), reset both _thinking_prefill_retries and _empty_content_retries. Previously these only reset on successful content, never during tool-calling turns — so a model cycling empty→prefill→tools→empty burned all prefill attempts permanently.

3. Strip think blocks before truly-empty check
_truly_empty now uses _strip_think_blocks() so inline <think> content doesn't make the string falsely non-empty.

Reproduction

qwen/qwen3.5-9b on OpenRouter reliably produces the triggering response: model emits tool calls as XML inside its reasoning field instead of proper function calls. Response has content=None, tool_calls=None, reasoning='...<tool_call>...'. Prefill recovery works on first attempt, but counter accumulation caused permanent (empty) in sustained sessions.

Test plan

Updated 2 existing tests to match new recovery depth (6 attempts instead of 3)
python -m pytest tests/run_agent/test_run_agent.py -k 'empty or prefill or thinking' — 22 passed

Three fixes for the (empty) response bug affecting open reasoning models: 1. Allow retries after prefill exhaustion — models like mimo-v2-pro always populate reasoning fields via OpenRouter, so the old 'not _has_structured' guard on the retry path blocked retries for EVERY reasoning model after the 2 prefill attempts. Now: 2 prefills + 3 retries = 6 total attempts before (empty). 2. Reset prefill/retry counters on tool-call recovery — the counters accumulated across the entire conversation, never resetting during tool-calling turns. A model cycling empty→prefill→tools→empty burned both prefill attempts and the third empty got zero recovery. Now counters reset when prefill succeeds with tool calls. 3. Strip think blocks before _truly_empty check — inline <think> content made the string non-empty, skipping both retry paths. Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models. Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead of proper function calls, causing content=None + tool_calls=None + reasoning with embedded <tool_call> XML. Prefill recovery works but counter accumulation caused permanent (empty) in long sessions.

…o, qwen, GLM) Upstream commit d6785dc (PR NousResearch#8609): - Fix empty response retry logic that blocked retries for reasoning models after prefill exhaustion. Models like mimo-v2-pro always populate reasoning fields via OpenRouter, so the old not _has_structured guard prevented retries for every reasoning model after prefill. - Remove tool timing footer (upstream cleanup) - Remove format_tool_timing_footer import - Replace logger.debug with pass for non-critical failures - 134 files changed, net -16K lines (mostly test/data cleanup) Self-improve: automated upstream merge

…ousResearch#8609) Three fixes for the (empty) response bug affecting open reasoning models: 1. Allow retries after prefill exhaustion — models like mimo-v2-pro always populate reasoning fields via OpenRouter, so the old 'not _has_structured' guard on the retry path blocked retries for EVERY reasoning model after the 2 prefill attempts. Now: 2 prefills + 3 retries = 6 total attempts before (empty). 2. Reset prefill/retry counters on tool-call recovery — the counters accumulated across the entire conversation, never resetting during tool-calling turns. A model cycling empty→prefill→tools→empty burned both prefill attempts and the third empty got zero recovery. Now counters reset when prefill succeeds with tool calls. 3. Strip think blocks before _truly_empty check — inline <think> content made the string non-empty, skipping both retry paths. Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models. Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead of proper function calls, causing content=None + tool_calls=None + reasoning with embedded <tool_call> XML. Prefill recovery works but counter accumulation caused permanent (empty) in long sessions.

teknium1 merged commit d6785dc into main Apr 12, 2026
4 of 6 checks passed

teknium1 deleted the hermes/hermes-81f1601a branch April 12, 2026 22:38

github-actions Bot mentioned this pull request Apr 15, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.8 to v2026.4.13 Docker-Hub-sirmark/docker-hermes-agent#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: empty response recovery for reasoning models (mimo, qwen, GLM)#8609

fix: empty response recovery for reasoning models (mimo, qwen, GLM)#8609
teknium1 merged 1 commit into
mainfrom
hermes/hermes-81f1601a

teknium1 commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teknium1 commented Apr 12, 2026

Summary

Fixes

Reproduction

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant