Skip to content

fix: retry 3 times with nudge when model returns truly empty response#6488

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-d5c0fd3b
Apr 9, 2026
Merged

fix: retry 3 times with nudge when model returns truly empty response#6488
teknium1 merged 1 commit into
mainfrom
hermes/hermes-d5c0fd3b

Conversation

@teknium1

@teknium1 teknium1 commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Fills the last gap in the empty-response recovery chain. When a model returns no content, no structured reasoning, and no tool calls (common with open models via OpenRouter/Ollama), the agent now nudges the model up to 3 times before falling through to (empty).

Problem

Open models sometimes return completely blank responses — no content field, no reasoning, no tool calls. PR #5278 removed the old retry cascade (correctly, for reasoning-only responses), and PR #5931 added thinking prefill continuation for structured reasoning. But neither covers the case where the model produces absolutely nothing. Users on Discord/Telegram see (empty) as a visible message.

Recovery chain (complete)

Step Condition Action
1 Prior tool turn had content Use _last_content_with_tools fallback
2 Structured reasoning, no text Thinking prefill continuation (#5931)
3 Truly empty (no content, no reasoning) Nudge retry up to 3 times (NEW)
4 All recovery exhausted (empty) terminal

What changed

run_agent.py: Between the thinking prefill path and the (empty) terminal, added a nudge retry loop. Each retry:

  • Appends the empty assistant message (maintains role alternation)
  • Appends a system nudge: [System: Your last response was empty. Please provide a response to the user.]
  • Continues the agent loop

Gated on _truly_empty (content is None or whitespace-only) AND not _has_structured (no API reasoning fields). Inline <think> blocks are excluded — the model chose to reason, it just produced no visible text.

tests/run_agent/test_run_agent.py:

  • Updated test_truly_empty_response → expects 4 API calls (1 original + 3 retries)
  • Added test_truly_empty_response_succeeds_on_nudge — model produces content after 1 nudge

Test results

6 passed (empty_response + truly_empty + reasoning_only tests)
674 passed total in run_agent suite (7 pre-existing failures in test_agent_loop_tool_calling unrelated)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
@teknium1 teknium1 force-pushed the hermes/hermes-d5c0fd3b branch from d62f35f to e63727c Compare April 9, 2026 08:57
@teknium1 teknium1 merged commit 1eabbe9 into main Apr 9, 2026
4 of 6 checks passed
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…rch#6488)

When a model returns no content, no structured reasoning, and no tool
calls (common with open models), the agent now silently retries up to
3 times before falling through to (empty).

Silent retry (no synthetic messages) keeps the conversation history
clean, preserves prompt caching, and respects the no-synthetic-user-
injection invariant.  Most empty responses from open models are
transient (provider hiccups, rate limits, sampling flukes) so a
simple retry is sufficient.

This fills the last gap in the empty-response recovery chain:
1. _last_content_with_tools fallback (prior tool turn had content)
2. Thinking-only prefill continuation (NousResearch#5931 — structured reasoning)
3. Empty response silent retry (NEW — truly empty, no reasoning)
4. (empty) terminal (last resort after all retries exhausted)

Inline <think> blocks are excluded — the model chose to reason, it
just produced no visible text.  That differs from truly empty.

Tests:
- Updated test_truly_empty to expect 4 API calls (1 + 3 retries)
- Added test_truly_empty_response_succeeds_on_nudge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant