Bugfix: recover from structured reasoning budget exhaustion by HiddenPuppy · Pull Request #9452 · NousResearch/hermes-agent

HiddenPuppy · 2026-04-14T06:02:07Z

Summary

detect thinking-budget exhaustion for chat-completions responses that return structured reasoning fields without visible text
record usage before length/empty-response recovery so Hermes can use real token pressure for recovery decisions
compact context before continuation/prefill when a reasoning-only response already shows the conversation is over the compaction threshold
add regression tests for structured reasoning truncation and proactive compression retry paths

Root Cause

Hermes already had a thinking-budget guard for inline <think> content, but OpenAI-compatible models like glm-5-turbo often return reasoning via reasoning_content/reasoning_details with empty content. Those responses skipped the guard, then walked into continuation or prefill retries that grew context further without ever giving compression a chance.

Notes

This PR focuses on the still-real structured-reasoning recovery gap from [Bug] Thinking model (glm-5-turbo) reasoning tokens exhaust output budget, producing empty responses with no recovery path #9344.
The separate stale _last_content_with_tools fallback issue from [Bug]: _last_content_with_tools fallback bypasses empty-response retries, causing silent agent loop termination mid-task #7968 is tracked independently in Bugfix: avoid stale tool-turn fallback on empty responses #9432 and is intentionally not duplicated here.

Validation

git diff --check
python3 -m py_compile run_agent.py tests/run_agent/test_run_agent.py
Full pytest was not runnable locally in this environment because the machine does not currently have the repo's required Python 3.11 + dev test toolchain installed.

Closes #9344

HiddenPuppy added 3 commits April 14, 2026 14:01

Bugfix: recover from structured reasoning budget exhaustion

90764e9

Bugfix: allow GitHub noreply contributor emails

8184e97

Bugfix: preserve retries after reasoning compression

97ff256

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 27, 2026

alt-glitch mentioned this pull request May 18, 2026

fix(conversation_loop): detect structured reasoning exhaustion from Ollama fallback models #28133

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: recover from structured reasoning budget exhaustion#9452

Bugfix: recover from structured reasoning budget exhaustion#9452
HiddenPuppy wants to merge 3 commits into
NousResearch:mainfrom
HiddenPuppy:codex/fix-thinking-budget-recovery

HiddenPuppy commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HiddenPuppy commented Apr 14, 2026

Summary

Root Cause

Notes

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants