fix: emergency compression before max_iterations exhaustion#18607
Open
gzsiang wants to merge 2 commits into
Open
fix: emergency compression before max_iterations exhaustion#18607gzsiang wants to merge 2 commits into
gzsiang wants to merge 2 commits into
Conversation
liuhao1024
reviewed
May 2, 2026
liuhao1024
left a comment
Contributor
There was a problem hiding this comment.
The emergency compression mechanism is a solid idea for handling the "provider doesn't return usage data" edge case. Two concerns:
1. No tests for a complex behavioral change
This PR adds a new IterationBudget.reset() method and a new code path that can extend the agent's effective iteration budget up to 4x max_iterations (original + 3 resets). The diff only touches run_agent.py — no test file. Given the interaction between _emergency_compression_count, iteration_budget.reset(), api_call_count reset, and the _budget_grace_call gate, this deserves at least:
- A unit test for
IterationBudget.reset()(reset mid-consumption, verifyremainingandused) - An integration test showing that when
remaining <= 1and context is above threshold, emergency compression fires and the budget resets - A test verifying the cap at 3 emergency compressions (4th exhaustion should hit
_handle_max_iterations)
2. #TODO-issue placeholder in the comment block (line ~10718)
The comment has (#TODO-issue) — should reference the actual issue number before merging.
Cyrene963
pushed a commit
to Cyrene963/hermes-agent
that referenced
this pull request
May 3, 2026
Community PRs applied: - NousResearch#18596: Enable secret redaction by default (SECURITY) - NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400 - NousResearch#18607: Emergency compression before max_iterations exhaustion - NousResearch#18603: Compression fallback to main model on 413 rate limit - NousResearch#18638: Pass threshold_percent on model switch - NousResearch#18663: Strip extra_content from tool_calls for strict APIs - NousResearch#18618: Forward explicit_api_key to OpenRouter - NousResearch#18632: Show cache tokens in /insights breakdown - NousResearch#18614: Add idempotency guard for patch duplicate loops - NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode - NousResearch#18616: Allow ZWJ emoji in context files - NousResearch#18582: Reload .env on /restart - NousResearch#18547: Stabilize system prompt prefix for KV cache reuse - NousResearch#18692: Strip FTS5 operators from session search truncation terms Fix: Add order_by_last_active=True to list_sessions_rich call (pre-existing commit 142b4bf code sync)
3ace50e to
d5bb01e
Compare
57ba10a to
68ed7a1
Compare
68ed7a1 to
9bc74a0
Compare
f027690 to
2907c0a
Compare
Open
1 task
gzsiang
added a commit
to gzsiang/hermes-agent
that referenced
this pull request
Jun 6, 2026
Added Chinese description of fork features: - Circuit breaker (NousResearch#16749) - CLI Chinese localization (NousResearch#15282) - Message embedding (NousResearch#18059) - Emergency compression (NousResearch#18607)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
max_iterations(default 90) was exhausted before the compressionthreshold was reached, the agent was killed without compressing.
This happened when:
last_prompt_tokens == 0→should_compress(0)never fires)Result: Agent hits max_iterations, calls
_handle_max_iterations, and the session ends mid-work.Fix
Add emergency compression in the main loop that triggers when:
remaining <= 1)After compression, the budget resets and the agent continues.
Changes
IterationBudget.reset()methodapi_call_count += 1in the main loop_emergency_compression_countcounter (reset at turn start)