Skip to content

fix: emergency compression before max_iterations exhaustion#18607

Open
gzsiang wants to merge 2 commits into
NousResearch:mainfrom
gzsiang:fix/emergency-compression-2
Open

fix: emergency compression before max_iterations exhaustion#18607
gzsiang wants to merge 2 commits into
NousResearch:mainfrom
gzsiang:fix/emergency-compression-2

Conversation

@gzsiang

@gzsiang gzsiang commented May 2, 2026

Copy link
Copy Markdown

Problem

When max_iterations (default 90) was exhausted before the compression
threshold was reached, the agent was killed without compressing.

This happened when:

  • Provider doesn't return usage data (last_prompt_tokens == 0should_compress(0) never fires)
  • Tool outputs are small per iteration (context accumulates slowly, never reaches threshold in 90 iterations)

Result: Agent hits max_iterations, calls _handle_max_iterations, and the session ends mid-work.

Fix

Add emergency compression in the main loop that triggers when:

  • iteration budget is nearly exhausted (remaining <= 1)
  • compression is enabled
  • estimated message tokens exceed the compression threshold
  • fewer than 3 emergency compressions this turn

After compression, the budget resets and the agent continues.

Changes

  • IterationBudget.reset() method
  • Emergency compression check before api_call_count += 1 in the main loop
  • _emergency_compression_count counter (reset at turn start)

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 2, 2026

@liuhao1024 liuhao1024 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emergency compression mechanism is a solid idea for handling the "provider doesn't return usage data" edge case. Two concerns:

1. No tests for a complex behavioral change

This PR adds a new IterationBudget.reset() method and a new code path that can extend the agent's effective iteration budget up to 4x max_iterations (original + 3 resets). The diff only touches run_agent.py — no test file. Given the interaction between _emergency_compression_count, iteration_budget.reset(), api_call_count reset, and the _budget_grace_call gate, this deserves at least:

  • A unit test for IterationBudget.reset() (reset mid-consumption, verify remaining and used)
  • An integration test showing that when remaining <= 1 and context is above threshold, emergency compression fires and the budget resets
  • A test verifying the cap at 3 emergency compressions (4th exhaustion should hit _handle_max_iterations)

2. #TODO-issue placeholder in the comment block (line ~10718)

The comment has (#TODO-issue) — should reference the actual issue number before merging.

Cyrene963 pushed a commit to Cyrene963/hermes-agent that referenced this pull request May 3, 2026
Community PRs applied:
- NousResearch#18596: Enable secret redaction by default (SECURITY)
- NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400
- NousResearch#18607: Emergency compression before max_iterations exhaustion
- NousResearch#18603: Compression fallback to main model on 413 rate limit
- NousResearch#18638: Pass threshold_percent on model switch
- NousResearch#18663: Strip extra_content from tool_calls for strict APIs
- NousResearch#18618: Forward explicit_api_key to OpenRouter
- NousResearch#18632: Show cache tokens in /insights breakdown
- NousResearch#18614: Add idempotency guard for patch duplicate loops
- NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode
- NousResearch#18616: Allow ZWJ emoji in context files
- NousResearch#18582: Reload .env on /restart
- NousResearch#18547: Stabilize system prompt prefix for KV cache reuse
- NousResearch#18692: Strip FTS5 operators from session search truncation terms

Fix: Add order_by_last_active=True to list_sessions_rich call
(pre-existing commit 142b4bf code sync)
@gzsiang gzsiang force-pushed the fix/emergency-compression-2 branch 2 times, most recently from 3ace50e to d5bb01e Compare May 4, 2026 16:53
@gzsiang gzsiang force-pushed the fix/emergency-compression-2 branch 2 times, most recently from 57ba10a to 68ed7a1 Compare May 16, 2026 17:38
@gzsiang gzsiang force-pushed the fix/emergency-compression-2 branch from 68ed7a1 to 9bc74a0 Compare May 17, 2026 12:31
@gzsiang gzsiang force-pushed the fix/emergency-compression-2 branch from f027690 to 2907c0a Compare May 17, 2026 12:50
gzsiang added a commit to gzsiang/hermes-agent that referenced this pull request Jun 6, 2026
Added Chinese description of fork features:
- Circuit breaker (NousResearch#16749)
- CLI Chinese localization (NousResearch#15282)
- Message embedding (NousResearch#18059)
- Emergency compression (NousResearch#18607)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants