Skip to content

fix(agent): prevent infinite context compression loop with small windows#40976

Open
iamlukethedev wants to merge 1 commit into
NousResearch:mainfrom
iamlukethedev:fix/40803-context-compaction-loop
Open

fix(agent): prevent infinite context compression loop with small windows#40976
iamlukethedev wants to merge 1 commit into
NousResearch:mainfrom
iamlukethedev:fix/40803-context-compaction-loop

Conversation

@iamlukethedev

Copy link
Copy Markdown

Fixes #40803: Infinite Context Compaction Loop (messages=N->N) on low context_length / limit configurations

Problem

When summary_target_ratio is set high (e.g., 0.45) with a constrained context_length, the tail budget becomes so large that the compression window shrinks to <5 messages. Compressing only 1-2 messages yields 0 token savings (due to summary overhead), triggering compression again on the next turn, creating an infinite loop.

Solution

Detect when the compression window is too small (<5 messages) and would result in minimal token savings (<2000 tokens). When detected, fall back to a more aggressive tail budget (50% of original) to create a larger compression window.

Testing

Added 6 comprehensive tests verifying:

  1. Small windows trigger fallback
  2. Fallback creates larger compression window
  3. Large windows don't unnecessarily trigger fallback
  4. Fallback prevents no-op compression
  5. Window size calculation is correct
  6. Token estimation in windows is accurate

All 6 tests pass.

Fixes NousResearch#40803: Infinite Context Compaction Loop (messages=N->N) on low context_length / limit configurations

When summary_target_ratio is set high (e.g., 0.45) with a constrained context_length,
the tail budget becomes so large that the compression window shrinks to <5 messages.
Compressing only 1-2 messages yields 0 token savings (due to summary overhead),
triggering compression again on the next turn, creating an infinite loop.

This fix detects when the compression window is too small (<5 messages) and would
result in minimal token savings (<2000 tokens). When detected, it falls back to a
more aggressive tail budget (50% of original) to create a larger compression window.

New safety checks prevent:
1. No-op compressions that save 0 tokens
2. Infinite compression loops on constrained configs
3. Wasted API calls and latency on every message turn

Added 6 comprehensive tests verifying:
1. Small windows trigger fallback
2. Fallback creates larger compression window
3. Large windows don't unnecessarily trigger fallback
4. Fallback prevents no-op compression
5. Window size calculation is correct
6. Token estimation in windows is accurate

All 6 tests pass.

Fixes NousResearch#40803
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Infinite Context Compaction Loop (messages=N->N) on low context_length / limit configurations

2 participants