fix(agent): prevent infinite context compression loop with small windows#40976
Open
iamlukethedev wants to merge 1 commit into
Open
fix(agent): prevent infinite context compression loop with small windows#40976iamlukethedev wants to merge 1 commit into
iamlukethedev wants to merge 1 commit into
Conversation
Fixes NousResearch#40803: Infinite Context Compaction Loop (messages=N->N) on low context_length / limit configurations When summary_target_ratio is set high (e.g., 0.45) with a constrained context_length, the tail budget becomes so large that the compression window shrinks to <5 messages. Compressing only 1-2 messages yields 0 token savings (due to summary overhead), triggering compression again on the next turn, creating an infinite loop. This fix detects when the compression window is too small (<5 messages) and would result in minimal token savings (<2000 tokens). When detected, it falls back to a more aggressive tail budget (50% of original) to create a larger compression window. New safety checks prevent: 1. No-op compressions that save 0 tokens 2. Infinite compression loops on constrained configs 3. Wasted API calls and latency on every message turn Added 6 comprehensive tests verifying: 1. Small windows trigger fallback 2. Fallback creates larger compression window 3. Large windows don't unnecessarily trigger fallback 4. Fallback prevents no-op compression 5. Window size calculation is correct 6. Token estimation in windows is accurate All 6 tests pass. Fixes NousResearch#40803
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #40803: Infinite Context Compaction Loop (messages=N->N) on low context_length / limit configurations
Problem
When summary_target_ratio is set high (e.g., 0.45) with a constrained context_length, the tail budget becomes so large that the compression window shrinks to <5 messages. Compressing only 1-2 messages yields 0 token savings (due to summary overhead), triggering compression again on the next turn, creating an infinite loop.
Solution
Detect when the compression window is too small (<5 messages) and would result in minimal token savings (<2000 tokens). When detected, fall back to a more aggressive tail budget (50% of original) to create a larger compression window.
Testing
Added 6 comprehensive tests verifying:
All 6 tests pass.