Skip to content

fix(compressor): prevent MINIMUM_CONTEXT_LENGTH floor from blocking compression at small contexts#15496

Closed
aj-nt wants to merge 1 commit into
NousResearch:mainfrom
aj-nt:fix/context-compression-minimum-threshold
Closed

fix(compressor): prevent MINIMUM_CONTEXT_LENGTH floor from blocking compression at small contexts#15496
aj-nt wants to merge 1 commit into
NousResearch:mainfrom
aj-nt:fix/context-compression-minimum-threshold

Conversation

@aj-nt

@aj-nt aj-nt commented Apr 25, 2026

Copy link
Copy Markdown

Problem

When context_length equals MINIMUM_CONTEXT_LENGTH (64000 tokens), the max() floor in threshold_tokens calculation clamps the threshold to 64000 — which is 100% of the context window. Since should_compress() checks prompt_tokens >= threshold_tokens, and the API errors out before reaching 64000 tokens, compression never fires.

This affects any model where context_length <= MINIMUM_CONTEXT_LENGTH / threshold_percent:

  • context_length=64000, threshold=0.70 → threshold = 64000 (100%) — compression never triggers
  • context_length=64000, threshold=0.50 → threshold = 64000 (100%) — compression never triggers
  • context_length=80000, threshold=0.70 → threshold = 64000 (80%) — works but threshold is higher than configured

The same bug exists in both __init__ and update_model().

Fix

After computing max(pct_based, MINIMUM_CONTEXT_LENGTH), check if the result >= context_length (and context_length > 0). If so, fall back to the percentage-based value.

pct_based = int(context_length * threshold_percent)
self.threshold_tokens = max(pct_based, MINIMUM_CONTEXT_LENGTH)
if self.threshold_tokens >= self.context_length and self.context_length > 0:
    self.threshold_tokens = pct_based

This preserves the floor's original intent (prevent premature compression on large-context models) while ensuring compression can actually trigger on models with context_length at or near 64000.

The floor still works correctly for large contexts — e.g. 200K at 30% gives 60000, clamped up to 64000 (32% of context, which is reasonable).

Testing

13 new tests in tests/agent/test_context_compression_threshold.py:

Test Covers
test_threshold_below_context_when_at_minimum 64K context at 70% → 44800, not 64000
test_threshold_below_context_at_minimum_with_default_percent 64K at default 50% → 32000
test_threshold_at_85_percent_minimum_context 64K at 85% → 54400
test_should_compress_fires_at_minimum_context should_compress(true) at 45K with 64K context
test_should_compress_does_not_fire_below_threshold should_compress(false) below threshold
test_floor_clamps_tiny_percentage_on_large_context 200K at 30% → floor 64000
test_large_context_percentage_above_floor 200K at 70% → 140K (no floor)
test_threshold_never_equals_context_length_on_large_model Floor < context on 200K
test_update_model_threshold_below_context_at_minimum update_model() also fixed
test_update_model_preserves_floor_for_large_context update_model() floor still works
test_context_length_slightly_above_minimum 60K at 85% → 51000, not 64000
test_threshold_at_boundary 100% threshold → equals context (correct, user wants no compression)
test_context_length_below_minimum_at_zero Degenerate zero context → floor wins

All 63 tests pass (50 existing + 13 new). No regressions.

Fixes #14690

…ompression at small contexts

When context_length equals MINIMUM_CONTEXT_LENGTH (64000), the max() floor
in threshold_tokens calculation clamps the threshold to 64000, which is 100%
of the context window. Since should_compress() checks prompt_tokens >=
threshold_tokens, and the API errors out before reaching 64000 tokens,
compression never fires.

The same bug exists in both __init__ and update_model().

Fix: after computing max(pct_based, MINIMUM_CONTEXT_LENGTH), check if the
result >= context_length (and context_length > 0). If so, fall back to the
percentage-based value. This preserves the floor's original intent (prevent
premature compression on large-context models) while ensuring compression
can actually trigger on models with context_length at or near 64000.

The floor still works correctly for large contexts (e.g. 200K at 30% gives
60000, which gets clamped up to 64000 — threshold is 32% of context, which
is reasonable).

Fixes NousResearch#14690
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 25, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #15431 — both fix the same MINIMUM_CONTEXT_LENGTH floor bug where threshold becomes 100% of context at 64K, preventing compression from ever triggering. Also overlaps #14696 which fixes this plus two additional compression bugs. See tracking issue #14690.

@aj-nt aj-nt closed this Apr 25, 2026
@aj-nt aj-nt deleted the fix/context-compression-minimum-threshold branch April 25, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Context auto-compression never triggers when context_length == MINIMUM_CONTEXT_LENGTH (64000)

2 participants