fix(compressor): prevent MINIMUM_CONTEXT_LENGTH floor from blocking compression at small contexts by aj-nt · Pull Request #15496 · NousResearch/hermes-agent

aj-nt · 2026-04-25T03:53:52Z

Problem

When context_length equals MINIMUM_CONTEXT_LENGTH (64000 tokens), the max() floor in threshold_tokens calculation clamps the threshold to 64000 — which is 100% of the context window. Since should_compress() checks prompt_tokens >= threshold_tokens, and the API errors out before reaching 64000 tokens, compression never fires.

This affects any model where context_length <= MINIMUM_CONTEXT_LENGTH / threshold_percent:

context_length=64000, threshold=0.70 → threshold = 64000 (100%) — compression never triggers
context_length=64000, threshold=0.50 → threshold = 64000 (100%) — compression never triggers
context_length=80000, threshold=0.70 → threshold = 64000 (80%) — works but threshold is higher than configured

The same bug exists in both __init__ and update_model().

Fix

After computing max(pct_based, MINIMUM_CONTEXT_LENGTH), check if the result >= context_length (and context_length > 0). If so, fall back to the percentage-based value.

pct_based = int(context_length * threshold_percent)
self.threshold_tokens = max(pct_based, MINIMUM_CONTEXT_LENGTH)
if self.threshold_tokens >= self.context_length and self.context_length > 0:
    self.threshold_tokens = pct_based

This preserves the floor's original intent (prevent premature compression on large-context models) while ensuring compression can actually trigger on models with context_length at or near 64000.

The floor still works correctly for large contexts — e.g. 200K at 30% gives 60000, clamped up to 64000 (32% of context, which is reasonable).

Testing

13 new tests in tests/agent/test_context_compression_threshold.py:

Test	Covers
`test_threshold_below_context_when_at_minimum`	64K context at 70% → 44800, not 64000
`test_threshold_below_context_at_minimum_with_default_percent`	64K at default 50% → 32000
`test_threshold_at_85_percent_minimum_context`	64K at 85% → 54400
`test_should_compress_fires_at_minimum_context`	should_compress(true) at 45K with 64K context
`test_should_compress_does_not_fire_below_threshold`	should_compress(false) below threshold
`test_floor_clamps_tiny_percentage_on_large_context`	200K at 30% → floor 64000
`test_large_context_percentage_above_floor`	200K at 70% → 140K (no floor)
`test_threshold_never_equals_context_length_on_large_model`	Floor < context on 200K
`test_update_model_threshold_below_context_at_minimum`	update_model() also fixed
`test_update_model_preserves_floor_for_large_context`	update_model() floor still works
`test_context_length_slightly_above_minimum`	60K at 85% → 51000, not 64000
`test_threshold_at_boundary`	100% threshold → equals context (correct, user wants no compression)
`test_context_length_below_minimum_at_zero`	Degenerate zero context → floor wins

All 63 tests pass (50 existing + 13 new). No regressions.

Fixes #14690

…ompression at small contexts When context_length equals MINIMUM_CONTEXT_LENGTH (64000), the max() floor in threshold_tokens calculation clamps the threshold to 64000, which is 100% of the context window. Since should_compress() checks prompt_tokens >= threshold_tokens, and the API errors out before reaching 64000 tokens, compression never fires. The same bug exists in both __init__ and update_model(). Fix: after computing max(pct_based, MINIMUM_CONTEXT_LENGTH), check if the result >= context_length (and context_length > 0). If so, fall back to the percentage-based value. This preserves the floor's original intent (prevent premature compression on large-context models) while ensuring compression can actually trigger on models with context_length at or near 64000. The floor still works correctly for large contexts (e.g. 200K at 30% gives 60000, which gets clamped up to 64000 — threshold is 32% of context, which is reasonable). Fixes NousResearch#14690

alt-glitch · 2026-04-25T04:06:07Z

Duplicate of #15431 — both fix the same MINIMUM_CONTEXT_LENGTH floor bug where threshold becomes 100% of context at 64K, preventing compression from ever triggering. Also overlaps #14696 which fixes this plus two additional compression bugs. See tracking issue #14690.

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 25, 2026

devilardis mentioned this pull request Apr 25, 2026

fix(compression): three bugs causing auto-compression to never trigger #14696

Closed

aj-nt closed this Apr 25, 2026

aj-nt deleted the fix/context-compression-minimum-threshold branch April 25, 2026 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compressor): prevent MINIMUM_CONTEXT_LENGTH floor from blocking compression at small contexts#15496

fix(compressor): prevent MINIMUM_CONTEXT_LENGTH floor from blocking compression at small contexts#15496
aj-nt wants to merge 1 commit into
NousResearch:mainfrom
aj-nt:fix/context-compression-minimum-threshold

aj-nt commented Apr 25, 2026

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aj-nt commented Apr 25, 2026

Problem

Fix

Testing

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants