fix(compaction): token-budget primary tail protection by BongSuCHOI · Pull Request #6240 · NousResearch/hermes-agent

BongSuCHOI · 2026-04-08T17:36:48Z

Problem

Tail protection in context compaction is effectively message-count based despite having a token budget. protect_last_n=20 acts as a hard floor, so a single 50K-token tool output (file read, large API response) causes all 20 recent messages to be preserved — even if they total 200K+ tokens. This leaves almost nothing to summarize, making compaction nearly useless in long-tool-output sessions.

Current behavior (200K context, 50K threshold):

tail_token_budget = 20K tokens  (threshold × summary_target_ratio)
protect_last_n = 20 messages    ← this wins every time

If the last 5 messages total 120K tokens, all 20 are still protected → only head + tiny middle gets summarized.

Solution

Make token budget the primary criterion for tail protection, with a small message-count floor for safety:

min_tail reduced from protect_last_n (20) → 3 messages (hard minimum)
Budget is allowed to exceed up to 1.5× to avoid cutting mid-oversized-message
If even 3 messages exceed 1.5× budget, compression still runs (cut after head)
_prune_old_tool_results also respects token budget (new protect_tail_tokens param)
Tool group alignment (no splitting tool_call/result pairs) preserved

Changes

Component	Before	After
Tail min messages	20 (protect_last_n)	3
Tail budget enforcement	Hard floor at 20 msgs	Soft ceiling at 1.5× budget
Prune boundary	protect_last_n × 3 = 60 msgs	token budget + min floor
Min messages for compression	head + 20 + 1	head + 3 + 1

Example (200K context, 50K threshold, 20K tail budget)

Before: 20 messages protected (could be 200K tokens) → almost nothing to summarize
After: ~20K tokens of recent messages protected (~5 normal msgs, or 1 large tool output + 2 msgs) → much more middle content available for summarization

Backward compatibility

_prune_old_tool_results new param is optional (defaults to None → old behavior)
protect_last_n still exists as a config param, just no longer the tail floor
_find_tail_cut_by_tokens signature unchanged
No changes outside context_compressor.py

Tail protection was effectively message-count based despite having a token budget, because protect_last_n=20 acted as a hard floor. A single 50K-token tool output would cause all 20 recent messages to be preserved regardless of budget, leaving little room for summarization. Changes: - _find_tail_cut_by_tokens: min_tail reduced from protect_last_n (20) to 3; token budget is now the primary criterion - Soft ceiling at 1.5x budget to avoid cutting mid-oversized-message - _prune_old_tool_results: accepts optional protect_tail_tokens so pruning also respects the token budget instead of a fixed count - compress() minimum message check relaxed from protect_first_n + protect_last_n + 1 to protect_first_n + 3 + 1 - Tool group alignment (no splitting tool_call/result) preserved

PR NousResearch#6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.

BongSuCHOI · 2026-04-08T18:25:44Z

Test Fix: context_compressor min_tail=3 ✅ pushed

3 failing tests in test_context_compressor.py fixed and pushed to this branch:

Test	Issue	Fix
`test_summary_role_avoids_consecutive_user_messages`	6 msgs = min threshold → returned unchanged	6→8 msgs
`test_double_collision_user_head_assistant_tail`	Tail shift → consecutive assistant	7→8 msgs
`test_no_collision_scenarios_still_work`	Same threshold issue	6→8 msgs, fixed roles

Root cause: PR changed min_tail from protect_last_n to min(3, ...), raising min_for_compress to 6. Tests with exactly 6 msgs hit the early return.

Remaining 10 failures are pre-existing on main (quick_commands print mock, HF model, docker env, vision tools).

PR NousResearch#6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.

BongSuCHOI · 2026-04-09T04:30:05Z

CI Failure Analysis

Verified that all 10 test failures are pre-existing on main and unrelated to this PR. The changes here only touch context_compressor.py and its tests — none of the failing test files were modified.

Test failures (all pre-existing):

test_quick_commands.py (5) — print mock issue from a recent CLI interface change
test_auxiliary_named_custom_providers.py (1) — custom: prefix normalization
test_api_key_providers.py (1) — HF model MiniMaxAI/MiniMax-M2.5 not yet added to DEFAULT_CONTEXT_LENGTHS
test_docker_environment.py (2) — Docker env var handling in CI
test_vision_tools.py (1) — codex auth check

build-and-push failure: pip install hits resolution-too-deep — also fails on the latest main commit. Upstream dependency resolution issue.

9329 tests passed. Safe to merge.

PR #6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.

teknium1 · 2026-04-09T06:54:34Z

Merged via PR #6453 — your commits were cherry-picked onto current main with authorship preserved. Added 6 new tests covering the motivating large-tool-output scenario, min tail guarantee, 1.5x soft ceiling, and token-budget prune path. Great fix, @BongSuCHOI — this makes compaction actually effective in tool-heavy sessions!

BongSuCHOI · 2026-04-09T06:59:48Z

@teknium1
Thank you!! And if it’s not too much trouble, could you also check version #6239? It seems the usability has improved significantly.

PR NousResearch#6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.

teknium1 mentioned this pull request Apr 9, 2026

fix(compaction): token-budget primary tail protection #6453

Merged

teknium1 closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compaction): token-budget primary tail protection#6240

fix(compaction): token-budget primary tail protection#6240
BongSuCHOI wants to merge 2 commits into
NousResearch:mainfrom
BongSuCHOI:feat/token-budget-tail-protection

BongSuCHOI commented Apr 8, 2026

Uh oh!

BongSuCHOI commented Apr 8, 2026 •

edited

Loading

Uh oh!

BongSuCHOI commented Apr 9, 2026

Uh oh!

teknium1 commented Apr 9, 2026

Uh oh!

BongSuCHOI commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BongSuCHOI commented Apr 8, 2026

Problem

Solution

Changes

Example (200K context, 50K threshold, 20K tail budget)

Backward compatibility

Uh oh!

BongSuCHOI commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Fix: context_compressor min_tail=3 ✅ pushed

Uh oh!

BongSuCHOI commented Apr 9, 2026

CI Failure Analysis

Uh oh!

teknium1 commented Apr 9, 2026

Uh oh!

BongSuCHOI commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BongSuCHOI commented Apr 8, 2026 •

edited

Loading