fix(compaction): token-budget primary tail protection by teknium1 · Pull Request #6453 · NousResearch/hermes-agent

teknium1 · 2026-04-09T06:36:15Z

Summary

Salvage of PR #6240 by @BongSuCHOI (cherry-picked onto current main) with added test coverage.

Problem

Tail protection during compaction uses protect_last_n=20 (a hard message count). If those 20 messages include large tool outputs (50K+ chars each), the protected tail can total 200K+ tokens — leaving almost nothing for the compressor to summarize. Compaction fires frequently (especially at the 50% threshold) but accomplishes nothing.

Fix

Make token budget the primary criterion for tail protection:

	Before	After
Tail min messages	20 (`protect_last_n`)	3 (hard minimum)
Tail budget	Ignored	Primary — derived from `threshold × summary_target_ratio`
Soft ceiling	N/A	1.5× budget (avoids splitting oversized messages)
Min messages for compression	head + 20 + 1 = 24	head + 3 + 1 = 7

Example (200K context, 50% threshold, 20K tail budget)

Before: 20 messages protected (could be 200K tokens) → almost nothing to summarize
After: ~20K tokens of tail protected (~5 normal msgs, or 1 large tool output + 2 msgs) → 80K+ of middle content available for summarization

Changes

agent/context_compressor.py — token-budget tail in _find_tail_cut_by_tokens, token-budget prune in _prune_old_tool_results, lower compression guard (7 msgs)
tests/agent/test_context_compressor.py — 6 adapted existing tests + 6 new tests covering: large tool outputs no longer block compaction, min 3 tail guarantee, 1.5× soft ceiling, small conversation compression, token-budget prune path, message-count fallback

Cache impact

Zero. Changes only affect which messages survive compression and when compression triggers — the compression event itself is still a single cache break, same as before.

Test plan

42 compressor tests pass (36 existing + 6 new)
py_compile clean

Fixes the "compaction fires but accomplishes nothing" pattern reported alongside #6369.

Tail protection was effectively message-count based despite having a token budget, because protect_last_n=20 acted as a hard floor. A single 50K-token tool output would cause all 20 recent messages to be preserved regardless of budget, leaving little room for summarization. Changes: - _find_tail_cut_by_tokens: min_tail reduced from protect_last_n (20) to 3; token budget is now the primary criterion - Soft ceiling at 1.5x budget to avoid cutting mid-oversized-message - _prune_old_tool_results: accepts optional protect_tail_tokens so pruning also respects the token budget instead of a fixed count - compress() minimum message check relaxed from protect_first_n + protect_last_n + 1 to protect_first_n + 3 + 1 - Tool group alignment (no splitting tool_call/result) preserved

PR #6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.

Tests for the new behavior paths: - Large tool outputs no longer block compaction (motivating scenario) - Hard minimum of 3 tail messages always protected - 1.5x soft ceiling for oversized messages - Small conversations still compress (min 8 messages) - Token-budget prune path in _prune_old_tool_results - Fallback to message-count when no token budget

BongSuCHOI and others added 3 commits April 8, 2026 23:33

teknium1 merged commit d40264d into main Apr 9, 2026
2 of 4 checks passed

teknium1 mentioned this pull request Apr 9, 2026

fix(compaction): token-budget primary tail protection #6240

Closed

SHL0MS mentioned this pull request Apr 11, 2026

[Tracking] /compress display bugs #7955

Closed

2 tasks

github-actions Bot mentioned this pull request Apr 15, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.8 to v2026.4.13 Docker-Hub-sirmark/docker-hermes-agent#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compaction): token-budget primary tail protection#6453

fix(compaction): token-budget primary tail protection#6453
teknium1 merged 3 commits into
mainfrom
hermes/hermes-b0a4b31e

teknium1 commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 9, 2026

Summary

Problem

Fix

Example (200K context, 50% threshold, 20K tail budget)

Changes

Cache impact

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants