fix: improve context compression quality — higher limits, tool tracking, degradation warning by SHL0MS · Pull Request #4153 · NousResearch/hermes-agent

SHL0MS · 2026-03-31T03:16:44Z

Summary

Three targeted improvements to the compression system that address the most impactful remaining quality issues documented in #499.

1. Increase serializer truncation limits

The summarizer LLM was working from heavily truncated input:

Content type	Before	After
Tool results	3,000 chars (2000+800)	6,000 chars (4000+1500)
Assistant content	3,000 chars	6,000 chars
Tool call arguments	500 chars	1,500 chars
User messages	3,000 chars	6,000 chars

A 200-line code edit or a full test output was being cut to 3000 chars before the summarizer ever ran. The summary was working from incomplete data. The budget here is the summary model's context window, not the main model's — there's room.

2. Add "Tools & Patterns" section to compression prompts

Both the first-pass and iterative compression prompts now include:

## Tools & Patterns
[Which tools were used, how they were used effectively, and any
tool-specific discoveries (e.g., preferred flags, working invocations,
successful command patterns)]

After compression, the agent retains tool definitions (from self.tools) but loses conversational context of HOW those tools were used effectively. This section preserves working invocations, preferred flags, and tool-specific discoveries.

3. Degradation warning on repeated compressions

After the 2nd compression, the user sees:

⚠️  Session compressed 3 times — accuracy may degrade. Consider /new to start fresh.

compression_count was already tracked (context_compressor.py:145) but never surfaced. Multiple compressions silently compound quality loss — "summaries of summaries" lose information with each pass.

Changes

24 lines changed, 9 added across agent/context_compressor.py and run_agent.py.

Ref #499

…ng, degradation warning Three targeted improvements to the compression system: 1. Increase serializer truncation limits (3000→6000 chars for content, 500→1500 for tool args). The summarizer LLM was working from heavily truncated data — a 200-line edit got cut to 3000 chars before the summarizer ever saw it. 2. Add '## Tools & Patterns' section to both compression prompt templates (first-pass and iterative). After compression, the agent retains its tool definitions but loses conversational context of HOW tools were used effectively. This section preserves working invocations, preferred flags, and tool-specific discoveries across compaction boundaries. 3. Warn users on 2nd+ compression: 'Session compressed N times — accuracy may degrade. Consider /new to start fresh.' The compression_count was tracked but never surfaced to the user. Ref NousResearch#499

SHL0MS · 2026-03-31T03:26:08Z

Superseded by PR from fix/compression-quality-v2 branch (rebased cleanly on latest main).

SHL0MS closed this Mar 31, 2026

SHL0MS mentioned this pull request Mar 31, 2026

fix: improve context compression quality — higher limits, tool tracking, degradation warning #4156

Closed

SHL0MS deleted the fix/compression-quality branch April 1, 2026 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve context compression quality — higher limits, tool tracking, degradation warning#4153

fix: improve context compression quality — higher limits, tool tracking, degradation warning#4153
SHL0MS wants to merge 1 commit into
NousResearch:mainfrom
SHL0MS:fix/compression-quality

SHL0MS commented Mar 31, 2026

Uh oh!

SHL0MS commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SHL0MS commented Mar 31, 2026

Summary

1. Increase serializer truncation limits

2. Add "Tools & Patterns" section to compression prompts

3. Degradation warning on repeated compressions

Changes

Uh oh!

SHL0MS commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant