Skip to content

fix: improve context compression quality — higher limits, tool tracking, degradation warning#4153

Closed
SHL0MS wants to merge 1 commit into
NousResearch:mainfrom
SHL0MS:fix/compression-quality
Closed

fix: improve context compression quality — higher limits, tool tracking, degradation warning#4153
SHL0MS wants to merge 1 commit into
NousResearch:mainfrom
SHL0MS:fix/compression-quality

Conversation

@SHL0MS

@SHL0MS SHL0MS commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator

Summary

Three targeted improvements to the compression system that address the most impactful remaining quality issues documented in #499.

1. Increase serializer truncation limits

The summarizer LLM was working from heavily truncated input:

Content type Before After
Tool results 3,000 chars (2000+800) 6,000 chars (4000+1500)
Assistant content 3,000 chars 6,000 chars
Tool call arguments 500 chars 1,500 chars
User messages 3,000 chars 6,000 chars

A 200-line code edit or a full test output was being cut to 3000 chars before the summarizer ever ran. The summary was working from incomplete data. The budget here is the summary model's context window, not the main model's — there's room.

2. Add "Tools & Patterns" section to compression prompts

Both the first-pass and iterative compression prompts now include:

## Tools & Patterns
[Which tools were used, how they were used effectively, and any
tool-specific discoveries (e.g., preferred flags, working invocations,
successful command patterns)]

After compression, the agent retains tool definitions (from self.tools) but loses conversational context of HOW those tools were used effectively. This section preserves working invocations, preferred flags, and tool-specific discoveries.

3. Degradation warning on repeated compressions

After the 2nd compression, the user sees:

⚠️  Session compressed 3 times — accuracy may degrade. Consider /new to start fresh.

compression_count was already tracked (context_compressor.py:145) but never surfaced. Multiple compressions silently compound quality loss — "summaries of summaries" lose information with each pass.

Changes

24 lines changed, 9 added across agent/context_compressor.py and run_agent.py.

Ref #499

…ng, degradation warning

Three targeted improvements to the compression system:

1. Increase serializer truncation limits (3000→6000 chars for content,
   500→1500 for tool args). The summarizer LLM was working from heavily
   truncated data — a 200-line edit got cut to 3000 chars before the
   summarizer ever saw it.

2. Add '## Tools & Patterns' section to both compression prompt templates
   (first-pass and iterative). After compression, the agent retains its
   tool definitions but loses conversational context of HOW tools were
   used effectively. This section preserves working invocations, preferred
   flags, and tool-specific discoveries across compaction boundaries.

3. Warn users on 2nd+ compression: 'Session compressed N times — accuracy
   may degrade. Consider /new to start fresh.' The compression_count was
   tracked but never surfaced to the user.

Ref NousResearch#499
@SHL0MS

SHL0MS commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator Author

Superseded by PR from fix/compression-quality-v2 branch (rebased cleanly on latest main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant