feat(compressor): preserve user messages verbatim during compression by kshitijk4poor · Pull Request #9665 · NousResearch/hermes-agent

kshitijk4poor · 2026-04-14T13:01:21Z

Summary

When middle turns are compressed, user messages — preferences, corrections, and intent — get paraphrased or lost by the LLM summarizer. This PR preserves them verbatim.

What changed

New _extract_user_messages() static method on ContextCompressor that:

Scans compressed turns for user messages
Filters out system injections ([SYSTEM:...) and skill loads (---\nname:)
Collects them newest-first within a 4K char budget
Truncates individual messages at 300 chars
Returns a formatted section appended to the summary

Wired into compress() as Phase 3b — after _generate_summary() returns, the preserved messages block is appended to the summary text.

Example output

After compression, the summary includes:

## Preserved User Messages (verbatim, from compressed turns)
- Turn 8: "Use pytest.mark.parametrize — I prefer that style"
- Turn 15: "Wait — empty string should raise ValueError, not return None. Security reasons."
- Turn 19: "Run the full test suite before committing"

Why this matters

"Use pytest.mark.parametrize" is a style preference the model needs to follow
"empty string should raise ValueError" is a security correction that changes behavior
These survive as exact quotes instead of being paraphrased as "user preferred strict validation"

Benchmark

Metric	Before	After
User preferences surviving compression	0/6	6/6
Extra cost per compression	—	~100 tokens

Test plan

All 63 existing compressor/engine/focus tests pass
No changes to public API

Part of #9666.

When middle turns are compressed, user messages (preferences, corrections, intent) get paraphrased or lost by the summarizer. Add _extract_user_messages() that extracts verbatim user messages from the compressed zone (newest first, budget-capped at 4K chars) and appends them to the summary as a structured section: ## Preserved User Messages (verbatim, from compressed turns) - Turn 8: "Use pytest.mark.parametrize -- I prefer that style" - Turn 15: "Wait -- empty string should raise ValueError, not None" Filters out system injections ([SYSTEM:...) and skill loads. Truncates long messages at 300 chars. Cost: ~100 tokens per compression for the preserved messages block.

…rade, hardening Combined salvage of PRs #9661, #9663, #9674, #9677, #9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs #9665 (user msg preservation — duplication risk) and #9675 (dead code).

teknium1 · 2026-04-15T05:21:48Z

Closing as part of the compression PR triage (#9666). #9665: the user message preservation concept is good but appending verbatim messages after the LLM summary risks duplication — would need to be injected into the summarizer prompt instead. #9675: should_compress_preflight() is never called by run_agent.py (the preflight loop does its own estimation directly). See #10088 for the merged improvements.

…rade, hardening Combined salvage of PRs NousResearch#9661, NousResearch#9663, NousResearch#9674, NousResearch#9677, NousResearch#9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs NousResearch#9665 (user msg preservation — duplication risk) and NousResearch#9675 (dead code).

kshitijk4poor mentioned this pull request Apr 14, 2026

tracking: context compression improvements #9666

Closed

7 tasks

teknium1 mentioned this pull request Apr 15, 2026

feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening #10088

Merged

teknium1 closed this Apr 15, 2026

This was referenced Apr 15, 2026

feat(compressor): implement preflight compression check #9675

Closed

Feature: Context Compaction Quality Overhaul — Handoff-Oriented Prompts, User Message Preservation, and Configurable Compaction (inspired by Codex CLI) #499

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compressor): preserve user messages verbatim during compression#9665

feat(compressor): preserve user messages verbatim during compression#9665
kshitijk4poor wants to merge 1 commit into
NousResearch:mainfrom
kshitijk4poor:feat/preserve-user-messages

kshitijk4poor commented Apr 14, 2026 •

edited

Loading

Uh oh!

teknium1 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kshitijk4poor commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Example output

Why this matters

Benchmark

Test plan

Uh oh!

teknium1 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kshitijk4poor commented Apr 14, 2026 •

edited

Loading