feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening#10088
Merged
Conversation
…rade, hardening Combined salvage of PRs #9661, #9663, #9674, #9677, #9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs #9665 (user msg preservation — duplication risk) and #9675 (dead code).
This was referenced Apr 15, 2026
Closed
23 tasks
Ataraksea
pushed a commit
to Ataraksea/hermes-agent
that referenced
this pull request
May 6, 2026
ContextEngine.should_compress_preflight() is documented as the per-turn ingest entry for plugin engines, but run_agent.py never calls it. PR NousResearch#10088 explicitly noted this as dead code when skipping NousResearch#9675: > NousResearch#9675 (preflight check) — dead code, run_agent.py never calls > should_compress_preflight() This breaks plugin context engines that rely on the hook for per-turn message ingest. hermes-lcm overrides should_compress_preflight() to persist messages each turn into its DAG store, but with the hook never called, the lossless message store stays empty until compress() fires at the threshold (typically ~96K tokens). Reproducible: $ hermes chat -q "test" -Q $ sqlite3 ~/.hermes/lcm.db "SELECT COUNT(*) FROM messages;" 0 (Verified on hermes-agent v0.11.0 with hermes-lcm v0.7.0.) Add two calls to should_compress_preflight(messages): 1. Top of the main loop, right after api_call_count is incremented — per-turn ingest before each API call. 2. End of run_conversation(), before the on_session_end plugin hook — final flush so the last assistant message reaches the engine when the turn exited via the no-tool-calls branch and skipped the per-turn hook above. The return value is discarded; compression is still decided by the later should_compress(_real_tokens) call which uses the provider- reported token count. Both calls are wrapped in try/except so a misbehaving plugin engine cannot break the conversation loop. Default ContextEngine.should_compress_preflight() returns False with no work, so this is zero overhead for the built-in ContextCompressor and any engine that does not override the hook. After this fix: $ hermes chat -q "test" -Q $ sqlite3 ~/.hermes/lcm.db "SELECT COUNT(*) FROM messages;" 2 Refs: - NousResearch#9675 (closed: feat(compressor): implement preflight compression check) - NousResearch#10088 (merged body: skipped NousResearch#9675 as dead code) - stephenschoettler/hermes-lcm#68 (LCM author flagged host integration issue but could not file upstream because GitHub Issues was off on a different fork) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Combined salvage of 5 compression improvement PRs by @kshitijk4poor (from tracking issue #9666):
Follow-up fixes applied during salvage
web_extract: readsurls(list) noturl(original PR bug)Skipped
should_compress_preflight()Tests
97/97 compressor tests pass. E2E validated: smart collapse, dedup, arg pruning, anti-thrashing, multimodal safety, note idempotency.