feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening by teknium1 · Pull Request #10088 · NousResearch/hermes-agent

teknium1 · 2026-04-15T05:21:19Z

Summary

Combined salvage of 5 compression improvement PRs by @kshitijk4poor (from tracking issue #9666):

feat(compressor): smart tool output collapse during context pruning #9661 — Smart tool output collapse (informative 1-line summaries instead of generic placeholder)
feat(compressor): structured action-log summary format #9663 — Structured action-log summary template with numbered actions and Active State
fix(compressor): anti-thrashing protection for ineffective compression loops #9674 — Anti-thrashing protection (skip after 2 consecutive <10% savings)
feat(compressor): deduplicate tool results + prune tool_call arguments #9677 — Dedup identical tool results + truncate large tool_call arguments
fix(compressor): hardening — max_tokens cap, multimodal safety, note fix, adaptive cooldown #9678 — Hardening: max_tokens cap, multimodal safety, note idempotency, adaptive cooldown

Follow-up fixes applied during salvage

web_extract: reads urls (list) not url (original PR bug)
Multimodal list content guards in dedup and prune passes (crash prevention)
Kept 'Relevant Files' section in template (original PR removed it)

Skipped

feat(compressor): preserve user messages verbatim during compression #9665 (user msg preservation) — duplication risk with LLM summary
feat(compressor): implement preflight compression check #9675 (preflight check) — dead code, run_agent.py never calls should_compress_preflight()

Tests

97/97 compressor tests pass. E2E validated: smart collapse, dedup, arg pruning, anti-thrashing, multimodal safety, note idempotency.

…rade, hardening Combined salvage of PRs #9661, #9663, #9674, #9677, #9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs #9665 (user msg preservation — duplication risk) and #9675 (dead code).

ContextEngine.should_compress_preflight() is documented as the per-turn ingest entry for plugin engines, but run_agent.py never calls it. PR NousResearch#10088 explicitly noted this as dead code when skipping NousResearch#9675: > NousResearch#9675 (preflight check) — dead code, run_agent.py never calls > should_compress_preflight() This breaks plugin context engines that rely on the hook for per-turn message ingest. hermes-lcm overrides should_compress_preflight() to persist messages each turn into its DAG store, but with the hook never called, the lossless message store stays empty until compress() fires at the threshold (typically ~96K tokens). Reproducible: $ hermes chat -q "test" -Q $ sqlite3 ~/.hermes/lcm.db "SELECT COUNT(*) FROM messages;" 0 (Verified on hermes-agent v0.11.0 with hermes-lcm v0.7.0.) Add two calls to should_compress_preflight(messages): 1. Top of the main loop, right after api_call_count is incremented — per-turn ingest before each API call. 2. End of run_conversation(), before the on_session_end plugin hook — final flush so the last assistant message reaches the engine when the turn exited via the no-tool-calls branch and skipped the per-turn hook above. The return value is discarded; compression is still decided by the later should_compress(_real_tokens) call which uses the provider- reported token count. Both calls are wrapped in try/except so a misbehaving plugin engine cannot break the conversation loop. Default ContextEngine.should_compress_preflight() returns False with no work, so this is zero overhead for the built-in ContextCompressor and any engine that does not override the hook. After this fix: $ hermes chat -q "test" -Q $ sqlite3 ~/.hermes/lcm.db "SELECT COUNT(*) FROM messages;" 2 Refs: - NousResearch#9675 (closed: feat(compressor): implement preflight compression check) - NousResearch#10088 (merged body: skipped NousResearch#9675 as dead code) - stephenschoettler/hermes-lcm#68 (LCM author flagged host integration issue but could not file upstream because GitHub Issues was off on a different fork) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

teknium1 merged commit 9855190 into main Apr 15, 2026
3 of 5 checks passed

teknium1 deleted the hermes/hermes-24eb3c49 branch April 15, 2026 05:21

github-actions Bot mentioned this pull request Apr 24, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.16 to v2026.4.23 Docker-Hub-sirmark/docker-hermes-agent#3

Merged

catgodtw mentioned this pull request Apr 25, 2026

fix(run_agent): wire up should_compress_preflight() per-turn ingest hook #15806

Open

23 tasks

konsisumer mentioned this pull request Jun 3, 2026

fix(agent): focus automatic compression on recent user turns #38155

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening#10088

feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening#10088
teknium1 merged 1 commit into
mainfrom
hermes/hermes-24eb3c49

teknium1 commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 15, 2026

Summary

Follow-up fixes applied during salvage

Skipped

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants