Bug report draft: /compress reports success even when no compaction occurs
Suggested title:
/compress can report a successful compression even when the transcript is unchanged
Summary
The Telegram/manual gateway /compress path can return a success banner like:
🗜️ Compressed: 19 → 19 messages
~11,726 → ~11,726 tokens
…even when no compaction actually happened.
This is a false positive. The underlying compressor can return the original message list unchanged, but the gateway still rewrites the transcript and formats a success banner as if compression succeeded.
Actual behavior
Running /compress in a gateway session can produce identical before/after message counts and token estimates, with no visible summary inserted and no history trimmed.
Example observed output:
🗜️ Compressed: 19 → 19 messages
~11,726 → ~11,726 tokens
Expected behavior
One of these should happen:
-
Actual compaction occurs:
- message and/or token counts decrease, or
- a summary/handoff message is inserted and older turns are pruned.
-
If compaction is not possible, the command should return an explicit no-op message, e.g.:
Nothing to compress yet.
No compression performed: only 19 compressible messages; current settings require more history.
It should not claim success when the transcript is unchanged.
Root cause
This appears to be a combination of two behaviors:
ContextCompressor.compress() intentionally returns the input unchanged for small histories.
- The manual gateway
/compress handler does not check whether compression actually changed anything before returning the success banner.
Why this happens
The gateway manual /compress path first filters the transcript down to only user and assistant messages:
gateway/run.py around lines 4996-5000
msgs = [
{"role": m.get("role"), "content": m.get("content")}
for m in history
if m.get("role") in ("user", "assistant") and m.get("content")
]
Then it creates a temporary AIAgent and calls _compress_context(...) with quiet_mode=True:
gateway/run.py around lines 5004-5018
tmp_agent = AIAgent(
...,
quiet_mode=True,
...,
)
compressed, _ = await loop.run_in_executor(
None,
lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens)
)
The compressor itself has an early return for small histories:
agent/context_compressor.py around lines 578-586
if n_messages <= self.protect_first_n + self.protect_last_n + 1:
...
return messages
With the current defaults, that threshold is effectively:
protect_first_n = 3
protect_last_n = 20
- so compression requires
> 24 messages to do any work.
Those defaults come from run_agent.py:
run_agent.py around lines 1093-1097 and 1132-1136
compression_protect_last = int(_compression_cfg.get("protect_last_n", 20))
...
self.context_compressor = ContextCompressor(
...,
protect_first_n=3,
protect_last_n=compression_protect_last,
)
So a 19-message user/assistant history is not compressible by design and is returned unchanged.
However, the gateway handler unconditionally formats a success banner afterward:
gateway/run.py around lines 5037-5040
return (
f"🗜️ Compressed: {original_count} → {new_count} messages\n"
f"~{approx_tokens:,} → ~{new_tokens:,} tokens"
)
There is no guard for:
compressed == msgs
new_count == original_count and new_tokens == approx_tokens
- a no-op reason such as
too_few_messages
Because quiet_mode=True, the internal warning about not being able to compress is also suppressed, so the failure mode is silent.
Minimal reproduction
This can be reproduced locally without Telegram by calling the same compression path with a 19-message alternating user/assistant history.
Example reproduction logic:
from run_agent import AIAgent
from agent.model_metadata import estimate_messages_tokens_rough
msgs = []
for i in range(19):
role = 'user' if i % 2 == 0 else 'assistant'
msgs.append({'role': role, 'content': f'message {i}'})
orig_tokens = estimate_messages_tokens_rough(msgs)
agent = AIAgent(
model='openai-codex/gpt-5.4',
quiet_mode=True,
max_iterations=1,
enabled_toolsets=['memory'],
session_id='compress_bug_repro',
)
compressed, _ = agent._compress_context(msgs, '', approx_tokens=orig_tokens)
print(len(msgs), len(compressed), compressed == msgs)
Observed result:
- before messages: 19
- after messages: 19
- transcript identical:
True
In a real gateway /compress flow, that unchanged result still gets presented as a successful compression.
Impact
- Misleading UX: users are told compression succeeded when nothing changed.
- Harder debugging: no-op cases look like silent failures or “fake compression.”
- Confusing telemetry: operators may assume compaction is working because the banner says it is.
- Manual
/compress is especially misleading because it filters to only user/assistant turns, so visible conversation length may look “long enough” while the compressible count is still below threshold.
Suggested fix direction
A good fix would be to return structured outcome metadata from compression, not just the output message list.
For example, manual /compress should be able to distinguish:
compressed
unchanged_too_few_messages
unchanged_boundary_collapse
unchanged_summary_unavailable
Minimum viable fix:
- After
_compress_context(...), detect whether the transcript materially changed.
- Only show the
🗜️ Compressed: banner if the transcript changed.
- Otherwise return a no-op explanation with the reason and current threshold.
Optional improvement:
- Make the manual
/compress response explicitly say it is counting only compressible user/assistant messages, or reconsider whether manual compression should evaluate the full transcript for eligibility.
- Revisit whether the default
compression.protect_last_n: 20 is too conservative for manual messaging sessions. Lowering it to 10 would reduce the minimum compressible size from >24 messages to >14 messages (given the hardcoded protect_first_n=3).
- Important nuance: lowering
protect_last_n is only a tuning/workaround. It does not fix the false-success bug by itself; the command still needs explicit no-op detection.
Not a duplicate of nearby issues
I checked a few adjacent issues before drafting this:
#499 — open enhancement about compaction quality and prompt design, not false-success reporting on no-op manual /compress
#2153 — closed bug about compression failing to trigger after API disconnects, different failure mode
#2771 — open bug about silent memory write failures in gateway sessions, similar "not surfaced to user" pattern but unrelated subsystem
Environment
- Repo:
NousResearch/hermes-agent
- Branch observed:
main
- Commit observed locally:
ff6a86cb529a372198b4b80d5e022e32a4a3f2cc
- Hermes version:
Hermes Agent v0.8.0 (2026.4.8)
- Python:
3.11.14
- OS:
Linux 6.17.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 9 17:01:16 UTC 2026 x86_64 GNU/Linux
- Reproduced on the gateway/manual
/compress path and in a direct local Python reproduction
Relevant code locations
gateway/run.py — _handle_compress_command()
run_agent.py — AIAgent compressor defaults and _compress_context()
agent/context_compressor.py — ContextCompressor.compress() early no-op return
Bug report draft:
/compressreports success even when no compaction occursSuggested title:
/compresscan report a successful compression even when the transcript is unchangedSummary
The Telegram/manual gateway
/compresspath can return a success banner like:🗜️ Compressed: 19 → 19 messages~11,726 → ~11,726 tokens…even when no compaction actually happened.
This is a false positive. The underlying compressor can return the original message list unchanged, but the gateway still rewrites the transcript and formats a success banner as if compression succeeded.
Actual behavior
Running
/compressin a gateway session can produce identical before/after message counts and token estimates, with no visible summary inserted and no history trimmed.Example observed output:
🗜️ Compressed: 19 → 19 messages~11,726 → ~11,726 tokensExpected behavior
One of these should happen:
Actual compaction occurs:
If compaction is not possible, the command should return an explicit no-op message, e.g.:
Nothing to compress yet.No compression performed: only 19 compressible messages; current settings require more history.It should not claim success when the transcript is unchanged.
Root cause
This appears to be a combination of two behaviors:
ContextCompressor.compress()intentionally returns the input unchanged for small histories./compresshandler does not check whether compression actually changed anything before returning the success banner.Why this happens
The gateway manual
/compresspath first filters the transcript down to onlyuserandassistantmessages:gateway/run.pyaround lines 4996-5000Then it creates a temporary
AIAgentand calls_compress_context(...)withquiet_mode=True:gateway/run.pyaround lines 5004-5018The compressor itself has an early return for small histories:
agent/context_compressor.pyaround lines 578-586With the current defaults, that threshold is effectively:
protect_first_n = 3protect_last_n = 20> 24messages to do any work.Those defaults come from
run_agent.py:run_agent.pyaround lines 1093-1097 and 1132-1136So a 19-message user/assistant history is not compressible by design and is returned unchanged.
However, the gateway handler unconditionally formats a success banner afterward:
gateway/run.pyaround lines 5037-5040There is no guard for:
compressed == msgsnew_count == original_count and new_tokens == approx_tokenstoo_few_messagesBecause
quiet_mode=True, the internal warning about not being able to compress is also suppressed, so the failure mode is silent.Minimal reproduction
This can be reproduced locally without Telegram by calling the same compression path with a 19-message alternating user/assistant history.
Example reproduction logic:
Observed result:
TrueIn a real gateway
/compressflow, that unchanged result still gets presented as a successful compression.Impact
/compressis especially misleading because it filters to only user/assistant turns, so visible conversation length may look “long enough” while the compressible count is still below threshold.Suggested fix direction
A good fix would be to return structured outcome metadata from compression, not just the output message list.
For example, manual
/compressshould be able to distinguish:compressedunchanged_too_few_messagesunchanged_boundary_collapseunchanged_summary_unavailableMinimum viable fix:
_compress_context(...), detect whether the transcript materially changed.🗜️ Compressed:banner if the transcript changed.Optional improvement:
/compressresponse explicitly say it is counting only compressibleuser/assistantmessages, or reconsider whether manual compression should evaluate the full transcript for eligibility.compression.protect_last_n: 20is too conservative for manual messaging sessions. Lowering it to10would reduce the minimum compressible size from>24messages to>14messages (given the hardcodedprotect_first_n=3).protect_last_nis only a tuning/workaround. It does not fix the false-success bug by itself; the command still needs explicit no-op detection.Not a duplicate of nearby issues
I checked a few adjacent issues before drafting this:
#499— open enhancement about compaction quality and prompt design, not false-success reporting on no-op manual/compress#2153— closed bug about compression failing to trigger after API disconnects, different failure mode#2771— open bug about silent memory write failures in gateway sessions, similar "not surfaced to user" pattern but unrelated subsystemEnvironment
NousResearch/hermes-agentmainff6a86cb529a372198b4b80d5e022e32a4a3f2ccHermes Agent v0.8.0 (2026.4.8)3.11.14Linux 6.17.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 9 17:01:16 UTC 2026 x86_64 GNU/Linux/compresspath and in a direct local Python reproductionRelevant code locations
gateway/run.py—_handle_compress_command()run_agent.py—AIAgentcompressor defaults and_compress_context()agent/context_compressor.py—ContextCompressor.compress()early no-op return