fix(agent): prevent double-compression on turn immediately after compress run by ashishpatel26 · Pull Request #38133 · NousResearch/hermes-agent

ashishpatel26 · 2026-06-03T10:50:01Z

Problem

After compress() runs, the scheduler sets:

last_prompt_tokens = -1 (sentinel)
awaiting_real_usage_after_compression = True

But last_real_prompt_tokens still holds the old pre-compression value (above the threshold). On the very next preflight check, should_defer_preflight_to_real_usage() hit this branch:

if self.last_real_prompt_tokens >= self.threshold_tokens:
    return False   # incorrectly skips deferral

…returned False, allowing should_compress(preflight_tokens) to fire — triggering a second compression before the provider ever reported real token usage for the now-shorter conversation.

Root cause

awaiting_real_usage_after_compression exists precisely to guard this window, but should_defer_preflight_to_real_usage() never consulted it. The stale last_real_prompt_tokens value short-circuited deferral.

Fix

Add an early-return in should_defer_preflight_to_real_usage() (agent/context_compressor.py):

if self.awaiting_real_usage_after_compression:
    return True   # defer until update_from_response() clears the flag

update_from_response() already clears awaiting_real_usage_after_compression once real prompt_tokens arrive from the provider, so the guard is active for exactly one turn.

Test

Added two cases to TestPreflightDeferral in tests/agent/test_context_compressor.py:

test_defers_immediately_after_compression_before_real_usage_arrives — verifies should_defer_preflight_to_real_usage returns True when the flag is set and rough tokens exceed threshold (the previously broken case).
test_no_longer_defers_after_real_usage_clears_flag — verifies normal baseline/growth deferral resumes once the flag is cleared.

Test plan

Existing TestPreflightDeferral tests still pass
TestUpdateFromResponse tests still pass (flag-clearing path unchanged)
Manual: start a long conversation that triggers compression; verify only one compression fires per threshold crossing, not two back-to-back

liuhao1024 · 2026-06-03T11:27:23Z

I verified the guard logic and the regression scenario — the test is correct that deferring when awaiting_real_usage_after_compression=True prevents double-compression. However, the flag is never set to True in production.

On current main, awaiting_real_usage_after_compression is:

Initialized to False at line 554 and line 655
Cleared to False in update_from_response() at line 696
Never assigned True anywhere

The compress() method (line 1827+) does not set this flag after a successful compression. So the new guard if self.awaiting_real_usage_after_compression: return True will never trigger — the flag is always False when should_defer_preflight_to_real_usage() runs.

The tests pass because they manually set compressor.awaiting_real_usage_after_compression = True, but no production code path does this.

Suggested fix: Add this to the compress() method, after the summary is generated and before returning the compressed messages:

# Park last_prompt_tokens at -1 so the preflight check in
# should_defer_preflight_to_real_usage() knows real usage
# hasn't arrived yet.
self.last_prompt_tokens = -1
self.awaiting_real_usage_after_compression = True

Without this, the PR fixes the symptom in tests but not in production.

ashishpatel26 · 2026-06-04T07:03:00Z

Great catch — you're absolutely right. The flag and the guard were both present, but compress() never actually set awaiting_real_usage_after_compression = True, so the guard in should_defer_preflight_to_real_usage() could never fire in production. The tests passed only because they set the flag manually.

Fixed in the latest push: at the end of compress() (before return compressed), added:

self.last_prompt_tokens = -1
self.awaiting_real_usage_after_compression = True

This matches your suggested fix exactly. last_prompt_tokens = -1 is also important — the -1 or 0 truthiness bug (#36718 secondary mechanism) means a zero-check alone isn't sufficient, so the flag is the reliable signal.

After compress() runs, last_prompt_tokens is set to -1 and awaiting_real_usage_after_compression=True. last_real_prompt_tokens still holds the old pre-compression value (above threshold), so should_defer_preflight_to_real_usage() incorrectly returned False on the very next turn — letting the preflight estimate re-trigger a second compression before the API reported real usage for the shorter conversation. Fix: add an early return in should_defer_preflight_to_real_usage() that defers any above-threshold preflight compression while the awaiting_real_usage_after_compression flag is set. The flag is cleared by update_from_response() once the first real prompt_tokens arrive from the provider, restoring normal behaviour. Closes NousResearch#36718

…s() (NousResearch#36718) The flag and guard were present but compress() never set awaiting_real_usage_after_compression=True, so should_defer_preflight_to_real_usage() always returned False in production. Setting last_prompt_tokens=-1 and the flag before returning ensures the preflight check defers until update_from_response() receives the real post-compress token count.

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround labels Jun 3, 2026

ashishpatel26 added 2 commits June 5, 2026 09:40

ashishpatel26 force-pushed the fix/compression-double-trigger-36718 branch from dc745d8 to 5e60b4e Compare June 5, 2026 04:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): prevent double-compression on turn immediately after compress run#38133

fix(agent): prevent double-compression on turn immediately after compress run#38133
ashishpatel26 wants to merge 2 commits into
NousResearch:mainfrom
ashishpatel26:fix/compression-double-trigger-36718

ashishpatel26 commented Jun 3, 2026

Uh oh!

liuhao1024 commented Jun 3, 2026

Uh oh!

ashishpatel26 commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ashishpatel26 commented Jun 3, 2026

Problem

Root cause

Fix

Test

Test plan

Uh oh!

liuhao1024 commented Jun 3, 2026

Uh oh!

ashishpatel26 commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants