Skip to content

fix: deduplicate context pressure warnings in gateway mode#6309

Closed
duan78 wants to merge 1 commit into
NousResearch:mainfrom
duan78:fix/gateway-context-pressure-dedup
Closed

fix: deduplicate context pressure warnings in gateway mode#6309
duan78 wants to merge 1 commit into
NousResearch:mainfrom
duan78:fix/gateway-context-pressure-dedup

Conversation

@duan78

@duan78 duan78 commented Apr 8, 2026

Copy link
Copy Markdown

Summary

In the Hermes gateway (Telegram, Discord, Slack, etc.), every incoming user message creates a new AIAgent instance (see gateway/run.py ~line 4569 in run_sync()). The _context_pressure_warned flag is designed to prevent duplicate "Context compaction approaching" warnings within a single agent loop, but since it's initialized to False in __init__, it resets on every message.

Result: If a session's context is already above 85% of the compaction threshold, the warning fires on every single message the user sends. This can spam users with 3+ notifications per minute.

Fix

Added class-level time-based deduplication on AIAgent (persists across instances):

  1. Class-level dict _context_pressure_last_warned: dict[str, float] tracks the last warning timestamp per session_id
  2. 5-minute cooldown (_CONTEXT_PRESSURE_COOLDOWN = 300): same session won't receive the warning more than once per 5 minutes, even across new AIAgent instances
  3. Compression reset: when compression drops context below 85%, the cooldown entry is cleared, allowing a fresh warning cycle

Changes

  • run_agent.py: ~12 lines of logic (class var + cooldown check in emission + reset in compression)
  • tests/run_agent/test_context_pressure.py: 3 new test cases:
    • test_second_instance_within_cooldown_is_suppressed — verifies dedup works across instances
    • test_warning_fires_after_cooldown_expires — verifies warning returns after 5 min
    • test_compression_resets_dedup — verifies compression clears the cooldown

Validation

  • All 26 context pressure tests pass (18 existing + 8 new)
  • All 672 run_agent tests pass
  • No gateway changes needed — the fix is entirely within run_agent.py

Related

  • Same class of issue as the _context_pressure_warned design noted in reset_session_state() not clearing the flag (intentional, per comment at line 6015-6018)

In gateway mode, a new AIAgent is created per message. The instance-level
_context_pressure_warned flag was always False on fresh instances, causing
the 'Context compaction approaching' warning to fire on every message when
context was already above 85% of the compaction threshold.

Fix: add a class-level dict _context_pressure_last_warned keyed by
session_id that tracks the timestamp of the last warning emission. If less
than 5 minutes have passed since the last warning for a given session, the
warning is suppressed. The entry is cleared after compression drops context
below the 85% threshold, allowing a fresh warning cycle.

- run_agent.py: add class-level dedup dict + cooldown logic
- tests/run_agent/test_context_pressure.py: add 3 dedup tests
teknium1 added a commit that referenced this pull request Apr 9, 2026
Combines the approaches from PR #6309 (duan78) and PR #5963 (KUSH42):

Tiered warnings (from #5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from #6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes #6309. Fixes #5963.
pull Bot pushed a commit to gmhl000/hermes-agent that referenced this pull request Apr 9, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
saxster pushed a commit to saxster/hermes-agent that referenced this pull request Apr 9, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…earch#6411)

Combines the approaches from PR NousResearch#6309 (duan78) and PR NousResearch#5963 (KUSH42):

Tiered warnings (from NousResearch#5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression

Gateway dedup (from NousResearch#6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
  instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak

Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.

Fixes NousResearch#6309. Fixes NousResearch#5963.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant