Skip to content

fix: resource leaks and crashes in long-running gateway#993

Closed
Himess wants to merge 1 commit into
NousResearch:mainfrom
Himess:fix/gateway-resource-leaks
Closed

fix: resource leaks and crashes in long-running gateway#993
Himess wants to merge 1 commit into
NousResearch:mainfrom
Himess:fix/gateway-resource-leaks

Conversation

@Himess

@Himess Himess commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

Fixes #990

Four fixes for issues that compound in long-running gateway processes:

  • Log handler accumulation: guard RotatingFileHandler addition with a module-level flag so it only runs once
  • Delivery dedup mismatch: (Platform.LOCAL, None)(Platform.LOCAL, None, None) to match the 3-tuple in seen_platforms
  • Context compressor crash: (content or "").strip() to handle None API responses
  • Thread pool leak: shutdown(wait=False) old executor before replacing in resize_tool_pool()

- Prevent RotatingFileHandler accumulation: each AIAgent.__init__ was
  appending a new handler to the root logger without checking if one
  already exists. After N messages the same log line is written N times.
  Guard with a module-level flag.

- Fix delivery dedup tuple mismatch: seen_platforms stores 3-tuples
  (platform, chat_id, thread_id) but the local check used a 2-tuple
  that never matches, causing duplicate LOCAL deliveries.

- Fix AttributeError in context compressor: content can be None when
  the API returns no text, calling .strip() on None crashes.

- Shutdown old ThreadPoolExecutor in resize_tool_pool: the previous
  executor was replaced without shutdown(), leaking threads.
teknium1 pushed a commit that referenced this pull request Mar 14, 2026
Salvages the two still-relevant fixes from PR #993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
teknium1 added a commit that referenced this pull request Mar 14, 2026
Merging the non-redundant fixes salvaged from #993 onto current main, plus adjacent trajectory compressor hardening found during review.
@teknium1

Copy link
Copy Markdown
Contributor

Merged the still-relevant parts via PR #1327 on top of current main. That salvage preserved your substantive fixes for gateway LOCAL dedup and agent_loop executor cleanup, with authorship kept in git history. The other two changes in #993 were already present on main in stronger form. Thanks.

@teknium1 teknium1 closed this Mar 14, 2026
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026


Salvages the two still-relevant fixes from PR NousResearch#993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…048e6599

Merging the non-redundant fixes salvaged from NousResearch#993 onto current main, plus adjacent trajectory compressor hardening found during review.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026


Salvages the two still-relevant fixes from PR NousResearch#993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…048e6599

Merging the non-redundant fixes salvaged from NousResearch#993 onto current main, plus adjacent trajectory compressor hardening found during review.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026


Salvages the two still-relevant fixes from PR NousResearch#993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…048e6599

Merging the non-redundant fixes salvaged from NousResearch#993 onto current main, plus adjacent trajectory compressor hardening found during review.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026


Salvages the two still-relevant fixes from PR NousResearch#993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…048e6599

Merging the non-redundant fixes salvaged from NousResearch#993 onto current main, plus adjacent trajectory compressor hardening found during review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log handler accumulates on every AIAgent init, degrading gateway performance

2 participants