Skip to content

fix: store asyncio task references to prevent GC mid-execution#3267

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-dd753a5f
Mar 26, 2026
Merged

fix: store asyncio task references to prevent GC mid-execution#3267
teknium1 merged 1 commit into
mainfrom
hermes/hermes-dd753a5f

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Python's asyncio event loop holds only weak references to tasks. Without a strong reference, the GC can destroy a task while it's awaiting I/O — silently dropping messages. Python 3.12+ made this more aggressive (docs, SO discussion).

Full audit of all 12 gateway platform adapters found 6 untracked create_task calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class)

  • webhook.pyhandle_message task
  • sms.pyhandle_message task
  • signal.py — SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars)

  • slack.py — Socket Mode handler → self._socket_mode_task
  • discord.py — bot client → self._bot_task
  • whatsapp.py — message poll loop → self._poll_task (2 call sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant, dingtalk) already tracked their tasks correctly.

Salvaged from #3160 by @memosr — expanded from 1 file to 6.

Test plan

  • Gateway tests: 1461 passed
  • Full suite: 6226 passed (1 pre-existing failure deselected)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR #3160 by memosr — expanded from 1 file to 6.
@teknium1 teknium1 merged commit 243ee67 into main Mar 26, 2026
1 of 2 checks passed
teknium1 added a commit that referenced this pull request Mar 29, 2026
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
teknium1 added a commit that referenced this pull request Mar 29, 2026
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…esearch#3267)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR NousResearch#3160 by memosr — expanded from 1 file to 6.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…Research#3818)

Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR NousResearch#1851 by Himess. The _poll_task storage was already
on main from PR NousResearch#3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…esearch#3267)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR NousResearch#3160 by memosr — expanded from 1 file to 6.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…Research#3818)

Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR NousResearch#1851 by Himess. The _poll_task storage was already
on main from PR NousResearch#3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…esearch#3267)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR NousResearch#3160 by memosr — expanded from 1 file to 6.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…Research#3818)

Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR NousResearch#1851 by Himess. The _poll_task storage was already
on main from PR NousResearch#3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…esearch#3267)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR NousResearch#3160 by memosr — expanded from 1 file to 6.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…Research#3818)

Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR NousResearch#1851 by Himess. The _poll_task storage was already
on main from PR NousResearch#3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…esearch#3267)

Python's asyncio event loop holds only weak references to tasks.
Without a strong reference, the garbage collector can destroy a task
while it's awaiting I/O — silently dropping messages. Python 3.12+
made this more aggressive.

Audit of all gateway platform adapters found 6 untracked create_task
calls across 6 files:

Per-message tasks (tracked via _background_tasks set from base class):
- gateway/platforms/webhook.py: handle_message task
- gateway/platforms/sms.py: handle_message task
- gateway/platforms/signal.py: SSE response aclose task

Long-running infrastructure tasks (stored in named instance vars):
- gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task)
- gateway/platforms/discord.py: bot client (_bot_task)
- gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites)

All other adapters (telegram, mattermost, matrix, email, homeassistant,
dingtalk) already tracked their tasks correctly.

Salvaged from PR NousResearch#3160 by memosr — expanded from 1 file to 6.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…Research#3818)

Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR NousResearch#1851 by Himess. The _poll_task storage was already
on main from PR NousResearch#3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant