Skip to content

fix(weixin): schedule direct sends on adapter loop#17557

Closed
tonyxu-io wants to merge 1 commit into
NousResearch:mainfrom
tonyxu-io:fix/weixin-direct-live-adapter-loop
Closed

fix(weixin): schedule direct sends on adapter loop#17557
tonyxu-io wants to merge 1 commit into
NousResearch:mainfrom
tonyxu-io:fix/weixin-direct-live-adapter-loop

Conversation

@tonyxu-io

@tonyxu-io tonyxu-io commented Apr 29, 2026

Copy link
Copy Markdown

Summary

  • record the Weixin adapter's owning asyncio event loop when it connects
  • route live-adapter direct sends back onto that adapter loop instead of using its aiohttp session from the caller loop
  • add regression coverage for cron/send_message direct sends from a different loop

Why this is needed

send_weixin_direct() is used by one-shot delivery paths such as send_message and cron delivery. When a live Weixin adapter is available, it reuses that adapter so outbound delivery can keep the adapter's context-token and media behavior.

The live adapter's _send_session is an aiohttp.ClientSession created during WeixinAdapter.connect(), so it belongs to the gateway adapter's asyncio event loop. Cron/tool delivery can call send_weixin_direct() from a different worker thread/event loop. Before this change, that path directly awaited live_adapter.send(...) from the caller loop, which can use the live adapter's aiohttp session on the wrong loop.

That cross-loop aiohttp session access can fail with event-loop/task errors such as:

Timeout context manager should be used inside a task

In practice, this can make a scheduled job complete successfully but fail when delivering the result to Weixin.

Implementation details

This PR stores the adapter's loop at connect time:

self._loop = asyncio.get_running_loop()

Then, when send_weixin_direct() finds a live adapter with an open send session:

  • if the caller is already on the adapter loop, it sends directly as before
  • if the caller is on another loop, it schedules the send coroutine onto the adapter loop with asyncio.run_coroutine_threadsafe(...) and awaits the wrapped future
  • if there is no usable live adapter loop, the existing standalone one-shot session fallback remains available

This preserves live adapter reuse while avoiding cross-loop use of the adapter's aiohttp.ClientSession.

Test Plan

  • /home/tonyxu/.hermes/hermes-agent/venv/bin/python -m pytest tests/gateway/test_weixin.py::TestWeixinSendMessageIntegration::test_send_weixin_direct_schedules_live_adapter_send_on_adapter_loop tests/gateway/test_weixin.py::TestWeixinSendMessageIntegration::test_send_to_platform_routes_weixin_media_to_native_helper -q -o 'addopts='
  • /home/tonyxu/.hermes/hermes-agent/venv/bin/python -m pytest tests/gateway/test_weixin.py -q -o 'addopts='

Results:

  • targeted tests: 2 passed, 2 warnings
  • full Weixin adapter test file: 50 passed, 2 warnings

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery platform/wecom WeCom / WeChat Work adapter labels Apr 29, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #12810 and #15911 — same root cause: cross-event-loop aiohttp session reuse in send_weixin_direct. Multiple PRs address this (#14530, #13361, #12810, #15911). This PR offers the same fix (schedule on adapter loop).

@tonyxu-io

Copy link
Copy Markdown
Author

Thanks for flagging the duplicates. I agree this is the same underlying Weixin/aiohttp cross-event-loop issue.

I checked the related PRs:

This PR intentionally takes the “schedule on the adapter loop” variant rather than the “skip live adapter / fresh session” variant:

  1. Keep live-adapter reuse for Weixin direct sends so context-token/media behavior stays on the live adapter path.
  2. Avoid cross-loop aiohttp.ClientSession access by recording the adapter’s owning loop at connect() time and using asyncio.run_coroutine_threadsafe(...) when the caller is on a different loop.
  3. Keep the standalone one-shot session fallback when there is no usable live adapter loop.
  4. Add a regression test proving a different caller loop schedules onto the adapter loop instead of directly awaiting adapter.send() from the caller loop.

If maintainers prefer the simpler fresh-session fallback used in #12810/#15911/#13361, I’m happy to rework this PR in that direction or close it in favor of whichever duplicate you want to merge. The reason I kept this open is that it preserves live adapter reuse while still fixing the loop-affinity problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/wecom WeCom / WeChat Work adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants