Skip to content

[Bug]: model_tools async bridge recreates loops in running-loop contexts #16570

@chezzdev

Description

@chezzdev

Bug Description

When model_tools._run_async() is called from a thread that already has a running asyncio loop, it bridges by spinning up a fresh worker thread and running the coroutine with asyncio.run() for that single call.

That creates a new event loop per async-context tool call. Cached async clients such as AsyncOpenAI/httpx can remain bound to those short-lived loops, leaving clients/transports tied to dead loops and causing descriptor/resource churn in long-lived gateway processes.

Steps to Reproduce

  1. Run Hermes in a long-lived gateway or another async context.
  2. Trigger an async tool path repeatedly, for example one that goes through async_call_llm().
  3. Observe that _run_async() uses a fresh loop for each running-loop branch call instead of reusing a stable bridge loop.

Expected Behavior

Running-loop callers should submit coroutines to a persistent bridge loop so cached async clients remain bound to a live event loop across gateway turns. Shutdown paths should explicitly stop and close that bridge loop.

Actual Behavior

The running-loop branch uses per-call asyncio.run() in a disposable worker thread. Cached async clients can outlive the loop they were created on, causing stale-loop cleanup hazards and resource churn.

Affected Component

  • Tools (async tool dispatch / model_tools._run_async())
  • Gateway (long-lived async process)

Root Cause Analysis

model_tools._run_async() already uses persistent loops for the main thread and worker threads, but the branch for callers inside an active asyncio loop still uses a throwaway thread with asyncio.run(). asyncio.run() creates and closes an event loop each time, which conflicts with cached async clients that retain loop-bound transports.

Proposed Fix

Reuse one dedicated bridge loop for running-loop callers via asyncio.run_coroutine_threadsafe(), add startup failure handling, and stop/close the bridge loop from CLI/gateway cleanup paths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverycomp/toolsTool registry, model_tools, toolsetstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions