Summary
model_tools.py _get_tool_loop() and _get_worker_loop() (lines ~39-78) create persistent event loops that are never explicitly closed. The loops (and any async resources bound to them — httpx connections, aiohttp sessions) live until process exit.
Root Cause
- If the main thread's loop is replaced (line ~54:
if _tool_loop is None or _tool_loop.is_closed()), any resources bound to the old loop become orphaned
- Worker thread loops stored in
_worker_thread_local are never cleaned up when the thread pool shrinks — thread-local storage is only freed when the thread is joined, which may not happen promptly with ThreadPoolExecutor
The code comments acknowledge this is a conscious tradeoff to avoid "Event loop is closed" errors.
Impact
In long-running gateway deployments with many worker threads:
- Event loop accumulation
- HTTP connection pool leak (connections bound to old loops never close)
- File descriptor exhaustion under sustained load
The _run_async fallback thread (line ~113) compounds this: on timeout (300s hard cap), the spawned thread continues running in the background with no cleanup, holding its own event loop and resources.
Suggested Fix
- Add cleanup hooks for thread-local event loops (e.g., via
atexit per thread or ThreadPoolExecutor callbacks)
- For the timeout path, cancel the task and close the disposable loop on
TimeoutError
- Consider using a shared event loop with proper lifecycle management instead of per-thread loops
Severity
Low — resource accumulation in long-running processes; unlikely to cause issues in CLI usage.
Summary
model_tools.py_get_tool_loop()and_get_worker_loop()(lines ~39-78) create persistent event loops that are never explicitly closed. The loops (and any async resources bound to them — httpx connections, aiohttp sessions) live until process exit.Root Cause
if _tool_loop is None or _tool_loop.is_closed()), any resources bound to the old loop become orphaned_worker_thread_localare never cleaned up when the thread pool shrinks — thread-local storage is only freed when the thread is joined, which may not happen promptly withThreadPoolExecutorThe code comments acknowledge this is a conscious tradeoff to avoid "Event loop is closed" errors.
Impact
In long-running gateway deployments with many worker threads:
The
_run_asyncfallback thread (line ~113) compounds this: on timeout (300s hard cap), the spawned thread continues running in the background with no cleanup, holding its own event loop and resources.Suggested Fix
atexitper thread orThreadPoolExecutorcallbacks)TimeoutErrorSeverity
Low — resource accumulation in long-running processes; unlikely to cause issues in CLI usage.