Skip to content

Parallel web_search calls hang in Amplifier's async tool executor #219

@Joi

Description

@Joi

Summary

When the LLM fires multiple web_search tool calls in parallel (e.g., 9-10 simultaneous searches), all searches hang indefinitely — no results return, no timeout fires, and the session dies. This is reproducible on resume.

Affected module: microsoft/amplifier-module-tool-web

Environment

  • macOS (Apple Silicon)
  • Python 3.11
  • ddgs 9.10.0 + primp 1.0.0
  • Amplifier CLI latest (as of 2026-02-15)

Root Cause Analysis

The WebSearchTool._real_search() method at __init__.py:103-123 has three issues:

  1. No concurrency limit: Each search creates a new DDGS() instance (which internally creates a primp Rust HTTP client) and dispatches it to the default ThreadPoolExecutor via run_in_executor(None, ...). When 9-10 run simultaneously, they appear to deadlock — likely due to thread pool starvation or contention in primp's Rust runtime.

  2. No timeout: If run_in_executor hangs, it hangs forever. There's no asyncio.wait_for() wrapper, so the session becomes permanently stuck.

  3. Deprecated API: Uses asyncio.get_event_loop() instead of asyncio.get_running_loop().

Reproduction

  • Have an LLM session make 9+ parallel web_search tool calls (e.g., looking up phone numbers for multiple businesses)
  • The tool:pre events fire but zero tool:post events are recorded
  • Session becomes unresponsive

Note: The same 9 parallel searches work fine in a standalone asyncio.run() test script. The hang is specific to Amplifier's runtime event loop / thread pool context.

Suggested Fix

class WebSearchTool:
    _search_semaphore: asyncio.Semaphore | None = None
    _SEARCH_TIMEOUT = 30
    _MAX_CONCURRENT = 3

    def __init__(self, config):
        ...
        if WebSearchTool._search_semaphore is None:
            WebSearchTool._search_semaphore = asyncio.Semaphore(self._MAX_CONCURRENT)

    async def _real_search(self, query):
        try:
            def search_sync():
                ddgs = DDGS()
                return [...]

            async with self._search_semaphore:
                loop = asyncio.get_running_loop()
                results = await asyncio.wait_for(
                    loop.run_in_executor(None, search_sync),
                    timeout=self._SEARCH_TIMEOUT,
                )
            return results

        except TimeoutError:
            logger.warning(f"DuckDuckGo search timed out after {self._SEARCH_TIMEOUT}s: {query}")
            return await self._mock_search(query)
        except Exception as e:
            logger.warning(f"DuckDuckGo search failed: {e}, falling back to mock")
            return await self._mock_search(query)

Key changes:

  • Semaphore (max 3 concurrent) prevents thread pool starvation
  • asyncio.wait_for() with 30s timeout prevents infinite hangs, falls back to mock
  • asyncio.get_running_loop() replaces deprecated get_event_loop()

Additional Note: primp/ddgs version mismatch

There's also a cosmetic issue: ddgs 9.10.0 requests browser impersonation profiles (e.g., chrome_126) that primp 1.0.0 doesn't recognize, producing Impersonate 'chrome_126' does not exist, using 'random' warnings. The random fallback works fine — this is noisy but not the cause of the hang. The uv.lock pins ddgs==9.9.2 + primp==0.15.0 (compatible pair) but the installed versions have drifted.


🤖 Generated with Amplifier

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions