Skip to content

[Bug] Gateway fails to register stdio MCP servers silently on macOS (launchd) #14113

@gth-ai

Description

@gth-ai

Environment

  • Hermes Agent v0.10.0 (2026.4.16)
  • macOS 26.3.1 (arm64)
  • Python 3.11.14
  • Node v25.9.0
  • Gateway managed by launchd (ai.hermes.gateway.plist)

Summary

A stdio-based MCP server that works with hermes mcp add / hermes mcp test and with hermes chat -q … (CLI) never becomes reachable from the Telegram-facing gateway. The gateway silently fails to connect — no successful MCP: registered N tool(s) log line, no MCP child processes of the gateway PID — and sessions are built with only the built-in toolsets.

Reproduction

  1. Register a local stdio MCP server (Node subprocess):
    hermes mcp add myserver --command node --args /path/to/server.mjs --env DATABASE_URL=...
    
  2. hermes mcp test myserver → ✓ connected in ~200 ms, 26 tools discovered.
  3. hermes chat -Q -q \"call tool X\" (CLI) → works.
  4. Restart the gateway (hermes gateway restart), open a Telegram session, /new, then ask anything that should use the MCP.
  5. Observed:
    • pgrep -P <gateway-pid> → no child node process.
    • /reload-mcp on Telegram → No MCP servers connected.
    • gateway.error.log:
      WARNING tools.mcp_tool: Failed to connect to MCP server 'myserver' (command=node): CancelledError
      
    • Per-session tool list dumped from ~/.hermes/sessions/*.json contains only built-in toolsets (e.g. 29 entries, none prefixed mcp_).
    • hermes tools list --platform telegram still reports myserver all tools enabled — misleading since the server was never actually registered in the gateway process.

What works vs. what doesn't

Context Result
hermes mcp test ✓ connects (~200 ms)
hermes chat -q … (CLI one-shot) ✓ MCP tools callable
Gateway (launchd-managed, Telegram) — stdio transport ✗ never registers, CancelledError
Gateway (launchd-managed, Telegram) — HTTP transport ✓ works reliably

Workaround

Switching the same server to HTTP transport (StreamableHTTPServerTransport bound to 127.0.0.1:PORT) makes it register immediately and work end-to-end in Telegram sessions. So the bug appears specific to stdio + gateway-under-launchd.

Hypothesis

discover_mcp_tools() is invoked via nested executors (loop.run_in_executor(None, discover_mcp_tools)_run_on_mcp_loop on a background thread) and the stdio client's subprocess is spawned via anyio. The outer call appears to be cancelled before the stdio handshake completes, producing the silent CancelledError. Worth checking why the cancellation happens in the gateway context but not in a plain Python REPL or hermes chat.

Evidence attached

Happy to provide full gateway.log / gateway.error.log snippets and a minimal repro repo if helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliverycomp/toolsTool registry, model_tools, toolsetstool/mcpMCP client and OAuthtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions