Skip to content

MCP reconnect fails with asyncio.CancelledError on Python 3.11+ #9930

@SwavveHub

Description

@SwavveHub

Priority: P1

Problem: When Hermes gateway restarts, GHL MCP server (586 tools) fails to reconnect with .

Root Cause: tools/mcp_tool.py line 997: except Exception does NOT catch asyncio.CancelledError in Python 3.11+ (CancelledError inherits from BaseException, not Exception). When the connection task is cancelled during gateway restart, CancelledError escapes the exception handler and the reconnection loop aborts silently.

Environment: Python 3.11.15, hermes-agent v0.9.0

Fix — add explicit CancelledError handling before the Exception catch in MCPServerTask.run():

except asyncio.CancelledError:
    # Task was cancelled — expected during shutdown/cancel.
    # Do NOT treat as a connection failure.
    self.session = None
    return
except Exception as exc:

This pattern is already used correctly in the shutdown handler (line 1056-1060). The bug only affects the connect/reconnect loop path.

Reproduction: Start Hermes with GHL MCP → kill/restart gateway → MCP fails with CancelledError.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliverytool/mcpMCP client and OAuthtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions