Bug Description
When the Hermes gateway runs for an extended period, MCP servers using the Streamable HTTP transport lose their server-side session. Subsequent tool calls fail with:
Invalid params: Invalid or expired session
The MCP client does not detect this condition and re-establish the session automatically. The only recovery is a full gateway restart, which interrupts all connected messaging platforms.
Note: This is NOT an OAuth token expiry issue — the access token remains valid (direct API calls return HTTP 200). The failure is at the MCP transport session layer.
Steps to Reproduce
- Configure a Streamable HTTP MCP server (e.g. WordPress.com MCP)
- Run
hermes gateway run and leave it running for several days
- Invoke any tool on that MCP server
- Observe the error
Expected Behavior
When a tool call returns "Invalid or expired session", the MCP client should automatically re-establish the session using the still-valid credentials and retry the call transparently.
Actual Behavior
Every subsequent tool call on the affected MCP server fails. The server remains broken until the gateway is manually restarted.
Relevant log entries:
ERROR tools.mcp_tool: MCP tool wpcom-mcp/wpcom-mcp-content-authoring call failed: Invalid params: Invalid or expired session
WARNING tools.mcp_tool: Failed to connect to MCP server 'wpcom-mcp': Client error '401 Unauthorized'
WARNING tools.mcp_oauth: MCP OAuth for 'wpcom-mcp': non-interactive environment and no cached tokens found.
Affected Component
- Tools (MCP client)
- Agent Core (gateway long-running stability)
Environment
- OS: Linux x86_64
- Hermes Version: 0.10.0 (2026.4.16)
- Python: 3.11.13
- MCP transport: Streamable HTTP
Root Cause Analysis
The MCP client in tools/mcp_tool.py does not treat "Invalid or expired session" as a reconnect trigger. The 3-attempt retry logic runs only at gateway startup, not on mid-session failures. Session expiry during normal operation falls through as a plain tool error with no recovery path.
Proposed Fix
In the MCP tool call handler, catch "Invalid or expired session" errors, tear down and re-initialize the MCP client for that server, then retry the original call once. The OAuth token remains valid — only the transport-layer session needs to be re-established.
Willing to submit a PR?
Not at this time, but the fix scope is well-defined above.
Bug Description
When the Hermes gateway runs for an extended period, MCP servers using the Streamable HTTP transport lose their server-side session. Subsequent tool calls fail with:
The MCP client does not detect this condition and re-establish the session automatically. The only recovery is a full gateway restart, which interrupts all connected messaging platforms.
Note: This is NOT an OAuth token expiry issue — the access token remains valid (direct API calls return HTTP 200). The failure is at the MCP transport session layer.
Steps to Reproduce
hermes gateway runand leave it running for several daysExpected Behavior
When a tool call returns "Invalid or expired session", the MCP client should automatically re-establish the session using the still-valid credentials and retry the call transparently.
Actual Behavior
Every subsequent tool call on the affected MCP server fails. The server remains broken until the gateway is manually restarted.
Relevant log entries:
Affected Component
Environment
Root Cause Analysis
The MCP client in
tools/mcp_tool.pydoes not treat "Invalid or expired session" as a reconnect trigger. The 3-attempt retry logic runs only at gateway startup, not on mid-session failures. Session expiry during normal operation falls through as a plain tool error with no recovery path.Proposed Fix
In the MCP tool call handler, catch "Invalid or expired session" errors, tear down and re-initialize the MCP client for that server, then retry the original call once. The OAuth token remains valid — only the transport-layer session needs to be re-established.
Willing to submit a PR?
Not at this time, but the fix scope is well-defined above.