Skip to content

feat(mcp): add SSE transport fallback and keepalive for HTTP MCP servers#3976

Closed
airudotsh wants to merge 2 commits into
NousResearch:mainfrom
airudotsh:feat/mcp-sse-fallback
Closed

feat(mcp): add SSE transport fallback and keepalive for HTTP MCP servers#3976
airudotsh wants to merge 2 commits into
NousResearch:mainfrom
airudotsh:feat/mcp-sse-fallback

Conversation

@airudotsh

Copy link
Copy Markdown

Summary

Adds SSE transport fallback and keepalive for HTTP-based MCP servers.

Problem

Some MCP servers (e.g., Supermemory V4 on Cloudflare Workers) use the legacy SSE protocol instead of Streamable HTTP. When Hermes attempts a Streamable HTTP connection, it fails with "Session terminated" — making these servers completely unusable.

Additionally, SSE servers may close idle connections after a few minutes of silence, causing intermittent tool call failures that are hard to debug.

Solution

  1. SSE fallback: When Streamable HTTP fails, automatically falls back to the legacy SSE transport (sse_client). The import check (_MCP_SSE_AVAILABLE) is done at module load time with a graceful fallback.

  2. SSE keepalive: A background coroutine sends periodic pings (default: 60s interval) to keep the SSE connection alive. Prevents servers from closing idle connections.

  3. Updated error messages: When neither transport is available, the error message now mentions both options instead of only Streamable HTTP.

Testing

  • Used successfully with Supermemory V4 MCP server on Cloudflare Workers
  • py_compile passes on both modified files

Some MCP servers (e.g. Supermemory V4 on Cloudflare Workers) use the
legacy SSE protocol instead of Streamable HTTP. When the HTTP client
attempts a Streamable HTTP connection to such servers, it fails with
"Session terminated" or similar errors, making the MCP server unusable.

Additionally, SSE-based servers may close idle connections after a few
minutes of silence, causing tool calls to fail unpredictably.

Changes:
- Add SSE transport detection (`_MCP_SSE_AVAILABLE`) at import time
- Try Streamable HTTP first; on failure, fall back to SSE transport
- Add SSE keepalive coroutine that sends periodic pings (default: 60s)
  to prevent idle disconnect
- Update error message to mention both transport options
Combines the best of both approaches:
- Explicit SSE detection via transport: sse config or /sse URL path
  (based on work by @amiller in NousResearch#5981)
- Automatic SSE fallback when Streamable HTTP fails (original approach)
- SSE keepalive ping to prevent idle disconnect
- OAuth 2.1 PKCE support for SSE connections
- Transport type reporting in get_mcp_status() (sse/http/stdio)
- 12 new tests for SSE detection and status reporting

Routing logic in run():
  1. If _is_sse() → connect directly via SSE (skip HTTP)
  2. If plain HTTP → try Streamable HTTP, fallback to SSE on failure

This covers both use cases: servers that advertise SSE via /sse paths
AND servers where SSE is only discovered after HTTP connection fails.
Copilot AI review requested due to automatic review settings April 13, 2026 03:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds compatibility for MCP servers that only support legacy SSE by introducing SSE transport support (explicit and as a fallback) plus an SSE keepalive loop to reduce idle disconnects.

Changes:

  • Add module-level SSE availability detection and updated ImportError messaging for HTTP transports.
  • Add SSE connection path (_run_sse / _run_http_sse) with periodic keepalive pinging.
  • Extend status reporting and add tests around SSE transport detection and status output.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
tools/mcp_tool.py Adds SSE transport support, Streamable HTTP→SSE fallback logic, keepalive task, and status transport labeling.
tests/tools/test_mcp_tool.py Updates existing HTTP-unavailable test and adds new tests for SSE detection and status transport reporting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/mcp_tool.py
Comment on lines +1118 to +1123
# If explicitly SSE (config or URL path), go straight to SSE.
# Otherwise try Streamable HTTP with SSE fallback.
if self._is_sse():
await self._run_sse(config)
else:
await self._run_http(config)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCPServerTask._is_sse() is referenced here, but no such method is defined on the class. This will raise AttributeError at runtime (and will also break the newly added tests). Add an _is_sse() helper (e.g., check config.get('transport') == 'sse' or whether the parsed URL path ends with /sse, handling trailing slashes and query params) and keep the behavior consistent with the tests.

Copilot uses AI. Check for mistakes.
Comment thread tools/mcp_tool.py
await self._run_http_streamable(
url, headers, connect_timeout, _oauth_auth, sampling_kwargs
)
return

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad except Exception here will also catch asyncio.CancelledError and then proceed into the SSE fallback path, which can interfere with task cancellation/shutdown. Handle asyncio.CancelledError explicitly (re-raise) before the generic exception handler so cancellation propagates correctly.

Suggested change
return
return
except asyncio.CancelledError:
raise

Copilot uses AI. Check for mistakes.
Comment thread tools/mcp_tool.py
Comment on lines +889 to +893
# Try Streamable HTTP first
if _MCP_HTTP_AVAILABLE:
try:
await self._run_http_streamable(
url, headers, connect_timeout, _oauth_auth, sampling_kwargs

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior: Streamable HTTP now falls back to SSE on failure, but there are no tests asserting (1) _run_http calls the SSE path when Streamable HTTP raises (e.g., McpError('Session terminated')) and (2) the original exception is raised when SSE is unavailable. Adding focused unit tests around this branching will help prevent regressions.

Copilot uses AI. Check for mistakes.
Comment thread tools/mcp_tool.py
Comment on lines +2036 to +2040
# Determine transport type: sse, http, or stdio
if "url" in cfg:
_tmp = MCPServerTask(name)
_tmp._config = cfg
transport = "sse" if _tmp._is_sse() else "http"

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_mcp_status() instantiates MCPServerTask just to detect SSE vs HTTP, which pulls in asyncio primitives and duplicates runtime transport detection logic. Consider extracting the SSE detection into a small pure helper (or @staticmethod) that operates directly on the config dict/URL so status rendering doesn’t need to construct a full server task.

Copilot uses AI. Check for mistakes.
amiller added a commit to amiller/hermes-agent that referenced this pull request Apr 22, 2026
The SSE read timeout was set to the tool timeout (60s), causing
httpcore.ReadTimeout after ~60s of silence. SSE servers like Router
Teamwork and Supermemory close idle connections, resulting in
ClosedResourceError on subsequent tool calls.

Changes:
- Add _sse_keepalive() coroutine: sends session.send_ping() every 60s
  to keep the SSE stream alive (adapted from @airouz in NousResearch#3976)
- Bump sse_read_timeout from tool_timeout to 300s (5 min safety net)
- Add OAuth 2.1 PKCE support to _run_sse (matching _run_http)
- Cancel keepalive cleanly on shutdown with try/finally
- Log transport as 'SSE' instead of 'HTTP' in _discover_and_register_server

Verified: SSE connection stays alive for 90+ seconds of idle with
keepalive. Previously died at ~60s.
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists tool/mcp MCP client and OAuth labels May 2, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #5981 — same SSE transport fallback feature for MCP HTTP servers. Also related to #11647. Three competing PRs need consolidation.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #5981 — same SSE transport fallback feature for MCP HTTP servers. Also related to #11647. Three competing PRs need consolidation.

@airudotsh

Copy link
Copy Markdown
Author

Superseded by #21343. The new PR is a focused single-commit follow-up to #5981/#21323 for the remaining SSE keepalive coroutine only, rebased on current main.

@airudotsh airudotsh closed this May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Medium — degraded but workaround exists tool/mcp MCP client and OAuth type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants