Skip to content

fix(core): add streamable HTTP SSE POST fallback#4344

Closed
cyphercodes wants to merge 2 commits into
QwenLM:mainfrom
cyphercodes:fix/mcp-spring-ai-streamable-fallback
Closed

fix(core): add streamable HTTP SSE POST fallback#4344
cyphercodes wants to merge 2 commits into
QwenLM:mainfrom
cyphercodes:fix/mcp-spring-ai-streamable-fallback

Conversation

@cyphercodes

Copy link
Copy Markdown
Contributor

Summary

  • What changed: Added a qwen-code-side compatibility fallback for Streamable HTTP MCP servers that reject the spec-compliant SSE GET with HTTP 400/405. The fallback retries the SSE opening request with POST and logs a warning that the server is spec-divergent.
  • Why it changed: Spring AI 1.1.x MCP servers can reject the SDK's Streamable HTTP SSE GET, which causes Qwen Code to report transport errors even though normal Streamable HTTP requests can work.
  • Reviewer focus: Please verify the fallback is limited to method-related 400/405 responses, leaves 5xx/network failures unchanged, and stays in Qwen Code's wrapper layer without modifying the upstream MCP SDK.

Validation

  • Commands run:
    npm ci --ignore-scripts
    cd packages/core && npx vitest run src/tools/mcp-client.test.ts
    npm run typecheck --workspace=packages/core
    npm run lint --workspace=packages/core
    npm run build --workspace=packages/core
    npx prettier --check packages/core/src/tools/mcp-client.ts packages/core/src/tools/mcp-client.test.ts
    git diff --check
  • Prompts / inputs used: N/A — unit-level transport wrapper behavior.
  • Expected result: Streamable HTTP SSE GET responses with HTTP 400/405 are retried once with POST; 5xx responses are not retried.
  • Observed result: Focused unit tests passed and covered 400 fallback, 405 fallback, and non-retry for 5xx.
  • Quickest reviewer verification path: Run cd packages/core && npx vitest run src/tools/mcp-client.test.ts.
  • Evidence (output, logs, screenshots, video, JSON, before/after, etc.):
    ✓ src/tools/mcp-client.test.ts (39 tests) 567ms
    Test Files  1 passed (1)
    Tests       39 passed (39)
    
    > @qwen-code/qwen-code-core@0.15.11 typecheck
    > tsc --noEmit
    
    > @qwen-code/qwen-code-core@0.15.11 lint
    > eslint . --ext .ts,.tsx
    
    > @qwen-code/qwen-code-core@0.15.11 build
    > node ../../scripts/build_package.js
    Successfully copied files.
    
    Checking formatting...
    All matched files use Prettier code style!
    

Scope / Risk

  • Main risk or tradeoff: Servers that reject GET with 400/405 incur one extra failed round trip before the fallback. For 405, if the fallback POST is also rejected, the original no-SSE behavior is preserved.
  • Not covered / not validated: No live Spring AI server was available in this workspace; validation uses focused unit tests around the transport wrapper fetch behavior.
  • Breaking changes / migration notes: None.

Testing Matrix

🍏 🪟 🐧
npm run N/A N/A
npx N/A N/A
Docker N/A N/A N/A
Podman N/A N/A N/A
Seatbelt N/A N/A N/A

Testing matrix notes:

  • Tested on Linux in the repository workspace with Node 22.22.2/npm 10.9.7. No OS-specific code paths were changed.

Linked Issues / Bugs

Fixes #4326

method: 'POST',
headers: retryHeaders,
};
delete retryInit.body;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Wrap the retry fetchImpl in a try/catch and cancel the original response body before the retry.

Two robustness improvements:

  1. Cancel the 400/405 body before retrying — the error response body is never useful to the caller. Cancelling it early frees the TCP connection and simplifies post-retry cleanup.
  2. Add try/catch for network errors — if fetchImpl(url, retryInit) throws (DNS failure, TCP reset, TLS error), the original response.body currently leaks. The catch ensures it's always cancelled.
Suggested change
delete retryInit.body;
await response.body?.cancel();
let retryResponse: Response;
try {
retryResponse = await fetchImpl(url, retryInit);
} catch (err) {
throw err;
}
if (retryResponse.ok) {
return retryResponse;
}
if (response.status === 405) {
await retryResponse.body?.cancel();
return response;
}
return retryResponse;

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

Authorization: 'derp',
});
});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Add test coverage for POST retry failure paths.

The three existing tests cover happy paths (400→POST ok, 405→POST ok, 5xx no-retry), but neither "POST retry also fails" branch is tested. These paths contain the most nuanced logic — asymmetric body cancellation and different return-value semantics for 400 vs 405.

Suggested additional tests:

  1. GET 400, POST retry also fails (e.g. 503) — verify the retry's response is returned and bodies are handled correctly
  2. GET 405, POST retry also fails (e.g. 400) — verify the original 405 response is returned (not the retry's response)

These pin down the asymmetric failure semantics and guard against regressions in the body-cancellation logic.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

@cyphercodes

Copy link
Copy Markdown
Contributor Author

Updated this PR to address the Streamable HTTP fallback review feedback.

Changes:

  • Cancel the original 400/405 SSE GET response body before retrying with POST, including retry-throw paths.
  • Added regression coverage for:
    • GET 400 → POST retry also fails: returns retry response and cancels original body.
    • GET 405 → POST retry also fails: returns original response and cancels retry body.
    • POST retry throws: original body is still cancelled.

Verification:

  • cd packages/core && npx vitest run src/tools/mcp-client.test.ts
  • npx prettier --check packages/core/src/tools/mcp-client.ts packages/core/src/tools/mcp-client.test.ts
  • npx eslint packages/core/src/tools/mcp-client.ts packages/core/src/tools/mcp-client.test.ts
  • npm run typecheck --workspace=packages/core
  • npm run build --workspace=packages/core
  • git diff --check

const retryInit: RequestInit = {
...init,
method: 'POST',
headers: retryHeaders,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Keep the original 405 response body intact until you know it will not be returned. Right now this cancels the original GET response before the POST retry; if that retry also fails, the 405 branch returns the same original response with its body already canceled, so downstream error handling cannot read the original diagnostic body.

Suggested change
headers: retryHeaders,
const retryResponse = await fetchImpl(url, retryInit);
if (retryResponse.ok) {
await response.body?.cancel();
return retryResponse;
}
if (response.status === 405) {
await retryResponse.body?.cancel();
return response;
}
await response.body?.cancel();
return retryResponse;

— gpt-5.5 via Qwen Code /review

@LaZzyMan LaZzyMan left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Closing this PR after a deeper read of the MCP Streamable HTTP spec and the failure path on Spring AI 1.1.5. The investigation in #4326 was thorough and the symptom is real — but the root cause is in Spring AI, and the wrapper this PR adds doesn't actually fix the case it targets. Apologies for the back-and-forth; my earlier "if GET fails, retry POST" suggestion on #4326 was wrong on the spec, and the better response now is to escalate upstream while keeping our client spec-pure.

Why this isn't the right shape

The MCP Streamable HTTP spec splits the transport into two distinct channels with different roles:

  • POST channel (required) — every JSON-RPC request goes here, with Accept: application/json, text/event-stream and a JSON-RPC body. Each POST returns one response; that response can be application/json or a text/event-stream containing the events that respond to that request. This is where tool calls happen.
  • GET channel (optional) — opened with GET + Accept: text/event-stream, used for server-initiated messages that aren't tied to any client request. The spec is explicit: the server MUST either return 200 + text/event-stream or HTTP 405 Method Not Allowed. 405 is the spec-defined "this server does not offer GET SSE" signal, and the SDK's _startOrAuthSse already handles 405 gracefully by silently returning and proceeding POST-only.

Spring AI 1.1.5's WebMvcStreamableServerTransportProvider returns 400 on this GET, which is a server-side spec violation (it should be 405). That 400 is the only thing breaking — the POST channel works, tools work, the user impact in #4326 is log noise plus a reconnect loop, not lost functionality.

A "POST retry for the SSE-opening request" doesn't exist anywhere in the spec — POST is the JSON-RPC request channel, not an alternative way to open the optional listening channel. The bodiless POST this wrapper sends (no Content-Type, no JSON-RPC body, Accept: text/event-stream) doesn't match Spring AI's @PostMapping(consumes = APPLICATION_JSON_VALUE) controller and will typically come back as 415 Unsupported Media Type — so for the actual Spring AI case the PR targets, the retry fails and the SDK still throws StreamableHTTPError, just with status 415 instead of 400. If the retry ever did return 200 with a non-text/event-stream body (proxy / unrelated controller), the SDK would hand it to its SSE parser and produce silent failures.

What I'd suggest instead

If a client-side compat shim is still wanted while Spring AI is being fixed, the minimal spec-compliant version is to translate the 400 into a 405 on the SSE-opening GET so the SDK's existing graceful-degradation path runs. Roughly:

function createSseGetCompatFetch(baseFetch: FetchLike): FetchLike {
  return async (url, init) => {
    const response = await baseFetch(url, init);
    const isSseGet =
      (init?.method ?? 'GET').toUpperCase() === 'GET' &&
      new Headers(init?.headers).get('accept')?.includes('text/event-stream');
    if (isSseGet && response.status === 400) {
      await response.body?.cancel();
      return new Response(null, { status: 405, statusText: 'Method Not Allowed (translated from 400 for spec-divergent server)' });
    }
    return response;
  };
}

About 10 lines including the warning log, no SSE parser hazard, no fake handshake, and the moment Spring AI returns 405 (as the spec requires) the shim becomes a no-op naturally.

Right escalation

The durable fix belongs in Spring AI — please open an issue against spring-projects/spring-ai describing the 400 vs 405 divergence on WebMvcStreamableServerTransportProvider. If you'd like, we can carry the small translation shim above as a follow-up PR with a TODO pointing at the Spring AI issue so it gets removed once upstream lands the fix.

Thanks for the careful trace in #4326 and the work on this PR — the symptom is genuinely worth chasing, just at a different layer.

@LaZzyMan

Copy link
Copy Markdown
Collaborator

Closing per the review above — the root cause is in Spring AI's WebMvcStreamableServerTransportProvider returning 400 on the SSE-opening GET when the spec requires 405. The wrapper in this PR doesn't address that on the actual Spring AI case (the bodiless POST retry will 415 against their @PostMapping(consumes = APPLICATION_JSON_VALUE) controller). If you'd like to take a stab at the 10-line 400→405 translation shim sketched in the review, that would be very welcome. Thanks again for the investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP Streamable HTTP transport incompatible with Spring AI servers - GET method not supported

3 participants