Skip to content

fix(mcp): retry stale pipe transport failures as session-expired#21289

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-b625eb32
May 7, 2026
Merged

fix(mcp): retry stale pipe transport failures as session-expired#21289
teknium1 merged 2 commits into
mainfrom
hermes/hermes-b625eb32

Conversation

@teknium1

@teknium1 teknium1 commented May 7, 2026

Copy link
Copy Markdown
Contributor

Salvage of #19935 by @subtract0 onto current main. Resolved a tiny rebase conflict against the recent "session terminated" marker addition (kept both).

Summary

Extends _SESSION_EXPIRED_MARKERS with six stdio/anyio stale-pipe variants so they trigger the existing reconnect-and-retry-once path rather than bubbling up as a one-shot failure. Covers ClosedResourceError, closed resource, transport is closed, connection closed, broken pipe, and end of file. Pairs directly with the stale-session recovery infrastructure from #13383.

Changes

  • tools/mcp_tool.py: +6 markers to _SESSION_EXPIRED_MARKERS
  • tests/tools/test_mcp_tool_session_expired.py: +11 lines of regression coverage
  • scripts/release.py: AUTHOR_MAP entry for @subtract0

Partial fix for #19417

This covers the stale-pipe retry gap for callers that format exceptions with a message (str(exc) contains the type name or a description). It does NOT fully close #19417 because _is_session_expired_error returns False for empty-string exception messages, and bare anyio ClosedResourceError() instances have str(exc) == ''. That's a separate fix (match on type(exc).__name__ too) and deserves its own PR. Leaving #19417 open.

Validation

Result
scripts/run_tests.sh tests/tools/test_mcp_tool*.py 201/201 passed (+2 new tests)
E2E: 12 stale-session/transport error messages classified correctly all 12 detected
E2E: 5 unrelated error messages all 5 correctly NOT flagged (no false positives)
Case-insensitive matching for CLOSEDRESOURCEERROR variants works

Closes #19935. @subtract0's authorship preserved via rebase-merge.

Alexander Monas and others added 2 commits May 7, 2026 06:30
Treat closed-resource, closed-transport, broken-pipe, and EOF MCP failures as stale session equivalents so the existing reconnect/retry-once path can recover. Add regression coverage for the stale-pipe marker variants.\n\nChecks:\n- python -m py_compile tools/mcp_tool.py tests/tools/test_mcp_tool_session_expired.py\n- python -m pytest tests/tools/test_mcp_tool_session_expired.py -q -o addopts=\n- selected secret scan over touched files
@teknium1 teknium1 merged commit f481395 into main May 7, 2026
9 of 11 checks passed
@teknium1 teknium1 deleted the hermes/hermes-b625eb32 branch May 7, 2026 13:32
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-b625eb32 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 7531 on HEAD, 7531 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 3954 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Medium — degraded but workaround exists tool/mcp MCP client and OAuth type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: MCP tool calls fail with ClosedResourceError and empty error message

2 participants