fix: recover MCP stale session retries#19208
Open
mgh3326 wants to merge 3 commits into
Open
Conversation
Three follow-ups to the structured stale-transport return added in this PR: - Failed recovery (timeout / retry raised / retry returned JSON error) now calls _bump_server_error() so the circuit breaker still trips on a permanently broken transport. Before, the structured return bypassed the caller's bump and the model could hammer the tool indefinitely. - Successful retry uses _reset_server_error() instead of a raw count assignment, so the breaker_opened_at timestamp is also cleared. Without this, a server that recovered after tripping the breaker would keep a stale opened-at, miscomputing future cooldowns. - _stale_transport_error drops the op_description from the operator-visible message; the auth-error sibling already keeps its message scoped to the server, and the failing op is preserved in the surrounding log lines. Tests cover the three failure-mode bumps, the success-path reset (both count and opened-at), and document the _install_stub_server __dict__-copy contract that future contributors are most likely to trip over.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Verification
python -m ruff check tools/mcp_tool.py tests/tools/test_mcp_tool_session_expired.pypython -m pytest tests/tools/test_mcp_tool_session_expired.py -v -o 'addopts='(26 passed)python -m pytest tests/tools -v -k mcp -o 'addopts='(326 passed, 1 skipped, 4056 deselected, 3 warnings)Follow-ups (not in this PR)
_handle_auth_error_and_retryhas the same retry-on-stale-session race (mcp_tool.py:1612-1616) that this PR fixes for the session-expired path; tracked as a separate PR so the OAuth recovery scope stays reviewable on its own.Safety / non-actions