fix(mcp): clear stale thread interrupt before MCP discovery#21276
Merged
Conversation
Fixes #9930 When an agent session is interrupted (Ctrl+C or gateway timeout), the current thread's interrupt flag is set in _interrupted_threads. asyncio executor threads are pooled and reused across sessions, so a thread that carried an interrupt flag from a prior session will immediately cancel any new asyncio work dispatched to it — including MCP server discovery. Fix: in register_mcp_servers(), temporarily clear the interrupt flag on the current thread before running _discover_all(), then restore it afterward in a finally block so the original interrupt state is not lost.
Contributor
🔎 Lint report:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #10287 by @AJV20 onto current main. Cherry-picked clean.
Summary
Clears the calling thread's stale interrupt flag before running MCP server discovery so discovery isn't immediately cancelled by interrupt state left over from a prior agent session.
Root cause
tools.interrupttracks interrupts per thread ident in_interrupted_threads. asyncio's executor thread pool reuses threads across sessions. When a prior agent session set its thread's interrupt flag (via Ctrl+C, gateway timeout, etc.) and the thread got pooled and later reused to runregister_mcp_servers, the FIRST poll iteration inside_run_on_mcp_loopseesis_interrupted() == True, cancels the discovery future, and raisesInterruptedError. MCP discovery never runs, so MCP tools never get registered, so they silently don't appear in subsequent sessions.Fix
Around
_run_on_mcp_loop(_discover_all(), timeout=120)inregister_mcp_servers: snapshot the current thread's interrupt state, temporarily clear it for the duration of discovery, restore on exit viafinally. The user's actual interrupt semantics (if any) are preserved.Changes
tools/mcp_tool.py: +13 / -1 around the discovery call sitescripts/release.py: AUTHOR_MAP entry for @AJV20Note on #9930 reference
The original PR header says "Fixes #9930", but #9930 is a DIFFERENT bug (Python 3.11+
CancelledErrorescapingexcept ExceptioninMCPServerTask.run(), also still live on main). Both bugs are real and independent — this PR only fixes the stale-interrupt path. Leaving #9930 open for a follow-up.Validation
scripts/run_tests.sh tests/tools/test_mcp_tool*.pyregister_mcp_serverswith the patch_run_on_mcp_loopdirectly with no guardInterruptedError: User sent a new message— confirms bug is realCloses #10287 (not #9930). @AJV20's authorship preserved via rebase-merge.