Fix "add custom context server" modal hanging indefinitely#50085
Fix "add custom context server" modal hanging indefinitely#50085
Conversation
| worktree_store: Entity<WorktreeStore>, | ||
| cx: &AsyncApp, | ||
| ) -> Option<Self> { | ||
| const EXTENSION_COMMAND_TIMEOUT: Duration = Duration::from_secs(10); |
There was a problem hiding this comment.
Is my understanding correct that the extension could be attempting an npm install during that command execution? If that’s the case, we have to bump this timeout (maybe 30 seconds?).
There was a problem hiding this comment.
The extension gives us a command that we run ourselves. I am not aware of any Extensions that try to install something when we resolve the command, so 10 seconds should be fine
There was a problem hiding this comment.
I got that impression from here: https://github.com/akbxr/zed-mcp-server-context7/blob/c6b427cff4bc5c36df15533ea64608cb47ab81ff/src/mcp_server_context7.rs#L32 — (it’s the one that is problematic in my setup). But I’m not familiar, so I probably misinterpreted.
There was a problem hiding this comment.
No, it looks like you're right. Let's increase the timeout, as you suggested
There was a problem hiding this comment.
Bumped to 30 seconds.
When adding a new MCP server through the UI modal, the "waiting for context server..." state can hang forever without surfacing any error. There are several independent ways this can happen, all addressed here. First, `maintain_servers` resolves the configuration for every enabled server (including extension servers) through a `join_all`. Extension servers call into the extension host to resolve their command, and that call had no timeout. A single slow or stuck extension blocks the entire `join_all`, which means `maintain_servers` never returns, `populate_server_ids` never runs, and the new server never even appears in the panel — let alone starts. This is now bounded by a 30-second per-extension timeout. Second, the `servers_to_start` loop in `maintain_servers` used `?` on `create_context_server`. If any server in the loop failed to create (e.g. the process could not be spawned), the error propagated out to `available_context_servers_changed`, which only logged it. No `ServerStatusChangedEvent` was ever emitted for the failed server, so the modal subscription waiting for Running/Stopped/Error never resolved. Other servers later in the loop were also skipped. The loop now handles each failure individually: it logs the error, emits an Error status event so the modal gets notified, and continues to the next server. Third, `wait_for_context_server` (the future the modal awaits) had no timeout of its own. If no status event was ever emitted for any reason, it would wait forever. It now has a 120-second timeout that surfaces an actionable error message.
012f389 to
9aac336
Compare
When adding a new MCP server through the UI modal, the "waiting for context server..." state can hang forever without surfacing any error. This PR addresses:
In
maintain_serverswe resolve the configuration for every enabled server (including extension servers) through ajoin_all. Extension servers call into the extension host to resolve their command, and that call had no timeout. A single slow or stuck extension blocks the entirejoin_all, which meansmaintain_serversnever returns,populate_server_idsnever runs, and the new server never even appears in the panel. This is now bounded by a 30-second per-extension timeout.The
servers_to_startloop inmaintain_serversused?oncreate_context_server. If any server in the loop failed to create (e.g. the process could not be spawned), the error propagated out toavailable_context_servers_changed, which only logged it. NoServerStatusChangedEventwas ever emitted for the failed server, so the modal subscription waiting for Running/Stopped/Error never resolved. Other servers later in the loop were also skipped. The loop now handles each failure individually: it logs the error, emits an Error status event so the modal gets notified, and continues to the next server.And
wait_for_context_server(the future the modal awaits) had no timeout of its own. If no status event was ever emitted for any reason, it would wait forever. It now has a 120-second timeout that surfaces an actionable error message.Release Notes: