Skip to content

perf(tui): stop slow/dead MCP servers from freezing TUI startup#35273

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-819640c9
May 30, 2026
Merged

perf(tui): stop slow/dead MCP servers from freezing TUI startup#35273
teknium1 merged 1 commit into
mainfrom
hermes/hermes-819640c9

Conversation

@teknium1

@teknium1 teknium1 commented May 30, 2026

Copy link
Copy Markdown
Contributor

Summary

TUI startup no longer freezes on a configured-but-dead MCP server. Discovery moves to a background daemon thread so gateway.ready fires immediately (~7,500ms → ~115ms with a dead server), with a bounded first-build wait so reachable servers still land in the tool snapshot.

Salvage of #35245 by @kshitijk4poor, cherry-picked onto current main (authorship preserved).

Changes

  • tui_gateway/entry.py: run discover_mcp_tools() in a daemon thread; expose wait_for_mcp_discovery(timeout=0.75); log background failures with exc_info=True instead of swallowing.
  • tui_gateway/server.py: _make_agent briefly joins discovery (bounded) before the first agent build, since AIAgent snapshots its tool list once and never re-reads. /reload-mcp now actually rebuilds the cached agent's tool snapshot (the old hasattr(agent, "refresh_tools") guard was dead code — no such method exists on main), mirroring gateway/run.py::_execute_mcp_reload.
  • hermes_cli/banner.py: lazy rich / prompt_toolkit imports (~45ms off the TUI critical path; Console annotation under TYPE_CHECKING).
  • tests/tui_gateway/test_wait_for_mcp_discovery.py: 4 new tests (no-thread no-op, finished no-op, fast-join, hung-thread bounded).

Validation

Before After
spawn → gateway.ready (dead MCP server) ~7,500 ms ~115 ms
First-prompt agent build (no MCP) n/a ~1µs no-op
import tui_gateway.server ~115 ms ~69 ms
  • refresh_tools confirmed absent from main — old /reload-mcp path was dead, only /new recovered late servers.
  • discover_mcp_tools is _lock-guarded/idempotent — safe for thread + reload.
  • tests/tui_gateway/ (94) + tests/hermes_cli/test_banner.py pass locally.

Infographic

chalkboard

The 'summoning hermes…' phase blocked on gateway.ready, which ran MCP
tool discovery inline. Any configured-but-unreachable MCP server burned
its full connect-retry backoff (1+2+4s ≈ 7s) before the composer
appeared — startup went from instant to ~7.5s of dead air for anyone
with a down stdio/http server in mcp_servers.

Move discovery into a background daemon thread so gateway.ready fires
immediately; tools register into the shared registry as servers connect,
and the agent isn't built until the first prompt. Measured spawn→ready:
~7500ms → ~115ms (dead twozero_td server in config).

Also drop rich.console + prompt_toolkit off banner.py's import path
(lazy-imported inside cprint/build_welcome_banner). tui_gateway.server
imports banner only to reach the lightweight prefetch_update_check
helper; the eager rich/pt imports added ~45ms before gateway.ready for
no benefit. tui_gateway.server import: ~115ms → ~69ms.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-819640c9 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9506 on HEAD, 9503 on base (🆕 +3)

🆕 New issues (1):

Rule Count
invalid-assignment 1
First entries
tests/tui_gateway/test_wait_for_mcp_discovery.py:70: [invalid-assignment] invalid-assignment: Object of type `Thread` is not assignable to attribute `_mcp_discovery_thread` of type `None`

✅ Fixed issues: none

Unchanged: 4929 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit cbf851a into main May 30, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-819640c9 branch May 30, 2026 09:53
@alt-glitch alt-glitch added type/perf Performance improvement or optimization comp/tui Terminal UI (ui-tui/ + tui_gateway/) tool/mcp MCP client and OAuth P2 Medium — degraded but workaround exists labels May 30, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Salvage of #35245 (by @kshitijk4poor) cherry-picked onto current main. Competes with #32811 (background MCP discovery via daemon thread). Both address the same startup blocking pattern first fixed in #16899 (module-level side effect) and #19326 (async discovery).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tui Terminal UI (ui-tui/ + tui_gateway/) P2 Medium — degraded but workaround exists tool/mcp MCP client and OAuth type/perf Performance improvement or optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants