Skip to content

fix(pi): await MCP bridge bootstrap in before_agent_start#490

Merged
mksglu merged 2 commits into
mksglu:nextfrom
ousamabenyounes:fix/pi-mcp-bridge-race
May 9, 2026
Merged

fix(pi): await MCP bridge bootstrap in before_agent_start#490
mksglu merged 2 commits into
mksglu:nextfrom
ousamabenyounes:fix/pi-mcp-bridge-race

Conversation

@ousamabenyounes

Copy link
Copy Markdown
Collaborator

What

Make the Pi adapter's before_agent_start handler await _mcpBridgeReady before returning so the LLM call dispatched right after the handler sees the ctx_* tools in Pi's registry.

Why — race reported in #472 (comment 4412197109)

Pi subagents may miss ctx_* tools due to an async registration race.

Each subagent starts a fresh pi --mode json -p --no-session process. context-mode's Pi adapter registers ctx_* tools through the MCP bridge, but that bootstrap runs fire-and-forget via _mcpBridgeReady. The child process can start the prompt before registration finishes, so the initial tool registry may not include ctx_execute, ctx_batch_execute, etc.

This does not look like a frontmatter allowlist issue; the agents already include the ctx tool names. context-mode likely needs to await bridge setup, or block first agent start until _mcpBridgeReady settles.

The reporter's diagnosis is correct. In src/adapters/pi/extension.ts the bridge bootstrap was kicked off fire-and-forget at the bottom of piExtension(pi):

_mcpBridgeReady = bootstrapMCPTools(pi, serverBundle).then(...)

For long-lived Pi sessions the race usually closes before the user types a prompt, so the bug stayed invisible. For subagents (pi --mode json -p --no-session) the gap between extension load and the first before_agent_start is too small for the spawn → initializetools/listpi.registerTool round-trip, so the first (and often only) prompt of a subagent goes out with an empty ctx_* registry and the routing block (~2.5K tokens) becomes dead weight — the LLM is told to call ctx_execute / ctx_search / etc. but Pi has not yet registered them.

Fix

Add a single await _mcpBridgeReady at the top of the before_agent_start handler. The promise resolves on bootstrap success and on failure (matching the existing best-effort contract — failures are logged to stderr but never propagated), so a missing or broken bundle still cannot break agent start.

Coverage added

tests/pi-extension.test.tsPi MCP bridge (#426) > pi-extension.ts wiring (#426 regression guard):

  • before_agent_start awaits MCP bridge bootstrap so ctx_* are registered before LLM call
    • Calls registerPiExtension, opens a session, then immediately triggers before_agent_start (the same race a fresh subagent process exhibits).
    • Pre-asserts pi.registerTool has been called 0 times (race window confirmed open — guards against the test passing for the wrong reason if the bridge ever stops racing).
    • Post-asserts the canonical ctx_* set (ctx_execute, ctx_search, ctx_index, ctx_batch_execute, ctx_fetch_and_index) has been registered through pi.registerTool by the time the handler resolves.

Verified the test fails on next without the fix (race confirmed) and passes with the fix.

Test plan

  • npm run typecheck — passes
  • npx vitest run tests/pi-extension.test.ts — 47 passed (1 new + 46 existing, no flakes)
  • npm test — full suite green (2474 passed, 25 pre-existing skips). The two [vitest-pool]: Timeout terminating forks worker notices on kiro-hooks are pre-existing on next and unrelated to this change (16/16 tests in that file pass when run alone).
  • Pre-built bundles regenerated by npm run build reverted before commit per CONTRIBUTING.

Files touched

  • src/adapters/pi/extension.ts — single await _mcpBridgeReady + comment explaining the subagent race.
  • tests/pi-extension.test.ts — race regression test added to the existing pi-extension.ts wiring (#426 regression guard) block (no new test file per CONTRIBUTING).

ousamabenyounes and others added 2 commits May 9, 2026 10:40
Pi subagents (`pi --mode json -p --no-session`) spawn a fresh process
that loads the context-mode extension and immediately fires
`before_agent_start` to dispatch the LLM call. The MCP bridge
bootstrap (spawn server.bundle.mjs → initialize → tools/list →
pi.registerTool × N) was fire-and-forget via `_mcpBridgeReady`, so
the LLM call went out with an empty ctx_* tool registry and the
routing block (~2.5K tokens) became dead weight — the LLM was told
to call `ctx_execute` / `ctx_search` / etc. but Pi had not yet
registered them.

Awaiting `_mcpBridgeReady` at the top of the `before_agent_start`
handler closes the race: by the time the handler resolves, the
bridge has settled (success or failure — failures are still logged
to stderr but never propagated, matching the original best-effort
contract) and the registry contains the ctx_* tools.

Reported in mksglu#472 (comment 4412197109).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…up flake

The default 10s vitest hookTimeout is exhausted on Windows runners by
files whose afterAll loops over many better-sqlite3 handles —
tests/session/session-pipeline.test.ts is the canonical example. Local
runs finish in ~500ms but Windows fork-pool contention plus native
addon cleanup can stretch past 10s, surfacing as

  FAIL tests/session/session-pipeline.test.ts
  Error: Hook timed out in 10000ms.

Match the 30s testTimeout already in this config so the cleanup window
matches the work window — same envelope better-sqlite3 needs for tests
themselves. No change to local-dev wall time (only fires when a hook
exceeds 10s, which is a flake-level event).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mksglu mksglu merged commit d66f767 into mksglu:next May 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants