Skip to content

Optimize daemon cold start latency (2.5s → ~1.5s) #4748

@doudouOUC

Description

@doudouOUC

Background

Benchmark testing (integration-tests/cli/qwen-daemon-vs-cli-benchmark.test.ts) shows daemon cold start (boot + first session) takes ~2.5s, compared to CLI full init at ~0.7s. While daemon amortizes this cost across subsequent sessions (warm session creation is only ~21ms), the initial cold start is a UX bottleneck for first-time connections.

Cold Start Time Breakdown

0s          0.5s         1.0s         1.5s         2.0s         2.5s
|-- Node+ESM --|-- HTTP -|--- ACP spawn+relaunch ---|-- Session init ---|
   ~500ms       ~100ms     ~700ms (includes              config.initialize()
                           unnecessary relaunch)         + waitForMcpReady()
                                                         ~1200ms

Benchmark Data (darwin/arm64, Node v24.12.0)

Metric Value
CLI full init (profiler) p50=702ms
Daemon cold start p50=2,546ms
Warm session creation p50=21ms
Daemon process tree RSS 691MB (daemon=225 + ACP=213 + MCP=254)

Progress

✅ Completed (merged to daemon_mode_b_main via PR #4751)

P0-1. Skip unnecessary relaunchAppInChildProcess for ACP children

Eliminates redundant grandchild process spawn, passes --max-old-space-size directly with container-aware cgroup detection via process.constrainedMemory(), capped at 16GB. Process tree reduced from 3 processes to 2.

Estimated savings: 0.2-0.3s

P0-2. Pre-spawn ACP child at daemon boot (preheat)

Added bridge.preheat() method that calls ensureChannel() after app.listen(). ACP child is ready before the first session arrives. Fire-and-forget with fallback to lazy spawn on failure.

Estimated savings: 0.3-0.5s (parallelizes ACP spawn with HTTP readiness)

P1-3. ACP Keep-alive on idle

Added --channel-idle-timeout-ms flag. Keeps ACP child alive after last session closes, avoiding cold restart on reconnect. Default: unset = immediate kill (backward compatible).

❌ Not yet implemented

P1-4. MCP connection pool cross-session sharing

McpTransportPool already exists but scope is limited. Extending it to share MCP connections across sessions would eliminate per-session MCP discovery cost.

Not viable (investigated and rejected)

Approach Why not
Make waitForMcpReady() non-blocking First prompt needs MCP tools (Bash, Read, etc.). Removing the wait breaks tool availability. Requires protocol-level deferred-tool-push mechanism.
Eliminate double config.initialize() Bootstrap uses skipMcpDiscovery: true, skipGeminiInitialization: true so it does minimal work. Per-session init operates on a different Config object. Savings only ~0.1-0.2s.

Key Files

  • packages/acp-bridge/src/spawnChannel.ts — ACP child spawn, env vars
  • packages/acp-bridge/src/bridge.tsensureChannel(), doSpawn(), preheat()
  • packages/cli/src/gemini.tsxrelaunchAppInChildProcess decision, getNodeMemoryArgs()
  • packages/cli/src/utils/relaunch.tsrelaunchAppInChildProcess implementation
  • packages/cli/src/serve/runQwenServe.ts — daemon boot sequence
  • packages/cli/src/acp-integration/acpAgent.ts — per-session config.initialize() + waitForMcpReady()
  • integration-tests/cli/qwen-daemon-vs-cli-benchmark.test.ts — benchmark test
  • docs/e2e-tests/2026-06-03-daemon-vs-cli-benchmark-report.md — full benchmark report

🤖 Generated with Qwen Code

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions