Skip to content

Daemon mode (qwen serve): proposal & open decisions #3803

@wenshao

Description

@wenshao

A complete daemon design proposal for Qwen Code, organized as a 6-chapter design series (simplified from original 14 chapters). This issue tracks implementation; the series is the source of truth.

Status (2026-05-15):

  • Stage 1 merged — PR feat(cli,sdk): qwen serve daemon (Stage 1) #3889 (merge commit 870bdf2a, +12993/-194, 84 commits)
  • Stage 1.5a §02 1 daemon = 1 workspace — PR refactor(serve): 1 daemon = 1 workspace (#3803 §02) #4113 (merge commit 790f2d04, +2051/-434) MERGED 2026-05-15
  • Stage 1.5b A1 — createInMemoryChannel helper — PR refactor(serve): extract createInMemoryChannel helper (#4156 A1) #4160 MERGED 2026-05-15 (ff63da26, +315/-40) — extracted as reusable primitive; remaining Mode A work parked
  • 🔧 Mode B prioritized (2026-05-15) — see qwen-code-daemon-design PR #131 MERGED roadmap reset: P0 production must-haves + daemon-side state CRUD; P1 typed event contract + client adapters behind flag; P2 deferred Mode A + remote-control
  • 🔧 Wave 1 started (2026-05-16) — PR #4191 feat(serve): capability registry + protocol versions (Wave 1 PR 2, doudouOUC [codex], +170/-39) OPEN — first implementation PR from Issue proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 25-PR plan; closes chiga0 finding 5 FIXME (replaces hard-coded STAGE1_FEATURES array with additive registry + adds /capabilities.protocolVersions)
  • ⏳ Pending: 9 production must-haves + 1.5c control-plane parity (10+ routes) + 1.5-prereq typed event contract + client adapters (TUI/channels/web/IDE behind flag) + Mode A revisit + remote-control revisit + Stage 2

Overall main-line progress ~30-35%. Issue was auto-closed by PR #4113's "closes #3803 §02" trigger but only §02 (process model decision) was implemented — reopened to track remaining stages.

Architecture (post PR #4113, with PR#131 corrections)

1 daemon process = 1 workspace × N sessions multiplexed:

qwen serve (bound to cwd = single workspace)
├─ Express HTTP front + bearer auth + Host allowlist
├─ EventBus (per-session fan-out + ring replay + Last-Event-ID reconnect)
│  └─ INTERNAL fan-out primitive — exposed to clients via HTTP/SSE projection,
│     NOT subscribed directly by external clients
└─ qwen --acp child (workspace bound)
   └─ QwenAgent.sessions: Map → {sess-1, sess-2, sess-3}
      └─ per-session Config / ToolRegistry / McpClientManager / FileReadCache

qwen serve startup binds cwd to a single workspace; daemon embeds a single qwen --acp child; N sessions multiplex via QwenAgent.sessions: Map<sessionId, Session> (packages/cli/src/acp-integration/acpAgent.ts:194). Multi-workspace deployment = multiple daemon processes (systemd / docker / k8s each 1 process per workspace).

Key invariants (corrected per PR#131)

  • N sessions on same daemon share: the same qwen --acp child process + workspace bound context
  • N sessions on same daemon do NOT share: Config / ToolRegistry / McpClientManager / FileReadCache — these are still created per ACP session; cross-session MCP sharing requires future pool/proxy design
  • Cross-workspace = cross-daemon process = OS process-level isolation (strongest)
  • Blast radius minimal (daemon crash only affects 1 workspace)
  • K8s cloud-native natural fit (1 pod = 1 daemon = 1 workspace)
  • WorkspaceMismatchError returned as 400 workspace_mismatch when POST /session cwd ≠ bound workspace
  • --workspace <path> CLI flag overrides process.cwd() at boot; canonicalized via canonicalizeWorkspace
  • Same-workspace Nth session cold start <200ms (attach existing qwen --acp child)

Client boundary (per PR#131)

Clients do NOT directly subscribe to in-memory EventBus. Clients connect to daemon HTTP/SSE API:

client (TUI / channels / web / IDE)
  -> DaemonSessionClient (SDK helper)
  -> POST /session/:id/{prompt,cancel,model,...} + GET /session/:id/events (SSE)
  -> daemon internal EventBus (fan-out projection)

EventBus lift (1.5-prereq finding #2) refers to extracting typed event schema + reducer + server-side fan-out primitive + transport adapter to a shared package — NOT exposing the memory object to external clients.

Dual deployment modes

Mode Command TUI Use case Priority
Mode B: Headless Daemon + HttpServer qwen serve [--port N] Server / container / remote machine / K8s pod / unified client runtime P0 mainline
Mode A: CLI + HttpServer qwen --serve [--port N] ✅ local Parked. Wait until Mode B event/control/client contract stabilizes P2 parking lot

Both modes share the same wire protocol (Express 5 + ACP NDJSON over HTTP+SSE). Mode B prioritized (2026-05-15 decision) because Mode A value depends on 1.5c daemon-side state CRUD (else remote clients in Mode A are still thin shells, creating UX gap). Ship 1.5c first → remote clients fully functional → then Mode A revisit.

Design Documents (6-chapter series)

Implementation Tracker (Mode B priority — PR#131 reset)

Priority Stage Description Effort / Status
Stage 1 Mode B headless qwen serve daemon PR #3889 MERGED 2026-05-13
Stage 1.5a §02 1 daemon = 1 workspace simplification PR #4113 MERGED 2026-05-15
createInMemoryChannel primitive Paired NDJSON channel helper (originally Mode A A1; now retained as reusable primitive) PR #4160 MERGED 2026-05-15
P0 Stage 1.5a must-haves chiga0 9 remaining must-haves — identity + lifecycle + reliability + permission per-session routing ~2 weeks (9 PRs in parallel)
P0 Stage 1.5c daemon-side state CRUD / control-plane parity (10+ routes: memory, MCP, agents, tools, approval-mode, context, supported-commands, providers, auth, init) ~1-2 weeks
P1 Stage 1.5-prereq typed event contract + shared DaemonSessionClient SDK + AcpChannel / EventBus / PermissionMediator lift to @qwen-code/acp-bridge ~1 week
P1 behind flag Stage 1.5-client adapters TUI → channels → web/debug → IDE clients onboard via DaemonSessionClient; default switch must await P0/P1 ~2-3 weeks (can run in parallel with P0/P1 pilots)
P2 deferred Stage 1.5-remote-control PR #3929 / #3930 / #3931 reposition as daemon facade reusing same contract Wait for primary clients to converge
P2 parking lot Stage 1.5b Mode A qwen --serve flag — Issue #4156 doudouOUC 3-phase plan; A1 PR #4160 ✅ MERGED; rest parked until Mode B contract stabilizes ~5-6d (after Mode B convergence)
P3 Stage 2a-2d Protocol completion + ecosystem + observability + perf ~3-4 weeks
optional Stage 2e Native in-process (remove qwen --acp child) ~1-2 weeks

Engineering Principles (PR#131 — every PR must satisfy)

Stage 1.5 is incremental migration, not big rewrite. Each PR must:

Principle Requirement
Independently mergeable Each PR self-contained with tests; main stays releasable after merge
Backward compatible Don't remove existing routes / event fields / CLI behavior; new fields additive + optional
Default off TUI / channels / IDE go behind flag or dual-stack adapter; default keeps existing path until validation passes
qwen serve unbroken Stage 1 routes and SDK behavior preserved; new capabilities surfaced via /capabilities feature tag
Gradual migration P0 must-haves / state CRUD / typed contract can run in parallel; client adapters behind flag first, then default-switch
Reversible Each client adapter independently disable-able, doesn't affect other clients or daemon
Tests-first New contract has unit tests; client adapters have smoke/e2e; old paths have regression tests

P0: Stage 1.5a chiga0 must-haves (9 remaining, ~2 weeks, 9 PRs in parallel)

P0: Stage 1.5c daemon-side state CRUD (10+ wire routes, ~1-2 weeks)

Let remote clients access daemon-side state (no longer thin shell). Reference implementation: Claude Code /agents UI Deep-Dive.

  • GET / POST /workspace/memory (read/write ~/.qwen/memory.json)
  • GET /workspace/mcp + POST /workspace/mcp/:server/restart (MCP status / restart)
  • GET / POST /workspace/agents + :agentType CRUD
  • POST /workspace/tools/:name/enable (tool allowlist)
  • POST /session/:id/approval-mode
  • GET /session/:id/context (context usage — new per PR#131)
  • GET /session/:id/supported-commands (command palette / UI affordance — new per PR#131)
  • GET /workspace/providers (provider/model runtime state — new per PR#131)
  • POST /workspace/auth/device-flow or Capability RPC (auth — new per PR#131)
  • POST /workspace/init (project init / trust state)

Compatibility: All new routes have capability tags; clients fallback to old behavior or hide UI when capability missing. Read-only first, mutation second.

P1: Stage 1.5-prereq typed contract + bridge primitives (~1 week)

Lift shared primitives to @qwen-code/acp-bridge package — reusable by serve/ + remoteControl/ + nonInteractive/:

  • Typed SessionEvent / ControlEvent — convert data: unknown to discriminated union
  • DaemonSessionClient — shared SDK helper for TUI / channels / web / IDE
  • AcpChannel interface + Transport interface
  • EventBus lift to shared package (fan-out primitive — internal to daemon, projected to clients via HTTP/SSE)
  • PermissionMediator interface + 4 policy strategies (first-responder / designated / consensus / local-only)
  • FileSystemService per-request abstraction
  • Capability registry per-session
  • Event reducer — build client view-model from daemon events (clients don't reinvent state machines)
  • Output sinks — JSONL / stream-json / dual-output become same event stream consumers
  • /capabilities protocol_versions + feature registry for graceful degradation

Compatibility: Old DaemonEvent envelope still parseable; typed union is SDK/helper layer additive enhancement.

P1 behind flag: Stage 1.5-client adapters (~2-3 weeks)

Onboard primary clients to Mode B daemon — behind flag first, default switch after P0/P1:

Client Current state Mode B direction First wave
TUI useGeminiStream direct in-process New attach-to-daemon render target; HTTP/SSE + shared reducer; TUI no longer owns runtime Wave 1 behind flag
channels packages/channels/base/AcpBridge.ts spawns own qwen --acp New daemon transport behind config flag; preserve channel routing, switch prompt/event/cancel/model to DaemonSessionClient Wave 1 behind flag
web/debug PR #4132 /demo OPEN Thinnest POST+SSE client validation surface; surface event schema / reconnect / permission UI issues first Wave 1 behind flag
IDE VSCode companion spawns own qwen --acp New daemon transport behind flag; cover session/prompt/events/cancel/model first, then file/context/control Wave 2 behind flag
JSONL / stream-json / dual-output CLI internal output adapters Daemon event sinks; consume typed events instead of driving runtime Parallel with contract
remote-control PR #3929-3931 draft Deferred — primary clients converge first, then remote-control as daemon facade P2 deferred

P2 deferred: Stage 1.5b Mode A — Issue #4156 doudouOUC 3-phase plan

Why parked: Mode A value is "local TUI super-client + remote clients sharing same daemon". Without 1.5c, remote clients in Mode A are still thin shells. Without 1.5-prereq typed event contract, TUI adapter has no shared reducer to consume. Ship Mode B contract first → Mode A revisit cleanly.

P3: Stage 2 (~3-4 weeks, split 2a-2d)

  • 2a Protocol Completion (~1 week): WebSocket bidi + /health?deep=1 + POST /ext/:method + permission policy schema + Reverse RPC 5 Client Capability classes (editor / clipboard / browser / notification / file_picker)
  • 2b Ecosystem (~1 week): OpenAPI codegen + multi-token + HttpTransport SDK adapter
  • 2c Observability (~3-5 days): Prometheus metrics + mDNS + --max-sessions guard rail
  • 2d Perf eval + docs (~3-5 days)
  • 2e (optional) Native in-process (~1-2 weeks): Remove qwen --acp child bridge; resolve acpAgent.ts:601 loadSettings(cwd) cross-workspace pollution

After Stage 2 the daemon protocol surface is locked; external integrators can build on top.

Recommended 4-week execution timeline (PR#131 finalized)

Week 0 (now, 2026-05-15)
└─ 9 × 1.5a must-have PRs launched in parallel
   ★ #2 loadSession HTTP (3-4d, biggest user pain)
   + #1 sessionScope + #3 pair tokens + per-session permission routing (blockers)
   + #4-7 reliability + #8-9 ergonomics (small PRs, 1-2d each)

Week 1
├─ must-haves PRs review + merge
├─ 1.5c daemon-side state CRUD (10+ routes) PR opens
└─ web/debug client (PR #4132) starts using daemon contract behind flag

Week 2
├─ must-haves merge complete
├─ 1.5c merge → Mode B remote clients have full state access
└─ 1.5-prereq typed event contract + DaemonSessionClient PR opens

Week 3
├─ 1.5-prereq merge → contract stabilized
├─ Wave-1 client adapters (TUI / channels / web/debug) behind flag
└─ Wave-2 client adapter (IDE) preparation

Week 4
├─ IDE adapter behind flag
├─ remote-control revisit (after primary clients converge)
└─ Mode A revisit (after Mode B contract stable)

External Reference Architecture (out of project scope)

Designed but not on qwen-code's roadmap; meant for commercial platforms / k8s operators / cloud vendors:

  • Orchestrator (~1.5-2w reference) qwen-coordinator + multi-daemon spawn / route / cleanup / aggregate API
  • Multi-tenancy (~3-4w reference) Tenant abstraction / OIDC + Bearer + mTLS / Quota engine / Audit log
  • Shell sandbox (~4-6w reference) ShellSandbox interface + 4 local + 4 remote impls
  • SaaS deployment (~3-6w reference) k8s native / Postgres + Redis + S3 / multi-region scheduling

Key design decisions matrix (PR#131 corrections)

# Decision Choice
1 Session sharing across clients Default sessionScope: 'single'; Stage 1.5 #1 allows per-request override
2 State / process model 1 daemon = 1 workspace × N session (PR #4113 simplification)
3 MCP server lifetime Currently per-session (each ACP newSession() creates new Config, ToolRegistry owns its own McpClientManager); cross-session MCP sharing requires future pool/proxy — corrected from previous "per-daemon shared" claim
4 FileReadCache sharing Strictly per-session (no cross-session leak)
5 Permission flow Reuse PR #3723 + daemon as 4th mode + SSE permission_request + first-responder vote + per-session routing POST /session/:id/permission/:requestId (PR#131 correction)
6 Multi-client concurrency Same-session prompts FIFO + event fan-out + any client can answer permission
7 Deployment mode Mode B mainline (P0); Mode A parking lot (P2) — revisit after Mode B contract stabilizes

See §02 decision matrix for details.

Related Issues / PRs


Runtime locality / environment contract (2026-05-15 update from chiga0 comment 4458840712)

Mode B daemon is the runtime owner. MCP / Skills / shell / LSP / tool execution / provider auth / file access all evaluated on the daemon host / pod, not the client machine. This deployment contract is documented in qwen-code-daemon-design commit 36c9927 (codeagents):

  • §04 §五 Runtime locality / environment contract — new section with 5 concrete implications + reverse RPC scope clarification
  • §05 §八 生产部署 best practice — deny-by-default egress + credentials locality + deployment checklist
  • §06 §三 1.5c — adds 2 new diagnostic routes (GET /workspace/preflight + GET /workspace/env) + actionable failure detail requirement for GET /workspace/mcp / GET /workspace/skills (no "silent tool loss")

Concrete implications (must be documented for remote clients)

# Implication Example
1 stdio MCP servers spawn on daemon host daemon host needs node / uv / python / docker / cloud CLIs + env vars / secrets / files
2 HTTP/SSE MCP servers reached from daemon host daemon host/pod needs outbound egress to MCP endpoints + downstream APIs
3 Local resources (localhost / Unix sockets / volumes / kubeconfig / SSH agent / browser profile) All daemon-host local, NOT client
4 Personal skills (~/.qwen/skills) / project skills (.qwen/skills) / extension skills Must exist on daemon filesystem; client local skills NOT auto-visible
5 Locked VPC/pod without egress SaaS MCP discovery/init or tool calls will fail unless network policy allows

Reverse RPC scope (clarified)

§六 Client Capability reverse RPC 5 classes (editor / clipboard / browser / notification / file_picker) are explicit delegations to client-local resources, NOT general fallbacks for MCP/skill/shell execution. Future client-side MCP/skill fallback would need separate design.

Stage 1.5c route enhancements (PR pending)

  • GET /workspace/preflight — daemon startup + config readiness check (providers / MCP / skills / required binaries / egress detection)
  • GET /workspace/env — daemon host environment summary (available binaries / env vars masked / mount points / network reachability)
  • GET /workspace/mcp enriched to return {status, error, errorKind: missing_binary | blocked_egress | auth_env_error | init_timeout | protocol_error} per server
  • GET /workspace/skills enriched to return {loaded, error: missing_file | parse_error | required_binary_not_found} per skill

Deployment best practice

  • Deny-by-default egress + explicit allowlist (daemon only needs network surface required by configured providers / MCP / skills)
  • Credentials live on daemon host (OAuth tokens / API keys / SSH agent / kubeconfig); client credentials do NOT auto-transfer
  • Deployment checklist before exposing daemon to remote clients: install MCP runtimes, sync skills, provision secrets, configure network policy, surface preflight to client UI

Implementation Tracker — Issue #4175 (25-PR Wave Plan)

Issue #4175 by @doudouOUC (2026-05-15) is the authoritative 25-PR rollout plan for Mode B v0.16 production-ready. Stage 1.5 sub-stages in this issue body map to Wave 1-5 below; Wave 6 covers release hardening + v0.16.

Critical dependency chain

capability registry -> DaemonSessionClient -> typed events
  -> daemon-stamped clientId -> session-scoped permission
  -> mutation-gating helper -> control-plane mutation routes
  -> bridge extraction -> real MCP pool + full PermissionMediator

6 Wave breakdown

Wave Scope PRs Maps to Stage
1 Protocol foundation (no deps) baseline harness + capability registry + DaemonSessionClient skeleton + typed event schema PR 1-4 — 🔧 PR 2 OPEN (#4191) 1.5a #9 + 1.5-prereq
2 Session lifecycle + min multi-client safety per-request sessionScope + loadSession HTTP + minimal client identity + session-scoped permission PR 5-8 1.5a #1/#2/#3 (minimal)/#5
3 Read-only control plane + diagnostics read-only status routes + runtime-diagnostics + MCP resource guardrails (measurement, not full pool) PR 9-11 1.5c read-only + chiga0 diagnostics
4 Auth-gated mutation/control routes mutation gating helper + memory/agents CRUD + approval/tools/init + safe file read + file write/edit + auth device-flow PR 12-17 1.5c CRUD + file routes
5 Architecture extraction + full multi-client security bridge primitives extraction + real MCP shared pool (config-hash keyed) + pairing revocation + full PermissionMediator PR 18-20 1.5-prereq full + 1.5a #3 full
6 Release hardening + v0.16 alpha release docs + npm alpha publish + production token defaults + deployment references + v0.16 release PR 21-25 Stage 2 + release

Key sequencing decisions (from Issue #4175)

  1. Issue/docs cleanup is not a blocking PR — body itself is source of truth
  2. No full MCP shared pool in Phase 1McpClientManager deeply coupled to ToolRegistry/Config/WorkspaceContext; start with measurement + guardrails (PR 11), full pool in Wave 5 (PR 19)
  3. Protocol skeleton before CRUD — capability tags / typed events / DaemonSessionClient before broad control-plane routes
  4. Minimal client identity before mutation routes — PR 7 stamped clientId → PR 8 session-scoped permission → PR 12 mutation gate → PR 13+ CRUD
  5. Read-only first, mutation second — PR 9 read-only → PR 13+ mutation
  6. MCP pool + full PermissionMediator are Wave 5 follow-ups — guardrails (PR 11) can land early

Open questions

Moved to #4175 §Open questions — implementation-decision questions live next to the 25-PR rollout plan rather than the design tracker. The questions cover: npm alpha publish timing / loopback token default / token instance path / remote-control alignment / worktree interaction / client adapter requirement for v0.16. Update them on #4175 to keep one source of truth.

See §06 §三·一 Wave breakdown for full Wave 1-6 PR detail tables with dependencies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions