You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A complete daemon design proposal for Qwen Code, organized as a 6-chapter design series (simplified from original 14 chapters). This issue tracks implementation; the series is the source of truth.
Overall main-line progress ~30-35%. Issue was auto-closed by PR #4113's "closes #3803 §02" trigger but only §02 (process model decision) was implemented — reopened to track remaining stages.
Architecture (post PR #4113, with PR#131 corrections)
1 daemon process = 1 workspace × N sessions multiplexed:
qwen serve (bound to cwd = single workspace)
├─ Express HTTP front + bearer auth + Host allowlist
├─ EventBus (per-session fan-out + ring replay + Last-Event-ID reconnect)
│ └─ INTERNAL fan-out primitive — exposed to clients via HTTP/SSE projection,
│ NOT subscribed directly by external clients
└─ qwen --acp child (workspace bound)
└─ QwenAgent.sessions: Map → {sess-1, sess-2, sess-3}
└─ per-session Config / ToolRegistry / McpClientManager / FileReadCache
qwen serve startup binds cwd to a single workspace; daemon embeds a single qwen --acp child; N sessions multiplex via QwenAgent.sessions: Map<sessionId, Session> (packages/cli/src/acp-integration/acpAgent.ts:194). Multi-workspace deployment = multiple daemon processes (systemd / docker / k8s each 1 process per workspace).
Key invariants (corrected per PR#131)
N sessions on same daemon share: the same qwen --acp child process + workspace bound context
N sessions on same daemon do NOT share: Config / ToolRegistry / McpClientManager / FileReadCache — these are still created per ACP session; cross-session MCP sharing requires future pool/proxy design
Cross-workspace = cross-daemon process = OS process-level isolation (strongest)
Blast radius minimal (daemon crash only affects 1 workspace)
K8s cloud-native natural fit (1 pod = 1 daemon = 1 workspace)
WorkspaceMismatchError returned as 400 workspace_mismatch when POST /session cwd ≠ bound workspace
--workspace <path> CLI flag overrides process.cwd() at boot; canonicalized via canonicalizeWorkspace
Clients do NOT directly subscribe to in-memory EventBus. Clients connect to daemon HTTP/SSE API:
client (TUI / channels / web / IDE)
-> DaemonSessionClient (SDK helper)
-> POST /session/:id/{prompt,cancel,model,...} + GET /session/:id/events (SSE)
-> daemon internal EventBus (fan-out projection)
EventBus lift (1.5-prereq finding #2) refers to extracting typed event schema + reducer + server-side fan-out primitive + transport adapter to a shared package — NOT exposing the memory object to external clients.
Dual deployment modes
Mode
Command
TUI
Use case
Priority
Mode B: Headless Daemon + HttpServer
qwen serve [--port N]
❌
Server / container / remote machine / K8s pod / unified client runtime
P0 mainline
Mode A: CLI + HttpServer
qwen --serve [--port N]
✅ local
Parked. Wait until Mode B event/control/client contract stabilizes
P2 parking lot
Both modes share the same wire protocol (Express 5 + ACP NDJSON over HTTP+SSE). Mode B prioritized (2026-05-15 decision) because Mode A value depends on 1.5c daemon-side state CRUD (else remote clients in Mode A are still thin shells, creating UX gap). Ship 1.5c first → remote clients fully functional → then Mode A revisit.
GET / POST /workspace/memory (read/write ~/.qwen/memory.json)
GET /workspace/mcp + POST /workspace/mcp/:server/restart (MCP status / restart)
GET / POST /workspace/agents + :agentType CRUD
POST /workspace/tools/:name/enable (tool allowlist)
POST /session/:id/approval-mode
GET /session/:id/context (context usage — new per PR#131)
GET /session/:id/supported-commands (command palette / UI affordance — new per PR#131)
GET /workspace/providers (provider/model runtime state — new per PR#131)
POST /workspace/auth/device-flow or Capability RPC (auth — new per PR#131)
POST /workspace/init (project init / trust state)
Compatibility: All new routes have capability tags; clients fallback to old behavior or hide UI when capability missing. Read-only first, mutation second.
A2 — new inProcessAcpBridge.ts (~200 LOC) implementing HttpAcpBridge interface
A3 — gemini.tsx--serve flag integration; lazy import; boot path; default port 0
Phase B — Remote bind + auth/CORS defaults (~1d)
Phase C — Lifecycle coordination (~1d)
Why parked: Mode A value is "local TUI super-client + remote clients sharing same daemon". Without 1.5c, remote clients in Mode A are still thin shells. Without 1.5-prereq typed event contract, TUI adapter has no shared reducer to consume. Ship Mode B contract first → Mode A revisit cleanly.
Mode B daemon is the runtime owner. MCP / Skills / shell / LSP / tool execution / provider auth / file access all evaluated on the daemon host / pod, not the client machine. This deployment contract is documented in qwen-code-daemon-design commit 36c9927 (codeagents):
§04 §五 Runtime locality / environment contract — new section with 5 concrete implications + reverse RPC scope clarification
§05 §八 生产部署 best practice — deny-by-default egress + credentials locality + deployment checklist
§06 §三 1.5c — adds 2 new diagnostic routes (GET /workspace/preflight + GET /workspace/env) + actionable failure detail requirement for GET /workspace/mcp / GET /workspace/skills (no "silent tool loss")
Concrete implications (must be documented for remote clients)
Personal skills (~/.qwen/skills) / project skills (.qwen/skills) / extension skills
Must exist on daemon filesystem; client local skills NOT auto-visible
5
Locked VPC/pod without egress
SaaS MCP discovery/init or tool calls will fail unless network policy allows
Reverse RPC scope (clarified)
§六 Client Capability reverse RPC 5 classes (editor / clipboard / browser / notification / file_picker) are explicit delegations to client-local resources, NOT general fallbacks for MCP/skill/shell execution. Future client-side MCP/skill fallback would need separate design.
GET /workspace/env — daemon host environment summary (available binaries / env vars masked / mount points / network reachability)
GET /workspace/mcp enriched to return {status, error, errorKind: missing_binary | blocked_egress | auth_env_error | init_timeout | protocol_error} per server
GET /workspace/skills enriched to return {loaded, error: missing_file | parse_error | required_binary_not_found} per skill
Deployment best practice
Deny-by-default egress + explicit allowlist (daemon only needs network surface required by configured providers / MCP / skills)
Credentials live on daemon host (OAuth tokens / API keys / SSH agent / kubeconfig); client credentials do NOT auto-transfer
Deployment checklist before exposing daemon to remote clients: install MCP runtimes, sync skills, provision secrets, configure network policy, surface preflight to client UI
Issue #4175 by @doudouOUC (2026-05-15) is the authoritative 25-PR rollout plan for Mode B v0.16 production-ready. Stage 1.5 sub-stages in this issue body map to Wave 1-5 below; Wave 6 covers release hardening + v0.16.
Critical dependency chain
capability registry -> DaemonSessionClient -> typed events
-> daemon-stamped clientId -> session-scoped permission
-> mutation-gating helper -> control-plane mutation routes
-> bridge extraction -> real MCP pool + full PermissionMediator
Issue/docs cleanup is not a blocking PR — body itself is source of truth
No full MCP shared pool in Phase 1 — McpClientManager deeply coupled to ToolRegistry/Config/WorkspaceContext; start with measurement + guardrails (PR 11), full pool in Wave 5 (PR 19)
Protocol skeleton before CRUD — capability tags / typed events / DaemonSessionClient before broad control-plane routes
MCP pool + full PermissionMediator are Wave 5 follow-ups — guardrails (PR 11) can land early
Open questions
Moved to #4175 §Open questions — implementation-decision questions live next to the 25-PR rollout plan rather than the design tracker. The questions cover: npm alpha publish timing / loopback token default / token instance path / remote-control alignment / worktree interaction / client adapter requirement for v0.16. Update them on #4175 to keep one source of truth.
A complete daemon design proposal for Qwen Code, organized as a 6-chapter design series (simplified from original 14 chapters). This issue tracks implementation; the series is the source of truth.
Architecture (post PR #4113, with PR#131 corrections)
1 daemon process = 1 workspace × N sessions multiplexed:
qwen servestartup binds cwd to a single workspace; daemon embeds a singleqwen --acpchild; N sessions multiplex viaQwenAgent.sessions: Map<sessionId, Session>(packages/cli/src/acp-integration/acpAgent.ts:194). Multi-workspace deployment = multiple daemon processes (systemd / docker / k8s each 1 process per workspace).Key invariants (corrected per PR#131)
qwen --acpchild process + workspace bound contextConfig/ToolRegistry/McpClientManager/FileReadCache— these are still created per ACP session; cross-session MCP sharing requires future pool/proxy designWorkspaceMismatchErrorreturned as400 workspace_mismatchwhenPOST /sessioncwd ≠ bound workspace--workspace <path>CLI flag overridesprocess.cwd()at boot; canonicalized viacanonicalizeWorkspaceqwen --acpchild)Client boundary (per PR#131)
Clients do NOT directly subscribe to in-memory EventBus. Clients connect to daemon HTTP/SSE API:
EventBus lift(1.5-prereq finding #2) refers to extracting typed event schema + reducer + server-side fan-out primitive + transport adapter to a shared package — NOT exposing the memory object to external clients.Dual deployment modes
qwen serve [--port N]qwen --serve [--port N]Both modes share the same wire protocol (Express 5 + ACP NDJSON over HTTP+SSE). Mode B prioritized (2026-05-15 decision) because Mode A value depends on 1.5c daemon-side state CRUD (else remote clients in Mode A are still thin shells, creating UX gap). Ship 1.5c first → remote clients fully functional → then Mode A revisit.
Design Documents (6-chapter series)
Implementation Tracker (Mode B priority — PR#131 reset)
qwen servedaemonDaemonSessionClientSDK +AcpChannel/EventBus/PermissionMediatorlift to@qwen-code/acp-bridgeDaemonSessionClient; default switch must await P0/P1qwen --serveflag — Issue #4156 doudouOUC 3-phase plan; A1 PR #4160 ✅ MERGED; rest parked until Mode B contract stabilizesqwen --acpchild)Engineering Principles (PR#131 — every PR must satisfy)
Stage 1.5 is incremental migration, not big rewrite. Each PR must:
qwen serveunbroken/capabilitiesfeature tagP0: Stage 1.5a chiga0 must-haves (9 remaining, ~2 weeks, 9 PRs in parallel)
sessionScopeoverride — body{ scope: 'single' | 'thread' | 'user' }loadSession/unstable_resumeSessionHTTP —POST /session/:id/load+POST /session/:id/resume(★ biggest user pain point; PR Add background agent resume and continuation #3739 transcript on disk, only wire route missing)originatorClientId(NOT self-declared)POST /session/:id/heartbeatpermission_already_resolvedevent (first-responder vote loser feedback)slow_client_warningevent beforeclient_evictedPOST /session/:id/_meta(IM-style per-session context push) + close/delete session/capabilitiesactual feature negotiation —protocol_versions: { acp, daemon_envelope }payloadbbc7b8b6)POST /session/:id/permission/:requestId(session-scoped pending map; legacyPOST /permission/:requestIdretained for compatibility)P0: Stage 1.5c daemon-side state CRUD (10+ wire routes, ~1-2 weeks)
Let remote clients access daemon-side state (no longer thin shell). Reference implementation: Claude Code
/agentsUI Deep-Dive.GET / POST /workspace/memory(read/write~/.qwen/memory.json)GET /workspace/mcp+POST /workspace/mcp/:server/restart(MCP status / restart)GET / POST /workspace/agents+:agentTypeCRUDPOST /workspace/tools/:name/enable(tool allowlist)POST /session/:id/approval-modeGET /session/:id/context(context usage — new per PR#131)GET /session/:id/supported-commands(command palette / UI affordance — new per PR#131)GET /workspace/providers(provider/model runtime state — new per PR#131)POST /workspace/auth/device-flowor Capability RPC (auth — new per PR#131)POST /workspace/init(project init / trust state)Compatibility: All new routes have capability tags; clients fallback to old behavior or hide UI when capability missing. Read-only first, mutation second.
P1: Stage 1.5-prereq typed contract + bridge primitives (~1 week)
Lift shared primitives to
@qwen-code/acp-bridgepackage — reusable byserve/+remoteControl/+nonInteractive/:SessionEvent/ControlEvent— convertdata: unknownto discriminated unionDaemonSessionClient— shared SDK helper for TUI / channels / web / IDEAcpChannelinterface +TransportinterfaceEventBuslift to shared package (fan-out primitive — internal to daemon, projected to clients via HTTP/SSE)PermissionMediatorinterface + 4 policy strategies (first-responder / designated / consensus / local-only)FileSystemServiceper-request abstraction/capabilitiesprotocol_versions + feature registry for graceful degradationCompatibility: Old
DaemonEventenvelope still parseable; typed union is SDK/helper layer additive enhancement.P1 behind flag: Stage 1.5-client adapters (~2-3 weeks)
Onboard primary clients to Mode B daemon — behind flag first, default switch after P0/P1:
useGeminiStreamdirect in-processpackages/channels/base/AcpBridge.tsspawns ownqwen --acpDaemonSessionClient/demoOPENqwen --acpP2 deferred: Stage 1.5b Mode A — Issue #4156 doudouOUC 3-phase plan
validateAndCanonicalizeWorkspacehelper from PR refactor(serve): 1 daemon = 1 workspace (#3803 §02) #4113 code (~50 LOC)createInMemoryChannelhelper — PR refactor(serve): extract createInMemoryChannel helper (#4156 A1) #4160 ✅ MERGED 2026-05-15inProcessAcpBridge.ts(~200 LOC) implementingHttpAcpBridgeinterfacegemini.tsx--serveflag integration; lazy import; boot path; default port 0Why parked: Mode A value is "local TUI super-client + remote clients sharing same daemon". Without 1.5c, remote clients in Mode A are still thin shells. Without 1.5-prereq typed event contract, TUI adapter has no shared reducer to consume. Ship Mode B contract first → Mode A revisit cleanly.
P3: Stage 2 (~3-4 weeks, split 2a-2d)
/health?deep=1+POST /ext/:method+ permission policy schema + Reverse RPC 5 Client Capability classes (editor / clipboard / browser / notification / file_picker)HttpTransportSDK adapter--max-sessionsguard railqwen --acpchild bridge; resolveacpAgent.ts:601 loadSettings(cwd)cross-workspace pollutionAfter Stage 2 the daemon protocol surface is locked; external integrators can build on top.
Recommended 4-week execution timeline (PR#131 finalized)
External Reference Architecture (out of project scope)
Designed but not on qwen-code's roadmap; meant for commercial platforms / k8s operators / cloud vendors:
qwen-coordinator+ multi-daemon spawn / route / cleanup / aggregate APIShellSandboxinterface + 4 local + 4 remote implsKey design decisions matrix (PR#131 corrections)
sessionScope: 'single'; Stage 1.5 #1 allows per-request overridenewSession()creates newConfig,ToolRegistryowns its ownMcpClientManager); cross-session MCP sharing requires future pool/proxy — corrected from previous "per-daemon shared" claimpermission_request+ first-responder vote + per-session routingPOST /session/:id/permission/:requestId(PR#131 correction)See §02 decision matrix for details.
Related Issues / PRs
qwen-code-daemon-designseries/demoOPEN/agentsDeep-Dive (Stage 1.5c reference implementation)Runtime locality / environment contract (2026-05-15 update from chiga0 comment 4458840712)
Mode B daemon is the runtime owner. MCP / Skills / shell / LSP / tool execution / provider auth / file access all evaluated on the daemon host / pod, not the client machine. This deployment contract is documented in
qwen-code-daemon-designcommit36c9927(codeagents):GET /workspace/preflight+GET /workspace/env) + actionable failure detail requirement forGET /workspace/mcp/GET /workspace/skills(no "silent tool loss")Concrete implications (must be documented for remote clients)
node/uv/python/ docker / cloud CLIs + env vars / secrets / fileslocalhost/ Unix sockets / volumes / kubeconfig / SSH agent / browser profile)~/.qwen/skills) / project skills (.qwen/skills) / extension skillsReverse RPC scope (clarified)
§六 Client Capability reverse RPC 5 classes (
editor/clipboard/browser/notification/file_picker) are explicit delegations to client-local resources, NOT general fallbacks for MCP/skill/shell execution. Future client-side MCP/skill fallback would need separate design.Stage 1.5c route enhancements (PR pending)
GET /workspace/preflight— daemon startup + config readiness check (providers / MCP / skills / required binaries / egress detection)GET /workspace/env— daemon host environment summary (available binaries / env vars masked / mount points / network reachability)GET /workspace/mcpenriched to return{status, error, errorKind: missing_binary | blocked_egress | auth_env_error | init_timeout | protocol_error}per serverGET /workspace/skillsenriched to return{loaded, error: missing_file | parse_error | required_binary_not_found}per skillDeployment best practice
Implementation Tracker — Issue #4175 (25-PR Wave Plan)
Issue #4175 by @doudouOUC (2026-05-15) is the authoritative 25-PR rollout plan for Mode B v0.16 production-ready. Stage 1.5 sub-stages in this issue body map to Wave 1-5 below; Wave 6 covers release hardening + v0.16.
Critical dependency chain
6 Wave breakdown
runtime-diagnostics+ MCP resource guardrails (measurement, not full pool)Key sequencing decisions (from Issue #4175)
McpClientManagerdeeply coupled toToolRegistry/Config/WorkspaceContext; start with measurement + guardrails (PR 11), full pool in Wave 5 (PR 19)DaemonSessionClientbefore broad control-plane routesclientId→ PR 8 session-scoped permission → PR 12 mutation gate → PR 13+ CRUDPermissionMediatorare Wave 5 follow-ups — guardrails (PR 11) can land earlyOpen questions
Moved to #4175 §Open questions — implementation-decision questions live next to the 25-PR rollout plan rather than the design tracker. The questions cover: npm alpha publish timing / loopback token default / token instance path / remote-control alignment / worktree interaction / client adapter requirement for v0.16. Update them on #4175 to keep one source of truth.
See §06 §三·一 Wave breakdown for full Wave 1-6 PR detail tables with dependencies.