Skip to content

proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175

@doudouOUC

Description

@doudouOUC

Background

Stage 1 daemon (#3889) and the 1 daemon = 1 workspace refactor (#4113) are merged. Mode B (qwen serve) is functionally runnable today: the Stage 1 HTTP/SSE routes work, auth defenses are in place, and same-workspace session multiplexing is live.

The remaining work is not “make the demo run”; it is turning the daemon into a stable multi-client runtime contract that TUI / channels / SDK / IDE clients can safely share. Browser clients are out of scope for the daemon itself — the @qwen-code/webui library is published as a standalone package for downstream embedders to host (see 2026-05-21 direction update below).

Current facts:

  • NPM @qwen-code/qwen-code@0.15.11 predates feat(cli,sdk): qwen serve daemon (Stage 1) #3889, so users installing from npm do not yet get qwen serve.
  • The TypeScript SDK package is @qwen-code/sdk (currently 0.1.7), not @qwen-code/sdk-typescript.
  • Post-refactor(serve): 1 daemon = 1 workspace (#3803 §02) #4113 architecture is 1 daemon process = 1 workspace × N sessions, with one workspace-bound qwen --acp child and multiple ACP sessions multiplexed inside it.
  • MCP / skills / shell / LSP / provider auth / file access execute in the daemon runtime environment, not on the attached client machine.
  • EventBus is an internal daemon fan-out primitive projected to clients via HTTP/SSE; clients should consume a typed SDK/protocol layer, not the in-memory object directly.

Branching strategy

Updated 2026-05-19: per maintainer guidance, future Mode B feature PRs no longer merge directly into main one-at-a-time. They merge into the long-lived integration branch daemon_mode_b_main, and daemon_mode_b_main is rolled into main periodically as feature-cohesive batches. This addresses maintainer feedback that the Wave-by-implementation-layer rollout produced too many small PRs against main.

Current progress

2026-05-25 — daemon_mode_b_mainmain integration merge opened as DRAFT

#4490 opened as DRAFT 2026-05-25 — first reverse-direction integration merge of daemon_mode_b_main into main per the branching strategy. 14 feature PRs landed on the integration branch since divergence base #4304 (68e3ec988, 2026-05-19): F1 (#4319 / #4334 / #4445) + F2 (#4336 / #4411 / #4460) + F3 (#4335) + F4 prereq (#4360) + F5 alpha docs (#4473 / #4483) + chiga0 daemon-ui library track (#4328 / #4353) + pre-F1 housekeeping (#4297 / #4305). 536 files / +92322/−16894 against the actual GitHub merge base.

Status: CONFLICTING. 5 main commits landed AFTER the 2026-05-24 #4469 sync (94da486e1 / 8ef73599d / ab26a5ab7 / 84f408017 / 4dc98484f — weixin / cli text-buffer / skills heap-debug / cli completion). A sibling chore(integration): sync main PR (mirroring #4469's pattern) is needed first to clear the conflict before #4490 can flip to ready-for-review.

Why DRAFT: maintainer-only decisions (merge strategy, sync timing, CI dispatch, version tagging) gate this. PR 28 npm publish scaffolding sequences AFTER #4490 lands (npm publishes from main).

Recommended merge strategy: merge commit (preserves all 14 PR # links + individual feature commits in main's history). Squash would lose the boundary structure.

2026-05-24 — F5 baseline ready, PR 27 ✅ MERGED

PR 27 (#4473) ✅ MERGED 2026-05-24 15:57Z (merge commit 63803deab). 2 commits, +172/−6 across 4 files, 2/2 review threads resolved (both copilot doc fixes adopted, zero declined). wenshao APPROVED. F5 chain first leg done.

Just merged (2026-05-24):

Still open (do not block F5):

  • 🔄 Feat/daemon react cli #4380 chiga0 feat/daemon-react-cli — CHANGES_REQUESTED (wenshao). Library-only daemon-backed React web-shell; independent of v0.16-alpha critical path.
  • 🔄 docs: Refresh daemon developer docs #4412 doudouOUC docs(developers): daemon-mode developer deep-dive — CHANGES_REQUESTED (wenshao + Copilot). Targets main. Could fold into PR 27 alpha docs or merge standalone first.

F5 chain status (post-2026-05-24 merges):

PR Status Blocker
PR 27 alpha docs + ~10 LOC SDK env fallback #4473 MERGED 2026-05-24 15:57Z (merge 63803deab)
PR 28 npm publish scaffolding 🟢 Ready (no code) Sequentially after PR 27 ✅
PR 30a local launch refs #4483 MERGED 2026-05-25 03:00Z (merge 74c5d4505)
PR 31 v0.16-alpha.0 cut 🟢 Ready After PR 28 + 30a

2026-05-24 scope freeze — v0.16-alpha = text-only + local-only deployment

Phase A scope frozen 2026-05-24 (resolves the "awaiting @wenshao decision" thread opened 2026-05-23). v0.16-alpha targets the text-only chat / coding product surface with local-only deployment. Containerized deployment (PR 30b) explicitly NOT in v0.16; ships as a later v0.16.x patch once an enterprise pilot is committed.

In scope for v0.16-alpha:

  • F5 release chain (text-only flavor, 4 PRs):
    • PR 27 alpha docs#4473 MERGED 2026-05-24 15:57Z (merge commit 63803deab, +175/−6 across 4 files, 2 commits, base daemon_mode_b_main, wenshao APPROVED). v0.16-alpha banner + comprehensive known-limits section in docs/users/qwen-serve.md (text-only ✅ vs multimodal ❌, local launchers ✅ vs containerized ❌, BYO-token ✅ vs auto-gen ❌, hardening posture ✅/⏸️ table). SDK ergonomic micro-change: DaemonClient constructor falls back to QWEN_SERVER_TOKEN env var when opts.token is absent — closes the asymmetry where the daemon side already honors this var (--token CLI flag fallback, in main since PR 15) but the SDK forced clients to thread it through every construction. This is the entire ergonomic replacement for PR 29's SDK env/file fallback. Browser-safe via globalThis.process indirection (SDK is imported by @qwen-code/webui); whitespace stripped + empty-string treated as unset (matches daemon-side trim convention); resolved at construction not per-request; explicit opts.token wins. 4 new SDK tests + defensive snapshot on existing "omits Authorization" test. Round 1 fold-in (commit 7f850662f) adopted 2 copilot doc threads: removed stale line refs (runQwenServe.ts:175 actually at 302-318; qwen-serve.md:173) → stable symbol/section anchors; replaced misleading ~/.qwen/server-token example with explicit user-managed alternatives (openssl rand -hex 32 / cat ./my-token-file). Verification: 125/125 DaemonClient tests + tsc + eslint clean.
    • PR 28 npm publish scaffolding + post-publish smoke test. Publish list frozen 2026-05-24: @qwen-code/qwen-code (main CLI + bundled acp-bridge) + @qwen-code/qwen-code-core + @qwen-code/sdk (0.1.7 → 0.2.0 minor bump) + @qwen-code/webui (chiga0 library track). @qwen-code/acp-bridge stays workspace-internal — NOT published to npm for v0.16-alpha. Rationale: zero impact on qwen serve users (bundled into CLI tarball at release time); avoids prematurely locking ~50+ semver-tracked exports (BridgeClient / BridgeOptions / DaemonStatusProvider / PermissionMediator / 11 error classes / 11 bridge types / status 27-symbol contract) that are still being shaped by F2 / F3 / F4; preserves refactor freedom for chiga0 auth report error #18 Streamable HTTP and F4 IDE companion changes. Channels (@qwen-code/channel-*) and qwen-code-vscode-ide-companion keep their own independent release cadence.
    • PR 30a local launch references#4483 MERGED 2026-05-25 03:00Z (merge commit 74c5d4505, +175/−4 final, 3 commits, base daemon_mode_b_main). New docs/users/qwen-serve-deploy-local.md (~175 LOC final) with copy-paste templates: systemd user unit (Linux primary + system-wide alternative + loginctl enable-linger for headless reboots) using EnvironmentFile= instead of inline Environment= to keep token in chmod-600 file; launchd LaunchAgent plist (macOS, with explicit "no ~/$HOME expansion" warning, KeepAlive with SuccessfulExit=false matching systemd's Restart=on-failure, ThrottleInterval=10, logs to ~/Library/Logs/qwen-serve/ not /tmp to avoid symlink attacks + truncate-on-load + periodic-daily cleanup); tmux session (interactive supervision); nohup one-liner wrapped in bash -c 'cd ~/your-project && ...' to prevent silent workspace_mismatch. Both templates use /PATH/TO/qwen placeholder + a "Find your qwen binary first" callout listing common locations (Linuxbrew / nvm / fnm / Volta / Apple Silicon Homebrew) since service managers don't read $PATH. Plus token rotation walkthrough covering all four launchers, smoke-check (/health + /capabilities with correct conditional-auth wording), and an "out of scope: containerized / cross-host / Windows native" callout pointing at WSL2. Source-verified qwen serve CLI flags are --hostname / --port (NOT --bind) at serve.ts:58. All templates inline QWEN_SERVER_TOKEN=... per PR 27's BYO-token convention with explicit "DO NOT COMMIT" comments + scope-this-export-to-current-shell warning to prevent profile-level leak to all subprocesses. Cross-link edits: qwen-serve.md line 32 forward reference becomes live + new "What's next" bullet; _meta.ts gets sibling nav entry under qwen-serve. Reviewer iteration: 17 review threads resolved across 2 fold-in rounds — round 1 (commit 54d8111fc) folded 14 threads (5 copilot + 9 wenshao via qwen3.7-max): the critical --bind--hostname fix on all 4 templates, missing loginctl enable-linger, missing nohup cd for workspace, systemd Environment= exposing token in 644 unit file, launchd /tmp logs / KeepAlive=true / etc.; round 2 (commit 265891f23) folded 2 more wenshao threads — hardcoded /usr/local/bin/qwen breaks nvm/Volta users + shell-export profile-leak warning. Pure markdown, zero code, zero tests.
    • PR 31 v0.16-alpha.0 cut
  • Any small bug fixes / docs polish folded into PR 27 / PR 31

Net server-side code work for v0.16-alpha ≈ 0. PR 27 ships a ~10 LOC SDK env fallback (not strictly server-side). PR 28 / PR 30a / PR 31 are pure release infrastructure (workflow YAML, markdown, version bumps). Per the 2026-05-24 final scope pass below, every "service-side contract polish" item is deferred until a real consumer scenario materializes.

Explicitly deferred (NOT v0.16-alpha):

  • PR 29 production token defaults (auto-gen daemon token, SDK env/file fallback, instance-path keying, stale cleanup) — also deferred to v0.16.x (2026-05-24). Source-verified: the boot-time security gate is already in main since PR 15 (feat(serve): mutation gating helper and --require-auth #4236)runQwenServe.ts:175 refuses to start on non-loopback bind without a token, and loopback noauth default emits an explicit stdout warning. PR 29's four features are all DX / ergonomics, not capability: auto-gen saves an openssl rand -hex 32, SDK env/file fallback is replaced by the ~10 LOC process.env.QWEN_SERVER_TOKEN fallback folded into PR 27, instance-path keying + stale cleanup only matter for multi-daemon / remote scenarios that the local-only alpha audience doesn't trigger. Picking PR 29 up "for completeness" would burn ~3-5 days of focused work for zero shipped-alpha capability gain. Re-evaluate when remote / multi-daemon / enterprise pilot becomes the target.
  • infra: enforce SDK/server MCP-restart timeout coupling (mode B follow-up to #4319) #4330 SDK/server MCP-restart timeout compat warn (Option C) — also deferred to v0.16.x (2026-05-24). Purely defensive: warns when SDK's default timeout doesn't leave enough headroom over the server's MCP_RESTART_TIMEOUT_MS. v0.16-alpha ships SDK + daemon as a single version pair (SDK default 330k, daemon 300k, 30s headroom already baked in by F1 commit b78de2719) — zero drift to detect. Re-evaluate when a real multi-version SDK ↔ daemon deployment scenario (v0.17 SDK on v0.16 daemon, or vice versa) materializes. Same logic as PR 29: no real consumer scenario in alpha → defer.
  • W133-c PoolEvent.source discriminator — also deferred to v0.16.x (2026-05-24). Contract polish: adds source: 'reconnect_budget' | 'silent_drop' to PoolEvent['failed'] so SDK reducers can distinguish without grepping lastError. Grep confirms @qwen-code/sdk + @qwen-code/webui + internal mcp-client-manager.ts:1571 onFailed all have zero consumers that distinguish today. Pick up when the first real consumer needs the split (likely F4 IDE / channel adapter polling on PoolEvent). Same logic as infra: enforce SDK/server MCP-restart timeout coupling (mode B follow-up to #4319) #4330 Option C: no real consumer → defer.
  • PR 30b containerized deployment refs (Docker / k8s / nginx + TLS / multi-instance token isolation) — defers to v0.16.x patch when the enterprise pilot target is picked. Rationale: with no one running k8s daemons in production yet, doc would rot from no-one-validating; structurally also depends on PR 29's instance-path keying, which is itself deferred above.
  • chiga0 请问这个 qwen3-coder-max 这个模型,从哪里找到? 从 阿里云百炼平台 没找到这个模型 #27 Phase A P0 items Where is the config saved? #2 / 如何自定义密钥文件 .env可能与其他文件冲突 #3 / Are you interested in AI Terminal? #4 (prompt absolute deadline, SSE writer idle timeout, multimodal echo bug). Source-verified non-blocking for text-only:
    • bridge.ts:1892 FIXME — already has AbortSignal + cancel + transportClosed race mitigations
    • server.ts:1888 TODO — already has 15s heartbeat + res.on('error') cleanup
    • MessageEmitter.ts multimodal echo — doesn't affect text-only path
  • chiga0 请问这个 qwen3-coder-max 这个模型,从哪里找到? 从 阿里云百炼平台 没找到这个模型 #27 Phase A P0 item pre-release: fix ci #1 (--max-body-size CLI flag) — also deferred. Source-verified: daemon already enforces express.json({ limit: '10mb' }) at server.ts:512, with typed 413 handling at server.ts:1862. Text-only prompts max out well under 10 MiB (model context windows ~200 KB chars), so the default cap is sufficient for v0.16-alpha. The CLI flag's value is "lets users tune it" — only relevant for multimodal (need larger) or enterprise hardening (need smaller). Pick up in v0.16.x or v0.17 alongside whichever target needs it.
  • F4 main scope (qwen --connect CLI / IDE daemon-native / channel adapter demo / Phase-1 scale instrumentation) — deferred to v0.17. Text-only alpha uses the existing ACP-over-HTTP/SSE surface from F1 as-is; no new client adapter required for the alpha audience (dogfooding developers + library embedders via feat(daemon): add shared UI transcript layer #4328 / feat(sdk/daemon-ui): unified completeness follow-up to #4328 #4353).
  • chiga0 auth report error #18 ACP Streamable HTTP (/acp dual-transport endpoint) — design-level decision, no v0.16 commitment
  • Keesan12 command not found: qwen #21-26 terminal receipt seam — Keesan12 confirmed (2026-05-22 18:47Z) that text-only alpha can ship without it; revisit for enterprise pilot
  • Architecture cleanup §2.1 / §2.2 / §2.3 (FileSystemService multi-hash edit + audit fan-out / acp-bridge sunk cost / TUI-IDE daemon adapter) — confirmed accepted as deferred per #4520535014

2026-05-26 daemon observability follow-ups

Two daemon observability gaps have been split out as explicit follow-up issues instead of staying implicit inside F4 / Stage 2 wording:

Scope position: neither issue blocks the text-only + local-only v0.16-alpha path. Both are candidates for the v0.16.x / v0.17 productionization bucket, especially if F4 picks the Phase-1 scale instrumentation / enterprise pilot track.

Estimated path to alpha: ~4-5 working days of focused work on the 4-PR F5 chain (PR 27 → 28 → 30a → 31). Zero service-side code on the critical path — every contract polish / hardening item (PR 29 / PR 30b / --max-body-size / #4330 Option C / W133-c) is deferred to v0.16.x once a real consumer scenario materializes. Only ~10 LOC SDK env fallback in PR 27 is "real code"; the rest is docs + release infra.

2026-05-23 community input — v0.16-alpha scope decisions pending

Three open community threads accumulated since F2 merged (2026-05-21 15:56Z) that gate the Phase A scope freeze. None block in-flight PRs; all gate when/what v0.16-alpha ships.

  • chiga0 #27 Pre-release scope proposal (2026-05-22) — Phase A/B/C scope freeze framework + architecture cleanup three conclusions. Phase A proposes 4 P0 items: (1) --max-body-size CLI flag, (2) prompt absolute deadline (bridge.ts:1892 FIXME), (3) SSE writer idle timeout (server.ts:1888 TODO), (4) multimodal echo bug (MessageEmitter.ts hardcoded text). Architecture cleanup: §2.1 keep FileSystemService core trio (defer multi-hash edit + audit fan-out), §2.2 acp-bridge sunk cost accepted, §2.3 TUI/IDE daemon adapter not needed. My reply #27 response (2026-05-22) accepted framework + §2.1-§2.3 conclusions, but proposed alpha-target-dependent decomposition of Phase A:
    • Text-only alpha (~3-5 days): only --max-body-size; 2/3/4 are conditional. bridge.ts:1892 has AbortSignal + cancel + transportClosed race mitigations; server.ts:1888 has 15s heartbeat + res.on('error') cleanup; multimodal echo doesn't matter for text-only.
    • Multimodal alpha (~2 weeks): adds streaming upload route + MessageEmitter multimodal echo fix.
    • Enterprise pilot (~3 weeks): all 4 P0 + observability + rate limiter + load test harness.
    • Awaiting @wenshao decision on which alpha target sets Phase A scope.
  • chiga0 #18 ACP Streamable HTTP RFD vs current wire (2026-05-20) — Zed agent-client-protocol RFD #721 proposes Streamable HTTP + WebSocket transport. Question: should qwen serve add /acp dual-transport endpoint alongside existing REST+SSE? Needs explicit yes/no from @wenshao. Not a Phase A blocker; design-decision-level.
  • Keesan12 #21-26 terminal receipt seam series (2026-05-21 through 2026-05-22) — propose first-class can_this_attempt_continue state with {last_admitted_attempt, remaining_retry_budget, last_verifier_delta, inflight_tool_ids, terminal_reason}. Every shutdown path (normal stop, user abort, reconnect loss, watchdog kill, retry exhaustion) writes same terminal receipt shape so adapters / operators don't have to guess "is another attempt safe." Cross-reference: "We just tightened Martin..." — Keesan12's MartinLoop production experience. Status: pinged Keesan12 in my #4520535014 reply for MartinLoop repo reference; awaiting public link. Design-decision-level; likely post-v0.16 unless wenshao prioritizes for enterprise alpha.
  • chiga0 #28 Daemon UI SDK progress update (2026-05-22) — rebase status of feat(sdk/daemon-ui): unified completeness follow-up to #4328 #4353 onto current daemon_mode_b_main. Library-only track per 2026-05-21 direction update; does not gate daemon work.

2026-05-21 direction update — Browser as daemon-hosted client surface dropped

The 2026-05-19 "web-first pivot" (chiga0 #4296) is itself superseded as of 2026-05-21. Direction confirmed:

  • Daemon does not host browser UI. No qwen serve /web endpoint, no cookie BFF mediation, no browser auth bridge. Browser clients can't safely hold daemon bearer tokens; cookie-BFF was only justified if browser were an active target, and it is not.
  • chiga0 #4328 + #4353 continue as library-only: @qwen-code/webui React component library + @qwen-code/sdk/daemon/ui shared reducer/store/normalizer. Downstream embedders (other tools, custom integrations) may host them; daemon itself does not.
  • F4 reverts to daemon-native client experience for non-browser ACP clients: CLI / IDE companion / Node SDK / channel adapters via the existing ACP-over-HTTP/SSE surface (from F1). Concrete sub-scope TBD — candidates: (a) qwen --connect <port> CLI multi-session attach; (b) VS Code IDE companion daemon-native rewrite; (c) channel adapter end-to-end demo on top of PR 25 OutputSink; (d) Phase-1 scale instrumentation (rate limiter / session TTL / observability / load test harness) toward 30-50 active session goal.
  • 2026-05-19 pivot's other parts still hold: native TUI / VS Code / channel adapters continue to default to direct ACP / runtime paths.

Updated 2026-05-19 09:30Z (17:30 CST 2026-05-19):

Wave 1 — protocol foundation: complete.

Wave 2 — session lifecycle and multi-client safety: complete.

Wave 2.5 — reliability and lifecycle closure: complete.

Wave 3 — read-only control plane and diagnostics: complete (PR 12 + PR 13 + PR 14 v1 + PR 14b all merged).

  • PR 12 ✅ merged: feat(serve): add read-only status routes #4241 (feat(serve): add read-only status routes) merged 2026-05-17 13:37Z. It adds GET /workspace/mcp, GET /workspace/skills, GET /workspace/providers, GET /session/:id/context, and GET /session/:id/supported-commands; payloads are redacted, protocol-versioned, and mirrored in the SDK. Two wenshao review rounds were addressed (idle-workspace status snapshots, redaction edge cases, audit-symmetric clientId forwarding for session-scoped helpers, removal of a dangling AbortController). All cross-platform CI green at merge.
  • PR 14 v1 ✅ merged: feat(serve): MCP client guardrails (#4175 Wave 3 PR 14) #4247 (feat(serve): MCP client guardrails) merged 2026-05-18 04:07Z (merge commit 96219924a). Ships in-process MCP client accounting on McpClientManager (getMcpClientAccounting returns total / byTransport / subprocessCount / reservedSlots[] / refusedServerNames[]), atomic slot-reservation enforcement at all 3 spawn sites, --mcp-client-budget=N + --mcp-budget-mode={enforce,warn,off} CLI flags forwarded via env vars to the ACP child, additive clientCount / clientBudget / budgetMode / budgets[] fields on GET /workspace/mcp (new ServeMcpBudgetStatusCell sub-interface with scope: 'workspace' — forward-compat for PR 23's pool-scope cell), per-server disabledReason: 'config' | 'budget' tag, and always-on capability tag mcp_guardrails with modes: ['warn', 'enforce']. Unblocks PR 14b (typed SSE push events on top of v1's snapshot) and PR 17 (control routes need the snapshot to gate "is this restart safe under budget?").
  • PR 14b ✅ merged: feat(serve): MCP guardrail push events + hysteresis (#4175 Wave 3 PR 14b) #4271 (feat(serve): MCP guardrail push events + hysteresis) merged 2026-05-18 17:06Z (commit 3ffe321cf). Ships typed SSE push events mcp_budget_warning (single-fire on upward 75% crossing of reservedSlots.size / clientBudget with hysteresis re-arm at 37.5% via MCP_BUDGET_REARM_FRACTION core constant, mirroring PR 10's slow_client_warning pattern at the manager level rather than per-subscriber backlog level) + mcp_child_refused_batch (coalesced once per discoverAllMcpTools* pass when ≥1 server refused; length-1 batch on readResource lazy-spawn refusal; mode: 'enforce' literal since warn mode never refuses); transport stays per-session (uses ACP child→bridge connection.extNotification with method qwen/notify/session/mcp-budget-event — mirrors the authenticate/update precedent at acpAgent.ts:355, NOT a new workspace-level notification kind); BridgeClient gains extNotification handler that resolves session via byId.get(sessionId) and republishes as a session-scoped SSE frame; always-on capability tag mcp_guardrail_events distinct from PR 14's mcp_guardrails; SDK reducer state (mcpBudgetWarningCount, lastMcpBudgetWarning, mcpChildRefusedBatchCount, lastMcpChildRefusedBatch on DaemonSessionViewState); SDK predicate guards reject mode: 'warn' on refused-batch + unknown transport families; integration cross-check pgrep -P subprocessCount vs daemon's getMcpClientAccounting().total. Codex P2 review round 1 (commit 195d9000b) folded in 4 findings: (1) bridge early-event buffer — BridgeClient.earlyEvents Map (64 sessions × 32 frames × 60s TTL ≈ 400 KB worst case) + bufferEarlyEvent/drainEarlyEvents helpers, drained by createSessionEntry so events fired during the child's newSession handler reach SSE subscribers as the FIRST frames of the new session; pre-fix those frames hit resolveEntry → undefined and were dropped; (2) pre-init callback registration — Config.setMcpBudgetEventCallback shim stashes the callback and applies it inside createToolRegistry BEFORE discoverAllTools (legacy blocking) or startMcpDiscoveryInBackground (default progressive) fires, closing the legacy 100% loss + progressive race window pre-fix had; (3) bulk-pass refused-batch coalescing — new bulkPassDepth re-entrant counter + try/finally on both bulk paths + emitRefusedBatchIfAny early-return guard; pre-fix discoverAllMcpToolsIncremental with N refusals produced N length-1 batches breaking the documented "one batch per pass" contract; (4) hysteresis re-arm on slot release — new releaseSlotName helper wraps reservedSlots.delete + evaluateBudgetState; tryReserveSlot calls evaluate after add; all 6 release sites migrated; pre-fix scenario "4/4 fire → drop to 1/4 → up to 4/4" never fired a second warning because slot-release paths bypassed evaluate. Codex P2 review round 2 (commit 730e66d6a) folded in 2 SDK-side findings: (1) seed SDK replay for startup guardrail events — DaemonSessionClient.createOrAttach now seeds Last-Event-ID: 0 on session.attached === false (newly-created), unifying with the existing modelServiceId carve-out. Pre-fix the buffered mcp_budget_warning/mcp_child_refused_batch events drained into the per-session ring before spawnOrAttach returned but the SDK's default live-only subscription missed them, defeating PR 14b's whole push channel; (2) re-export new guardrail event types — DaemonMcpBudgetWarningData, DaemonMcpBudgetWarningEvent, DaemonMcpRefusedServer, DaemonMcpChildRefusedBatchData, DaemonMcpChildRefusedBatchEvent, DaemonMcpGuardrailEvent added to both daemon/index.ts and src/index.ts barrel exports so consumers can import + narrow them via @qwen-code/sdk instead of deep-importing internal paths. Codex P2 review round 3 (commit 84a38e6f9) folded in 6 actionable items across DeepSeek / mimo / copilot agents (2 declined as documented in thread replies): (1) debug logging for budget events — evaluateBudgetState now emits debugLogger.info on warning fire AND on re-arm; acpAgent's silent .catch(() => {}) upgraded to .catch((err) => debugLogger.debug(...)); (2) rename ViewState fields → mcpChildRefusedBatchCount / lastMcpChildRefusedBatch to match slow_client_warning → slowClientWarningCount convention; (3) satisfies DaemonEvent on test fixtures for type-safety hygiene; (4) emitRefusedBatchIfAny docstring fixed to accurately describe per-pass reset semantics (only pendingRefusalNames cleared on emit; lastRefusedServerNames/lastRefusedTransports survive between passes per snapshot contract); (5) integration test renamed clientCount matches external pgrep observation (was misleadingly "subprocessCount"); (6) emitBudgetEvent(event) try/catch helper isolates synchronously-throwing callbacks from MCP discovery / readResource / disconnectServer paths. Declined: (a) bridge structural payload validation — duplicates SDK predicate logic at a trust boundary; (b) widening thresholdRatio: 0.75 literal — premature for hypothetical second threshold (better design when needed: discriminator union, not type widening). All 14 review threads (3 outdated copilot + 11 wenshao with DeepSeek/mimo/gpt-5.5/copilot agents) replied + resolved 2026-05-18 19:35Z. Backward compatible — every new field optional, EVENT_SCHEMA_VERSION stays at 1, old daemons advertise only mcp_guardrails. 6 commits on top of base: initial 13 files / +1828 -16; round 1 fixup 7 files / +795 -377; merge pre-release: fix ci #1 resolves 4 conflicts (PR 16 + PR 19); merge Where is the config saved? #2 picks up PR 19 Windows path follow-up; round 2 fixup 4 files / +88 -8; round 3 fixup 7 files / +150 -46. 1309/1309 tests pass across 43 files; typecheck clean across 4 workspaces; lint clean on touched files. 25 new tests total across the three fixup rounds.
  • PR 13 ✅ merged: feat(serve): preflight and env diagnostics routes (#4175 Wave 3 PR 13) #4251 (feat(serve): preflight and env diagnostics routes) merged 2026-05-17 23:29Z (merge commit f44ed0941). Adds GET /workspace/env (daemon-process snapshot — runtime / platform / sandbox / proxy / presence-only secret env vars; never spawns ACP) and GET /workspace/preflight (daemon-level cells: node_version / cli_entry / workspace_dir / ripgrep / git / npm via Promise.allSettled; ACP-level cells: auth / mcp_discovery / skills / providers / tool_registry / egress, returning not_started placeholders when daemon idle). First PR to land the closed errorKind taxonomy from proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175missing_binary | blocked_egress | auth_env_error | init_timeout | protocol_error | missing_file | parse_error — used as a typed union on ServeStatusCell.errorKind. New typed BridgeTimeoutError, mapDomainErrorToErrorKind helper, exported ACP_PREFLIGHT_KINDS + AcpPreflightKind to drive both idle placeholders and live builder from the same source (TS exhaustiveness ensures no drift). Two new capability tags (workspace_env, workspace_preflight); two new SDK helpers (DaemonClient.workspaceEnv / workspacePreflight). Cross-cutting fix: getGitVersion/getNpmVersion migrated from blocking execSync to async execFile + 5s timeout, unblocking the daemon event loop on hung git/npm. Strict invariants enforced via tests: kind: 'env_var' cells never carry value, proxy URLs go through redactProxyCredentials + URL.host (with 3-stage parse fallback for non-URL shapes — never leaks <redacted>@host form), readProxyVar uses ?? so HTTPS_PROXY="" Docker/K8s explicit-disable convention works, idle preflight asserts handles.length === 0. Drift detection: authPreflight.test.ts walks AuthType enum → CI red on missing AUTH_PREFLIGHT_ENV_KEYS entries. 4 review rounds addressed (4 Copilot inline + 4 wenshao P3 + 4 wenshao Critical/Suggestion + 4 fold-in 4 items including Windows CI fix and integration-test parity); 11/12 inline threads resolved, 1 thread closed via wenshao's APPROVED waiver on errorKind closed-union narrowing (pre-1.0 semver, deliberate type-contract choice). Final: 639/639 tests, typecheck cli + sdk clean, eslint clean, all 3 OS CI passes. 6 atomic commits, +~2300/-~80 across ~20 files.

Wave 4 — auth-gated mutation/control routes: complete (PR 15 + PR 16 + PR 17 + PR 18 + PR 19 + PR 20 + PR 21 all merged).

  • PR 15 ✅ merged: feat(serve): mutation gating helper and --require-auth #4236 (feat(serve): mutation gating helper and --require-auth) merged 2026-05-17 12:10Z. It adds the centralized mutation gate, --require-auth, conditional require_auth capability advertisement, and non-strict adoption markers on existing mutation routes. No new mutation route is strict yet; Wave 4 PRs 16-21 will opt into strict: true where needed.
  • PR 16 ✅ merged: feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) #4249 (feat(serve): workspace memory and agents CRUD) merged 2026-05-18 06:27Z (merge commit 103090669). Adds 7 strict-gated workspace CRUD routes (GET/POST /workspace/memory, GET/POST /workspace/agents, GET/POST/DELETE /workspace/agents/:agentType), capability tags workspace_memory + workspace_agents, workspace event fan-out (memory_changed, agent_changed via bridge.publishWorkspaceEvent), bridge.knownClientIds() for workspace-level client attribution, and 7 SDK helpers on DaemonClient. Review fold-ins landed concurrent append serialization, force-refresh agent listing, builtin-name shadow rejection, scalar validation / SDK parity / CodeQL regex hardening, and event/fan-out/helper test coverage. Static QWEN.md memory is in; auto-memory remains deferred to PR 16.5.
  • PR 18 ✅ merged: refactor(serve): add FileSystemService boundary (#4175 Wave 4 PR 18) #4250 (refactor(serve): add FileSystemService boundary) merged 2026-05-18 05:14Z (merge commit 495d11f01). Pure refactor introducing per-request workspace filesystem boundary inside the qwen serve daemon; centralizes path canonicalization, symlink-aware boundary checks (40-hop limit + 7 reject classes), .gitignore/.qwenignore policy, size/binary limits (MAX_READ_BYTES = 256 KiB, MAX_WRITE_BYTES = 5 MiB, BINARY_PROBE_BYTES = 4096), and typed fs.access / fs.denied audit hooks behind a single WorkspaceFileSystem surface (resolve / stat / readText / readBytes / list / glob / writeText / edit; ResolvedPath brand). 70 review threads (Copilot 8 + wenshao 8 + DeepSeek 9 + 6 more rounds = ~52 external items + triple-agent self-review) ALL replied + resolved across 10 fix-up commits. Notable hardening folded back in-PR: TOCTOU symlink-substitution guard (assertInodeStableAfterRead + assertNotSymlinkBeforeWrite), multi-hop dangling-symlink write escape, glob 3-way kind taxonomy + escape audit, OOM hard caps + post-read byte-length defense-in-depth, safe UTF-8 truncation, 'io_error' HTTP 503 kind for ENOSPC/EIO/EMFILE class, strict-default fsFactory. 446/446 serve + 137 new fs tests pass at merge. Unblocks PR 19 (file read GET routes) and PR 20 (file write/edit POST, also still needs PR 8 ✅ + PR 15 ✅).

Round-7+ rounds added the following hardening atop earlier waves: edit() encoding round-trip via lowFs.readTextFile + _meta forwarding (BOM/iconv preserved), 8.3 short-name regex multi-digit + win32-only gate, CANONICAL_BOUND_CACHE Map for steady-state-zero workspace canonicalization, readBytes window-read truncation matching API name, glob node_modules/.git walk-time pruning, dangling-symlink chain reuses verified canonical (no re-walk TOCTOU), Object.freeze(ignore) blocks cross-request mutation, internal_error kind (HTTP 500) for non-errno errors so security oncall isn't paged on TypeError, audit sessionId forwarding for multi-session correlation, recordAndWrap audit message field gated on QWEN_AUDIT_RAW_PATHS (privacy regression caught + closed across hint as well), safeUtf8Truncate simplified to its essential 4-line form, enforceReadBytesSize signature tightened (removed dead maxBytes param), readText / readBytes post-read byte-length defense vs concurrent file growth, glob cwd === boundWorkspace realpath short-circuit, createDefaultFsAuditEmit periodic warn with payload context (every 100th drop instead of one-shot), relForAudit Windows cross-drive <cross-drive> sentinel.

  • PR 21 ✅ merged: feat(serve): auth device-flow route (#4175 Wave 4 PR 21) #4255 (feat(serve): auth device-flow route) merged 2026-05-18 14:05Z (commit 36760ca63). Brokers OAuth 2.0 Device Authorization Grant (RFC 8628) through the daemon: 4 routes under /workspace/auth/ (POST /device-flow strict + idempotent take-over, GET /device-flow/:id, DELETE /device-flow/:id strict + idempotent, GET /status); 1 capability tag auth_device_flow; 5 typed events (auth_device_flow_started/throttled/authorized/failed/cancelled) workspace-fanned via inline bridge.broadcastWorkspaceEvent (deliberately distinct from PR 16's publishWorkspaceEvent to avoid merge conflict — fold-in candidate after feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) #4249). New DeviceFlowRegistry with per-providerId singleton + idempotent take-over, in-flight Promise coalescing on concurrent start, 5-min terminal grace + 30s sweeper, dispose() wired into runQwenServe.close() shutdown drain. BrandedSecret<T> for device_code / PKCE verifier with frozen plain object + WeakMap + 4-way redaction (toString/toJSON/Symbol.toPrimitive/numeric coercion → [redacted]/NaN) + unique symbol brand — earlier new String() shape leaked through +/template literals because Symbol.toPrimitive/valueOf returned the primitive. provider.persist() writes disk FIRST then in-process setCredentials (no zombie state on EACCES/EROFS). provider.poll(state, {signal}) cancellable via entry's cancelController; lost-success branch records audit. New errorKind persist_failed distinct from upstream_error. SDK 4 new helpers + lazy client.auth getter exposing DaemonAuthFlow.start(...).awaitCompletion() mirroring gh auth login UX (print code first, let consumer decide where to open browser; daemon NEVER spawns browser — static-source grep test fails build on node:child_process/open/xdg-open/shell.openExternal/execa/shelljs/process.spawn and dynamic-import variants). Side fixes in qwenOAuth2.ts: exports cacheQwenCredentials + folds SharedTokenManager.clearCache() into it + sets 0o600 mode on oauth_creds.json. Three pre-PR specialist agent passes (code-reviewer + silent-failure-hunter + type-design-analyzer) flagged 12 P0/P1 items, ALL folded in-PR before opening (BrandedSecret rewrite, dispose wiring, concurrent-start race, persist order, poll signal, transitionTerminal returns boolean, anchored RFC 8628 regex, dead-code removal, SDK reducer alignment with daemon, broadened static-source grep, sweeper integration test, 502 upstream_error mapping). Coordination items recorded in proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 (comment-4471728450): (a) inline broadcastWorkspaceEvent fold-in to PR 16's publishWorkspaceEvent once feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) #4249 lands; (b) /workspace/auth/status vs PR 12 /workspace/providers boundary — needs @wenshao decision (kept separate in v1; merge alternative discussed). Fold-in 1 (post-merge) parks: DeviceFlowEntry discriminated union over status, single-source SDK status / ProviderId unions, awaitCompletion memoize, broadcast-100%-fail stderr elevation, SDK 404 → 'not_found_or_evicted' errorKind. 369 serve + SDK tests pass (28 new across 3 layers); typecheck + eslint --max-warnings 0 clean. +3527/-8 across 17 files (post-prettier).
  • PR 17 ✅ merged: feat(serve): approval / tools / init / MCP-restart mutation routes (#4175 Wave 4 PR 17) #4282 (feat(serve): approval / tools / init / MCP-restart mutation routes) merged 2026-05-18 16:27Z (commit 6f7a48936). Implements POST /session/:id/approval-mode, POST /workspace/tools/:name/enable, POST /workspace/init, POST /workspace/mcp/:server/restart. MCP restart consults PR 14 v1's getMcpClientAccounting snapshot to gate "is this restart safe under budget?"; emitted events stamp originatorClientId.
  • PR 20 ✅ merged: feat(serve): add workspace file write/edit routes (#4175 PR20) #4280 (feat(serve): add workspace file write/edit routes) merged 2026-05-18 14:37Z (commit 688d64416). Strict-auth file write/edit routes on top of WorkspaceFileSystem; reuses PR 19's read-route envelope; atomic temp-file + rename writes; expectedHash / baseRevision optimistic concurrency for edit() TOCTOU per post-PR-18 follow-ups.
  • PR 19 ✅ merged: feat(serve): safe workspace file read routes (#4175 PR 19) #4269 (feat(serve): safe workspace file read routes) merged 2026-05-18 08:17Z (merge commit 52d2850c). Adds four read-only routes routed through PR 18's WorkspaceFileSystem: GET /file?path= (text + encoding/BOM/lineEnding metadata, ?maxBytes, ?line+?limit, stable originalLineCount, source sizeBytes plus response returnedBytes), GET /list?path= ({name, kind, ignored} entries, boundary-side MAX_LIST_ENTRIES + 1 probe, truncated:true, ?includeIgnored=1), GET /glob?pattern= (?cwd, ?maxResults, workspace-relative matches), and GET /stat?path=. All four set no-store / nosniff headers and share the sendFsError envelope. New always-on workspace_file_read capability tag. runQwenServe now constructs + injects fsFactory; glob audit hashes the bound workspace and emits literal pattern. All review threads were replied + resolved across follow-up commits 31c57c646 and 370f2d482. Unblocks PR 20.

Wave 5+ (architecture extraction / release): server-side core complete; client direction revised twice. Wave 5's "must-do" half — daemon server-side architecture hardening (PR 22a + 22b/1 + 22b/2 + F1 + F2 + F3) — all merged into daemon_mode_b_main as of 2026-05-20. Wave 5's adapter-migration half (PR 26) was superseded 2026-05-19 by the web-first pivot recorded in chiga0 #4296, and the web-first pivot itself was then revisited 2026-05-21 — browser dropped as daemon-hosted client surface (see "2026-05-21 direction update" subsection above). Net effect: native TUI / VS Code / channel adapters keep their direct ACP / runtime paths (2026-05-19 part still holds); F4 reverts to daemon-native client experience for non-browser ACP clients (CLI / IDE / SDK / channel adapters). PR 25 (output-sink lift) is independent CLI cleanup, unrelated to current F4 direction. Wave 6 (release hardening / v0.16) is independent of both pivots and still required.

Adapter spikes all merged: #4199 IDE (merge commit 4ab20ff6b), #4202 TUI (merge commit d07c958bb), #4203 channel/web (merge commit 11ba3856d). The session lifecycle + read-only status surface they relied on (PR 12) was the blocker.

Web-first pivot (2026-05-19, recorded in chiga0 PR #4296): the original "TUI + channels + web + IDE all default-migrate to DaemonSessionClient" plan is superseded. The 3 spikes are now treated as future-reference baselines, not active migration checklists:

  • Native local TUI keeps its direct runtime / Ink / streamJson path. No daemon HTTP/SSE migration — avoids a localhost hop and preserves local UX simplicity.
  • VS Code IDE keeps the existing --acp child path as default. Daemon-backed IDE is future / behind-flag only.
  • Channel adapters (Telegram / Weixin / Dingtalk / plugin) keep ACP subprocess by default. Daemon channel integration is future / behind-flag only.
  • Web chat / web terminal is the only daemon-native client target going forward. F4 now centers on this surface. Superseded 2026-05-21 — browser dropped as daemon-hosted client surface (see "2026-05-21 direction update" subsection above). F4 reverts to daemon-native client experience for non-browser ACP clients (CLI / IDE / SDK / channel adapters).

Active library work by chiga0 (status as of 2026-05-24): #4328 feat(daemon): add shared UI transcript layerMERGED 2026-05-22 (SDK-side daemon/ui reducer + framework-free store + @qwen-code/webui React bindings); #4353 unified completeness follow-up (PR-A through PR-E) ✅ MERGED 2026-05-24 00:51Z; #4380 feat/daemon-react-cli daemon-backed React web-shell still open (CHANGES_REQUESTED). All continue as standalone React/SDK library packages — daemon does not host them; downstream embedders that want web UI bring their own hosting.

Current near-term queue

Now organized around the F1-F5 feature plan (see "Branching strategy" + "Remaining work — feature-cohesive plan"). All future work targets daemon_mode_b_main, not main directly.

  1. Pre-F1 housekeeping ✅ complete: fix(serve): post-merge fixes for #4291 review (7 threads) #4305 merged 2026-05-19 05:40Z, fix(serve): post-merge P2 corrections from Codex review on #4282 #4297 merged 2026-05-19 06:57Z. daemon_mode_b_main is now the clean F1 baseline.

  2. F1 — acp-bridge package self-sufficiencyPR feat(acp-bridge): F1 — acp-bridge package self-sufficiency (#4175 mechanical lift + BridgeFileSystem seam) #4319 MERGED 2026-05-19 16:26Z (merge commit 981bc7c7e). 15 commits / +5343 −4690 across 14 files. All 18 inline review threads resolved across 5 wenshao review rounds + 1 initial Copilot pass; 12/13 adopted in-PR (initial 12 + refactor: extract normalizeDisabledToolList shared helper (mode B follow-up to #4319) #4329 helper extraction folded after second look + closed) + 1 declined (infra: enforce SDK/server MCP-restart timeout coupling (mode B follow-up to #4319) #4330 SDK/server MCP-restart timeout coupling — needs @qwen-code/acp-bridge npm publish decision OR Option B/C); 1 deferred (bug(acp-bridge): closeSession + killSession use module-scoped channelInfo instead of channelInfoForEntry(entry) #4325 pre-existing channelInfo bug — channel-overlap regression test needs dedicated scope). Bridge core lifted (BridgeClient + defaultSpawnChannelFactory + createHttpAcpBridge factory + BridgeFileSystem injection seam); httpAcpBridge.ts shrunk 4682 → 97 LOC shim. F2-F4 now unblocked.

  3. F2 — shared MCP transport poolPR #4336 MERGED 2026-05-21 15:56Z (merge commit 46f8d48f1, 36 commits / +10308 −147 across 38 files). All 6 feature commits landed: (1) split McpClient.discover into pure tool/prompt list, (2) McpTransportPool + SessionMcpView core, (3) cross-platform descendant pid sweep + commit-2 review fixes, (4) wire pool into QwenAgent daemon mode, (5) pool-aware status + restart routes (entryCount / entrySummary / ?entryIndex=), (6) graduate MCP budget guardrails to workspace scope (new WorkspaceMcpBudget; scope: 'workspace' cell; isWorkspaceScopedBudgetEvent SDK helper). 24 cumulative review-iteration rounds across multiple reviewers (wenshao, qwen-latest series, gpt-5.5, claude-opus-4-7) — 168 inline threads, all resolved. Critical bugs caught + fixed mid-PR: sibling-fingerprint statusChangeListener corruption (W2), descendant pid sweep missing on restart (W3), drain-timer-during-restart race (W4), discovery non-re-entrancy leak (W6), slot-release-during-spawn race (6R1), W90/W111 reverse-index ref leak on pooled in-flight path, W115/W118 stopTimedOut sticky-flag manager-reuse bug, W120/W122 silent-transport-drop zombie-attach (multi-round terminal-cleanup parity with forceShutdown / doRestart catch — R20→R21→R22 progression), W125 fast-path defense-in-depth + budget-release identity check (R22), R23 T1 discover() filter regression (legacy non-pool callers silently lost trust + include/exclude filtering), R23 T15 sweep / updateGlobalStatus ordering inversion, R24 T17 phantom budget-release on 'already_held' + R24 T19 readResource self-heal on dead pooled handle. Design doc v2.2 (docs/design/f2-mcp-transport-pool.md) records every fold-in with site / what was wrong / fold-in commit ref. 287 F2/SDK/acp-bridge/cli tests pass at merge. Filed as F2 follow-ups (post-merge cleanup PR): R9 McpClientManager ctor 7-positional sentinels, R10 pgrep-per-PID-per-level perf (R23 T7 dup), W7 (3/4) test coverage gaps + R24 T18 doRestart failure-path test scaffolding, W8 maxReconnectAttempts health-monitor wire-up (currently dead config), W11 acquire path duplication, W12 passesSessionFilter O(M×N), W93 pool self-heal post-spawn-failure, W133-a/c McpClient.onerror upstream-error threading + SDK 'failed' event source discriminator, W134 wrapper-grandchild SIGTERM error-surfacing on silent-drop sweep, R23 T3/T7 (defensive-doc const + pid BFS perf — declined in-PR, deferred for batched perf PR), R24 T17 multi-fingerprint OAuth scenarios (single-user deployment unaffected — relevant only when multi-tenant lands).

  4. F3 — multi-client permission coordinationPR #4335 MERGED 2026-05-20 11:13Z (merge commit 8eeb51009, +5263/−417 across 17 files). 4-strategy MultiClientPermissionMediator impl (first-responder / designated / consensus / local-only) behind PR 22a's frozen PermissionMediator interface; in-memory PermissionAuditRing (FIFO 512); 4 new SSE events (permission_partial_vote, permission_forbidden) + 3 new error classes (PermissionForbiddenError 403, PermissionPolicyNotImplementedError 501, CancelSentinelCollisionError 500); permission_mediation capability tag + active policy.permission field on /capabilities; policy.permissionStrategy + policy.consensusQuorum settings (boot-time validated); cancel-sentinel injection guard; detectFromLoopback covers RFC 1122 127.0.0.0/8; SDK reducer state for permissionVoteProgress / forbiddenVotes (FIFO 32) + mergeOriginator propagation; docs. 10 fixup rounds (aac5e638c9ac7b022c) addressing 47 inline review threads across Copilot + wenshao + DeepSeek-v4-pro reviews, all resolved. Two final wenshao APPROVALs (4327485978 + 4327608823) with 99/99 acp-bridge + 781/781 cli + 439/439 sdk + boot-smoke verification. F4 now unblocked.

  5. F4 prereq — daemon protocol completionPR #4360 MERGED 2026-05-21 03:11Z (merge commit a60c1c52a, 6 commits across 11 files). Three protocol-level daemon completions F4 needs before client adapter wiring can render correctly: (a) chiga0 issue Error after selecting authentication method: "Cannot read properties of undefined (reading 'value')" #19 three P0 stampings_meta.serverTimestamp on every SSE frame (multi-client clock-drift fix); tool_call provenance + serverId (UI dispatch on builtin / mcp / subagent); errorKind on stream_error via existing mapDomainErrorToErrorKind (typed UI retry / remediation rendering). (b) Ilya0527 issue 运行不了 #15 state divergence fix — daemon detects ring eviction on SSE resume + emits new state_resync_required synthetic terminal; SDK reducer adds awaitingResync flag + auto-skips non-terminal deltas until consumer recovers via loadSession + createDaemonSessionViewState. (c) Codex round 2 fold-in — FsError preservation over ACP wire in BridgeClient (writeTextFile/readTextFile wrap fileSystem call; FsError shape duck-typed; rethrown as ACP RequestError with errorKind/hint/status in data). Review iteration: 12/12 inline threads resolved across 6 reviewer rounds (Codex × 2 + wenshao × 3 + Copilot × 1) — 12 adopted in-PR + 2 declined (out-of-scope F3/chiga0 deferrals) + 5 placeholder/test deleted by reviewer. Verification: 451/451 SDK + 113/113 acp-bridge + ~30 new tests across 5 test files. F4 (PR 25 + 26) can now branch off daemon_mode_b_main — the SDK's reducer + adapter rendering have the protocol fields they need from day one.

  6. Immediate F1 follow-ups (small, can start any time):

    • feat(serve): F1 follow-up — BridgeFileSystem wiring + #4325 channelInfo fix #4334MERGED 2026-05-20 05:10Z (merge commit dfa8ca407). Bundled three F1 follow-ups: (a) fs adapter — serve-side BridgeFileSystem adapter wrapping PR 18's WorkspaceFileSystem + runQwenServe.ts + server.ts default-bridge wiring (closes ws.ts:613 TOCTOU thread); (b) bug(acp-bridge): closeSession + killSession use module-scoped channelInfo instead of channelInfoForEntry(entry) #4325 channelInfo fixcloseSession + killSession route per-entry via channelInfoForEntry(entry) + HAZARD(#4325) comments at both fix sites (smoke regression test is single-channel only); (c) writeTextOverwrite primitive added to PR 18's WorkspaceFileSystem — atomic create-or-overwrite with mode preservation + 0o600 default (closes Copilot review on (a)). Behavior change: ACP writeTextFile now rejects symlinked targets (matches PR 20 HTTP POST /file posture). Review iteration: 19/19 inline threads resolved across 3 reviewer rounds (Copilot + 2× wenshao + DeepSeek-v4-pro) — 12 adopted in-PR + 2 declined (read-size cap regression intentional, validator-rejected-already-tested duplicate) + 5 placeholder/test deleted by reviewer. 767/767 serve + 62/62 acp-bridge tests; +700 LOC / −80 LOC across 7 commits.
    • F1 test split#4445 MERGED 2026-05-23 08:46Z (merge commit 57d04786d, +587/−448 across 5 files). 5 commits: (1) pure git mv httpAcpBridge.test.ts → packages/acp-bridge/src/bridge.test.ts (100% rename, blame preserved across 6861 LOC); (2) extract internal/testUtils.ts (FakeAgent / makeChannel / makeBridge / WS_A / WS_B / SESS_A) + split 4 daemon-host integration tests to new packages/cli/src/serve/daemonStatusProvider.test.ts; (3) self-review round 1 fold-in (vitest alias + 4 doc/comment polish); (4) self-review round 2 fold-in (stale vitest.config.ts comment); (5) review round 3 — 4 of 7 reviewer threads adopted (T1 add bridge.shutdown() for symmetry, T5 rename cli helper to makeBridgeWithDaemonStatusProvider, T6 fix @internal JSDoc claim, T7 switched files array to negation patterns excluding dist/internal/testUtils.* + dist/**/*.test.* from npm publish; T2/T3/T4 declined as scope-creep / pre-split-verbatim / pitfall-not-triggered). Test parity: 181 pre-split → 177 (acp-bridge bridge.test.ts) + 4 (cli daemonStatusProvider.test.ts) post-split = 181. Cross-package resolution dual-channel: TS nodenext resolves via ./internal/testUtils subpath export in acp-bridge package.json; vitest reads .ts source via packages/cli/vitest.config.ts:resolve.alias so cli tests don't depend on stale dist/. Reviewer iteration: Copilot 4 + wenshao 3 inline threads, ALL 7 replied + resolved across one fold-in commit; APPROVED by wenshao 2026-05-22 18:43Z.
    • infra: enforce SDK/server MCP-restart timeout coupling (mode B follow-up to #4319) #4330 SDK/server timeout coupling — deferred to v0.16.x (2026-05-24). Option C (server advertise + SDK warn) is the chosen approach when it ships, but v0.16-alpha single-version-pair audience has 0 real drift to detect. Re-evaluate when v0.17 brings multi-version deployment.
  7. F2 post-merge cleanup PRs (deferred from feat(serve): shared MCP transport pool [F2] #4336 review, all small/low-risk, can run in parallel with F4 main scope):

    The F2 PR's 24-round review cycle filed several follow-ups that were intentionally NOT folded into the main PR (kept feat(serve): shared MCP transport pool [F2] #4336 scope to "shared MCP transport pool foundation"). Bucketed below by risk profile + owner-actionability so reviewers can pick up independently:

    • PR A — F2 perf cleanup#4411 MERGED 2026-05-23 00:04Z (merge commit c6deb58f1, +823/−594 across 8 files, APPROVED by wenshao)

      • R9: McpClientManager constructor 7-positional sentinels → (config, toolRegistry, options?: McpClientManagerOptions); mkManager(...) test factory at top of mcp-client-manager.test.ts collapses 80 inline constructions to 1-line factory calls naming only what each test overrides. Net −104 LOC in test file.
      • W11: mcp-transport-pool.ts:acquire() extracts attachPooledSession + rollbackReservationOnSpawnFailure private helpers. Race-window invariants (W10 / W77 / W90 / W111 / W125 / R24 T17) stay at call sites because they describe surrounding ordering, not the helpers themselves.
      • W12: session-mcp-view.ts applyTools / applyPrompts precompute filter Sets once per pass instead of scanning includeTools/excludeTools arrays inside every per-tool iteration. passesSessionFilter / passesSessionPromptFilter (the array-based predicates) stay exported and unchanged for unit tests.
      • R10 / R23 T7: pid-descendants.ts switches from per-pid pgrep -P <pid> BFS (one fork per node) to single ps -A -o pid=,ppid= snapshot + in-memory tree walk. Windows analog: single Get-CimInstance Win32_Process | ConvertTo-Csv snapshot. Per-pid path retained as fallback for BusyBox ps <v1.28 (no -o support) and distroless containers without ps.
    • PR B — F2 self-heal observability#4460 MERGED 2026-05-23 15:23Z (merge commit 0c0430939, +405/−5 across 3 files, 3 commits). Scope reduced from 3 items to 2 after source verification:

      • W93Declined — source-verified non-repro post the W1 fix in F2 commit 6 (mcp-transport-pool.ts:1031-1035). spawnEntry's catch already calls entry.forceShutdown('manual') which runs the full cleanup table (status listener removal, timer clear, subscriber detach, sweep+disconnect, onClosed eviction). The "partial cleanup" claim was filed pre-W1.
      • W133-aAdopted: McpClient gains a lastTransportError?: Error private field populated in onerror BEFORE the synchronous updateStatus(DISCONNECTED) cascade, exposed via new getLastTransportError() getter. The W120 silent-drop block reads it inline and appends : <error.message> to the lastError string on the emitted 'failed' event. Preserves the literal "silent transport drop" substring for log-grep backward compat. Operators now see the upstream EPIPE / OAuth 401 / server-crash cause directly on the wire instead of grepping --debug logs out of band.
      • W134Adopted (lightweight): sweepAndDisconnect returns SweepResult ({ pidSweepError?, descendantsFound?, descendantsSignaled? }) instead of Promise<void>. The silent-drop fire-and-forget caller chains to inspect the result and emits a structured debugLogger.warn line when either pid-sweep threw OR sigtermPids killed fewer descendants than discovered (partial-signal). forceShutdown / doRestart callers ignore the return (JS implicit-void at await preserves behavior). No new SSE event / SDK reducer state — full metrics surface deferred to W134-followup if maintainers want them.
      • Tests: 4 new tests in mcp-transport-pool.test.ts (W133-a happy path + fallback + W134 pidSweepError + W134 partial-signal) using new module-mocks for pid-descendants.js and a singleton-stub debugLogger.js. Test suite: 32/32 (28 pre-existing + 4 new).
      • Review iteration: 5 review threads resolved across 2 fold-in rounds. Round 1 (commit f3157ecc0) folded 4 copilot doc/comment threads — stale line refs → method anchors (T1, T3); ?? 0'unknown' sentinel for descendant counts so operators can distinguish "0 found" from "not measured" (T2); rewrote singleton-stub mock comment to be unambiguous about current behavior (T4). Round 2 (commit 450e301c8) folded 1 wenshao dead-data thread — removed SweepResult.disconnectError field that no consumer read (T5; pre-existing inner debugLogger.error already gives operators the disconnect-failure signal). Zero declined.
      • Verification: 32/32 F2 + 46/46 mcp-client tests pass; tsc clean both packages; eslint clean on touched files.
    • PR C — F2 SDK breaking — deferred to v0.16.x (2026-05-24)

      • W133-c: PoolEvent['failed'] adds discriminator field source: 'reconnect_budget' | 'silent_drop'. Source-verified zero real consumers today (@qwen-code/sdk + @qwen-code/webui + internal mcp-client-manager.ts:1571 onFailed all don't distinguish). Adding the field is breaking, but with no consumers it's premature contract polish. Pick up when the first real consumer requires the split.
    • Test coverage backfill (can fold into A or B)

      • W7 (3/4): 3 of 4 test gaps the F2 PR didn't close
      • R24 T18: doRestart 3 untested failure paths (catch block, generation-superseded guard, post-await state guard) — needs ~100 LOC of mock infrastructure (McpClient connect-throws-on-restart, generation race timing, forceShutdown-during-await mocks)
    • Decision-blocked, NOT cleanup (separate F2 follow-on feature, needs design decision):

      • W8: PoolEntryOptions.maxReconnectAttempts + reconnectStrategy are currently dead config (pool mode has no health monitor). Either implement health-monitor + auto-reconnect, OR delete the fields. Underlying question: should pool actively reconnect on transport drop, or rely on the W122 self-heal pattern (onClosed → next acquire spawns fresh)? Surface for reviewer decision; not appropriate to bundle with cleanup PRs.
    • Already declined in feat(serve): shared MCP transport pool [F2] #4336, NOT for cleanup PR (listed only to prevent re-litigation):

      • R23 T3 (bootstrapSkipsMcpDiscovery = true defensive doc-as-code)
      • R24 T17 multi-fingerprint OAuth scenarios — single-user deployment unaffected; relevant only when multi-tenant lands, at which point the contract should be re-reviewed end-to-end

    Suggested ordering: PR A → PR B → PR C. PR A is risk-free pure-refactor that surfaces stale code paths PR B may need to touch. PR C bumps SDK minor so it goes last in the cluster. None blocks F4 (main scope).

  8. Optional PR 16.5 follow-up — auto-memory remains intentionally deferred from PR 16. Start only if maintainers want auto-memory before bridge extraction; static QWEN.md workspace memory CRUD is already merged.

Goal

Deliver Mode B incrementally through small, independently mergeable PRs. The order below keeps the protocol stable before adding many endpoints, and puts minimal multi-client safety ahead of state-changing routes.

This is a feature-priority plan, but not a “mutation routes first” plan. Read-only/status routes can land early; routes that change daemon/session/workspace state must wait for auth/client-identity/session-scoped permission groundwork.

Key sequencing decisions

  1. Issue/docs cleanup is not a blocking PR. This issue body is the source-of-truth rollout plan. Release docs and alpha warnings can be folded into the release PR unless an npm release is imminent.
  2. Do not build a full MCP shared pool in Phase 1. McpClientManager is currently tied deeply to ToolRegistry, Config, and WorkspaceContext; a real shared pool needs the bridge/package split and a config-hash key, not just (workspaceCwd, serverName). Start with measurement and guardrails.
  3. Add protocol skeleton before CRUD. Capability tags, protocol versions, DaemonSessionClient, and typed event schemas should exist before broad control-plane routes land.
  4. Add minimal client identity before mutation routes. Daemon-stamped clientId and session-scoped permission routing are needed before remote clients can safely mutate approval mode, tools, files, auth, or workspace state.
  5. Add reliability/lifecycle before broad control-plane mutation. Heartbeat, replay/backpressure behavior, slow-client warnings, and session metadata/close/delete semantics are part of the P0 contract from Daemon mode (qwen serve): proposal & open decisions #3803, not optional polish.
  6. Read-only first, mutation second. Status/diagnostics routes come before write/edit/control routes.
  7. MCP pool and full PermissionMediator are architecture/security follow-ups. Guardrails can land early; full shared transport/process pool and policy strategies come after bridge extraction.
  8. Stage 2 protocol/ecosystem work is explicitly deferred unless called out below. WebSocket bidi, /ext/:method, reverse Client Capability RPC, OpenAPI codegen, Prometheus, and mDNS should not block Mode B v0.16 unless we intentionally move them into scope.

PR breakdown and dependencies

Wave 1 — Protocol foundation

These are the first real implementation PRs.

PR Title Contents Depends on
1 test/perf: add daemon baseline harness Done: #4205 merged 2026-05-16. Captures RSS curve, same-workspace attach latency, prompt p50/p99, MCP child count, SSE replay/backpressure basics. No optimization yet. Reference patterns from opencode: test/memory/abort-leak.test.ts (forced-GC + process.memoryUsage().heapUsed baseline→iterations→growth shape), src/cli/heap.ts (periodic RSS poll + writeHeapSnapshot on threshold, useful as a Wave 6 production tool reference), src/util/cpu-watchdog.ts (event-loop lag drift sampling with per-subsystem counter dumps). Daemon-level multi-session RSS scan + prompt percentile + MCP child count + SSE backpressure are net new — neither opencode nor qwen-code has them today. none
2 feat(serve): capability registry and protocol versions Done: #4191 merged 2026-05-16. Replace hard-coded STAGE1_FEATURES with an additive registry; add /capabilities.protocolVersions; keep existing v1 fields backward compatible. none
3 feat(sdk): add DaemonSessionClient skeleton Done: #4201 merged 2026-05-16; hardening follow-up #4225 merged 2026-05-17. Shared SDK helper over DaemonClient: create/attach, prompt, events, cancel, model; intended for TUI/channels/web/IDE adapters. PR 2
4 feat(protocol): typed daemon event schema v1 Done: #4217 merged 2026-05-17; follow-up #4226 merged 2026-05-17 10:43Z. SDK-layer discriminated union over the 8 emitted daemon frames, narrow helper, type guard, pure SessionState reducer; daemon advertises typed_event_schema capability tag. Raw DaemonEvent { data: unknown } and DaemonClient.subscribeEvents / DaemonSessionClient.events() signatures preserved. PR 2, PR 3

Wave 2 — Session lifecycle and minimum multi-client safety

These should land before any broad state-changing control-plane routes.

PR Title Contents Depends on
5 feat(serve): per-request sessionScope Done: #4209 merged 2026-05-16. POST /session accepts `{ sessionScope: 'single' 'thread' }; daemon default remains single; invalid values return 400 invalid_session_scope. New capability tag session_scope_overrideadvertised on/capabilities.features. Backward compatible — omitting the field preserves pre-PR wire shape bit-for-bit. If user` scope remains proposed in #3803, keep it capability-gated until its isolation semantics are clear. Follow-up #4214 aligns stale integration-test expectations and docs.
6 feat(serve): HTTP load/resume session Done: #4222 merged 2026-05-17 04:58. Adds POST /session/:id/load and POST /session/:id/resume; SDK DaemonSessionClient.load/.resume. Three @wenshao review passes addressed (25 inline threads); fixes include cached restoreState for late attachers, no defaultEntry promote on restore, symmetric coalesce guard (cross-action races both reject with RestoreInProgressError), synchronous coalesceState.count reservation to close the spawn-owner-disconnect race, pendingRestoreIds in killSession teardown, transportClosed.catch() to suppress dangling rejection, Retry-After: 5 alignment, isAcpSessionResourceNotFound exact-match fallback. Protocol + user docs updated. PR 3, PR 5
7 feat(serve): minimal daemon-stamped client identity Done: #4231 by @chiga0 merged 2026-05-17 08:19Z. Daemon assigns/stamps clientId; clients echo it back via X-Qwen-Client-Id; emitted events use trusted originatorClientId; no full revocation yet. PR 3, PR 4
8 feat(serve): session-scoped permission route Done: #4232 merged 2026-05-17 09:48Z. Adds POST /session/:id/permission/:requestId; keeps legacy POST /permission/:requestId; adds permission_already_resolved event. PR 7

Wave 2.5 — Reliability and session lifecycle closure

These are P0 contract gaps from #3803 and should land before broad mutation/control-plane work.

PR Title Contents Depends on
9 feat(serve): client heartbeat Done: #4235 merged 2026-05-17 10:57Z. Adds POST /session/:id/heartbeat (capability tag client_heartbeat); bridge heartbeat bookkeeping; SDK DaemonClient.heartbeat() + DaemonSessionClient.heartbeat(). Pure additive. PR 7
10 feat(serve): SSE replay and slow-client warnings Done: #4237 merged 2026-05-17 11:30Z. Configurable replay ring sizing, ?maxQueued=N, slow_client_warning, SDK event typing/reducer support, and docs. PR 1, PR 4
11 feat(serve): session metadata and close/delete lifecycle Done: #4240 merged 2026-05-17 12:42Z. Adds explicit session close/delete lifecycle, mutable session metadata, enriched session listings, typed lifecycle events, and SDK helpers. PR 6, PR 7

Wave 3 — Read-only control plane and diagnostics

This lets remote clients see the daemon runtime state before they can mutate it.

PR Title Contents Depends on
12 feat(serve): read-only workspace/session status routes Done: #4241 merged 2026-05-17 13:37Z. GET /workspace/mcp, GET /workspace/skills, GET /workspace/providers, GET /session/:id/context, GET /session/:id/supported-commands. Status payloads are protocol-versioned, redacted, and typed in SDK. Idle workspace returns initialized: false without spawning ACP. Session-scoped SDK helpers forward clientId for audit symmetry. PR 2, PR 4
13 feat(serve): preflight and env diagnostics routes Done: #4251 merged 2026-05-17 23:29Z (merge commit f44ed0941). GET /workspace/env + GET /workspace/preflight with closed errorKind taxonomy (7 literals); typed BridgeTimeoutError + mapDomainErrorToErrorKind + ACP_PREFLIGHT_KINDS shared between idle placeholder and live builder (TS exhaustiveness, no drift). getGitVersion/getNpmVersion migrated from blocking execSync to async execFile + 5s timeout (cross-cutting fix). 4 review rounds folded in across 12 inline threads (11 resolved, 1 closed via APPROVED waiver). 639/639 tests, all 3 OS CI green. Egress probe deferred to PR 14b. PR 12
14 (v1) feat(serve): MCP client guardrails Done: #4247 merged 2026-05-18 04:07Z (merge commit 96219924a). Ships in-process MCP client accounting on McpClientManager + atomic slot-reservation enforcement at all 3 spawn sites + --mcp-client-budget=N / --mcp-budget-mode={enforce,warn,off} flags + additive clientCount / clientBudget / budgetMode / budgets[] fields on GET /workspace/mcp + per-server disabledReason: 'budget' tag + always-on capability tag mcp_guardrails. Snapshot-only (typed push events split into PR 14b). Unblocks PR 14b + PR 17. Not the full shared pool — that's Wave 5 PR 23. PR 1, PR 12, PR 15
14b feat(serve): MCP guardrail push events + hysteresis Done: #4271 merged 2026-05-18 17:06Z (commit 3ffe321cf). Per-session typed SSE push events mcp_budget_warning (single-fire on upward 75% crossing of reservedSlots.size / clientBudget with hysteresis re-arm at 37.5% via new core constant MCP_BUDGET_REARM_FRACTION) + mcp_child_refused_batch (coalesced once per discoverAllMcpTools* pass + length-1 from readResource lazy-spawn refusal; mode: 'enforce' literal). Hysteresis state machine in McpClientManager (not bridge — per-session manager state, not per-subscriber backlog). Transport via existing connection.extNotification('qwen/notify/session/mcp-budget-event', ...) mirroring acpAgent.ts:355 precedent — pivoted away from the originally planned new workspace-level notification kind because PR 14 v1's per-session scope correction made the session-scoped notification surface the natural fit. BridgeClient.extNotification handler resolves session via byId.get(sessionId) and republishes as session-scoped SSE frame; unknown methods/kinds/sessionIds drop silently for forward-compat. Always-on capability tag mcp_guardrail_events (distinct from PR 14's mcp_guardrails). SDK reducer state: mcpBudgetWarningCount, lastMcpBudgetWarning, mcpRefusedBatchCount, lastMcpRefusedBatch on DaemonSessionViewState. SDK predicate guards reject mode: 'warn' on refused-batch (literal-'enforce' invariant) + unknown transport families. Integration cross-check pgrep -P subprocessCount vs daemon's getMcpClientAccounting().total validates the in-process counter as event source (POSIX non-sandbox skip-gating). Backward compatible — every new field optional, EVENT_SCHEMA_VERSION stays at 1, old daemons advertise only mcp_guardrails, SDK consumers fall back to snapshot polling. 13 files, +1828/-16 lines. 865/865 tests pass; 22 new tests across 6 layers. Purely additive on top of PR 14 v1; runs in parallel with Wave 4 PRs. PR 14 v1, PR 1

Wave 4 — Auth-gated mutation/control routes

These are the first broad mutation routes. They should reuse a single mutation-gating helper rather than open-code auth checks per route.

PR Title Contents Depends on
15 feat(serve): mutation gating helper and --require-auth Done: #4236 merged 2026-05-17 12:10Z. Central createMutationGate({ tokenConfigured, requireAuth }); --require-auth; conditional require_auth capability tag; existing mutation routes adopt the default non-strict gate as a centralization marker. Unblocks Wave 4 PRs 16-21. PR 7
16 feat(serve): memory and agents CRUD Done: #4249 merged 2026-05-18 06:27Z (merge commit 103090669). GET/POST /workspace/memory, GET/POST /workspace/agents, and agent-type scoped CRUD GET/POST/DELETE /workspace/agents/:agentType; strict mutation gate on writes; capability tags workspace_memory + workspace_agents; typed events memory_changed + agent_changed via bridge.publishWorkspaceEvent; bridge.knownClientIds() for workspace-level client attribution; 7 SDK helpers. Review fold-ins covered concurrent append serialization, fresh agent listing, builtin-name shadow rejection, validation/SDK parity/CodeQL regex hardening, and event/fan-out/helper tests. Auto-memory remains deferred to PR 16.5. PR 15, PR 12
17 feat(serve): approval tools init and MCP restart controls Done: #4282 merged 2026-05-18 16:27Z (commit 6f7a48936). POST /session/:id/approval-mode, POST /workspace/tools/:name/enable, POST /workspace/init, POST /workspace/mcp/:server/restart. MCP-restart consults PR 14 v1's getMcpClientAccounting snapshot to gate "is this restart safe under budget?" and stamps originatorClientId on SSE fan-out. PR 15, PR 12, PR 14 v1
18 refactor(serve): add FileSystemService boundary Done: #4250 merged 2026-05-18 05:14Z (merge commit 495d11f01). Per-request WorkspaceFileSystem boundary in packages/cli/src/serve/fs/. Centralizes chain-aware path canonicalization (ENOENT-tolerant ancestor walk), suspicious-pattern detection (NTFS ADS / 8.3 / UNC / DOS device names / trailing dots), .gitignore/.qwenignore enforcement, trust gate over Intent union, MAX_READ_BYTES = 256 KiB hard cap stat'd before slurp, SHA-256-hashed audit events. 70 review threads ALL resolved across 10 fix-up commits (Copilot 8 + wenshao 8 + DeepSeek 9 + 6 more rounds; TOCTOU symlink-substitution guards, multi-hop dangling-symlink escape, OOM defense-in-depth, safe UTF-8 truncation, 'io_error' HTTP 503 kind, strict-default fsFactory). 446 serve + 137 fs tests pass at merge. Unblocks PR 19 + PR 20. PR 12, PR 15
19 feat(serve): safe workspace file read routes Done: #4269 merged 2026-05-18 08:17Z (merge commit 52d2850c). Four read-only routes through WorkspaceFileSystem: /file, /list, /glob, /stat; no-store/nosniff headers; shared sendFsError envelope; workspace_file_read capability tag; runQwenServe fsFactory injection; glob audit pattern; stable /file metadata (sizeBytes source size + returnedBytes response size; nullable originalLineCount); boundary-side list cap probe. All review threads replied + resolved. Unblocks PR 20. PR 18
20 feat(serve): file write/edit routes behind auth Done: #4280 merged 2026-05-18 14:37Z (commit 688d64416). Strict-auth write/edit routes; reuses PR 19's route envelope and PR 18's WorkspaceFileSystem; enforces trust/qwenignore behavior, audit hooks, explicit symlink policy, atomic temp-file + rename writes, and expectedHash / baseRevision optimistic concurrency for edit() TOCTOU (#4250 thread on ws.ts:613). PR 8, PR 15, PR 19
21 feat(serve): auth device-flow route Done: #4255 merged 2026-05-18 14:05Z (commit 36760ca63). OAuth 2.0 Device Authorization Grant (RFC 8628) brokering: 4 routes under /workspace/auth/, 1 capability tag (auth_device_flow), 5 typed events (auth_device_flow_started/throttled/authorized/failed/cancelled). DeviceFlowRegistry with per-providerId singleton + idempotent take-over + in-flight Promise coalescing + 5-min terminal grace + sweeper + dispose. BrandedSecret<T> (frozen plain object + WeakMap + unique symbol brand + 4-way redaction). Disk-first persist(). Cancellable poll(state, {signal}). SDK client.auth.start(...).awaitCompletion() UX mirroring gh auth login. Pre-PR specialist passes (code-reviewer + silent-failure-hunter + type-design-analyzer) flagged 12 P0/P1 items, all folded before opening. 369 tests pass; typecheck + eslint clean. +3527/-8 across 17 files. PR 15, PR 12

Remaining work — feature-cohesive plan (replaces Wave 5/6 fragmented split)

Per maintainer guidance (2026-05-19), the remaining 10+ Wave 5/6 PRs are consolidated into 5 feature-cohesive PRs targeting daemon_mode_b_main. The Wave 5 / Wave 6 tables below are kept for traceability of the original split but should be read as superseded by F1-F5.

Feature PR Replaces Scope Est. LOC Depends on
F1: acp-bridge package self-sufficiency (#4319✅ MERGED 2026-05-19 16:26Z, merge commit 981bc7c7e) PR 22b/3 + PR 22b' Mechanical bulk lift of BridgeClient + defaultSpawnChannelFactory + createHttpAcpBridge factory closure (~3000 LOC); shrunk cli/src/serve/httpAcpBridge.ts from 4682 → 97 LOC re-export shim; BridgeFileSystem injection seam landed (writeText / readText ACP fs methods delegate to the seam when wired, fall back to inline proxy otherwise). 15 commits / +5343 −4690 across 14 files. Review history: 18 inline threads resolved across 5 wenshao rounds + 1 Copilot pass; 12 adopted in-PR + 1 declined (#4330) + 1 deferred (#4325) + 1 originally declined then folded after second look (#4329 ✅ closed). Follow-up ✅ MERGED: #4334MERGED 2026-05-20 05:10Z (merge commit dfa8ca407). Three F1 follow-ups bundled into one PR on daemon_mode_b_main: (1) serve-side WorkspaceFileSystem adapter wiring (closes ws.ts:613 TOCTOU thread); (2) #4325 channelInfo channel-overlap fix with HAZARD(#4325) review-time markers; (3) new WorkspaceFileSystem.writeTextOverwrite primitive — atomic create-or-overwrite with mode preservation + 0o600 default (replaces the inline-proxy posture for ACP writes). Behavior change documented: ACP writes now reject symlinked targets (consistent with PR 20 HTTP fs). 19/19 review threads resolved across Copilot + 2× wenshao + DeepSeek-v4-pro rounds. 767/767 serve + 62/62 acp-bridge tests. Still actionable: F1 test-file split + move (deferred — requires shared test-utils extraction), #4330 (SDK/server timeout coupling — needs npm-publish or Option B/C decision). Net: @qwen-code/acp-bridge is now a complete, injectable, testable package; F2-F4 unblocked. ~3500 (mostly git mv); shipped at +5343/−4690 across 14 files PR 22b/2 ✅
F2: shared MCP transport pool (#4336 — ✅ MERGED 2026-05-21 15:56Z, merge commit 46f8d48f1) PR 23 Real shared MCP transport / process pool keyed by canonical workspace + server config hash + auth/env/runtime inputs; replaces per-session spawn; reference counting + drain grace + max-idle hard cap; pool scope cell on GET /workspace/mcp complementing PR 14 v1's workspace scope cell. All 6 feature commits landed: McpClient.discover pure-snapshot split, McpTransportPool + SessionMcpView core, cross-platform descendant pid sweep, QwenAgent daemon-mode pool wiring, pool-aware status + restart routes (entryCount / entrySummary / ?entryIndex=), WorkspaceMcpBudget workspace-scope budget guardrails. Review history: 168 inline threads resolved across 24 review-iteration rounds (wenshao + qwen-latest series + gpt-5.5 + claude-opus-4-7). Notable critical fixes mid-PR: sibling-fingerprint statusChangeListener cross-corruption (W2), descendant pid sweep missing on restart (W3), drain-timer-during-restart race (W4), discovery non-re-entrancy leak (W6), W90/W111 reverse-index ref leak, W115/W118 stopTimedOut sticky-flag, W120→W122→R21→R22 silent-drop terminal-cleanup parity progression, W125 fast-path defense-in-depth + identity-checked eviction helper, R23 T1 discover() filter regression for legacy callers, R23 T15 sweep / updateGlobalStatus ordering, R24 T17/T19 budget-tracking + readResource self-heal. Design doc v2.2 (docs/design/f2-mcp-transport-pool.md) records every fold-in. 287 tests at merge. F2 follow-ups bucket (post-merge cleanup PR — see item 3 in "Current near-term queue" above for full list): R9/R10 perf + ctor cleanups, W7/R24 T18 test coverage scaffolding, W8 health-monitor wire-up, W11/W12 acquire-path / filter perf, W93 post-spawn-failure self-heal, W133-a/c upstream error threading + SDK discriminator, W134 silent-drop sweep error surfacing. ~800-1200 estimated; shipped at +10308/−147 across 38 files (36 commits) F1 ✅
F3: multi-client permission coordination (#4335 — ✅ MERGED 2026-05-20 11:13Z, merge commit 8eeb51009) PR 24 4-strategy MultiClientPermissionMediator impl (first-responder, designated, consensus, local-only) behind PR 22a's frozen interface; policy.permissionStrategy + policy.consensusQuorum workspace-settings keys (boot-time validated); structured PermissionAuditRing (in-memory FIFO 512) capturing 5 record kinds (requested / voted / forbidden / resolved / timeout) with decisionReason discriminated union for forensics; new SSE events permission_partial_vote (consensus quorum progress) + permission_forbidden (rejected vote) + 3 new error classes wired through HTTP (403/501/500). Pair-token revocation API explicitly out-of-scope (filed for follow-up — consensus voter set keyed on clientId registry; cancel sentinel cross-policy by design + documented). ~1500-2000 (final +5263/−417) F1 ✅
F4: daemon-native client experience for ACP clients (revised 2026-05-21 — 2026-05-19 web-first pivot itself superseded; see "2026-05-21 direction update" subsection above) revised Browser is out of scope for the daemon. F4 focuses on the existing ACP-over-HTTP/SSE surface (from F1) being used end-to-end by non-browser clients: CLI / IDE companion / Node SDK / channel adapters. Concrete sub-scope TBD — candidates: (a) qwen --connect <port> CLI multi-session attach (cross-terminal session sharing, F3 multi-client coordination in CLI domain); (b) VS Code IDE companion daemon-native rewrite (currently spawns CLI; could become daemon client for multi-window session sharing); (c) channel adapter end-to-end demo on top of PR 25 OutputSink (reference implementation for new wire-protocol channels); (d) Phase-1 scale instrumentation toward 30-50 active session goal (rate limiter / session TTL / per-session resource accounting / backpressure / observability / load test harness — independent of process model, prerequisite to any larger scaling work). F4 (1c) qwen serve /web HTTP endpoint was DROPPED 2026-05-21: browsers can't safely hold daemon bearer tokens (XSS risk), cookie-BFF mediation was only justified if browser were an active target. chiga0 #4328 + #4353 continue as library-only React/SDK packages; downstream embedders may host them, daemon does not. Native TUI / VS Code companion / channel adapters explicitly stay on their direct ACP / runtime paths — the spike PRs #4199 / #4202 / #4203 are future-reference baselines, not migration checklists. TBD per chosen sub-scope F1 ✅, F3 ✅
F5: production release chain PR 27 + 28 + 30a + 31 (PR 29 + PR 30b both deferred to v0.16.x) Alpha release docs (README known-limits, loopback noauth warning, daemon runtime locality, durability, deployment notes, BYO QWEN_SERVER_TOKEN guide + ~10 LOC SDK env fallback) + npm publish scaffolding + PR 30a local launch refs (systemd / macOS launchd / nohup-tmux) + v0.16-alpha.0 cut. 2026-05-24 scope freeze: v0.16-alpha = text-only + local-only, no new protocol/runtime knobs, no new auth machinery. Existing PR 15 boot-time gate (runQwenServe.ts:175) already refuses unsafe non-loopback boots; alpha audience is loopback-default dogfooding developers who don't need token at all. --max-body-size (chiga0 #27 P0 #1), PR 29 production token defaults (auto-gen + instance-path keying + SDK file fallback + stale cleanup), and PR 30b containerized deployment refs are all explicitly NOT in v0.16 — each ships as v0.16.x patch when the consumer scenario validating them (multimodal / remote / multi-daemon / enterprise pilot) is actually committed; otherwise the work would rot from no-one-validating. ~300-500 (29 + 30b excluded) F1-F3 ✅

Pre-F1 housekeeping ✅ complete: #4305 merged 2026-05-19 05:40Z and #4297 merged 2026-05-19 06:57Z. F1 can now branch off daemon_mode_b_main cleanly.

Parallelism: F2 and F3 are independent of each other (both depend on F1 only); they can run concurrently after F1 lands. F4 depends on F3's permission contract for multi-client UX. F5 sequences last. (Status as of 2026-05-21: F1 / F2 / F3 all merged into daemon_mode_b_main. F4 revised 2026-05-21 — daemon-native client experience for non-browser ACP clients (CLI / IDE / SDK / channel adapters); web hosting dropped. F4 (1c) qwen serve /web endpoint was DROPPED 2026-05-21 (browser not a target). chiga0 #4328 + #4353 continue as library-only React/SDK packages, independent of F4 critical path. F4 concrete sub-scope TBD — see F4 row above for candidates (CLI --connect, IDE daemon-native, channel adapter demo, Phase-1 scale instrumentation). PR 25 OutputSink lift is independent CLI cleanup. Native TUI / IDE / channel adapter migrations are NO LONGER planned (2026-05-19 part still holds). Shared TUI+web render-core deferred to far-future wishlist.)

Wave 5 — Architecture extraction, output sinks, and full multi-client security

Do not start this until the protocol skeleton, permission route, and lifecycle events are stable.

⚠️ Superseded by feature-cohesive F1-F5 above. Kept for traceability of the original 22a/22b/1/22b/2/22b/3 + 23/24/25/26 split.

📊 Wave 5 status as of 2026-05-20 (post web-first pivot recorded in chiga0 #4296):

Original PR Status Mapping / disposition
22a ✅ MERGED #4295 (interfaces + skeleton)
22b/1 ✅ MERGED #4298 (status / paths / errors / bridge types)
22b/2 ✅ MERGED #4304 (BridgeOptions + DaemonStatusProvider seam)
22b/3 + 22b' ✅ MERGED via F1 #4319 (acp-bridge core lift + BridgeFileSystem seam); follow-up #4334
23 ✅ MERGED via F2 #4336 (shared MCP transport pool)
24 ✅ MERGED via F3 #4335 (MultiClientPermissionMediator + audit ring + 4-strategy dispatch)
25 🟢 CLI-internal cleanup, independent of revised F4 direction (2026-05-21) chiga0 confirmed 2026-05-20: pure CLI-internal cleanup — daemon SSE keeps existing JSON-envelope wire format (no ?format=jsonl HTTP query param), sink lives in packages/cli/src/output/ or a new CLI-only package, MUST NOT land under packages/sdk-typescript/src/daemon/ (browser-safe bundle invariant guarded by chiga0's assertBrowserSafeBundle). Could become a candidate basis for revised F4 sub-scope (c) "channel adapter end-to-end demo" if someone picks that path. Otherwise: unblocked but low priority; ~1-2 days refactor whenever someone picks it up.
26 SUPERSEDED Native TUI / VS Code / channel adapter migration to DaemonSessionClient is no longer planned (2026-05-19 pivot, confirmed). The 3 spike PRs #4199 / #4202 / #4203 stay as future-reference docs only. The web-only piece was also dropped 2026-05-21 — daemon does not host web UI. chiga0 #4328 + #4353 continue as library-only React/SDK packages, independent of daemon.

Net: Wave 5's "must-do" server-side hardening (22a/22b/1/22b/2 + F1+F2+F3) is fully landed. PR 25 awaits a single yes/no coordination, PR 26 is dropped. There is no remaining "must-start" Wave 5 work as of this writing.

PR Title Contents Depends on
22a refactor(acp-bridge): create skeleton + lift zero-coupling primitives Done: #4295 merged 2026-05-18 17:23Z (commit f97cb680a). Creates packages/acp-bridge/ (@qwen-code/acp-bridge internal package). Lifts the three zero-coupling primitives via git mv (preserves blame): eventBus.ts (578 LOC), inMemoryChannel.ts (73 LOC), and AcpChannel/AcpChannelExitInfo/ChannelFactory types from httpAcpBridge.ts:638-677. Adds type-only PermissionMediator interface + PermissionPolicy literal union (4 strategies — first-responder, designated, consensus, local-only) as the contract for PR 24. Backward-compat re-export wrappers in cli/src/serve/eventBus.ts and inMemoryChannel.ts; httpAcpBridge.ts:638-677 is now an import type + export type re-export. 28 acp-bridge tests + 567 cli serve tests pass; typecheck clean. Zero /capabilities, route, SDK, or spawn-site behavior changes. PR 4, PR 8, PR 11
22b/1 refactor(acp-bridge): lift status, paths, errors, and bridge types Done: #4298 merged 2026-05-18 23:00Z (commit 6c2605f74). Pure-type / pure-utility lift: status.ts (600 LOC, includes the 27-symbol contract acpAgent.ts:85-113 consumes), canonicalizeWorkspace + MAX_WORKSPACE_PATH_LENGTH to workspacePaths.ts, 11 bridge error classes to bridgeErrors.ts, 11 bridge type interfaces (BridgeSession*, HttpAcpBridge interface) to bridgeTypes.ts. ~1500 LOC of mechanical moves with re-export wrappers; 4 commits (2 checkpoint commits + 1 review-feedback commit dropping premature subpath exports + adding bridgeErrors.ts module JSDoc + 1 self-review commit fixing missing barrel re-exports). Codex P2 + Copilot inline both flagged the same premature-exports issue → fixed pre-merge. github-actions Medium #4 (mapDomainErrorToErrorKind regex tech debt) tracked as follow-up issue #4299. PR 22a
22b/2 refactor(acp-bridge): lift BridgeOptions + DaemonStatusProvider injection seam Done: #4304 merged 2026-05-19 01:27Z (commit 68e3ec988). Design slice — judgment-heavy lift completed in 5 commits with wenshao review fold-ins (try/catch around provider calls, createServeApp default-bridge wiring, createIdleEnvStatus helper extraction, 4 fallback-path tests, SkillError cross-bundle defense closing #4298 thread). Lift BridgeOptions interface (~150 LOC) from httpAcpBridge.ts:695-831 to acp-bridge/src/bridgeOptions.ts. Introduce new DaemonStatusProvider interface in same file — the daemon-host-specific seam for env / preflight cells (buildEnvStatusFromProcess, buildDaemonPreflightCells stay in serve, get injected via BridgeOptions.statusProvider). New serve/daemonStatusProvider.ts (~150 LOC) wraps the daemon-host implementations. runQwenServe.ts passes the provider into the bridge factory. Review focus: DaemonStatusProvider interface shape — argument granularity ((boundWorkspace, acpChannelLive) vs context object), return-type granularity (full Serve*Status vs cells), abort/timeout support, fallback when omitted, single interface vs split (EnvStatusProvider + PreflightProvider). Total ~600 LOC, no behavior change. PR 22b/1
22b/3 refactor(acp-bridge): lift BridgeClient + spawn factory + createHttpAcpBridge Unblocked. Mechanical bulk move — zero new decisions. BridgeClient class (~400 LOC) + defaultSpawnChannelFactory (~250 LOC) + createHttpAcpBridge factory closure (~2240 LOC) → acp-bridge/src/{bridgeClient,spawnChannel,bridge}.ts. Factory consumes DaemonStatusProvider per the contract 22b/2 froze (no design judgment in this PR). Move 5064-LOC httpAcpBridge.test.tsacp-bridge/src/bridge.test.ts. Final shrink of cli/src/serve/httpAcpBridge.ts to ~80-line re-export shim. WorkspaceFileSystem injection into BridgeClient.writeTextFile/readTextFile deferred to PR 22b' (~200 LOC follow-up). Channels (packages/channels/base/AcpBridge.ts:61) and VSCode IDE companion (packages/vscode-ide-companion/src/services/acpConnection.ts:132) own-spawn migrations follow as separate PRs. Suitable for IDE-driven manual git mv (1-2 hours) given the mechanical nature; review focus is import path correctness + git rename detection. PR 22b/2
23 feat(mcp): shared MCP transport/process pool ✅ MERGED via F2 (#4336). Real shared pool keyed by canonical workspace + server config hash + auth/env/runtime inputs; lifecycle/refcount tests. PR 22, PR 14
24 feat(security): client pairing revocation and PermissionMediator ✅ MERGED via F3 (#4335 — merge commit 8eeb51009, 2026-05-20 11:13Z). 4-strategy MultiClientPermissionMediator (first-responder / designated / consensus / local-only) on PR 22a's interface + audit ring + capability surface + SDK reducer state. Pair tokens / revocation API explicitly out-of-scope (filed for follow-up — F3 v1 keys consensus voter set on clientId registry + cancel sentinel cross-policy by design). PR 8, PR 22
25 refactor(output): daemon-compatible output sinks 🟢 CLI-internal cleanup, independent of revised F4 direction (2026-05-21) — chiga0 reply 2026-05-20. Abstract JSONL / stream-json / dual-output behind an OutputSink interface used only by native CLI (nonInteractiveCli.ts + ui/utils/export/formatters/jsonl.ts + ui/hooks/useGeminiStream.ts). Sink lives in packages/cli/src/output/ or a new CLI-only package; must not land under packages/sdk-typescript/src/daemon/ (browser-safe bundle invariant). Daemon SSE wire format unchanged. Could become a candidate basis for revised F4 sub-scope (c) "channel adapter end-to-end demo" if someone picks that path. ~1-2 days refactor whenever someone picks it up. PR 4
26 feat(adapters): flag-gated daemon client adapters (SUPERSEDED 2026-05-19 by web-first pivot — chiga0 #4296; web-only piece also dropped 2026-05-21) Original plan (TUI / channels / web-debug / IDE all migrate to DaemonSessionClient) is shelved. Native TUI / IDE / channels keep their direct ACP / runtime paths as defaults (2026-05-19 part still holds). The web-only piece was also dropped 2026-05-21 — daemon does not host web UI. chiga0 #4328 + #4353 continue as library-only React/SDK packages, independent of daemon. The 3 spike PRs #4199 / #4202 / #4203 stay as future-reference baselines. PR 3, PR 4, PR 25 — superseded

Wave 6 — Release hardening and v0.16

⚠️ Superseded by F5 in the feature-cohesive plan above. Kept for traceability of the original 27/28/29/30/31 split.

PR Title Contents Depends on
27 docs(serve): alpha release docs and runtime locality README/docs known limits, loopback noauth warning, daemon runtime locality, durability semantics, deployment notes. Can be folded into release PR if small. before npm release
28 chore(release): publish qwen serve alpha Cut npm release containing Mode B alpha and post-publish smoke test. selected Wave 1/2/2.5/3 baseline
29 feat(security): production token defaults NOT in v0.16 — deferred to v0.16.x (2026-05-24 scope freeze). Original scope: auto-generate daemon token, SDK env/file fallback, token instance path keyed by host + port + workspace hash, stale cleanup. PR 15 (#4236, already on main) supplies the boot-time security gate (runQwenServe.ts:175 refuses non-loopback boot without token; loopback noauth default with stdout warning). PR 29's features are pure DX / ergonomics on top of that gate and only matter for multi-daemon / remote / pilot scenarios that the v0.16-alpha local-only audience does not trigger. SDK env fallback minimum (~10 LOC) folded into PR 27 instead. Re-evaluate when remote / multi-daemon / enterprise pilot is committed. F3 ✅ (post-v0.16)
30a docs(deploy): local launch references In v0.16-alpha scope (2026-05-24 freeze): systemd unit (Linux) + macOS launchd plist + nohup qwen serve & / tmux one-liner + daemon restart/crash semantics. systemd / launchd templates write Environment=QWEN_SERVER_TOKEN=... directly using the BYO-token pattern from PR 27 (no PR 29 dependency). ~150-250 LOC markdown, 0 code. Can run alongside PR 27 once PR 27 is in. PR 27
30b docs(deploy): containerized deployment references NOT in v0.16 — deferred to v0.16.x (2026-05-24 scope freeze). Dockerfile / docker-compose, k8s Deployment manifest, nginx/Caddy reverse-proxy with TLS termination, multi-instance token isolation (uses PR 29 instance-path keying), containerized secret management. Structurally depends on PR 29, which is itself deferred. Picks up when an enterprise pilot target is committed; without a real container deployment to validate against, the doc would rot. PR 29 (post-v0.16)
31 chore(release): v0.16-alpha.0 cut Scope (2026-05-24): text-only chat / coding + local-only deployment, no new protocol knobs, no new auth machinery. Cuts v0.16-alpha.0 after PR 27 + 28 + 30a are in. PR 29 (token defaults) and PR 30b (containers) ship as later v0.16.x patches when a remote / multi-daemon / pilot scenario is actually committed. F3 ✅, PR 27, PR 28, PR 30a

Critical dependency chain

capability registry
  -> DaemonSessionClient
  -> typed daemon events
  -> daemon-stamped client identity
  -> session-scoped permission route
  -> heartbeat/replay/lifecycle closure
  -> mutation-gating helper
  -> control-plane mutation routes
  -> bridge extraction
  -> real MCP pool + full PermissionMediator

Work that can run in parallel:

Route safety definitions

Read-only routes return daemon/session/workspace state and should not mutate runtime behavior. Examples: GET /capabilities, GET /workspace/mcp, GET /workspace/preflight, GET /workspace/env, GET /session/:id/context.

Mutation routes change daemon/session/workspace state or cause runtime action. Examples: POST /session/:id/prompt, POST /session/:id/cancel, POST /session/:id/approval-mode, POST /workspace/tools/:name/enable, POST /workspace/mcp/:server/restart, POST /workspace/file/edit, POST /session/:id/permission/:requestId.

PR organization by feature area

Cross-cutting view of merged Mode B PRs grouped by what they deliver (complement to the Wave tables above which show the timeline / dependency view). Useful for new contributors orienting on "what's in Mode B today" without walking the wave-by-wave history.

1. Protocol foundation / capability negotiation (4 PRs)

2. Session lifecycle (Wave 2 + 2.5, 8 PRs)

3. Diagnostic / read-only status routes (Wave 3, 2 PRs)

4. MCP resource guardrails (Wave 3, 3 PRs)

5. Auth infrastructure (Wave 4 prereq + auth route, 3 PRs)

6. Workspace state mutation routes (Wave 4, 2 PRs)

7. File system service + file routes (Wave 4, 4 PRs)

8. Bridge package extraction (Wave 5 PR 22, 4 PRs)

9. Error classification (1 PR)

10. Client adapter spikes (3 PRs, chiga0)

11. CI / test alignment (1 PR; also #4214, #4279, #4291, #4306 cross-listed in their feature areas)

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions