Skip to content

proposal(serve): qwen --serve (Mode A) — TUI + in-process HTTP daemon, 3-phase plan (Stage 1.5b) #4156

@doudouOUC

Description

@doudouOUC

Background

Stage 1's qwen serve (#3889) shipped a headless daemon with HTTP+SSE for remote clients; #4113 is consolidating the "1 daemon = 1 workspace" architecture. But today only Mode B (headless) exists: with a TUI process running, you cannot also run a daemon — so while a local user has the TUI open, mobile / IDE / IM bot / web clients cannot connect.

The design proposal at qwen-code-daemon-design §04 / §06 names this work Stage 1.5b — Mode A qwen --serve flag: attach an HttpServer inside a normal TUI process; the TUI acts as a super-client over the in-process EventBus, and remote clients connect via HTTP+SSE sharing the same daemon and same session pool.

⚠️ Strongly recommended to start this issue after #4113 merges. APIs introduced by #4113 are dependencies: boundWorkspace is required, canonicalizeWorkspace and WorkspaceMismatchError are exported. Sub-task A0 (extract a workspace-validation helper) is itself a small refactor of #4113's code; sub-task A1 (extract the in-memory channel helper) has no #4113 dependency and can start immediately.

Architecture comparison

Dimension Mode A (qwen --serve) — this issue Mode B (qwen serve) — already shipped
TUI ✅ in-process super-client (full ~15 Ink dialogs + local-jsx slash commands) ❌ headless
Agent in-process QwenAgent (same process) spawn qwen --acp child
Default auth loopback no-token bearer required
Default CORS loopback only configuration-driven
Shutdown Ctrl+C / /quit → drain HTTP → exit daemon SIGTERM → drain → close
TUI session count 1 (additional remote sessions exist over HTTP but TUI cannot see them) N/A
Crash isolation ⚠️ daemon exception kills TUI (same process) ✅ child isolated
MCP child count TUI's set + daemon's per-session set (N×2 amplification) daemon's per-session set only

Key technical finding

createServeApp(opts, deps) (packages/cli/src/serve/server.ts:53) already supports deps.bridge?: HttpAcpBridge injection, and HttpAcpBridge (packages/cli/src/serve/httpAcpBridge.ts:87) is a transport-agnostic interface. So Mode A does not need to change the routing / SSE / EventBus layers — it only needs a createInProcessAcpBridge(agent) implementation that wraps the in-process QwenAgent as a bridge to inject. SSE / Last-Event-ID / 15s heartbeat / ring replay all reuse #3889 directly.

⚠️ Scope clarification: the design proposal's "TUI startup auto-POST /session, remote attaches to the same X" full semantics requires refactoring the TUI to flow through QwenAgent — that's a Phase D-class refactor (see Phase A detailed design §1 below). This issue's Phase A/B/C does NOT include it.

Three-phase plan overview (3 stacked PRs)

Each phase reviews and merges independently; B/C have no hard dependency between them. Phase A detailed design is in its own section below.

Phase A — Loopback-only minimal skeleton (~4.3 days, split into A0/A1/A2/A3 four stacked PRs)

qwen --serve [--workspace /path] [--serve-port N] brings up TUI + local daemon, with remote curl over loopback exercising the full prompt + SSE flow. Local debugging only — no auth, no graceful shutdown, TUI and remote sessions not shared, remote cannot call authenticate, daemon exception will kill TUI. Detailed design below.

Phase B — Remote bind + auth/CORS defaults (~1 day)

Goal: qwen --serve --serve-host 0.0.0.0 is reachable from LAN/container; bearer token required for non-loopback; loopback still no-token. Team / container alpha-quality.

Changes:

  1. Add --serve-host flag (default 127.0.0.1) and --serve-token flag + QWEN_SERVER_TOKEN env fallback
  2. Reuse packages/cli/src/serve/runQwenServe.ts:60-75 token trim+env logic
  3. Loopback detection via isLoopbackBind() (packages/cli/src/serve/loopbackBinds.ts); non-loopback without token → boot fails with same stderr message as Mode B's "Refusing to bind X without a bearer token"
  4. Auto-generate token written to ~/.qwen/serve/token, TUI banner shows the path (matches Mode B existing behavior)
  5. CORS / Host allowlist fully reuses packages/cli/src/serve/auth.ts three middlewares (bearerAuth / denyBrowserOriginCors / hostAllowlist); they are transparent to the in-process bridge
  6. Disallow --serve-port 0 + non-loopback host combination (OS ephemeral port + remote = operator can't tell what they exposed)
  7. Tests: non-loopback no-token → boot reject; with token → accept; wrong token → 401

Phase B does NOT do: Mutual TLS / client certificates (Stage 2), token revocation API (chiga0 must-have #3)

Phase C — Lifecycle coordination (~1 day)

Goal: TUI exit (Ctrl+C / /quit / exception) drains HTTP first, then unmounts ink, then exits process; remote clients see clean close not TCP RST; remote disconnect doesn't affect TUI. Production-ready.

Changes:

  1. Centralized shutdown handle: TUI startup hook returns { shutdown(): Promise<void> } wrapping bridge.shutdown() + server.close() + 5s force-close (reuses runQwenServe.ts:15 SHUTDOWN_FORCE_CLOSE_MS)
  2. qwen.tsx top-level single SIGINT/SIGTERM handler, ordered: ① EventBus pushes daemon_shutting_down to all SSE → ② server.close() rejects new connections, awaits in-flight → ③ bridge.shutdown() → ④ ink unmount → ⑤ process.exit(0)
  3. /quit reuses the same shutdown() function — avoid two divergent exit paths
  4. Double-SIGINT force exit (matches refactor(serve): 1 daemon = 1 workspace (#3803 §02) #4113 BkUyD behavior): second Ctrl+C → skip drain, exit(130)
  5. Coordinate with ink's default SIGINT handler: must remove/wrap ink's handler at TUI startup, otherwise ink unmounts first and HTTP drain doesn't run (only likely-trap point in this phase, leave 0.5d buffer)
  6. E2E: a) TUI + remote SSE connected, TUI process SIGINT → remote sees close event + 200 response within 5s, TUI exit code 0; b) remote kill -9 SSE client → TUI still running, /help still responds

Phase C does NOT do: Daemon hot-reload (restart daemon without restarting TUI), Persistent SSE replay across restart (Stage 2 durable scope)

Three-phase characteristics

Property Phase A Phase B Phase C
User-facing Local debug only + remote independent sessions Team / container alpha Production stable
Shippable midstate? ⚠️ internal dogfooding ✅ alpha ✅ stable
Dependencies Strongly recommended after #4113 (sub-task A1 can start independently) A A recommended first but not hard-required
Main risks runAcpAgent side-effect isolation, authenticate security gate, MCP resource amplification, ACP paired-channel wiring Mostly reuse, low risk ink + signal coordination potential traps
Independent PRs ✅ (split into A0/A1/A2/A3 stacked PRs, see §8)

Phase A detailed design

0. opencode reference and trade-offs

opencode's tui command (packages/opencode/src/cli/cmd/tui/thread.ts) implements TUI + server in one process:

opencode approach Borrow? Reason
TUI on main thread + server in Worker thread, communicating via Rpc.client qwen-code is single-threaded + ink; introducing Worker is an independent large refactor
process.argv.includes('--port') to distinguish internal vs external ✅ in spirit qwen-code simplifies to: listen only when --serve is passed + lazy import
createWorkerFetch / createEventSource wrap RPC as fetch Same process same thread does not need a fetch abstraction
withTimeout(client.call("shutdown"), 5000) 5s force Matches runQwenServe.ts:15 SHUTDOWN_FORCE_CLOSE_MS = 5_000
tui() entry accepts url + fetch + events injection ⚠️ partial qwen-code's TUI entry is more deeply coupled today; refactoring to "injectable transport" is Phase D's work

Core borrow: opencode proves "TUI + server in same process" is viable + 5s shutdown is a sensible constant. Key difference: opencode's TUI already abstracts data access behind fetch/events injection; qwen-code's TUI does not. So Phase A cannot simply copy "TUI as daemon client" — that's Phase D's work.

1. Honest scope statement ⚠️

Phase A actually achieves:

  • qwen --serve brings up TUI with HttpServer attached
  • ✅ Remote clients can curl /capabilities, POST /session to get an independent session in the same workspace, receive SSE events, send prompts
  • ✅ TUI's own session continues normally without regression

Known limitations (differences from Mode B; must be documented for users):

Limitation Cause Impact Mitigation
TUI and daemon sessions not shared TUI does not flow through QwenAgent Remote cannot see TUI's conversation; TUI cannot see remote's Phase D refactor
Remote authenticate request rejected (returns ACP error) QwenAgent.authenticate() calls clearCachedCredentialFile + refreshAuth → directly clears TUI's credentials Remote cannot switch auth method; only the TUI's currently-authenticated method is usable Phase D, daemon ↔ TUI credential mediator
Daemon exception kills TUI Same process, no child isolation uncaughtException / OOM kills the whole process Phase A wraps try/catch at bridge layer; full fix requires Phase E in-process Stage 2e reverse refactor
MCP child count amplification (TUI's set + daemon's per-session set) acpAgent.ts:618 newSessionConfig builds a new Config per session N MCP servers × (1 + daemon session count) children; 10 MCP × 10 sessions ≈ 5-10 GB implicit memory Phase A daemon --max-sessions defaults to 5; long-term via chiga0 finding #3 (MCP per-daemon shared state)
Process-level state contention (OAuth refresh / FileReadCache / quota) TUI/daemon share singletons Concurrent token refresh may race Phase A documents it; later add process-level Mutex

"TUI ↔ daemon session unification" is broken out as Phase D (not in this issue's scope).

2. File-level change list

# File Type Description
0 extract from packages/cli/src/serve/runQwenServe.ts + export from packages/cli/src/serve/index.ts modify ~50 lines (A0 sub-PR) Extract a shared helper validateAndCanonicalizeWorkspace(rawPath: string): string: bundle runQwenServe.ts:121-160's path.isAbsolute / fs.statSync / isDirectory / ENOENT/EACCES/EPERM validation + canonicalizeWorkspace call; export for Mode A reuse. Rationale: #4113's server.ts docblock explicitly calls out "If a future entry point binds createServeApp directly to user input, it MUST replicate the runQwenServe validation (or call into a shared helper if one is extracted)" — Mode A is exactly that scenario; extracting the helper is the clean path
1 packages/cli/src/index.ts or yargs middleware layer modify Add top-level --serve (boolean) + --serve-port (number, default 0 — OS-assigned to avoid clashing with Mode B 4170) flags; mutex check against serve subcommand / --acp / -p. --workspace <path> flag reuses the same-named flag added by #4113 (do NOT introduce --serve-workspace)
2 packages/cli/src/serve/inMemoryChannel.ts new ~80 lines Extract the paired NDJSON channel pattern existing at httpAcpBridge.test.ts:151-154 as production code — createPairedChannel(): { clientStream, agentStream }, two TransformStream<Uint8Array, Uint8Array> pairs back-to-back + SDK's existing ndJsonStream (NOT PassThrough)
3 packages/cli/src/serve/inProcessAcpBridge.ts new ~200 lines (an order of magnitude smaller than #4113's httpAcpBridge.ts ~2400 lines, because in-process has no child / spawn race / SIGTERM grace) Implement HttpAcpBridge interface; inline a side-effect-free equivalent of runAcpAgent: ① create paired channel ② new QwenAgent(sharedConfig, sharedSettings, fabricatedArgv, agentSideConnection)do NOT redirect console.log/info/debugdo NOT wrap process.stdout/stdindo NOT register SIGINT/SIGTERM (left for Phase C) ⑥ do NOT runExitCleanup + process.exit on stream end ⑦ at the authenticate request forwarding point, return ACP error directly: "remote authenticate disabled in Mode A" ⑧ wrap try/catch around sendPrompt / newSession calls to catch uncaughtException
4 packages/cli/src/gemini.tsx around line ~705 modify ~40 lines In the if (config.isInteractive()) branch, before render(), if argv.serve: lazy import await import('./serve/inProcessDaemon.js')call A0's extracted validateAndCanonicalizeWorkspace(argv.workspace ?? process.cwd()) to obtain boundWorkspace → start daemon → listen URL goes to writeStderrLine (not writeStdoutLine, to avoid polluting ink's stdout)
5 packages/cli/src/serve/inProcessAcpBridge.test.ts new ~450 lines Interface contract tests: all HttpAcpBridge methods aligned with httpAcpBridge.test.ts; reuse #4113's makeBridge() helper pattern (default boundWorkspace: WS_A, override explicitly when needed); add 5 cases: remote authenticate rejected, remote prompt-throw doesn't kill process, port conflict exit 1, lazy import verification, omitting cwd falls back to boundWorkspace
6 packages/cli/src/serve/serveFlag.test.ts new ~200 lines E2E: spawn qwen --serve --serve-port 0 subprocess, parse listen port from stderr, curl /capabilities and POST /session, verify --serve + serve / --acp / -p three mutex combinations all exit 1
7 docs/users/qwen-serve.md modify ~50 lines (add chapter to existing file, not new) In the file already updated by #4113, add "Mode A — qwen --serve" chapter + Mode A vs Mode B selection table, document §1's 5 known limitations. Do NOT modify docs/developers/qwen-serve-protocol.md (HTTP protocol, ACP wire, workspace_mismatch body shape are identical for Mode A/B). docs/developers/examples/daemon-client-quickstart.md only needs a top-of-file note: "examples apply to both Mode A and Mode B; daemon startup differs but client integration is identical"

3. Key technical decisions

Decision 1: in-process bridge uses paired channel + full ACP

Dimension paired channel + ACP (chosen) direct method calls
server.ts / eventBus.ts changes 0 large
ACP protocol evolution auto-follow dual maintenance
Implementation complexity medium (~200 + 80 lines) low (~150 lines)
Debuggability good (ACP frames dumpable) average

The TransformStream pair + ndJsonStream pattern at httpAcpBridge.test.ts:151-154 already exists; just extract to inMemoryChannel.ts.

Simplification advantage (vs #4113's httpAcpBridge.ts): in-process has no child process / spawn race / child crash race / SIGTERM grace window, so it does not need ChannelInfo.isDying state, aliveChannels set, killAllSync path, or tanzhenxin's BkUyD double-SIGINT invariant. Estimated in-process bridge ~180-220 lines, an order of magnitude smaller than #4113's httpAcpBridge.ts (~2400 lines).

Decision 2: TUI ↔ daemon session relationship — decoupled

See §1. The daemon runs an independent QwenAgent in the same process; TUI session and daemon session are mutually invisible.

Decision 3: when to listen + boot failure + log direction

  • Position: gemini.tsx, after initializeApp(config, settings) (ensures settings/auth are ready), before render(<App ...>)
  • Failure handling: port conflict / bind error / workspace validation failure → TUI does not start, writeStderrLine prints error, process.exit(1)
  • Listen URL printing: only via writeStderrLine (stdout is occupied by ink; writing to stdout would pollute TUI rendering)
  • Default port: 0 (OS-assigned), to avoid clashing with Mode B's default 4170; users must check stderr for the actual port

Decision 4: EventBus instance sharing

createServeApp instantiates an EventBus internally for SSE. Phase A: in-process QwenAgent flows via paired channel → ACP sessionUpdate notification → bridge forwards to EventBus → SSE fan-out. Does not do "TUI directly subscribes to EventBus" — that's Phase D.

Decision 5: --serve mutex with existing flags

Combination Behavior
qwen --serve Mode A: TUI + daemon
qwen --serve --workspace /path Mode A, daemon binds to /path instead of cwd (legal, reuses #4113's --workspace semantics)
qwen --serve --continue / --resume X / --prompt-interactive "..." / --model X Legal (still interactive TUI)
qwen serve Mode B (existing, mutex)
qwen --serve serve Boot fails
qwen --acp --serve Boot fails (ACP uses stdio, mutex with daemon)
qwen -p "hello" --serve Boot fails (headless prompt mode mutex with interactive daemon)
qwen --bare --serve Boot fails (bare mode skips settings load, daemon needs settings)
qwen --input-format stream-json --serve Boot fails (non-interactive mode)
qwen --json-schema "..." --serve Boot fails (gemini.tsx:738 already errors for interactive mode, implicit mutex)
qwen --serve on non-TTY (nohup ... &) Boot fails + hint "use qwen serve for headless"

Validation lives near gemini.tsx:417, alongside --bare / --prompt-interactive mutex checks.

Decision 6: Phase A forces loopback

--serve-host is not exposed in Phase A; hard-coded 127.0.0.1. Remote bind must come with token — that's Phase B's work.

Decision 7: boundWorkspace is boot-time snapshot + canonical

Mode A's boundWorkspace = validateAndCanonicalizeWorkspace(argv.workspace ?? process.cwd()) at --serve boot time, never changes. Even if TUI calls process.chdir() during runtime (triggered by some commands), the daemon still serves the original workspace. This matches #4113's 1-daemon-1-workspace semantics.

Critical: canonicalizeWorkspace is the idempotent helper already exported from httpAcpBridge.ts; handles symlinks + case-insensitive FS. Phase A must pre-canonicalize — otherwise /capabilities.workspaceCwd may drift from the bridge's internal canonical form (although #4113 has its own re-canonicalize fallback, aligning with runQwenServe's pattern is cleaner).

Empirical: grepping the entire codebase for process.chdir() in ui/commands/, services/, acp-integration/zero calls, so snapshot vs dynamic is equivalent today. But future sub-agent tools (EnterWorktree PR #4073) involve directory switching; dynamic mode would silently change boundWorkspace, causing all connected clients to receive workspace_mismatch without notification — a sliding error. Snapshot is strictly superior to dynamic.

Decision 8: reject remote authenticate request forwarding

The in-process bridge intercepts the authenticate method at the ACP request routing layer and returns ACP error directly: { code: -32601, message: "remote authenticate disabled in Mode A; use TUI /auth instead" }. Rationale per §1: QwenAgent.authenticate() clears TUI credentials.

Decision 9: share config / settings, fabricate argv

The daemon's QwenAgent uses the TUI's already-constructed config + settings instances (avoiding settings drift), but argv is freshly built as a clean CliArgs (containing only daemon-relevant fields like cwd, mode), to avoid TUI's flags like --prompt-interactive poisoning daemon behavior.

InProcessBridgeOptions TS signature (aligned with #4113's BridgeOptions):

interface InProcessBridgeOptions {
  boundWorkspace: string;           // required (matches #4113 BridgeOptions.boundWorkspace)
  sharedConfig: Config;             // required
  sharedSettings: LoadedSettings;   // required
  daemonArgv: CliArgs;              // required (fabricated)
  maxSessions?: number;             // optional, defaults to 5 (per D2 decision)
  // No maxConnections needed (in-process has no listener layer)
  // No sessionScope override needed (sticks with single default)
}

export function createInProcessAcpBridge(opts: InProcessBridgeOptions): HttpAcpBridge;
// Note: matches #4113's createHttpAcpBridge(opts: BridgeOptions), opts has no default

fabricateDaemonArgv(orig: CliArgs) field disposition (keep / drop):

Fields Disposition Reason
model, yolo, approvalMode, extensions, includeDirectories, mcpConfig, allowedMcpServerNames, telemetry*, openaiApiKey, openaiBaseUrl, proxy, authType, coreTools, excludeTools, disabledSlashCommands, allowedTools, maxSessionTurns, chatRecording, checkpointing, debug, screenReader, sandbox, sandboxImage, channel ✅ keep provider/tool/limit/UX config; daemon should have the same view
prompt, promptInteractive, query, bare, inputFormat, outputFormat, inputFile, jsonFd, jsonFile, jsonSchema, includePartialMessages, acp, experimentalAcp, experimentalLsp, openaiLogging, openaiLoggingDir, listExtensions, continue, resume, sessionId ❌ clear (set undefined / false) startup prompt / I/O shape / one-shot commands / TUI session resume should NOT poison daemon

Decision 10: lazy import

gemini.tsx's daemon startup logic is wrapped with await import('./serve/inProcessDaemon.js'). No --serve → no ESM cold-start cost (~50ms, matches Mode B's commands/serve.ts:106 lazy pattern).

4. Data flow diagram

┌──────────────────────────────────────────────────────────────────┐
│                     qwen --serve process                          │
│                                                                   │
│  ┌─────────────┐                                                  │
│  │  ink TUI    │ ─── existing path (no ACP / no daemon) ──→       │
│  │ (gemini.tsx)│      sendMessage / GeminiClient / ...            │
│  └─────────────┘                                                  │
│         │                                                          │
│         │ Mutually invisible to QwenAgent below (Phase A limit)    │
│         │ Shares: config / settings / OAuth / FileReadCache       │
│         ▼                                                          │
│  ┌─────────────────────────────────────────────────────┐         │
│  │  Daemon subsystem (only when --serve, lazy-imported) │         │
│  │  boundWorkspace = validateAndCanonicalizeWorkspace() │         │
│  │                                                       │         │
│  │  Express(createServeApp({ workspace: boundWorkspace }))│       │
│  │       │                                               │         │
│  │       │ deps.bridge = createInProcessAcpBridge(...)  │         │
│  │       ▼                                               │         │
│  │  HttpAcpBridge (in-process impl, ~200 lines)         │         │
│  │       │  · Rejects authenticate request               │         │
│  │       │  · try/catch around uncaughtException         │         │
│  │       │  · No console redirect / stdio wrap / sigreg  │         │
│  │       │  · No isDying / aliveChannels / killAllSync   │         │
│  │       │  paired in-memory NDJSON channel              │         │
│  │       │  (TransformStream pair + ndJsonStream)        │         │
│  │       ▼                                               │         │
│  │  ClientSideConnection ←───────→ AgentSideConnection  │         │
│  │                                       │               │         │
│  │                                       ▼               │         │
│  │                  new QwenAgent(sharedConfig,          │         │
│  │                                sharedSettings,        │         │
│  │                                fabricatedArgv,        │         │
│  │                                conn)                  │         │
│  │                  sessions: Map<id, S>                 │         │
│  └─────────────────────────────────────────────────────┘         │
│                          ▲                                         │
│                          │ HTTP + SSE on 127.0.0.1:N (N OS-assigned) │
└──────────────────────────│─────────────────────────────────────────┘
                           │
            ┌──────────────┴──────────────┐
            │                              │
       curl / SDK                    IDE / mobile / IM
       (remote client A)             (remote client B)
       gets sessionId Y              attach Y or get new Z
       (cwd may be omitted →         (cwd mismatch → 400
        falls back to boundWorkspace)  workspace_mismatch)

5. Test matrix

Test File Purpose
Paired channel bidirectional frame round-trip inMemoryChannel.test.ts NDJSON boundaries, backpressure, close propagation
inProcessAcpBridge 8-method contract inProcessAcpBridge.test.ts Same assertions as httpAcpBridge.test.ts (spawnOrAttach sendPrompt cancelSession subscribeEvents respondToPermission listWorkspaceSessions setSessionModel killSession), without spawning child; reuse #4113's makeBridge() helper pattern
Omitting cwd still creates session inProcessAcpBridge.test.ts POST /session body without cwd → gets sessionId, response workspaceCwd = boundWorkspace (verifies #4113-introduced fallback works for in-process bridge too)
Remote authenticate rejected (decision 8 + §1) inProcessAcpBridge.test.ts Remote sends ACP authenticate request → receives method-disabled error; TUI's current OAuth credentials file mtime unchanged
Remote prompt throwing exception doesn't kill process inProcessAcpBridge.test.ts Mock QwenAgent.newSession to throw Error('boom') → bridge converts to ACP error returned to client, process still alive
Lazy import verification inProcessAcpBridge.test.ts Without --serve startup → require.cache does not include inProcessAcpBridge.js
qwen --serve startup serveFlag.test.ts Subprocess runs qwen --serve --serve-port 0, parses listen port from stderr, curl /capabilities 200
--workspace flag wiring serveFlag.test.ts qwen --serve --workspace /tmp/x/capabilities.workspaceCwd === '/tmp/x'
--workspace boot validation serveFlag.test.ts --workspace /no/such/path / relative path / file-not-directory → exit 1 + friendly error (validates A0 helper invocation)
TUI + daemon coexistence regression serveFlag.test.ts TUI mode existing startup smoke test still passes with --serve enabled
--serve mutex validation serveFlag.test.ts --serve serve / --serve --acp / --serve -p / --serve --bare four combinations all exit 1
Port already in use → exit 1 + error message includes port number serveFlag.test.ts After occupying a port, --serve --serve-port <that port> → stderr contains port number + exit 1
Remote creates independent session serveFlag.test.ts Remote POST /session gets sessionId Y, GET /session/Y/events SSE works; TUI's session sessionId X does not appear in daemon's listWorkspaceSessions view (validates §1 limitation)

6. Acceptance criteria

  • qwen --serve starts TUI; /help etc. work normally inside TUI
  • Same machine curl http://127.0.0.1:N/capabilities returns workspaceCwd = TUI's cwd (N read from stderr)
  • Same machine curl -X POST http://127.0.0.1:N/session -d '{}' returns sessionId (omitting cwd takes fallback)
  • Same machine curl -X POST http://127.0.0.1:N/session -d '{"cwd":"/wrong"}' returns 400 + code: workspace_mismatch
  • curl -N http://127.0.0.1:N/session/{id}/events streams ACP events
  • Remote POST /session/{id}/prompt triggers agent work and SSE returns streaming tokens
  • Remote ACP authenticate request → receives method-disabled error, TUI credentials untouched
  • qwen --serve --workspace /no/such/path → exit 1 + friendly error
  • TUI exit (Ctrl+C) exits the entire process (does not guarantee SSE clean close — that's Phase C)
  • Without --serve startup → ESM load graph does NOT include serve module
  • CI: vitest all green, new tests pass

7. Phase A explicit non-goals

Not doing Defer to
Remote bind / bearer token / CORS Phase B
Ctrl+C graceful drain SSE / /quit coordination / double-Ctrl+C force exit Phase C
TUI ↔ daemon session unification (incl. remote authenticate capability) Phase D
process-level Mutex around OAuth refresh to resolve TUI/daemon concurrent races Phase D
MCP child reuse between TUI and daemon chiga0 finding #3 (MCP per-daemon shared state) or Stage 2
uncaughtException process-level quarantine (distinguish daemon-origin vs TUI-origin) Phase E (Stage 2e in-process reverse refactor)
Banner showing listen URL / current client count / token path Phase B
--serve mode process.exit path cleanup serve (avoid listener leak) Phase C
Telemetry root span merge with TUI (currently each independent root) Phase D
mDNS discovery (opencode has it, useful for LAN deployment) Stage 2c (in roadmap)
Port discovery file ~/.qwen/serve/instances/<pid>.json Phase B

8. PR split recommendation

A 4.3-day diff in a single PR is too much review burden; split into 4 stacked PRs:

Sub-PR Estimate Content Dependencies
A0 ~0.3d Extract validateAndCanonicalizeWorkspace shared helper, refactor runQwenServe.ts:121-160 to call it (this is a small refactor PR over #4113's code, does not introduce Mode A by itself) #4113 must merge first
A1 ~1d Extract inMemoryChannel.ts + unit tests (pure refactor of httpAcpBridge.test.ts:151-154's existing pattern, zero behavior change) None (can start immediately, even before #4113)
A2 ~2d Implement inProcessAcpBridge.ts + contract tests (incl. §3 decision 8 authenticate rejection, §3 decision 9 fabricate argv, §3 decision 10 lazy import prep; uses A0 helper, canonicalizeWorkspace, WorkspaceMismatchError, makeBridge() pattern) A0 + A1 + #4113
A3 ~1d --serve flag + --workspace wiring + gemini.tsx integration + mutex validation + e2e + docs A2

A1 can start now (no #4113 dependency); A0/A2/A3 wait for #4113 merge before stacking, to avoid rebases.

Phase A workload estimate (revised)

Sub-task Estimate
A0 extract validateAndCanonicalizeWorkspace helper 0.3d
A1 inMemoryChannel.ts + tests 1d
A2 inProcessAcpBridge.ts (incl. inline side-effect-free runAcpAgent + authenticate rejection + uncaughtException wrapping + cwd fallback) 2d
A3 --serve + --workspace flag + gemini.tsx lazy import + stderr printing + listen error handling + e2e + docs 1d
Total ~4.3 d

Main added cost is:

  • §3 decision 1 paired channel + ACP wiring: ~1.5d (in-process simplification saves back 0.5d)
  • §3 decision 8 authenticate rejection forwarding + tests: ~0.5d
  • §1 known limitations documentation + uncaughtException wrapping: ~0.5d
  • A0 helper extraction: ~0.3d

ROI is positive: in exchange for zero invasion of server layer + ACP protocol auto-follow + no security/resource amplification problems + alignment with #4113 workflow.


Routing decision point ⚠️

#3929 / #3930 / #3931 (chiga0) are an alternative route that overlaps in functionality but uses a different protocol: qwen remote-control + qwen --remote-control, WebSocket + stream-json. Both routes do "TUI as super-client + mobile/browser thin client". Before implementing 1.5b, design alignment is needed:

Out of scope for this issue

  • Phase D — TUI ↔ daemon session unification (make TUI a real daemon client pre-release: fix ci #1, requires refactoring TUI to flow through QwenAgent; also unlocks remote authenticate / process-level credential mediator / telemetry span merge) — separate issue
  • Phase E — Stage 2e in-process reverse refactor (process-level quarantine so daemon exceptions don't kill TUI) — separate issue
  • chiga0's 6 prereq refactors (PermissionMediator extraction, EventBus elevation, MCP per-daemon shared state, etc.) — separate issue
  • Stage 1.5a remaining 9 must-haves (per-request sessionScope override, loadSession HTTP, heartbeat, token revocation, etc.) — separate issue
  • Stage 1.5c daemon-side state CRUD (memory/mcp/agents/tools/approval-mode/init/auth 7 routes) — separate issue

References

Daemon series upstream (@wenshao-led)

Design docs

Routing decision point

External reference

  • opencode reference implementation: packages/opencode/src/cli/cmd/tui/thread.ts (Worker + RPC fetch)

📖 简体中文版本(点击展开)

背景

Stage 1 的 qwen serve (#3889) 提交了 headless daemon,远端 client 走 HTTP+SSE 接入;#4113 在收口"1 daemon = 1 workspace"的架构。但目前只有 Mode B(headless):在带 TUI 的进程里没法同时跑 daemon,本地用户开着 TUI 时,手机 / IDE / IM bot / web 接不进来。

设计提案 qwen-code-daemon-design §04 / §06 把这一块定义为 Stage 1.5b — Mode A qwen --serve flag:在普通 TUI 进程里附挂一个 HttpServer,TUI 当 super-client 走 in-process EventBus,远端 client 通过 HTTP+SSE 共享同一 daemon、同一 session 集合。

⚠️ 强烈建议在 #4113 合入后开工本 issueboundWorkspace 必填、canonicalizeWorkspace / WorkspaceMismatchError 等 API 由 #4113 引入;A0 子任务(抽 workspace 校验 helper)正好是对 #4113 代码的小重构,A1 子任务(提取 in-memory channel)则与 #4113 无依赖、可立刻开工。

架构区分

维度 Mode A (qwen --serve) — 本 issue Mode B (qwen serve) — 已实现
TUI ✅ in-process super-client(保留全部 ~15 Ink dialog + local-jsx slash) ❌ headless
Agent in-process QwenAgent(同进程) spawn qwen --acp child
默认 auth loopback 免 token bearer 强制
默认 CORS loopback only 配置驱动
关闭 Ctrl+C / /quit → drain HTTP → 退 daemon SIGTERM → drain → close
TUI 绑 session 数 1(其余远端 session 在 HTTP 层成立但 TUI 看不到) N/A
Crash isolation ⚠️ daemon 异常会带崩 TUI(同进程) ✅ child 隔离
MCP child 数量 TUI 一份 + daemon 每 session 一份(N×2 放大) daemon 每 session 一份

关键技术发现

createServeApp(opts, deps) (packages/cli/src/serve/server.ts:53) 已经支持 deps.bridge?: HttpAcpBridge 注入,且 HttpAcpBridge (packages/cli/src/serve/httpAcpBridge.ts:87) 是 transport-agnostic 的接口。所以 Mode A 不需要改路由 / SSE / EventBus 层,只需提供一个 createInProcessAcpBridge(agent) 实现,把当前进程的 QwenAgent 包装成 bridge 注入即可。SSE / Last-Event-ID / 15s heartbeat / ring replay 全部从 #3889 直接复用。

⚠️ 范围澄清:设计提案里"TUI 启动 = 自动 POST /session,远端 attach 同一个 X" 的完整语义需要 TUI 改造成走 QwenAgent 路径,这是 Phase D 级的重构(详见下方 Phase A 详细设计 §1)。本 issue 的 Phase A/B/C 不包含这一项。

三阶段方案概览(拆 3 个独立 PR)

每个 phase 都可独立 review、独立合入;B/C 之间无硬依赖。Phase A 详细设计见下方独立章节

Phase A — Loopback-only 最小骨架 (~4.3 天,拆 A0/A1/A2/A3 四个 stacked PR)

qwen --serve [--workspace /path] [--serve-port N] 起 TUI + 本机 daemon,远端 curl 在 loopback 下能跑通完整 prompt + SSE 流。仅本机调试可用,无 auth、无优雅退出、TUI 与远端 session 不共享、远端不能调 authenticate、daemon 异常会带崩 TUI。详细设计见下方。

Phase B — 远端绑定 + auth/CORS 默认值区分 (~1 天)

目标qwen --serve --serve-host 0.0.0.0 能从局域网/容器外访问,强制 bearer token;loopback 仍免 token。团队/容器内 alpha 可用

改动:

  1. --serve-host flag(default 127.0.0.1)和 --serve-token flag + QWEN_SERVER_TOKEN env 兜底
  2. 复用 packages/cli/src/serve/runQwenServe.ts:60-75 的 token trim+env 解析逻辑
  3. Loopback 检测调 isLoopbackBind() (packages/cli/src/serve/loopbackBinds.ts);非 loopback 且无 token → 启动失败,stderr 打印同 Mode B 的 "Refusing to bind X without a bearer token"
  4. 自动生成 token 写盘 ~/.qwen/serve/token,TUI banner 显示路径(对齐 Mode B 现有行为)
  5. CORS / Host allowlist 完全复用 packages/cli/src/serve/auth.ts 三个 middleware(bearerAuth / denyBrowserOriginCors / hostAllowlist),它们对 in-process bridge 透明
  6. 禁止 --serve-port 0 + 非 loopback host 组合(OS 临时端口 + 远端 = 操作员搞不清自己暴露在哪)
  7. 测试:非 loopback 无 token → boot reject;带 token → 接受;wrong token → 401

Phase B 不做:Mutual TLS / 客户端证书(Stage 2)、token revocation API(chiga0 must-have #3

Phase C — 生命周期协同 (~1 天)

目标:TUI 退出(Ctrl+C / /quit / 异常)时先 drain HTTP,再让 ink 卸载、最后退进程;远端 client 在 TUI 退出时收到 clean close 而非 TCP RST;远端断连不影响 TUI。生产可用

改动:

  1. 集中关闭句柄:TUI 启动钩子返回 { shutdown(): Promise<void> } 包装 bridge.shutdown() + server.close() + 5s 强制 close(复用 runQwenServe.ts:15SHUTDOWN_FORCE_CLOSE_MS
  2. qwen.tsx 顶层 SIGINT/SIGTERM 单点处理,顺序:① EventBus 推 daemon_shutting_down 给所有 SSE → ② server.close() 拒新连接、等 in-flight → ③ bridge.shutdown() → ④ ink unmount → ⑤ process.exit(0)
  3. /quit 复用同一 shutdown() 函数,避免两条退出路径分叉
  4. 双 SIGINT 强退(对齐 refactor(serve): 1 daemon = 1 workspace (#3803 §02) #4113 BkUyD 行为):第二次 Ctrl+C → 跳过 drain 直接 exit(130)
  5. ink 默认 SIGINT handler 协同:需要在 TUI 启动时移除/包裹 ink 的 handler,否则 ink 会先 unmount 导致 HTTP drain 没机会跑(本 phase 唯一可能踩坑点,留 0.5d buffer
  6. E2E:a) TUI + 远端 SSE 连着,TUI 进程 SIGINT → 远端在 5s 内收到 close 事件 + 200 响应、TUI 进程退出码 0;b) 远端 kill -9 SSE 客户端 → TUI 仍在跑,/help 仍响应

Phase C 不做:Daemon hot-reload(重启 daemon 不重启 TUI)、Persistent SSE 跨重启重放(Stage 2 durable 范畴)

三个 phase 的关键性质

性质 Phase A Phase B Phase C
用户面 仅本机调试 + 远端独立 session 团队/容器内 alpha 生产 stable
中间态可发布? ⚠️ 内部 dogfooding ✅ alpha ✅ stable
依赖 强烈建议在 #4113 后开工(A1 子任务可独立先做) A A 推荐先做但非硬依赖
主要风险 runAcpAgent 副作用隔离、authenticate 安全门、MCP 资源放大、ACP paired-channel 接线 主要靠复用,低风险 ink + signal 协同的踩坑可能
可独立提 PR ✅(拆 A0/A1/A2/A3 四个 stacked PR,见 §8)

Phase A 详细设计

0. 参考 opencode 的取舍

opencode 的 tui 命令(packages/opencode/src/cli/cmd/tui/thread.ts)实现 TUI + 服务端共进程:

opencode 做法 是否借鉴 原因
TUI 主线程 + 服务端跑在 Worker thread,通过 Rpc.client 通信 qwen-code 单线程 + ink,引 Worker 是独立大重构
process.argv.includes('--port') 区分 internal vs external ✅ 思想 qwen-code 简化为:用了 --serve 才 listen + lazy import
createWorkerFetch / createEventSource 把 RPC 包装成 fetch 同进程同线程不需要 fetch 抽象
withTimeout(client.call("shutdown"), 5000) 5s 强制 runQwenServe.ts:15 SHUTDOWN_FORCE_CLOSE_MS = 5_000 一致
tui() 入口接受 url + fetch + events 三个注入参数 ⚠️ 部分 qwen-code 现 TUI 入口耦合更深,重构是 Phase D 的事

核心借鉴:opencode 证明"TUI + 服务端共进程"可行 + 5s shutdown 是合适常量。关键差异:opencode 的 TUI 已经把数据访问抽到 fetch/events 注入接口,qwen-code 的 TUI 没这层抽象,所以 Phase A 不能照抄"TUI 当 daemon client"——这是 Phase D 的事。

1. 范围诚实声明 ⚠️

Phase A 实际可达成

  • qwen --serve 起 TUI 同时挂 HttpServer
  • ✅ 远端 client 能 curl /capabilitiesPOST /session 在同一 workspace 拿独立 session、SSE 收事件、发 prompt
  • ✅ TUI 自己的会话照常跑,不退化

已知限制(与 Mode B 的差异,必须文档化告知用户)

限制 原因 影响 缓解
TUI 与 daemon session 不共享 TUI 不走 QwenAgent 路径 远端看不到 TUI 对话;TUI 看不到远端 Phase D 重构
远端 authenticate 请求被拒绝(返 ACP error) QwenAgent.authenticate()clearCachedCredentialFile + refreshAuth → 直接清掉 TUI 的凭据 远端不能切换 auth method;只能用 TUI 当前已认证的方法 Phase D,daemon 与 TUI 凭据 mediator
daemon 异常会带崩 TUI 同进程,无 child 隔离 uncaughtException / OOM 整进程退 Phase A 在 bridge 层包 try/catch;彻底解决要 Phase E in-process Stage 2e 反向重构
MCP child 数量放大(TUI 一份 + daemon 每 session 一份) acpAgent.ts:618 newSessionConfig 每 session 新建 Config N 个 MCP server × (1 + daemon session 数) 个 child;10 MCP × 10 session ≈ 5-10 GB 隐式内存 Phase A daemon --max-sessions 默认调低到 5;长期靠 chiga0 finding #3 (MCP per-daemon shared state)
进程级状态并发(OAuth refresh / FileReadCache / quota) TUI/daemon 共享单例 并发 token refresh 可能竞态 Phase A 文档化;后续加 process-level Mutex

"TUI ↔ daemon session 统一" 作为 Phase D 单独立项(不在本 issue 范围)。

2. 文件级改动清单

# 文件 类型 说明
0 packages/cli/src/serve/runQwenServe.ts 抽 + packages/cli/src/serve/index.ts 导出 改 ~50 行(A0 子 PR) 抽出共享 helper validateAndCanonicalizeWorkspace(rawPath: string): string:把 runQwenServe.ts:121-160path.isAbsolute / fs.statSync / isDirectory / ENOENT/EACCES/EPERM 校验 + canonicalizeWorkspace 调用打包;导出供 Mode A 复用。理由#4113server.ts docblock 明确点名 "If a future entry point binds createServeApp directly to user input, it MUST replicate the runQwenServe validation (or call into a shared helper if one is extracted)" —— Mode A 正是此场景,抽 helper 是干净路径
1 packages/cli/src/index.ts 或 yargs middleware 层 顶层加 --serve (boolean) + --serve-port (number, default 0 —— OS 分配避免与 Mode B 4170 冲突) flag;与 serve 子命令 / --acp / -p 互斥校验。--workspace <path> flag 复用 #4113 已加的同名 flag(不要 --serve-workspace
2 packages/cli/src/serve/inMemoryChannel.ts 新建 ~80 行 提取 httpAcpBridge.test.ts:151-154 已有的 paired NDJSON 通道模式为生产代码——createPairedChannel(): { clientStream, agentStream }两对 TransformStream<Uint8Array, Uint8Array> 背靠背 + SDK 既有的 ndJsonStream(不是 PassThrough)
3 packages/cli/src/serve/inProcessAcpBridge.ts 新建 ~200 行(#4113 后的 httpAcpBridge.ts ~2400 行小一个数量级,因为 in-process 没有 child / spawn race / SIGTERM grace) 实现 HttpAcpBridge 接口;inline 一份去掉副作用的 runAcpAgent 等价物:① 创建 paired channel ② new QwenAgent(sharedConfig, sharedSettings, fabricatedArgv, agentSideConnection)重定向 console.log/info/debugprocess.stdout/stdin注册 SIGINT/SIGTERM(留给 Phase C) ⑥ 在 stream end 时 runExitCleanup + process.exit ⑦ 在转发 authenticate request 处直接返 ACP error "remote authenticate disabled in Mode A" ⑧ 在 sendPrompt / newSession 等调用包 try/catch 捕获 uncaughtException
4 packages/cli/src/gemini.tsx ~705 行附近 改 ~40 行 if (config.isInteractive()) 分支里、render() 之前,如果 argv.servelazy import await import('./serve/inProcessDaemon.js')调 A0 抽出的 validateAndCanonicalizeWorkspace(argv.workspace ?? process.cwd()) 拿到 boundWorkspace → 启 daemon → listen URL writeStderrLine(不是 writeStdoutLine,避免污染 ink 的 stdout)
5 packages/cli/src/serve/inProcessAcpBridge.test.ts 新建 ~450 行 接口契约测试:所有 HttpAcpBridge 方法对齐 httpAcpBridge.test.ts复用 #4113makeBridge() helper 模式(默认 boundWorkspace: WS_A,需要不同 binding 时显式覆盖);新增 5 条:远端 authenticate 被拒、远端 prompt 抛异常不杀进程、port 冲突 exit 1、lazy import 验证、省略 cwd fallback 到 boundWorkspace
6 packages/cli/src/serve/serveFlag.test.ts 新建 ~200 行 E2E:spawn qwen --serve --serve-port 0 子进程、stderr 解析 listen port、curl /capabilitiesPOST /session--serve + serve / --acp / -p 三个互斥组合都 exit 1
7 docs/users/qwen-serve.md 改 ~50 行(同文件加章节,不新建) #4113 已经更新过的同文件里新增 "Mode A — qwen --serve" 章节 + Mode A vs Mode B 选型表,明确写 §1 的 5 条已知限制。不改 docs/developers/qwen-serve-protocol.md(HTTP 协议、ACP wire、workspace_mismatch body 形状对 Mode A/B 完全一致)。docs/developers/examples/daemon-client-quickstart.md 顶部加一句 "示例适用 Mode A 和 Mode B;启动 daemon 的方式不同,client 接入完全一致" 即可

3. 关键技术决策

决策 1:in-process bridge 用 paired channel + 完整 ACP

维度 paired channel + ACP(选这个) 直接方法调用
server.ts / eventBus.ts 改动 0
ACP 协议演进 自动跟随 双份维护
实现复杂度 中(~200 + 80 行) 低(~150 行)
调试性 好(ACP frame 可 dump) 一般

httpAcpBridge.test.ts:151-154TransformStream 对 + ndJsonStream 模式已经存在,提取到 inMemoryChannel.ts 即可。

简化优势(vs #4113 后的 httpAcpBridge.ts):in-process 没有 child process / spawn race / child crash race / SIGTERM grace window,因此不需要 ChannelInfo.isDying 状态、aliveChannels 集合、killAllSync 路径、tanzhenxin BkUyD 双 SIGINT 不变式。预估 in-process bridge 实际 ~180-220 行,比 #4113 后的 httpAcpBridge.ts (~2400 行) 小一个数量级。

决策 2:TUI 与 daemon 的 session 关系——解耦

详见 §1。daemon 在同进程独立跑一份 QwenAgent,TUI session 与 daemon session 互不可见。

决策 3:何时 listen + 启动失败 + 日志走向

  • 位置gemini.tsxinitializeApp(config, settings) 之后(确保 settings/auth 就绪)、render(<App ...>) 之前
  • 失败处理:port 冲突 / bind 错误 / workspace 校验失败 → TUI 不启动,writeStderrLine 打印错误process.exit(1)
  • listen URL 打印只用 writeStderrLine(stdout 被 ink 占用,stdout 写会污染 TUI 渲染)
  • port 默认值0(OS 分配),避免与 Mode B 默认 4170 冲突;用户必须看 stderr 才知道实际 port

决策 4:EventBus 实例共享

createServeApp 内部 new 一个 EventBus 给 SSE。Phase A:in-process QwenAgent 走 paired channel → ACP sessionUpdate 通知 → bridge 转 EventBus → SSE fan-out。做 "TUI 直接订阅 EventBus" —— Phase D。

决策 5:--serve 与现有 flag 的互斥

组合 行为
qwen --serve Mode A:TUI + daemon
qwen --serve --workspace /path Mode A,daemon 绑到 /path 而不是 cwd(合法,复用 #4113--workspace 语义)
qwen --serve --continue / --resume X / --prompt-interactive "..." / --model X 合法(仍然交互式 TUI)
qwen serve Mode B(既有,互斥)
qwen --serve serve 启动失败
qwen --acp --serve 启动失败(ACP 走 stdio,跟 daemon 互斥)
qwen -p "hello" --serve 启动失败(headless prompt 模式跟交互式 daemon 互斥)
qwen --bare --serve 启动失败(bare 模式跳过 settings 加载,daemon 需要 settings)
qwen --input-format stream-json --serve 启动失败(非交互模式)
qwen --json-schema "..." --serve 启动失败(gemini.tsx:738 已对交互模式 error,等于隐式互斥)
qwen --serve 在 non-TTY(nohup ... & 启动失败 + 提示 "use qwen serve for headless"

校验在 gemini.tsx:417 附近,与 --bare / --prompt-interactive 互斥校验同位置。

决策 6:Phase A 强制 loopback

--serve-host 在 Phase A 不暴露,硬编码 127.0.0.1。远端绑定必须配套 token,是 Phase B 的事。

决策 7:boundWorkspace 是 boot 时 snapshot + canonical

Mode A 的 boundWorkspace = validateAndCanonicalizeWorkspace(argv.workspace ?? process.cwd()) at --serve boot time,永不变。即使 TUI 期间 process.chdir()(被某些命令触发),daemon 仍服务原 workspace。这与 #4113 的 1-daemon-1-workspace 语义一致。

关键canonicalizeWorkspacehttpAcpBridge.ts 已经导出的幂等 helper,处理 symlink + 大小写不敏感 FS。Phase A 必须 pre-canonicalize——否则 /capabilities.workspaceCwd 与 bridge 内部的 canonical 形式可能漂移(虽然 #4113 自己有重复 canonicalize 的兜底,但与 runQwenServe 模式对齐更干净)。

实证:grep 全代码 process.chdir()ui/commands/services/acp-integration/零调用——snapshot vs dynamic 在今天等价。但未来子 agent 工具(EnterWorktree PR #4073)涉及切目录,dynamic 模式会默默改 boundWorkspace导致已连接客户端全部 workspace_mismatch 而不通知——sliding error。snapshot 严格优于 dynamic。

决策 8:远端 authenticate 请求拒绝转发

In-process bridge 在 ACP request 路由层拦截 authenticate method,直接返 ACP error { code: -32601, message: "remote authenticate disabled in Mode A; use TUI /auth instead" }。理由见 §1:QwenAgent.authenticate() 会清 TUI 凭据。

决策 9:config / settings 共享,argv fabricate

daemon 的 QwenAgentTUI 已构造好的 config + settings 实例(避免 settings 漂移),但 argv 新建一份干净的 CliArgs(只含 cwd、mode 等 daemon 相关字段),避免 TUI 的 flags 如 --prompt-interactive 误导 daemon 行为。

InProcessBridgeOptions TS 签名(与 #4113 后的 BridgeOptions 对齐):

interface InProcessBridgeOptions {
  boundWorkspace: string;           // required(与 #4113 BridgeOptions.boundWorkspace 一致)
  sharedConfig: Config;             // required
  sharedSettings: LoadedSettings;   // required
  daemonArgv: CliArgs;              // required(fabricated)
  maxSessions?: number;             // optional, 默认 5(D2 决策)
  // 不需要 maxConnections(in-process 没 listener 层)
  // 不需要 sessionScope override(沿用 single 默认)
}

export function createInProcessAcpBridge(opts: InProcessBridgeOptions): HttpAcpBridge;
// 注意:与 #4113 后的 createHttpAcpBridge(opts: BridgeOptions) 一致,opts 无默认值

fabricateDaemonArgv(orig: CliArgs) 清单(保留/丢弃):

字段 处理 原因
model, yolo, approvalMode, extensions, includeDirectories, mcpConfig, allowedMcpServerNames, telemetry*, openaiApiKey, openaiBaseUrl, proxy, authType, coreTools, excludeTools, disabledSlashCommands, allowedTools, maxSessionTurns, chatRecording, checkpointing, debug, screenReader, sandbox, sandboxImage, channel ✅ 保留 provider/工具/限制/UX 配置;daemon 应该有同样视图
prompt, promptInteractive, query, bare, inputFormat, outputFormat, inputFile, jsonFd, jsonFile, jsonSchema, includePartialMessages, acp, experimentalAcp, experimentalLsp, openaiLogging, openaiLoggingDir, listExtensions, continue, resume, sessionId ❌ 清空(设 undefined / false) 启动 prompt / I/O 形态 / 一次性命令 / TUI 续 session 等不该污染 daemon

决策 10:lazy import

gemini.tsx 里 daemon 启动逻辑用 await import('./serve/inProcessDaemon.js') 包裹,不带 --serve 时不付 ESM 冷启动成本(约 50ms,对齐 Mode B 的 commands/serve.ts:106 lazy 模式)。

4. 数据流图

┌──────────────────────────────────────────────────────────────────┐
│                      qwen --serve 进程                            │
│                                                                   │
│  ┌─────────────┐                                                  │
│  │  ink TUI    │ ─── 现有路径(不走 ACP / 不经 daemon) ──→       │
│  │ (gemini.tsx)│      sendMessage / GeminiClient / ...            │
│  └─────────────┘                                                  │
│         │                                                          │
│         │ 与下方 QwenAgent 互不可见(Phase A 限制)                │
│         │ 共享:config / settings / OAuth / FileReadCache         │
│         ▼                                                          │
│  ┌─────────────────────────────────────────────────────┐         │
│  │  Daemon 子系统(仅当 --serve 启用时,lazy imported) │         │
│  │  boundWorkspace = validateAndCanonicalizeWorkspace() │         │
│  │                                                       │         │
│  │  Express(createServeApp({ workspace: boundWorkspace }))│       │
│  │       │                                               │         │
│  │       │ deps.bridge = createInProcessAcpBridge(...)  │         │
│  │       ▼                                               │         │
│  │  HttpAcpBridge (in-process 实现, ~200 行)            │         │
│  │       │  · 拒绝 authenticate request                  │         │
│  │       │  · try/catch 捕获 uncaughtException           │         │
│  │       │  · 不重定向 console / 不包 stdio / 不注册 sig │         │
│  │       │  · 无 isDying / aliveChannels / killAllSync   │         │
│  │       │  paired in-memory NDJSON channel              │         │
│  │       │  (TransformStream pair + ndJsonStream)        │         │
│  │       ▼                                               │         │
│  │  ClientSideConnection ←───────→ AgentSideConnection  │         │
│  │                                       │               │         │
│  │                                       ▼               │         │
│  │                  new QwenAgent(sharedConfig,          │         │
│  │                                sharedSettings,        │         │
│  │                                fabricatedArgv,        │         │
│  │                                conn)                  │         │
│  │                  sessions: Map<id, S>                 │         │
│  └─────────────────────────────────────────────────────┘         │
│                          ▲                                         │
│                          │ HTTP + SSE on 127.0.0.1:N (N OS 分配)  │
└──────────────────────────│─────────────────────────────────────────┘
                           │
            ┌──────────────┴──────────────┐
            │                              │
       curl / SDK                    IDE / mobile / IM
       (远端 client A)               (远端 client B)
       拿到 sessionId Y               attach Y 或拿新的 Z
       (可省略 cwd → fallback         (cwd 不匹配 → 400
        到 boundWorkspace)             workspace_mismatch)

5. 测试矩阵

测试 文件 目的
paired channel 双向 frame 收发 inMemoryChannel.test.ts NDJSON 边界、背压、关闭传播
inProcessAcpBridge 8 个方法契约 inProcessAcpBridge.test.ts httpAcpBridge.test.ts 同样的断言(spawnOrAttach sendPrompt cancelSession subscribeEvents respondToPermission listWorkspaceSessions setSessionModel killSession),不 spawn child;复用 #4113makeBridge() helper 模式
省略 cwd 也能创建 session inProcessAcpBridge.test.ts POST /session body 不带 cwd → 拿到 sessionId,response workspaceCwd = boundWorkspace(验证 #4113 引入的 fallback 行为对 in-process bridge 也生效)
远端 authenticate 被拒(决策 8 + §1) inProcessAcpBridge.test.ts 远端发 ACP authenticate request → 收到 method-disabled error;TUI 当前 OAuth credentials 文件 mtime 不变
远端 prompt 抛异常不杀进程 inProcessAcpBridge.test.ts mock QwenAgent.newSessionError('boom') → bridge 转换为 ACP error 返客户端,进程仍存活
lazy import 验证 inProcessAcpBridge.test.ts 不带 --serve 启动 → require.cache 不含 inProcessAcpBridge.js
qwen --serve 启动 serveFlag.test.ts 子进程跑 qwen --serve --serve-port 0stderr 解析出 listen port,curl /capabilities 200
--workspace flag 接入 serveFlag.test.ts qwen --serve --workspace /tmp/x/capabilities.workspaceCwd === '/tmp/x'
--workspace boot validation serveFlag.test.ts --workspace /no/such/path / 相对路径 / 文件而非目录 → exit 1 + 友好错误(验证 A0 helper 调用正确)
TUI + daemon 共存不退化 serveFlag.test.ts TUI 模式既有 startup smoke test 在 --serve 启用时仍通过
--serve 互斥校验 serveFlag.test.ts --serve serve / --serve --acp / --serve -p / --serve --bare 四个组合都 exit 1
port 已被占用 → exit 1 + 错误信息含端口号 serveFlag.test.ts 占用某 port 后 --serve --serve-port <该 port> → stderr 含端口号 + exit 1
远端创建独立 session serveFlag.test.ts 远端 POST /session 拿到 sessionId Y,GET /session/Y/events SSE 通;TUI 的会话 sessionId X 不出现在 daemon 的 listWorkspaceSessions 视图里(验证 §1 限制)

6. 验收标准

  • qwen --serve 启 TUI,TUI 内 /help 等命令正常工作
  • 同机 curl http://127.0.0.1:N/capabilities 返回 workspaceCwd = TUI 的 cwd(N 从 stderr 读)
  • 同机 curl -X POST http://127.0.0.1:N/session -d '{}' 拿到 sessionId(省略 cwd 走 fallback)
  • 同机 curl -X POST http://127.0.0.1:N/session -d '{"cwd":"/wrong"}' 返 400 + code: workspace_mismatch
  • curl -N http://127.0.0.1:N/session/{id}/events 流式收 ACP 事件
  • 远端发 POST /session/{id}/prompt 触发 agent 工作并 SSE 返流式 token
  • 远端发 ACP authenticate request → 收到 method-disabled error,TUI 凭据未受影响
  • qwen --serve --workspace /no/such/path → exit 1 + 友好错误
  • TUI 退出(Ctrl+C)整个进程退出(保证 SSE 优雅 close——Phase C)
  • 不带 --serve 启动 → ESM 加载图不含 serve 模块
  • CI:vitest 全绿,新增测试通过

7. Phase A 显式不做的事

不做 留给
远端绑定 / bearer token / CORS Phase B
Ctrl+C 优雅 drain SSE / /quit 协调 / 双 Ctrl+C 强退 Phase C
TUI ↔ daemon session 统一(含远端 authenticate 能力) Phase D
process-level Mutex 包 OAuth refresh 解决 TUI/daemon 并发竞态 Phase D
MCP child 进程在 TUI/daemon 间复用 chiga0 finding #3 (MCP per-daemon shared state) 或 Stage 2
uncaughtException 进程级 quarantine(区分 daemon 起源 vs TUI 起源) Phase E(Stage 2e in-process 反向重构)
Banner 显示 listen URL / 当前 client 数 / token 路径 Phase B
--serveprocess.exit 路径 cleanup serve(避免 listener 泄漏) Phase C
Telemetry root span 与 TUI 合并(暂时各自独立 root) Phase D
mDNS 发现(opencode 有,对 LAN 部署有用) Stage 2c(roadmap 已列)
Port discovery 文件 ~/.qwen/serve/instances/<pid>.json Phase B

8. PR 拆分建议

4.3 天 diff 单 PR review burden 大,建议拆 4 个 stacked PR:

子 PR 估时 内容 依赖
A0 ~0.3d 抽出 validateAndCanonicalizeWorkspace 共享 helper,重构 runQwenServe.ts:121-160 调它(这是对 #4113 代码的小重构 PR,本身不引入 Mode A) #4113 必须先 merge
A1 ~1d 提取 inMemoryChannel.ts + 单元测试(纯重构 httpAcpBridge.test.ts:151-154 已有模式,零行为变更) (可立刻开工,#4113 之前都行)
A2 ~2d 实现 inProcessAcpBridge.ts + 契约测试(含 §3 决策 8 authenticate 拒绝、§3 决策 9 fabricate argv、§3 决策 10 lazy import 准备;用 A0 helper、canonicalizeWorkspaceWorkspaceMismatchErrormakeBridge() 模式) A0 + A1 + #4113
A3 ~1d --serve flag + --workspace 接入 + gemini.tsx 集成 + 互斥校验 + e2e + 文档 A2

A1 可以现在就开(不依赖 #4113);A0/A2/A3 等 #4113 merge 后再叠,避免 rebase。

Phase A 工作量估算(修正后)

子任务 估时
A0validateAndCanonicalizeWorkspace helper 0.3d
A1 inMemoryChannel.ts + 测试 1d
A2 inProcessAcpBridge.ts(含 inline 无副作用 runAcpAgent + authenticate 拒绝 + uncaughtException 包裹 + cwd fallback) 2d
A3 --serve + --workspace flag + gemini.tsx lazy import + stderr 打印 + listen 错误处理 + e2e + 文档 1d
合计 ~4.3 d

主要追加成本是:

  • §3 决策 1 paired channel + ACP 接线:~1.5d(in-process 简化省回 0.5d)
  • §3 决策 8 authenticate 拒绝转发 + 测试:~0.5d
  • §1 已知限制文档化 + uncaughtException 包裹:~0.5d
  • A0 helper 抽取:~0.3d

ROI 正向:换来零侵入 server 层 + ACP 协议自动跟随 + 不引入安全/资源放大问题 + 与 #4113 工作流对齐。


路线决策点 ⚠️

#3929 / #3930 / #3931 (chiga0) 是另一条功能上重叠但协议不同的路:qwen remote-control + qwen --remote-control,WebSocket + stream-json。两条路都做"TUI 当 super-client + 移动端瘦客户端",开 1.5b 之前需要拉一次设计对齐:

不在本 issue 范围

  • Phase D — TUI ↔ daemon session 统一(让 TUI 真正成为 daemon 的 client pre-release: fix ci #1,需要 TUI 走 QwenAgent 路径的重构;同时解锁远端 authenticate / process-level credential mediator / telemetry span 合并)—— 单独 issue
  • Phase E — Stage 2e in-process 反向重构(让 daemon 异常不带崩 TUI 的进程级 quarantine)—— 单独 issue
  • chiga0 6 项 prereq 重构(PermissionMediator 抽取、EventBus 上提、MCP per-daemon shared state 等)—— 单独 issue
  • Stage 1.5a 剩余 9 个 must-have(per-request sessionScope override、loadSession HTTP、heartbeat、token revocation 等)—— 单独 issue
  • Stage 1.5c daemon 侧状态 CRUD(memory/mcp/agents/tools/approval-mode/init/auth 7 routes)—— 单独 issue

参考

Daemon 系列上游(@wenshao 主导)

设计文档

路线决策点

外部参考

  • opencode 参考实现: packages/opencode/src/cli/cmd/tui/thread.ts(Worker + RPC fetch)

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions