docs(deploy): local launch templates for v0.16-alpha (PR 30a)#4483
Conversation
Third PR in the F5 release chain (PR 27 ✅ → PR 30a → 28 → 31) per the 2026-05-24 v0.16-alpha scope freeze in #4175 (text-only + local-only). Pure markdown, zero code. New `docs/users/qwen-serve-deploy-local.md` (~160 LOC) with copy-paste-ready templates for: - systemd user-level unit (Linux) + system-wide alternative callout for shared dev hosts - launchd LaunchAgent plist (macOS) with explicit "no ~ / \$HOME expansion" warning since that's a common foot-gun - tmux session for interactive supervision - nohup one-liner with "not recommended" caveats - curl smoke-check (/health + /capabilities) + token rotation walkthrough (covers all four launchers) All templates inline `QWEN_SERVER_TOKEN=...` directly per the BYO- token guide PR 27 added to qwen-serve.md. No auto-gen, no token- store infrastructure — user generates via openssl rand -hex 32 and pastes into the unit/plist. Each template carries an explicit "DO NOT COMMIT this file with a real token" comment at the token line. Cross-references the SDK env fallback PR 27 added: one shell-level `export QWEN_SERVER_TOKEN=\$(cat token-file)` covers both the daemon-side flag fallback AND the SDK-side DaemonClient construction fallback. Restart-and-crash semantics cross-link to the existing Durability model section rather than duplicate. Cross-links from qwen-serve.md "v0.16-alpha known limits" line 32 (forward reference "templates land in PR 30a" becomes a live link) and "What's next" section (natural discovery hub at the bottom). _meta.ts gets a sibling nav entry under qwen-serve. Out of scope (deferred to v0.16.x or later): containerized deployment (PR 30b), cross-host federation, auto-gen tokens, native Windows service. WSL2 footnote covers Windows users for free without committing to an unvalidated nssm wrapper. Anchor integrity verified: links to #v016-alpha-known-limits / #authentication / #durability-model all resolve to live sections in qwen-serve.md. Part of #4175.
📋 Review SummaryThis PR adds reference templates for locally deploying 🔍 General Feedback
🎯 Specific Feedback🟢 Medium
🔵 Low
✅ Highlights
|
There was a problem hiding this comment.
Pull request overview
Adds a new user-facing reference page with local “long-running daemon” launch templates for qwen serve (v0.16-alpha), and wires it into the existing qwen-serve user guide via cross-links and nav.
Changes:
- New
docs/users/qwen-serve-deploy-local.mdpage with systemd/launchd/tmux/nohup templates plus token rotation and smoke checks. - Update
docs/users/qwen-serve.mdto link to the new local deployment page from the v0.16-alpha known-limits section and “What’s next”. - Add the new page to the users docs navigation in
docs/users/_meta.ts.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| docs/users/qwen-serve.md | Replaces the “templates land in PR…” forward reference with a live link; adds a “What’s next” cross-link. |
| docs/users/qwen-serve-deploy-local.md | New local launcher template/reference page (systemd/launchd/tmux/nohup) with operational guidance. |
| docs/users/_meta.ts | Adds a nav entry for the new local deployment doc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
All 14 unresolved threads (5 copilot + 9 wenshao) source-verified
and ADOPTED. Net effect: every code-block in the doc is now
copy-paste-runnable + the security / restart / log-location
posture matches what real local-deployment operators expect.
CRITICAL fixes:
T1 + T2 + T3 + T12 [copilot/wenshao — `--bind` flag does NOT exist]:
Source-verified at packages/cli/src/commands/serve.ts:58 — the CLI
flag is `--hostname` (with `--port`). All 4 templates (systemd /
launchd / tmux / nohup) had `--bind 127.0.0.1` which would fail at
startup with "unknown option". Replaced with `--hostname 127.0.0.1
--port 4170` (explicit port for parity with launchd
ProgramArguments). Defaults are 127.0.0.1:4170 already, but
explicit-is-better here for copy-paste docs.
T6 [wenshao Critical — systemd missing loginctl enable-linger]:
Without `loginctl enable-linger`, the user-level systemd instance
shuts down at logout / does not start at boot. "Across reboots"
was a stated goal of the doc. Added the linger command to the
systemd manage block + a paragraph explaining why it's required
for headless dev boxes.
T11 [wenshao — nohup missing workspace cd]:
Daemon defaults to process.cwd() — running `nohup qwen serve` from
~ or /tmp silently binds the wrong workspace, causing every
POST /session with the expected cwd to return 400 workspace_mismatch.
Wrapped in `bash -c 'cd ~/your-project && qwen serve ...'` and added
a paragraph explaining the silent foot-gun.
SUGGESTION fixes (security / correctness):
T7 [wenshao — systemd Environment= exposes token in unit file]:
Replaced inline `Environment=QWEN_SERVER_TOKEN=...` with
`EnvironmentFile=%h/.qwen-serve-token-env`. Unit file is typically
644 (world-readable); EnvironmentFile keeps the token in the
user's chmod 600 file. Added a setup step that wraps the existing
token in KEY=value form for systemd to read.
T8 [wenshao — launchd /tmp logs have 3 problems]:
Symlink-attack risk on shared workstations + truncate-on-load
destroys diagnostic logs at exactly the wrong moment + macOS
periodic-daily cleans /tmp after 3 days. Switched to
`~/Library/Logs/qwen-serve/{out,err}.log`. Added the mkdir step
in the manage block + a paragraph noting log truncation on
unload→load.
T9 [wenshao — launchd KeepAlive=true respawns on clean SIGTERM]:
Bare `<true/>` makes `kill <pid>` impossible (daemon respawns
immediately). Switched to `<dict><key>SuccessfulExit</key><false/></dict>`
to match systemd Restart=on-failure semantics. Added
`ThrottleInterval=10` to mirror systemd RestartSec=5 and prevent
restart storms on persistent failures.
T14 [wenshao — plist itself needs chmod 600]:
The plist embeds the inline token. Files in ~/Library/LaunchAgents/
default to 644. Added `chmod 600 ...plist` to the manage block.
T4 [copilot — /capabilities auth wording wrong]:
Doc said /capabilities "always requires auth" — but it's only
gated when a token is configured (or --require-auth is set). On
a zero-config loopback boot neither route requires a header.
Reworded "Verifying the daemon is up" section to call out both
paths ("templates above all configure a token, so Authorization
is needed in practice").
T5 [copilot — token rotation missing chmod 600]:
Step 1 of token rotation now writes `~/.qwen-serve-token` AND
`~/.qwen-serve-token-env` AND chmods both 600. Mirrors the
initial generation block.
T10 [wenshao — restart-and-crash section self-contradictory]:
Said sessions "re-attach via Last-Event-ID resume" then immediately
"a restart drops sessions". Rewrote to clearly distinguish
WITHIN-process disconnects (Last-Event-ID covers them, in-memory
ring) from RESTART (drops everything; cross-restart durability
not in v0.16-alpha). Also documented the systemd vs launchd
KeepAlive semantics difference.
T13 [wenshao — bullet structure under "Generate a bearer token"]:
The original bullet list framed `--token CLI flag` and the env
var as if one consumed the other. Rewrote as a paragraph: "daemon
reads token from either --token or QWEN_SERVER_TOKEN; SDK falls
back to QWEN_SERVER_TOKEN; one shell-level export covers both".
Verification: `grep -c '\-\-bind ' docs/users/qwen-serve-deploy-local.md`
returns 0 (all bind→hostname); section structure intact (9 H2
sections, expected); 4 cross-link anchors to qwen-serve.md still
resolve (#authentication / #v016-alpha-known-limits /
#durability-model + the original out-of-scope list).
Net diff: +220/-160 (mostly net-additive — every fix added
context paragraphs explaining "why").
wenshao
left a comment
There was a problem hiding this comment.
Review of the docs-only PR (3 files: _meta.ts, qwen-serve-deploy-local.md, qwen-serve.md). Prior round's Critical (loginctl enable-linger) and most Suggestions (EnvironmentFile, KeepAlive semantics, /tmp logs, SSE contradiction, nohup workspace, --token wording, plist chmod) are all adopted in this commit. Two remaining Suggestions below on the systemd/launchd binary path and shell export scope.
— qwen3.7-max via Qwen Code /review
…se resolved) T16 [wenshao — hardcoded /usr/local/bin/qwen breaks nvm/Volta/Apple Silicon Homebrew users]: Both systemd `ExecStart` and launchd `ProgramArguments` had hardcoded `/usr/local/bin/qwen` — only correct for Linuxbrew / Intel macOS Homebrew / manual global install. Most Node developers use nvm (~/.nvm/...), fnm, Volta, or Homebrew on Apple Silicon (/opt/homebrew/bin/qwen) and would hit "No such file or directory" on first `systemctl --user start`. Switched both templates to `/PATH/TO/qwen` placeholder + added a prominent callout block above each template listing the common locations (Linuxbrew, nvm, fnm, Volta on Linux; Apple Silicon Homebrew, Intel Homebrew, nvm, Volta on macOS) and explicitly pointing at `which qwen` as the discovery step. Inline comments at the ExecStart / ProgramArguments lines reinforce "systemd does NOT read $PATH" / "launchd does NOT read $PATH". T17 [wenshao — shell-wide export leaks token to every subprocess]: Added a callout block immediately after the `export QWEN_SERVER_TOKEN=...` setup step warning against adding it to .bashrc/.zshrc on shared workstations. Profile-level export exposes the token to every child process (IDE subprocesses, browser debuggers, `npm` scripts from unrelated projects). Points users at the systemd EnvironmentFile= / launchd EnvironmentVariables mechanisms below for persistent setups since both scope the token to just the daemon process. T15 [wenshao — empty "test" comment]: Resolved without code change. Comment body was just "test"; appears to be an accidental post. Verification: `/usr/local/bin/qwen` now only appears inside the explanatory "common locations" prose blocks (NOT in the actual templates, which use `/PATH/TO/qwen` placeholder); zero `--bind` left in the file.
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — qwen3.7-max via Qwen Code /review
Squashed feature work from daemon_mode_b_main branch, rebased onto latest main to establish proper merge-base and clean PR diff. Original commits: - perf(core): F2 cleanup PR A — R9/W11/W12/R10 (post-merge follow-ups) (#4411) - refactor(acp-bridge): F1 test split — lift bridge.test.ts (6861 LOC) to acp-bridge (#4445) - fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134) (#4460) - feat(sdk/daemon-ui): unified completeness follow-up to #4328 (#4353) - docs(serve): v0.16-alpha known limits + SDK QWEN_SERVER_TOKEN env fallback (PR 27) (#4473) - docs(deploy): local launch templates for v0.16-alpha (PR 30a) (#4483) - feat(daemon+sdk): cross-client real-time sync completeness (#4484) - feat(serve): add POST /session/:id/recap (#4504) - feat(daemon): add voterClientId to permission_resolved (A4) (#4539) - feat(serve): --allow-origin <pattern> CORS allowlist (T2.4 #4514) (#4527) - feat(daemon): in-session model switch reaches the bus (A1) (#4546) - feat(serve): prompt absolute deadline + SSE writer idle timeout (#4514 T2.9) (#4530) - Feat/daemon react cli (#4380)
* docs(deploy): local launch templates for v0.16-alpha (PR 30a) Third PR in the F5 release chain (PR 27 ✅ → PR 30a → 28 → 31) per the 2026-05-24 v0.16-alpha scope freeze in #4175 (text-only + local-only). Pure markdown, zero code. New `docs/users/qwen-serve-deploy-local.md` (~160 LOC) with copy-paste-ready templates for: - systemd user-level unit (Linux) + system-wide alternative callout for shared dev hosts - launchd LaunchAgent plist (macOS) with explicit "no ~ / \$HOME expansion" warning since that's a common foot-gun - tmux session for interactive supervision - nohup one-liner with "not recommended" caveats - curl smoke-check (/health + /capabilities) + token rotation walkthrough (covers all four launchers) All templates inline `QWEN_SERVER_TOKEN=...` directly per the BYO- token guide PR 27 added to qwen-serve.md. No auto-gen, no token- store infrastructure — user generates via openssl rand -hex 32 and pastes into the unit/plist. Each template carries an explicit "DO NOT COMMIT this file with a real token" comment at the token line. Cross-references the SDK env fallback PR 27 added: one shell-level `export QWEN_SERVER_TOKEN=\$(cat token-file)` covers both the daemon-side flag fallback AND the SDK-side DaemonClient construction fallback. Restart-and-crash semantics cross-link to the existing Durability model section rather than duplicate. Cross-links from qwen-serve.md "v0.16-alpha known limits" line 32 (forward reference "templates land in PR 30a" becomes a live link) and "What's next" section (natural discovery hub at the bottom). _meta.ts gets a sibling nav entry under qwen-serve. Out of scope (deferred to v0.16.x or later): containerized deployment (PR 30b), cross-host federation, auto-gen tokens, native Windows service. WSL2 footnote covers Windows users for free without committing to an unvalidated nssm wrapper. Anchor integrity verified: links to #v016-alpha-known-limits / #authentication / #durability-model all resolve to live sections in qwen-serve.md. Part of #4175. * fix(docs): #4483 round 1 fold-in — 14 review threads adopted All 14 unresolved threads (5 copilot + 9 wenshao) source-verified and ADOPTED. Net effect: every code-block in the doc is now copy-paste-runnable + the security / restart / log-location posture matches what real local-deployment operators expect. CRITICAL fixes: T1 + T2 + T3 + T12 [copilot/wenshao — `--bind` flag does NOT exist]: Source-verified at packages/cli/src/commands/serve.ts:58 — the CLI flag is `--hostname` (with `--port`). All 4 templates (systemd / launchd / tmux / nohup) had `--bind 127.0.0.1` which would fail at startup with "unknown option". Replaced with `--hostname 127.0.0.1 --port 4170` (explicit port for parity with launchd ProgramArguments). Defaults are 127.0.0.1:4170 already, but explicit-is-better here for copy-paste docs. T6 [wenshao Critical — systemd missing loginctl enable-linger]: Without `loginctl enable-linger`, the user-level systemd instance shuts down at logout / does not start at boot. "Across reboots" was a stated goal of the doc. Added the linger command to the systemd manage block + a paragraph explaining why it's required for headless dev boxes. T11 [wenshao — nohup missing workspace cd]: Daemon defaults to process.cwd() — running `nohup qwen serve` from ~ or /tmp silently binds the wrong workspace, causing every POST /session with the expected cwd to return 400 workspace_mismatch. Wrapped in `bash -c 'cd ~/your-project && qwen serve ...'` and added a paragraph explaining the silent foot-gun. SUGGESTION fixes (security / correctness): T7 [wenshao — systemd Environment= exposes token in unit file]: Replaced inline `Environment=QWEN_SERVER_TOKEN=...` with `EnvironmentFile=%h/.qwen-serve-token-env`. Unit file is typically 644 (world-readable); EnvironmentFile keeps the token in the user's chmod 600 file. Added a setup step that wraps the existing token in KEY=value form for systemd to read. T8 [wenshao — launchd /tmp logs have 3 problems]: Symlink-attack risk on shared workstations + truncate-on-load destroys diagnostic logs at exactly the wrong moment + macOS periodic-daily cleans /tmp after 3 days. Switched to `~/Library/Logs/qwen-serve/{out,err}.log`. Added the mkdir step in the manage block + a paragraph noting log truncation on unload→load. T9 [wenshao — launchd KeepAlive=true respawns on clean SIGTERM]: Bare `<true/>` makes `kill <pid>` impossible (daemon respawns immediately). Switched to `<dict><key>SuccessfulExit</key><false/></dict>` to match systemd Restart=on-failure semantics. Added `ThrottleInterval=10` to mirror systemd RestartSec=5 and prevent restart storms on persistent failures. T14 [wenshao — plist itself needs chmod 600]: The plist embeds the inline token. Files in ~/Library/LaunchAgents/ default to 644. Added `chmod 600 ...plist` to the manage block. T4 [copilot — /capabilities auth wording wrong]: Doc said /capabilities "always requires auth" — but it's only gated when a token is configured (or --require-auth is set). On a zero-config loopback boot neither route requires a header. Reworded "Verifying the daemon is up" section to call out both paths ("templates above all configure a token, so Authorization is needed in practice"). T5 [copilot — token rotation missing chmod 600]: Step 1 of token rotation now writes `~/.qwen-serve-token` AND `~/.qwen-serve-token-env` AND chmods both 600. Mirrors the initial generation block. T10 [wenshao — restart-and-crash section self-contradictory]: Said sessions "re-attach via Last-Event-ID resume" then immediately "a restart drops sessions". Rewrote to clearly distinguish WITHIN-process disconnects (Last-Event-ID covers them, in-memory ring) from RESTART (drops everything; cross-restart durability not in v0.16-alpha). Also documented the systemd vs launchd KeepAlive semantics difference. T13 [wenshao — bullet structure under "Generate a bearer token"]: The original bullet list framed `--token CLI flag` and the env var as if one consumed the other. Rewrote as a paragraph: "daemon reads token from either --token or QWEN_SERVER_TOKEN; SDK falls back to QWEN_SERVER_TOKEN; one shell-level export covers both". Verification: `grep -c '\-\-bind ' docs/users/qwen-serve-deploy-local.md` returns 0 (all bind→hostname); section structure intact (9 H2 sections, expected); 4 cross-link anchors to qwen-serve.md still resolve (#authentication / #v016-alpha-known-limits / #durability-model + the original out-of-scope list). Net diff: +220/-160 (mostly net-additive — every fix added context paragraphs explaining "why"). * fix(docs): #4483 round 2 fold-in — 2 wenshao threads adopted (T15 noise resolved) T16 [wenshao — hardcoded /usr/local/bin/qwen breaks nvm/Volta/Apple Silicon Homebrew users]: Both systemd `ExecStart` and launchd `ProgramArguments` had hardcoded `/usr/local/bin/qwen` — only correct for Linuxbrew / Intel macOS Homebrew / manual global install. Most Node developers use nvm (~/.nvm/...), fnm, Volta, or Homebrew on Apple Silicon (/opt/homebrew/bin/qwen) and would hit "No such file or directory" on first `systemctl --user start`. Switched both templates to `/PATH/TO/qwen` placeholder + added a prominent callout block above each template listing the common locations (Linuxbrew, nvm, fnm, Volta on Linux; Apple Silicon Homebrew, Intel Homebrew, nvm, Volta on macOS) and explicitly pointing at `which qwen` as the discovery step. Inline comments at the ExecStart / ProgramArguments lines reinforce "systemd does NOT read $PATH" / "launchd does NOT read $PATH". T17 [wenshao — shell-wide export leaks token to every subprocess]: Added a callout block immediately after the `export QWEN_SERVER_TOKEN=...` setup step warning against adding it to .bashrc/.zshrc on shared workstations. Profile-level export exposes the token to every child process (IDE subprocesses, browser debuggers, `npm` scripts from unrelated projects). Points users at the systemd EnvironmentFile= / launchd EnvironmentVariables mechanisms below for persistent setups since both scope the token to just the daemon process. T15 [wenshao — empty "test" comment]: Resolved without code change. Comment body was just "test"; appears to be an accidental post. Verification: `/usr/local/bin/qwen` now only appears inside the explanatory "common locations" prose blocks (NOT in the actual templates, which use `/PATH/TO/qwen` placeholder); zero `--bind` left in the file.
Summary
Third PR in the F5 release chain (PR 27 ✅ → PR 30a → PR 28 → PR 31) per the 2026-05-24 v0.16-alpha scope freeze in #4175.
PR 27 (✅ merged 2026-05-24 15:57Z, commit
63803deab) added the v0.16-alpha banner + known-limits section todocs/users/qwen-serve.md. That section's line 32 said "Local launch viasystemd/launchd/nohup &/tmux(templates land in PR 30a)" — this PR fills in those templates as a sibling reference page. Pure markdown, zero code.Files changed
docs/users/qwen-serve-deploy-local.mddocs/users/qwen-serve.mddocs/users/_meta.tsqwen-serveWhat's in the new doc
~/$HOMEexpansion" warning (common foot-gun)curl /health(no auth on loopback) +curl -H "Authorization: Bearer ..." /capabilities(always requires auth)Design notes
QWEN_SERVER_TOKEN=...directly per the PR 27 BYO-token convention. Each template carries an explicit "DO NOT COMMIT this file with a real token" comment at the token line.export QWEN_SERVER_TOKEN=$(cat token-file)covers both the daemon-side flag fallback AND the TypeScript SDK'sDaemonClientconstructor fallback.~/.config/systemd/user/) is the primary path; system-wide is a 2-line alternative for shared dev hosts. User-level avoids sudo, runs as the user with their existing env/credentials/SSH keys.nssm, Service Control Manager) is its own surface; alpha is local-only and Linux/macOS dominate the dogfooding audience.Test plan
#v016-alpha-known-limits,#authentication,#durability-modelall exist as live H2/section anchors inqwen-serve.mdqwen-serve.md, 1 hit in_meta.ts[Service]ini,<plist>XML,bash) all close correctly_meta.tsstring additionBackward compatibility
Out of scope (deferred)
nssm/ Service Control Manager): v0.16.x patch alongside any wider Windows hardening work.qwen serve install --systemd-userCLI that writes the unit file with a freshly-generated token: out of scope (overlaps with PR 29's auto-gen, which is itself deferred).🤖 Generated with Qwen Code