feat(core): write runtime.json sidecar for active sessions#3714
Conversation
Port kimi-cli PR QwenLM#2082 part 1 to qwen-code. On interactive session start, atomically write a small JSON sidecar at <projectDir>/chats/<sessionId>.runtime.json recording the (pid, session_id, work_dir, hostname, started_at, qwen_version) tuple. External tools (terminal multiplexers, IDE integrations, status daemons) can map a running PID to its session id and work dir without parsing argv. Write is best-effort: a read-only filesystem must not block UI startup. OS process title (was QwenLM#3713) and dynamic OSC tab title (kimi QwenLM#2083) remain out of scope. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Config.startNewSession() reassigns this.sessionId in the same process, which is reached by /clear, /reset, /new and /resume. Previously the old <oldId>.runtime.json was left behind, falsely claiming the still- live PID for a session no longer being served, and no new sidecar was written for the incoming session. Centralize the swap by clearing the old sidecar and writing a fresh one for the new session id from inside startNewSession itself, so all same-PID transitions are covered. The refresh runs as a fire-and- forget best-effort; failures must not block the session switch. Mirrors the post-merge Codex P1 fix on kimi-cli PR QwenLM#2082 (the source of the runtime.json sidecar pattern this PR ports). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Follow-up #1 — same-PID session swap Problem: Fix: Refresh the sidecar from inside Tests: 19/19 pass. New case: |
Mirrors kimi-cli PR QwenLM#2082 commit e237951f (Codex P1 r3158754463): a short-lived non-interactive invocation (qwen --prompt, ACP, etc.) that runs `/clear` would otherwise call `Config.startNewSession()`, delete a concurrent shell's runtime.json sidecar (same outgoing session id), and never write a replacement — leaving the shell discoverable to nobody. Add a `runtimeStatusEnabled` flag on Config, flipped on by the interactive UI bootstrap immediately after the first successful sidecar write, and gate the swap-time refresh in `startNewSession()` on it. Non-interactive entry points never reach the bootstrap, so they won't touch sibling sidecars. Kimi later reverted the equivalent `write only from shell mode` guard (commit 7083975a) in favor of writing from every long-lived mode, but qwen's wire point is already interactive-only, so the narrower guard is the right shape here. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Follow-up #2 — only refresh the sidecar when this process owns it Problem: Fix: Add a Tests: still 19/19. Build clean. |
| } | ||
|
|
||
| async function renameWithRetry( | ||
| src: string, |
There was a problem hiding this comment.
[Suggestion] renameWithRetry 与 packages/core/src/utils/atomicFileWrite.ts:50-70 中的实现完全相同——相同的重试逻辑、错误码(EPERM/EACCES)、指数退避和重试次数。两者均为私有函数,形成了代码重复。对重试逻辑的 Bug 修复需要在两处同步应用,存在代码分化风险。
| src: string, | |
| // 建议:删除本地的 renameWithRetry,改为从 atomicFileWrite.js 导入重用; | |
| // 或将 writeRuntimeStatus 委托给 atomicWriteJSON 以复用同一原子写入实现。 |
— deepseek-v4-pro via Qwen Code /review
| // starting up. | ||
| try { | ||
| const sessionId = config.getSessionId(); | ||
| const runtimeStatusPath = config.storage.getRuntimeStatusPath(sessionId); |
There was a problem hiding this comment.
[Suggestion] config.storage 为公开 readonly 字段,此处直接访问 config.storage.getRuntimeStatusPath(sessionId) 暴露了内部 Storage 对象。而在 Config.startNewSession() 内部,同一操作使用的是 this.storage。如果将来 Storage 重构改变了 getRuntimeStatusPath 的语义,此处不会收到编译时反馈。
| const runtimeStatusPath = config.storage.getRuntimeStatusPath(sessionId); | |
| // 建议在 Config 上添加封装方法: | |
| getRuntimeStatusPath(sessionId: string): string { | |
| return this.storage.getRuntimeStatusPath(sessionId); | |
| } | |
| // 然后此处改为: | |
| const runtimeStatusPath = config.getRuntimeStatusPath(sessionId); |
— deepseek-v4-pro via Qwen Code /review
There was a problem hiding this comment.
One more pass over the diff — six small follow-ups, none blocking. Filing here so we can pick them up after merge.
- Dead / redundant guards (
isFiniteIntegerNotBool, the'utf-8'branch inreadRuntimeStatus) index.tsre-export is out of alphabetical order- Module-header schema-compatibility wording contradicts the strict
schema_versioncheck in the reader getRuntimeStatusPathdoesn't sanitizesessionIdpath components (defense-in-depth)Config.startNewSession's fire-and-forget IIFE has no ordering guarantee under rapid/clearbursts- No integration test for the
Config.startNewSessionpath — which is exactly the glue rounds 3–5 kept regressing
| typeof v === 'number' && | ||
| Number.isInteger(v) && | ||
| Number.isFinite(v) && | ||
| typeof v !== 'boolean' |
There was a problem hiding this comment.
[Suggestion] Both guards here are dead code:
typeof v === 'number'already excludes booleans (typeof true === 'boolean'), so the trailingtypeof v !== 'boolean'is unreachable.Number.isInteger(NaN/±Infinity)returnsfalse, soNumber.isFinite(v)afterNumber.isInteger(v)is also redundant.
Reduce to typeof v === 'number' && Number.isInteger(v) and rename the helper to isInteger for clarity.
| } | ||
| if (err instanceof Error && err.message.includes('utf-8')) { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
[Suggestion] This branch is effectively dead code. Node's fs.readFile(path, 'utf-8') does not throw on invalid UTF-8 bytes — it substitutes U+FFFD and returns the decoded string. The returns null on invalid UTF-8 bytes test actually exercises the downstream JSON.parse failure path, not this err.message.includes('utf-8') check.
Suggested cleanup:
- Drop this branch and let the unconditional
return nullhandle the residual case. - Rename the test case to something like
on bytes that decode to non-JSONso future readers aren't misled about what's being verified.
| * - **Not** deleted on clean `/quit` or on crash. From an external | ||
| * observer's standpoint the recorded PID no longer exists in either | ||
| * case, so a liveness check is sufficient and an explicit cleanup | ||
| * adds nothing. |
There was a problem hiding this comment.
[Issue] The header says “External consumers should treat unknown fields as forward-compatible additions”, but readRuntimeStatus (line 168) uses strict equality on schema_version and rejects anything ≥ 2 outright — a v2 producer that only adds fields would still be rejected by every v1 reader. Doc and code disagree; pick one:
- Tighten the doc: declare
schema_versiona breaking version — bumping it is an ecosystem-wide break. - Loosen the reader: accept
>= 1and rely on the per-field type checks (which already exist) to absorb additive fields.
Most schema_version conventions imply the latter; either is fine, but the contract should be explicit.
| export * from './utils/errorParsing.js'; | ||
| export * from './utils/errors.js'; | ||
| export * from './utils/fileUtils.js'; | ||
| export * from './utils/runtimeStatus.js'; |
There was a problem hiding this comment.
[Nit] This block is alphabetical (fileUtils → filesearch/fileSearch → formatters). runtimeStatus belongs in the r section (next to retry.js); inserting it between fileUtils and filesearch breaks the existing convention.
| * the same directory used for chat history to find live sessions. | ||
| */ | ||
| getRuntimeStatusPath(sessionId: string): string { | ||
| return path.join(this.getProjectDir(), 'chats', `${sessionId}.runtime.json`); |
There was a problem hiding this comment.
[Issue] sessionId is normally a randomUUID(), but both --resume <id> and Config.startNewSession(sessionId) accept user-supplied strings without validation. A value like ../../foo would resolve outside the chats/ directory via path.join.
Impact is low (the user already owns their filesystem), but a one-liner reject for ids containing /, \, or .. is cheap defense-in-depth and lines up with how peer CLIs (kimi-cli, Codex) sanitize session ids.
| qwenVersion: cliVersion, | ||
| }); | ||
| } catch { | ||
| // ignored: best-effort cleanup |
There was a problem hiding this comment.
[Suggestion] Fire-and-forget IIFE here. If a user hits /clear twice within ~50 ms, two IIFEs interleave: unlink(A) → write(B) → unlink(B) → write(C), and the final on-disk state depends on kernel rename serialization rather than any guarantee in this code.
Two options:
- Hang a
_runtimeStatusTaskpromise chain offthisso swaps strictly serialize:this._runtimeStatusTask = (this._runtimeStatusTask ?? Promise.resolve()) .then(() => clearRuntimeStatus(oldPath)) .then(() => writeRuntimeStatus(newPath, { ... })) .catch(() => {});
- Add a one-line comment that documents “last writer wins, relies on rename atomicity” — at least the intent is recorded.
Unlikely to fire in practice, but spelling it out prevents the next contributor from re-discovering the same edge.
| const newPath = path.join(tmpDir, 'session-b.runtime.json'); | ||
| await writeRuntimeStatus(oldPath, { | ||
| sessionId: 'session-a', | ||
| workDir: '/w', |
There was a problem hiding this comment.
[Issue] The existing same-PID session swap test calls the low-level helpers directly — it does not exercise Config.startNewSession(). The real integration glue is exactly where rounds 3–5 kept finding regressions:
- whether the
runtimeStatusEnabledgate fires; - whether the
previousSessionId !== this.sessionIdshort-circuit holds; - whether the IIFE actually finishes deleting the old sidecar and writing the new one.
Suggested test in config.test.ts: build a Config, call markRuntimeStatusEnabled(), call startNewSession(newId), await new Promise(r => setImmediate(r)), then assert the old sidecar is gone and the new sidecar's fields are correct. That would catch the round 3–5 regressions in CI rather than in review.
Post-merge tmux validationRan the bundled CLI in a real interactive tmux session against the merged commit ( Setup: tmux new-session -d -s qwen-rt -x 200 -y 50 -c "$WORKDIR" \
"QWEN_RUNTIME_DIR=$RUNTIME_DIR node dist/cli.js"Result on launch — sidecar appears at the expected path: {
"schema_version": 1,
"pid": 1226115,
"session_id": "cc329ef4-3229-4271-b656-fb0cc7bd4f08",
"work_dir": "/tmp/qwen-tmux-J2c8",
"hostname": "...",
"started_at": 1778433841.934,
"qwen_version": "0.15.3"
}
Same-PID swap via
|
) - Drop dead `typeof !== 'boolean'` check in the integer-type guard (Number.isInteger(true) is already false; rename to isFiniteInteger). - Drop dead utf-8-message branch in readRuntimeStatus's catch: fs.readFile(p, 'utf-8') returns replacement chars rather than throwing, so that branch never fires; the JSON.parse fallback already handles the truncated-UTF-8 case the test exercises. - Move runtimeStatus.js export from between fileUtils and filesearch into alphabetical position next to runtimeFetchOptions. - Add Config.startNewSession integration tests pinning the runtimeStatusEnabled gate: flag-off leaves sibling sidecars alone, flag-on swaps old->new on the same PID, no-op when sessionId is unchanged. No behavior change for the runtime-status sidecar itself; this just trims dead defensive code and adds the integration coverage that was missing in #3714.
) - Drop dead `typeof !== 'boolean'` check in the integer-type guard (Number.isInteger(true) is already false; rename to isFiniteInteger). - Drop dead utf-8-message branch in readRuntimeStatus's catch: fs.readFile(p, 'utf-8') returns replacement chars rather than throwing, so that branch never fires; the JSON.parse fallback already handles the truncated-UTF-8 case the test exercises. - Move runtimeStatus.js export from between fileUtils and filesearch into alphabetical position next to runtimeFetchOptions. - Add Config.startNewSession integration tests pinning the runtimeStatusEnabled gate: flag-off leaves sibling sidecars alone, flag-on swaps old->new on the same PID, no-op when sessionId is unchanged. No behavior change for the runtime-status sidecar itself; this just trims dead defensive code and adds the integration coverage that was missing in #3714.
Summary
Write a small
runtime.jsonsidecar next to each interactive session''s chat log so external tools can map a running PID back to its session id and work directory without parsingargv.Prior art
Peer interactive CLIs already expose a PID/session mapping in some form, so external tooling has come to expect it:
~/.copilot/session-state/<sessionId>/, and shells spawned by a session are tagged so the owning session id is recoverable from the PID.~/.claude/projects/<encodedCwd>/<sessionId>.jsonl, with the running process discoverable per project.~/.codex/sessions/.../rollout-*.jsonl) whose header records the session id for the active process.This PR gives qwen-code an equivalent, lightweight surface — a single JSON file per session — without adopting any one of those formats.
Schema
File:
<projectDir>/chats/<sessionId>.runtime.json{ "schema_version": 1, "pid": 12345, "session_id": "…", "work_dir": "/abs/path", "hostname": "host", "started_at": 1730000000.123, "qwen_version": "0.15.3" }Snake_case keys are chosen so the file format stays language-agnostic and easy to consume from external tooling.
Guarantees
EPERM/EACCESretry on Windows). External readers never see a partial file.startInteractiveUI(). A read-only filesystem cannot block UI startup.readRuntimeStatusreturnsnullon missing file, malformed JSON, invalid UTF-8, unknownschema_version, or wrong field types — never coerces.Config.startNewSession()clears the outgoing sidecar and writes a fresh one for the incoming session, gated on an ownership flag set by the interactive UI bootstrap so non-interactive/clearpaths can''t trample a sibling process''s sidecar./quitor crash. A PID-liveness check on the consumer side already disambiguates dead records, so an explicit cleanup step buys nothing.Files
packages/core/src/utils/runtimeStatus.ts— new module (writeRuntimeStatus/readRuntimeStatus/clearRuntimeStatus).packages/core/src/utils/runtimeStatus.test.ts— 19 vitest cases covering atomic write, strict typing, schema validation, and same-PID session swap.packages/core/src/config/storage.ts— addsStorage.getRuntimeStatusPath(sessionId).packages/core/src/config/config.ts—runtimeStatusEnabledownership flag;startNewSession()refreshes the sidecar when this process owns it;markRuntimeStatusEnabled()exposed to the UI bootstrap.packages/core/src/index.ts— re-exports the module.packages/cli/src/gemini.tsx— callswriteRuntimeStatus()once instartInteractiveUI()next tosetWindowTitle(), wrapped in try/catch, then flips the ownership flag.Tests
npm run buildfrom the repo root succeeds.Platform coverage
The repo''s CI test job (
ci.yml) runs the full vitest suite — includingruntimeStatus.test.ts— across the matrix below on every push, withfail-fast: false:Tests use the real filesystem (no mocked
fs), so each platform exercises actualrename,unlink, and atomic-write semantics. Windows-specificEPERM/EACCESonrenameis handled with an explicit retry loop inwriteRuntimeStatus.