Skip to content

feat(core): write runtime.json sidecar for active sessions#3714

Merged
wenshao merged 3 commits into
QwenLM:mainfrom
yeelam-gordon:feature/runtime-status-sidecar
May 10, 2026
Merged

feat(core): write runtime.json sidecar for active sessions#3714
wenshao merged 3 commits into
QwenLM:mainfrom
yeelam-gordon:feature/runtime-status-sidecar

Conversation

@yeelam-gordon

@yeelam-gordon yeelam-gordon commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Write a small runtime.json sidecar next to each interactive session''s chat log so external tools can map a running PID back to its session id and work directory without parsing argv.

Prior art

Peer interactive CLIs already expose a PID/session mapping in some form, so external tooling has come to expect it:

  • GitHub Copilot CLI — keeps per-session state under ~/.copilot/session-state/<sessionId>/, and shells spawned by a session are tagged so the owning session id is recoverable from the PID.
  • Claude Code CLI — writes per-project session transcripts to ~/.claude/projects/<encodedCwd>/<sessionId>.jsonl, with the running process discoverable per project.
  • OpenAI Codex CLI — emits a rollout file (e.g. ~/.codex/sessions/.../rollout-*.jsonl) whose header records the session id for the active process.

This PR gives qwen-code an equivalent, lightweight surface — a single JSON file per session — without adopting any one of those formats.

Schema

File: <projectDir>/chats/<sessionId>.runtime.json

{
  "schema_version": 1,
  "pid": 12345,
  "session_id": "",
  "work_dir": "/abs/path",
  "hostname": "host",
  "started_at": 1730000000.123,
  "qwen_version": "0.15.3"
}

Snake_case keys are chosen so the file format stays language-agnostic and easy to consume from external tooling.

Guarantees

  • Atomic write (tmp + rename, with EPERM/EACCES retry on Windows). External readers never see a partial file.
  • Best-effort: failures are swallowed at the call site in startInteractiveUI(). A read-only filesystem cannot block UI startup.
  • Strict reader: readRuntimeStatus returns null on missing file, malformed JSON, invalid UTF-8, unknown schema_version, or wrong field types — never coerces.
  • Same-PID session swap covered: Config.startNewSession() clears the outgoing sidecar and writes a fresh one for the incoming session, gated on an ownership flag set by the interactive UI bootstrap so non-interactive /clear paths can''t trample a sibling process''s sidecar.
  • Not deleted on /quit or crash. A PID-liveness check on the consumer side already disambiguates dead records, so an explicit cleanup step buys nothing.

Files

  • packages/core/src/utils/runtimeStatus.ts — new module (writeRuntimeStatus / readRuntimeStatus / clearRuntimeStatus).
  • packages/core/src/utils/runtimeStatus.test.ts — 19 vitest cases covering atomic write, strict typing, schema validation, and same-PID session swap.
  • packages/core/src/config/storage.ts — adds Storage.getRuntimeStatusPath(sessionId).
  • packages/core/src/config/config.tsruntimeStatusEnabled ownership flag; startNewSession() refreshes the sidecar when this process owns it; markRuntimeStatusEnabled() exposed to the UI bootstrap.
  • packages/core/src/index.ts — re-exports the module.
  • packages/cli/src/gemini.tsx — calls writeRuntimeStatus() once in startInteractiveUI() next to setWindowTitle(), wrapped in try/catch, then flips the ownership flag.

Tests

$ npx vitest run src/utils/runtimeStatus.test.ts
 Test Files  1 passed (1)
      Tests  19 passed (19)

npm run build from the repo root succeeds.

Platform coverage

The repo''s CI test job (ci.yml) runs the full vitest suite — including runtimeStatus.test.ts — across the matrix below on every push, with fail-fast: false:

macOS Windows Linux
Node 20.x
Node 22.x
Node 24.x

Tests use the real filesystem (no mocked fs), so each platform exercises actual rename, unlink, and atomic-write semantics. Windows-specific EPERM/EACCES on rename is handled with an explicit retry loop in writeRuntimeStatus.

yeelam-gordon and others added 2 commits April 28, 2026 23:14
Port kimi-cli PR QwenLM#2082 part 1 to qwen-code.

On interactive session start, atomically write a small JSON sidecar at <projectDir>/chats/<sessionId>.runtime.json recording the (pid, session_id, work_dir, hostname, started_at, qwen_version) tuple.

External tools (terminal multiplexers, IDE integrations, status daemons) can map a running PID to its session id and work dir without parsing argv. Write is best-effort: a read-only filesystem must not block UI startup.

OS process title (was QwenLM#3713) and dynamic OSC tab title (kimi QwenLM#2083) remain out of scope.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Config.startNewSession() reassigns this.sessionId in the same process,
which is reached by /clear, /reset, /new and /resume. Previously the
old <oldId>.runtime.json was left behind, falsely claiming the still-
live PID for a session no longer being served, and no new sidecar was
written for the incoming session.

Centralize the swap by clearing the old sidecar and writing a fresh
one for the new session id from inside startNewSession itself, so all
same-PID transitions are covered. The refresh runs as a fire-and-
forget best-effort; failures must not block the session switch.

Mirrors the post-merge Codex P1 fix on kimi-cli PR QwenLM#2082 (the source
of the runtime.json sidecar pattern this PR ports).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yeelam-gordon

yeelam-gordon commented Apr 28, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up #1 — same-PID session swap

Problem: /clear, /reset, /new and /resume reuse the same process and just reassign this.sessionId inside Config.startNewSession(). The old <oldId>.runtime.json was left on disk pointing at the live PID, and no sidecar was written for the new session.

Fix: Refresh the sidecar from inside startNewSession() — clear the old file, write a new one. Best-effort, non-blocking.

Tests: 19/19 pass. New case: same-PID session swap.

Mirrors kimi-cli PR QwenLM#2082 commit e237951f (Codex P1 r3158754463): a
short-lived non-interactive invocation (qwen --prompt, ACP, etc.) that
runs `/clear` would otherwise call `Config.startNewSession()`, delete
a concurrent shell's runtime.json sidecar (same outgoing session id),
and never write a replacement — leaving the shell discoverable to
nobody.

Add a `runtimeStatusEnabled` flag on Config, flipped on by the
interactive UI bootstrap immediately after the first successful
sidecar write, and gate the swap-time refresh in
`startNewSession()` on it. Non-interactive entry points never reach
the bootstrap, so they won't touch sibling sidecars.

Kimi later reverted the equivalent `write only from shell mode`
guard (commit 7083975a) in favor of writing from every long-lived
mode, but qwen's wire point is already interactive-only, so the
narrower guard is the right shape here.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yeelam-gordon

yeelam-gordon commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up #2 — only refresh the sidecar when this process owns it

Problem: /clear also runs in non_interactive and acp modes. If two processes shared the same outgoing session id, a non-interactive clear in one could overwrite the sidecar of a live shell in the other.

Fix: Add a runtimeStatusEnabled flag on Config. The interactive UI flips it on right after the initial sidecar write succeeds. startNewSession() only refreshes the sidecar when the flag is set, so non-interactive paths leave sibling sidecars alone.

Tests: still 19/19. Build clean.

@tanzhenxin tanzhenxin added the type/feature-request New feature or enhancement request label Apr 29, 2026
}

async function renameWithRetry(
src: string,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] renameWithRetrypackages/core/src/utils/atomicFileWrite.ts:50-70 中的实现完全相同——相同的重试逻辑、错误码(EPERM/EACCES)、指数退避和重试次数。两者均为私有函数,形成了代码重复。对重试逻辑的 Bug 修复需要在两处同步应用,存在代码分化风险。

Suggested change
src: string,
// 建议:删除本地的 renameWithRetry,改为从 atomicFileWrite.js 导入重用;
// 或将 writeRuntimeStatus 委托给 atomicWriteJSON 以复用同一原子写入实现。

— deepseek-v4-pro via Qwen Code /review

// starting up.
try {
const sessionId = config.getSessionId();
const runtimeStatusPath = config.storage.getRuntimeStatusPath(sessionId);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] config.storage 为公开 readonly 字段,此处直接访问 config.storage.getRuntimeStatusPath(sessionId) 暴露了内部 Storage 对象。而在 Config.startNewSession() 内部,同一操作使用的是 this.storage。如果将来 Storage 重构改变了 getRuntimeStatusPath 的语义,此处不会收到编译时反馈。

Suggested change
const runtimeStatusPath = config.storage.getRuntimeStatusPath(sessionId);
// 建议在 Config 上添加封装方法:
getRuntimeStatusPath(sessionId: string): string {
return this.storage.getRuntimeStatusPath(sessionId);
}
// 然后此处改为:
const runtimeStatusPath = config.getRuntimeStatusPath(sessionId);

— deepseek-v4-pro via Qwen Code /review

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more pass over the diff — six small follow-ups, none blocking. Filing here so we can pick them up after merge.

  • Dead / redundant guards (isFiniteIntegerNotBool, the 'utf-8' branch in readRuntimeStatus)
  • index.ts re-export is out of alphabetical order
  • Module-header schema-compatibility wording contradicts the strict schema_version check in the reader
  • getRuntimeStatusPath doesn't sanitize sessionId path components (defense-in-depth)
  • Config.startNewSession's fire-and-forget IIFE has no ordering guarantee under rapid /clear bursts
  • No integration test for the Config.startNewSession path — which is exactly the glue rounds 3–5 kept regressing

typeof v === 'number' &&
Number.isInteger(v) &&
Number.isFinite(v) &&
typeof v !== 'boolean'

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Both guards here are dead code:

  • typeof v === 'number' already excludes booleans (typeof true === 'boolean'), so the trailing typeof v !== 'boolean' is unreachable.
  • Number.isInteger(NaN/±Infinity) returns false, so Number.isFinite(v) after Number.isInteger(v) is also redundant.

Reduce to typeof v === 'number' && Number.isInteger(v) and rename the helper to isInteger for clarity.

}
if (err instanceof Error && err.message.includes('utf-8')) {
return null;
}

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] This branch is effectively dead code. Node's fs.readFile(path, 'utf-8') does not throw on invalid UTF-8 bytes — it substitutes U+FFFD and returns the decoded string. The returns null on invalid UTF-8 bytes test actually exercises the downstream JSON.parse failure path, not this err.message.includes('utf-8') check.

Suggested cleanup:

  1. Drop this branch and let the unconditional return null handle the residual case.
  2. Rename the test case to something like on bytes that decode to non-JSON so future readers aren't misled about what's being verified.

* - **Not** deleted on clean `/quit` or on crash. From an external
* observer's standpoint the recorded PID no longer exists in either
* case, so a liveness check is sufficient and an explicit cleanup
* adds nothing.

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Issue] The header says “External consumers should treat unknown fields as forward-compatible additions”, but readRuntimeStatus (line 168) uses strict equality on schema_version and rejects anything ≥ 2 outright — a v2 producer that only adds fields would still be rejected by every v1 reader. Doc and code disagree; pick one:

  • Tighten the doc: declare schema_version a breaking version — bumping it is an ecosystem-wide break.
  • Loosen the reader: accept >= 1 and rely on the per-field type checks (which already exist) to absorb additive fields.

Most schema_version conventions imply the latter; either is fine, but the contract should be explicit.

export * from './utils/errorParsing.js';
export * from './utils/errors.js';
export * from './utils/fileUtils.js';
export * from './utils/runtimeStatus.js';

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nit] This block is alphabetical (fileUtilsfilesearch/fileSearchformatters). runtimeStatus belongs in the r section (next to retry.js); inserting it between fileUtils and filesearch breaks the existing convention.

* the same directory used for chat history to find live sessions.
*/
getRuntimeStatusPath(sessionId: string): string {
return path.join(this.getProjectDir(), 'chats', `${sessionId}.runtime.json`);

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Issue] sessionId is normally a randomUUID(), but both --resume <id> and Config.startNewSession(sessionId) accept user-supplied strings without validation. A value like ../../foo would resolve outside the chats/ directory via path.join.

Impact is low (the user already owns their filesystem), but a one-liner reject for ids containing /, \, or .. is cheap defense-in-depth and lines up with how peer CLIs (kimi-cli, Codex) sanitize session ids.

qwenVersion: cliVersion,
});
} catch {
// ignored: best-effort cleanup

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Fire-and-forget IIFE here. If a user hits /clear twice within ~50 ms, two IIFEs interleave: unlink(A) → write(B) → unlink(B) → write(C), and the final on-disk state depends on kernel rename serialization rather than any guarantee in this code.

Two options:

  1. Hang a _runtimeStatusTask promise chain off this so swaps strictly serialize:
    this._runtimeStatusTask = (this._runtimeStatusTask ?? Promise.resolve())
      .then(() => clearRuntimeStatus(oldPath))
      .then(() => writeRuntimeStatus(newPath, { ... }))
      .catch(() => {});
  2. Add a one-line comment that documents “last writer wins, relies on rename atomicity” — at least the intent is recorded.

Unlikely to fire in practice, but spelling it out prevents the next contributor from re-discovering the same edge.

const newPath = path.join(tmpDir, 'session-b.runtime.json');
await writeRuntimeStatus(oldPath, {
sessionId: 'session-a',
workDir: '/w',

@wenshao wenshao May 5, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Issue] The existing same-PID session swap test calls the low-level helpers directly — it does not exercise Config.startNewSession(). The real integration glue is exactly where rounds 3–5 kept finding regressions:

  • whether the runtimeStatusEnabled gate fires;
  • whether the previousSessionId !== this.sessionId short-circuit holds;
  • whether the IIFE actually finishes deleting the old sidecar and writing the new one.

Suggested test in config.test.ts: build a Config, call markRuntimeStatusEnabled(), call startNewSession(newId), await new Promise(r => setImmediate(r)), then assert the old sidecar is gone and the new sidecar's fields are correct. That would catch the round 3–5 regressions in CI rather than in review.

@wenshao wenshao merged commit 464e4cf into QwenLM:main May 10, 2026
13 checks passed
@wenshao

wenshao commented May 10, 2026

Copy link
Copy Markdown
Collaborator

Post-merge tmux validation

Ran the bundled CLI in a real interactive tmux session against the merged commit (464e4cf3) to cover what the unit tests skip — actual startInteractiveUI() execution order, ink rendering startup, real getCliVersion() resolution, and the full /clear command-dispatch chain.

Setup:

tmux new-session -d -s qwen-rt -x 200 -y 50 -c "$WORKDIR" \
  "QWEN_RUNTIME_DIR=$RUNTIME_DIR node dist/cli.js"

Result on launch — sidecar appears at the expected path:

$RUNTIME_DIR/projects/-tmp-qwen-tmux-XXXX/chats/cc329ef4-3229-4271-b656-fb0cc7bd4f08.runtime.json
{
  "schema_version": 1,
  "pid": 1226115,
  "session_id": "cc329ef4-3229-4271-b656-fb0cc7bd4f08",
  "work_dir": "/tmp/qwen-tmux-J2c8",
  "hostname": "...",
  "started_at": 1778433841.934,
  "qwen_version": "0.15.3"
}

pid: 1226115 matches the live node dist/cli.js process. qwen_version matches getCliVersion(). work_dir matches launch cwd.

Same-PID swap via /clear: old cc329ef4-… sidecar is unlinked, new a36087f6-85ba-470c-9a27-4bae90462e1e.runtime.json is written under the same PID 1226115 with a fresh started_at. Confirms the runtimeStatusEnabled flag flips correctly inside startInteractiveUI() and Config.startNewSession() honors it on a live interactive session.

Check Result
Sidecar appears at <projectDir>/chats/<sessionId>.runtime.json on launch
All 7 schema fields present, snake_case, schema_version: 1
pid matches the running node process
qwen_version resolved from getCliVersion()
/clear removes old sidecar and writes new one (same PID)

wenshao added a commit that referenced this pull request May 11, 2026
)

- Drop dead `typeof !== 'boolean'` check in the integer-type guard
  (Number.isInteger(true) is already false; rename to isFiniteInteger).
- Drop dead utf-8-message branch in readRuntimeStatus's catch:
  fs.readFile(p, 'utf-8') returns replacement chars rather than throwing,
  so that branch never fires; the JSON.parse fallback already handles
  the truncated-UTF-8 case the test exercises.
- Move runtimeStatus.js export from between fileUtils and filesearch
  into alphabetical position next to runtimeFetchOptions.
- Add Config.startNewSession integration tests pinning the
  runtimeStatusEnabled gate: flag-off leaves sibling sidecars alone,
  flag-on swaps old->new on the same PID, no-op when sessionId is
  unchanged.

No behavior change for the runtime-status sidecar itself; this just
trims dead defensive code and adds the integration coverage that was
missing in #3714.
wenshao added a commit that referenced this pull request May 17, 2026
)

- Drop dead `typeof !== 'boolean'` check in the integer-type guard
  (Number.isInteger(true) is already false; rename to isFiniteInteger).
- Drop dead utf-8-message branch in readRuntimeStatus's catch:
  fs.readFile(p, 'utf-8') returns replacement chars rather than throwing,
  so that branch never fires; the JSON.parse fallback already handles
  the truncated-UTF-8 case the test exercises.
- Move runtimeStatus.js export from between fileUtils and filesearch
  into alphabetical position next to runtimeFetchOptions.
- Add Config.startNewSession integration tests pinning the
  runtimeStatusEnabled gate: flag-off leaves sibling sidecars alone,
  flag-on swaps old->new on the same PID, no-op when sessionId is
  unchanged.

No behavior change for the runtime-status sidecar itself; this just
trims dead defensive code and adds the integration coverage that was
missing in #3714.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/feature-request New feature or enhancement request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants