Skip to content

Phase D (b) design: Ctrl+B to promote a running foreground shell to the background #3831

@wenshao

Description

@wenshao

Design proposal for Phase D part (b) of #3634 — Ctrl+B-style shortcut to send a running foreground shell command to the background mid-flight, without losing in-progress work. Opening as a design issue (not a PR yet) so the architecture choices can be aligned before code lands.

Goal

After this lands:

  • User presses Ctrl+B while a foreground shell command is mid-flight.
  • The child process keeps running (not killed). The agent's turn is unblocked.
  • The shell becomes a regular BackgroundShellEntry — visible in /tasks, the Background tasks dialog, and steerable via task_stop.
  • The LLM-facing tool result for the original (now-promoted) call says "Promoted to background as bg_xxxxxxxx; inspect via /tasks (text) or the Background tasks dialog (interactive). Output continues streaming to <outputPath>."

This is the gesture users coming from tmux expect. It's the natural payoff of the kind framework being built around BackgroundShellRegistry already.

Why this specifically (vs. cancel + re-run)

The degenerate alternative — "Ctrl+B kills the foreground command and the agent re-runs with is_background: true" — was considered and rejected because:

  • For a 30%-into-npm install command, losing the in-flight progress is exactly the cost users want to avoid.
  • The just-merged Phase D (a) hint (feat(core): hint to background long-running foreground bash commands #3809) already nudges agents to set is_background: true next time. The "kill + suggest" Ctrl+B would be redundant with that.
  • Ctrl+B users have tmux-shaped expectations about the gesture; mismatching the semantics (key keeps process alive in tmux, kills it in qwen-code) is a UX regression.

Scope

v1 (this design):

  • Foreground shell tool only (i.e. is_background: false invocations).
  • One running foreground shell at a time (the current architectural assumption — concurrent foreground shells aren't really a thing; if they ever become one, Ctrl+B promotes the most recent).
  • Ctrl+B keybinding on the Command enum.

Out of scope for v1:

Architecture options

Three plausible implementations, ordered by intrusiveness:

Option A — Pre-allocate hidden registry entry

Foreground spawn always creates BackgroundShellEntry { visible: false }. On Ctrl+B, flip visible: true, redirect output stream from in-memory accumulator to file.

  • Pro: Simple flag flip on promote.
  • Con: Output streaming has to be conditionally file-vs-memory throughout — touches ShellExecutionService heavily.
  • LOC est.: ~400-500.

Option B — Lazy promote on Ctrl+B (preferred)

Foreground stays as-is until Ctrl+B fires. On Ctrl+B, synchronously: (1) create a BackgroundShellEntry with the existing pid, (2) write the accumulated output buffer to the new on-disk output file as initial content, (3) install a stream redirect so subsequent data events go to the file, (4) abort the foreground watcher with signal.reason = { kind: 'background', shellId: 'bg_xxx' }.

  • Pro: Zero overhead in the common case (no Ctrl+B → no promote logic runs). Lazy state capture.
  • Con: More state to capture at promote time (output buffer flush, stream redirection); need to handle race "command finishes within milliseconds of Ctrl+B".
  • LOC est.: ~300-400.

Option C — signal.reason as takeover channel

Same as B but reuses the existing AbortSignal plumbing for the promote signal. ShellExecutionService differentiates abort reasons: default (no reason / { kind: 'cancel' }) kills the child as today; { kind: 'background' } only stops the watcher, leaves the child running.

  • Pro: Clean signaling; reuses existing AbortController plumbing rather than adding a side-channel.
  • Con: ShellExecutionService.execute()'s abort-handling block needs the reason discrimination — non-trivial change in a sensitive file.
  • LOC est.: ~350-450.

Initial preference: Option B + C combined — lazy promote (B) with signal.reason as the takeover channel (C). Gives the cleanest signaling without adding state for the never-pressed case.

Specific design questions

For reviewers / authors with context on the surrounding code:

  1. pid capture timing: coreToolScheduler calls setPidCallback(pid) (line ~1595) shortly after spawn. Is the TrackedExecutingToolCall.pid reliably populated by the time a user can plausibly press Ctrl+B (i.e. is the lag <500ms)?

  2. Output buffer at promote moment: foreground accumulates cumulativeOutput: string | AnsiOutput (shell.ts ~603). Writing this to the new on-disk file as initial content is straightforward for the string case but the AnsiOutput case needs a serialization step — is terminalSerializer the right tool? Any precedent?

  3. ShellExecutionService.execute() abort discrimination: today the abort listener (shellExecutionService.ts ~329) tree-kills the child. Adding a "if reason.kind === 'background', skip the kill, just stop watching" branch — does this play nicely with the existing PTY teardown / event listener cleanup, or are there listeners that need to be intentionally kept alive for the promoted entry's settle path?

  4. Concurrent-shell semantics: today only one foreground shell at a time is the assumption. If a future PR ever allows concurrent foreground shells, does the Ctrl+B → "promote the most recent" rule survive? Or do we need a UI for selecting which to promote?

  5. Race: command finishes within 100ms of Ctrl+B: should the promote attempt be idempotent (no-op if already terminal), or should it return an "already finished, here's the output" message? The latter is more honest UX.

  6. Keybinding conflicts: Ctrl+B is currently unbound (verified in keyBindings.ts and the no-readline composer paths). Any reason to prefer a different binding (e.g. Ctrl+)?

  7. aborted + promoted shape on the result (raised by @tanzhenxin in the PR-1 feat(core): add signal.reason convention for ShellExecutionService (#3831 PR-1 of 3) #3842 review): PR-1 lands result.aborted: true, result.promoted: true for the promote branch. The existing tools/shell.ts consumer checks aborted first and emits "cancelled" copy, so PR-2 callers must branch on promoted BEFORE aborted. Reviewer's suggestion: set aborted: false when promoted: true so existing branches just work — pushed to PR-2 because flipping the shape now would re-churn the consumer and PR-2 owns the actual caller-side branching logic. Decide here:

    • Option (a): promoted: true, aborted: false — existing consumers' if (aborted) branch falls through naturally; promote handling lives in a new if (result.promoted) arm above the cancel/timeout copy. Simpler downstream.
    • Option (b): keep promoted: true, aborted: true — explicit "I was aborted (with a specific reason)"; consumers must check promoted first to short-circuit the cancel arm. Requires every consumer to remember the precedence.
    • Option (a) seems cleaner and is what the reviewer suggested. Concrete impact in tools/shell.ts:700 and any future consumer that mimics that pattern.

Proposed PR sequencing

If the architecture above is roughly right, 3 PRs (not 4 — see "Why not a separate test PR" below):

  1. PR-1 (foundation, ~150 LOC): signal.reason convention + ShellExecutionService.execute() reason-aware abort handling, with its own unit tests pinning both branches (default abort still tree-kills; { kind: 'background' } only stops the watcher). No user-visible behavior change yet (no caller sets the reason). Pure plumbing — independently mergeable / revertable.
  2. PR-2 (shell.ts integration, ~280 LOC): lazy promote in shell.ts's foreground path; new BackgroundShellRegistry.promoteFromForeground(...) helper; output buffer flush + stream redirect; ToolResult shape for the promoted return; LLM-facing hint wording. Includes unit + integration tests for the promote path (string / AnsiOutput buffers; race-with-natural-exit; abort discrimination).
  3. PR-3 (UI wire-up + docs, ~120 LOC): Command.PROMOTE_TO_BACKGROUND keybinding (Ctrl+B); AppContainer detection during executing-tool state; calls into the scheduler with the existing pid + new reason. Includes E2E tests (key press → promote → registry entry visible in /tasks + dialog) and docs (keybinding in help text + Background tasks pill mention).

Total ~550 LOC across 3 PRs vs one big bang. Each is independently mergeable / revertable.

Why not a separate test PR: an earlier draft sketched a 4th "tests + docs" PR. Splitting tests off from the implementation they cover is an anti-pattern: (a) reviewers of the implementation PRs can't verify behavior matches expectation, (b) between the implementation merge and the test-PR merge, main is uncovered, (c) the cleavage breaks the "independently mergeable / revertable" property the splitting is supposed to give us. So tests live with the implementation they exercise; PR-3 picks up docs because the keybinding is its user-visible surface.

Risks

  • Foreground hot path regression: shell.ts's foreground execute is the most-used code path in the tool. Any refactor risks breaking flicker / streaming / cancellation. Mitigation: PR-1 is no-op behaviorally; behavior change starts in PR-2 with full test coverage of the existing paths.
  • PTY child-process-life management: detaching the watcher from a PTY child without leaking handles is the technically subtle piece. Need to verify that ShellExecutionService.execute()'s exit-event listener stays attached for the promoted entry's settle path.
  • Output buffer ANSI mid-line fragments: if promote fires while a partial ANSI escape is being assembled, splitting between memory and file could corrupt the rendered output. Mitigation: flush at chunk boundaries only, never mid-chunk.

中文版

目标:用户跑 npm run dev / pytest --slow 等前台命令,跑到一半发现需要 agent 继续干别的活,按 Ctrl+B — 进程继续跑不被 kill,agent 解锁,命令变成普通 BackgroundShellEntry,可通过 /tasks 或交互式 dialog 查看,可用 task_stop 停。

为什么不"kill + 重跑":30% 进度的 npm install 重跑是用户最不想要的。Phase D (a) 的提示(#3809)已经引导 agent 下次用 is_background: true,再做 "kill + 提示" Ctrl+B 跟那个重复。tmux 用户对 Ctrl+B 有"保留进程"的肌肉记忆,反语义会更糟。

v1 范围:仅前台 shell(agent / monitor 不在),单实例 Ctrl+B(多并行前台 shell 这个假设到时再说),Ctrl+B 键位。

架构三选:A 预分配 hidden entry / B 懒 promote / C 用 signal.reason 当 takeover 通道。初步倾向 B+C 组合(懒 promote + signal.reason 信号)— 最干净,common case 零开销。

6 个具体设计问题列在英文 section 里,主要是 pid 时机、输出 buffer 序列化、abort 区分 reason 的清理路径、并发语义、race 处理、键位选择。

PR 切片3 个 PR ~550 LOC(PR-1 signal.reason 框架自带测试 / PR-2 shell.ts 集成 + 单元 + 集成测试 / PR-3 UI 接入 + Ctrl+B 键位 + E2E 测试 + docs),每个独立可合可回退。不切独立"测试 PR":测试跟实现拆开是反模式 — reviewer 看实现 PR 时无法判断行为是否符预期;实现合后到测试 PR 合前,main 上无覆盖;切片本意"独立可合可回退"被破坏。

主要风险:前台 shell 是 hottest path,PTY child-life 管理是技术细节最微妙的一环,输出 buffer ANSI mid-line 碎片需要 chunk-boundary flush。


Calling on contributors familiar with the kind framework / dialog (per recent #3768) and scheduler / shell internals (per recent #3739, #3792) for review on the architecture choice + the 6 design questions before opening PR-1. Will hold off on coding until either a green light or a redirect.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions