Skip to content

feat(core): model-facing agent control (task_stop, send_message, per-agent transcript)#3471

Merged
tanzhenxin merged 13 commits into
mainfrom
feat/background-agent-control
Apr 27, 2026
Merged

feat(core): model-facing agent control (task_stop, send_message, per-agent transcript)#3471
tanzhenxin merged 13 commits into
mainfrom
feat/background-agent-control

Conversation

@tanzhenxin

@tanzhenxin tanzhenxin commented Apr 20, 2026

Copy link
Copy Markdown
Collaborator

TLDR

Today a parent agent can launch a background subagent but then can only wait for the final notification — no way to check progress, redirect it mid-flight, or stop it. This PR adds the three missing affordances (read the live transcript, send_message, task_stop) and switches the transcript to the same ChatRecord JSONL schema as the main session log so a single parser covers both.

Screenshots / Video Demo

N/A — no user-visible UI change. The PR adds tools the parent model calls and changes the on-disk shape of subagent transcripts; both are observable only via stream-json output, which the Reviewer Test Plan walks through.

Dive Deeper

The PR lands three new tools, a shared on-disk format for subagent transcripts, and a small lifecycle fix the control plane depends on.

   Main agent              Transcript file              Sub-agent
       │                          │                          │
       │ ────── delegate (agent) ──────────────────────────▶ │   (existing)
       │                          │                          │
       │                          │ ◀──── writes records ─── │
       │                          │                          │
       │ ────── check on it (read_file) ─▶                   │   ✱ new
       │                          │                          │
       │ ────── redirect (send_message) ───────────────────▶ │   ✱ new
       │ ────── stop (task_stop) ──────────────────────────▶ │   ✱ new
       │                          │                          │
       │ ◀───── completion (task_notification) ───────────── │   (existing)

New tools

  • task_stop — request cancellation of a running background agent by id. Aborts the agent's signal and lets its natural handler emit a terminal task_notification carrying the real partial/final result, instead of a bare "cancelled" message. The tool returns immediately with an acknowledgement so the parent isn't blocked while the target unwinds.
  • send_message — inject a message into a running background agent's reasoning loop. Messages are queued in the registry; the agent drains them between tool rounds, and they ride alongside the next tool-response user turn (prefixed with a fixed marker so downstream parsers can identify them). Each successful injection is also recorded in the subagent's transcript as a user-role record so the transcript stays a complete, reconstructible view of the run. This is what lets the parent give mid-flight guidance to a long-running agent without losing the audit trail.

Per-agent transcripts

Each background subagent emits two sibling files under <projectDir>/subagents/<sessionId>/: a canonical JSONL transcript (agent-<id>.jsonl) and a metadata sidecar (agent-<id>.meta.json). The transcript uses the same ChatRecord schema as the main session log — extended with four optional fields (agentId, agentName, agentColor, isSidechain) — so a single parser can consume both main and subagent records together, and tooling that walks chats/ is unaffected by the new sibling directory. Each record chains via parentUuid so readers can walk the transcript tree the same way they walk the main session log.

The writer subscribes to the subagent's event emitter and maps events directly to records: the launching prompt is recorded once at attach time as a user record, send_message injections become additional user records mid-stream, ROUND_TEXT and TOOL_CALL events become assistant records, and TOOL_RESULT events become tool_result records carrying the same response parts the model saw. Streaming progress (TOOL_OUTPUT_UPDATE) is intentionally not persisted — the model has a single structured record per tool call and waits for it rather than polling mid-flight chunks.

The dedicated emitter is important: reusing the parent tool's UI emitter would mix events from every concurrent fork/subagent into the same transcript. Isolation keeps each transcript to exactly one agent's events.

The launch tool response carries the JSONL path and tells the model the file is JSON-lines so it can parse with no special handling. The same path is surfaced in the completion notification XML via a new <output-file> tag, and read_file is auto-allowed under the subagents/ directory so the model's progress-polling flow runs without permission prompts. The sidecar lets external readers list subagents/<sessionId>/ and discover every subagent without opening any transcripts; it carries agentId, agentType, description, parentSessionId, parentAgentId, and createdAt.

Lifecycle

A few edges around the registry lifecycle were tightened so the new tools sit on a safe substrate: the cancel path gains an unref'd grace-timer fallback so a tool that swallows AbortSignal can no longer leave a launch without its terminal task_notification; the headless wait loop blocks on "every agent has produced a terminal notification" rather than just "no agents running"; and the registry grows the small amount of state the new tools need (outputFile, pendingMessages, notified).

Reviewer Test Plan

In non-interactive (stream-json) mode, a reviewer should be able to:

  1. Launch a background agent (e.g. a long-running explore task) and confirm the launch message reports the JSONL transcript path under <projectDir>/subagents/<sessionId>/ plus the three control paths (task_stop, send_message, read_file). The directory should contain exactly two files: agent-<id>.jsonl and agent-<id>.meta.json.
  2. While it is running, have the parent call send_message. Verify the agent acts on the message on its next tool round, and that the transcript now contains a new user record carrying the injected text.
  3. Call task_stop on a running agent — verify exactly one terminal task_notification arrives with status=cancelled and the partial result attached, and that the task-id/tool-use-id match the launch.
  4. read_file the transcript mid-run and confirm valid JSONL with the launching prompt as the first record — and no permission prompt raised for the read.
  5. Inspect the .meta.json sidecar and confirm it contains agentId, agentType, description, parentSessionId (matching the parent session's chats/<sessionId>.jsonl), parentAgentId (null for top-level), and createdAt.
  6. Run a long-running streaming shell command inside a background subagent and confirm the JSONL contains exactly one tool_result record for it — no mid-flight progress records.
  7. Cancel with SIGINT while background agents are running — verify every outstanding task_started still pairs with a task_notification before the process exits (no orphan task_started lines).
  8. Error cases: task_stop on an unknown id returns the TASK_STOP_AGENT_NOT_FOUND error; task_stop on an already-completed agent returns TASK_STOP_AGENT_NOT_RUNNING; same pattern for send_message.

Unit coverage is in background-tasks.test.ts, task-stop.test.ts, send-message.test.ts, agent-transcript.test.ts, and the background block in agent.test.ts. The grace-period fallback is covered by two new fake-timer tests: one where the natural handler never runs (fallback fires), one where it wins the race (fallback no-ops). The transcript writer tests cover the launching-prompt seed, the EXTERNAL_MESSAGEuser-record path, and the parentUuid chaining invariant.

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Linked issues / bugs

N/A — product-driven enhancement.

…und agents

Add two new tools (task_stop, send_message) and a plain-text transcript
writer so the parent model can control and observe long-running background
subagents. The agent lifecycle is also tightened so every background launch
is paired with exactly one terminal task_notification — including under
cancellation races and pathological tools that swallow AbortSignal.
Replaces the plain-text per-agent transcript writer with one that emits
the same ChatRecord schema as the main session log. Each background
subagent now writes to <projectDir>/subagents/<sessionId>/agent-<id>.jsonl
with a .meta.json sidecar; records carry agentId/agentName/agentColor
and isSidechain so a single parser can reconstruct the parent session
and its subagents as one tree.

A new EXTERNAL_MESSAGE event is emitted when send_message injections are
drained inside agent-core, so each follow-up message is persisted as a
user-role record and the transcript remains a complete view of the run.

read_file's auto-allow set is extended to <projectDir>/subagents/ so the
model can keep polling the transcript path advertised in the launch
response and the completion notification XML.
Drop the 2000-char truncation on <result> in emitNotification. The agent
output is already a model-generated summary; truncating it strips content
the parent agent specifically asked for. The <output-file> path is still
included for anyone who wants the structured transcript.
@tanzhenxin tanzhenxin linked an issue Apr 20, 2026 that may be closed by this pull request
@github-actions

github-actions Bot commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Summary

Package Lines Statements Functions Branches
CLI 53.77% 53.77% 70.2% 79.58%
Core 75.26% 75.26% 77.84% 81.56%
CLI Package - Full Text Report
-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   53.77 |    79.58 |    70.2 |   53.77 |                   
 src               |   61.85 |    62.43 |    62.5 |   61.85 |                   
  gemini.tsx       |   58.41 |    56.79 |      60 |   58.41 | ...22,730-733,741 
  ...ractiveCli.ts |   54.83 |    57.35 |   44.44 |   54.83 | ...07-715,723-724 
  ...liCommands.ts |   73.92 |     72.5 |     100 |   73.92 | ...40-264,289,389 
  ...ActiveAuth.ts |     100 |     87.5 |     100 |     100 | 66-80             
 ...cp-integration |    46.3 |    63.01 |   55.88 |    46.3 |                   
  acpAgent.ts      |   48.12 |    63.38 |   62.06 |   48.12 | ...91-793,807-815 
  authMethods.ts   |   12.19 |      100 |       0 |   12.19 | 11-31,34-38,41-50 
  errorCodes.ts    |       0 |        0 |       0 |       0 | 1-22              
  ...DirContext.ts |     100 |      100 |     100 |     100 |                   
 ...ration/service |   68.65 |    83.33 |   66.66 |   68.65 |                   
  filesystem.ts    |   68.65 |    83.33 |   66.66 |   68.65 | ...32,77-94,97-98 
 ...ration/session |   64.39 |    67.15 |   73.21 |   64.39 |                   
  ...ryReplayer.ts |   64.83 |    72.97 |   81.81 |   64.83 | ...68-269,277-278 
  Session.ts       |      59 |    63.22 |   64.28 |      59 | ...2049,2055-2058 
  ...entTracker.ts |   90.85 |    84.84 |      90 |   90.85 | ...35,199,251-260 
  index.ts         |       0 |        0 |       0 |       0 | 1-40              
  ...ssionUtils.ts |   84.21 |    77.77 |     100 |   84.21 | ...37-153,209-211 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ssion/emitters |   91.53 |    89.47 |   88.46 |   91.53 |                   
  BaseEmitter.ts   |   76.92 |    66.66 |      80 |   76.92 | 23-24,39-40,55-56 
  ...ageEmitter.ts |   82.22 |    83.33 |   83.33 |   82.22 | 29-44             
  PlanEmitter.ts   |     100 |      100 |     100 |     100 |                   
  ...allEmitter.ts |   97.96 |     91.8 |     100 |   97.96 | 226-227,316,324   
  index.ts         |       0 |        0 |       0 |       0 | 1-10              
 ...ession/rewrite |   89.69 |    85.89 |   94.11 |   89.69 |                   
  LlmRewriter.ts   |   80.53 |    79.31 |     100 |   80.53 | ...17-119,170-174 
  ...Middleware.ts |   95.83 |    85.71 |     100 |   95.83 | 119,127-129       
  TurnBuffer.ts    |     100 |      100 |     100 |     100 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 src/commands      |   66.22 |      100 |    12.5 |   66.22 |                   
  auth.ts          |   46.55 |      100 |       0 |   46.55 | ...58,67-72,75-76 
  channel.ts       |   56.66 |      100 |       0 |   56.66 | 15-19,27-34       
  extensions.tsx   |   96.55 |      100 |      50 |   96.55 | 37                
  hooks.tsx        |   66.66 |      100 |       0 |   66.66 | 20-24             
  mcp.ts           |   94.73 |      100 |      50 |   94.73 | 28                
 src/commands/auth |   42.33 |    95.83 |      60 |   42.33 |                   
  handler.ts       |   27.44 |    94.44 |   14.28 |   27.44 | 55-404            
  ...veSelector.ts |     100 |    96.66 |     100 |     100 | 58                
 ...mmands/channel |    26.5 |    93.75 |   26.47 |    26.5 |                   
  ...l-registry.ts |    8.57 |      100 |       0 |    8.57 | 6-21,24-42        
  config-utils.ts  |   91.89 |      100 |   66.66 |   91.89 | 20-25             
  configure.ts     |    14.7 |      100 |       0 |    14.7 | 18-21,23-84       
  pairing.ts       |   26.31 |      100 |       0 |   26.31 | ...30,40-50,52-65 
  pidfile.ts       |   96.34 |    86.95 |     100 |   96.34 | 49,59,91          
  start.ts         |     5.1 |      100 |       0 |     5.1 | ...69-472,474-482 
  status.ts        |   17.54 |      100 |       0 |   17.54 | 15-26,32-77       
  stop.ts          |      20 |      100 |       0 |      20 | 14-48             
 ...nds/extensions |   84.53 |    88.95 |   81.81 |   84.53 |                   
  consent.ts       |   71.65 |    89.28 |   42.85 |   71.65 | ...85-141,156-162 
  disable.ts       |     100 |      100 |     100 |     100 |                   
  enable.ts        |     100 |      100 |     100 |     100 |                   
  install.ts       |    75.6 |    66.66 |   66.66 |    75.6 | ...39-142,145-153 
  link.ts          |     100 |      100 |     100 |     100 |                   
  list.ts          |     100 |      100 |     100 |     100 |                   
  new.ts           |     100 |      100 |     100 |     100 |                   
  settings.ts      |   99.15 |      100 |   83.33 |   99.15 | 151               
  uninstall.ts     |    37.5 |      100 |   33.33 |    37.5 | 23-45,57-64,67-70 
  update.ts        |   96.32 |      100 |     100 |   96.32 | 101-105           
  utils.ts         |   60.24 |    28.57 |     100 |   60.24 | ...81,83-87,89-93 
 ...les/mcp-server |       0 |        0 |       0 |       0 |                   
  example.ts       |       0 |        0 |       0 |       0 | 1-60              
 src/commands/mcp  |   92.29 |    86.08 |   88.88 |   92.29 |                   
  add.ts           |     100 |    98.03 |     100 |     100 | 293               
  list.ts          |   91.22 |    80.76 |      80 |   91.22 | ...19-121,146-147 
  reconnect.ts     |   76.72 |    71.42 |   85.71 |   76.72 | 35-48,153-175     
  remove.ts        |     100 |       80 |     100 |     100 | 21-25             
 src/config        |   92.21 |    82.59 |   84.28 |   92.21 |                   
  auth.ts          |   87.87 |    81.35 |     100 |   87.87 | ...20-221,237-238 
  config.ts        |      87 |    82.63 |      70 |      87 | ...1244,1266-1267 
  keyBindings.ts   |   95.95 |       50 |     100 |   95.95 | 160-163           
  ...idersScope.ts |      92 |       90 |     100 |      92 | 11-12             
  sandboxConfig.ts |    58.9 |    61.53 |   66.66 |    58.9 | ...54-68,73,77-89 
  settings.ts      |   83.13 |    82.55 |   85.71 |   83.13 | ...35-936,941-944 
  ...ingsSchema.ts |     100 |      100 |     100 |     100 |                   
  ...tedFolders.ts |   96.29 |       94 |     100 |   96.29 | ...88-190,205-206 
 ...nfig/migration |   94.56 |    78.94 |   83.33 |   94.56 |                   
  index.ts         |   93.93 |    88.88 |     100 |   93.93 | 85-86             
  scheduler.ts     |   96.55 |    77.77 |     100 |   96.55 | 19-20             
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ation/versions |   93.63 |     94.5 |     100 |   93.63 |                   
  ...-v2-shared.ts |     100 |      100 |     100 |     100 |                   
  v1-to-v2.ts      |   81.75 |    90.19 |     100 |   81.75 | ...28-229,231-247 
  v2-to-v3.ts      |     100 |      100 |     100 |     100 |                   
 src/constants     |    3.52 |        0 |       0 |    3.52 |                   
  ...dardApiKey.ts |     100 |      100 |     100 |     100 |                   
  codingPlan.ts    |       0 |        0 |       0 |       0 | 1-347             
 src/core          |     100 |      100 |     100 |     100 |                   
  auth.ts          |     100 |      100 |     100 |     100 |                   
  initializer.ts   |     100 |      100 |     100 |     100 |                   
  theme.ts         |     100 |      100 |     100 |     100 |                   
 src/dualOutput    |   63.09 |    64.51 |   55.55 |   63.09 |                   
  ...tputBridge.ts |   62.94 |    65.51 |   56.25 |   62.94 | ...22-323,331-334 
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/export        |       0 |        0 |       0 |       0 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-7               
 src/generated     |     100 |      100 |     100 |     100 |                   
  git-commit.ts    |     100 |      100 |     100 |     100 |                   
 src/i18n          |   48.26 |    76.19 |   38.88 |   48.26 |                   
  index.ts         |   26.92 |    76.92 |   26.66 |   26.92 | ...38-239,249-260 
  languages.ts     |    98.7 |       75 |     100 |    98.7 | 110               
 src/i18n/locales  |       0 |        0 |       0 |       0 |                   
  ca.js            |       0 |        0 |       0 |       0 | 1-2143            
  de.js            |       0 |        0 |       0 |       0 | 1-2064            
  en.js            |       0 |        0 |       0 |       0 | 1-2114            
  fr.js            |       0 |        0 |       0 |       0 | 1-2097            
  ja.js            |       0 |        0 |       0 |       0 | 1-1554            
  pt.js            |       0 |        0 |       0 |       0 | 1-2055            
  ru.js            |       0 |        0 |       0 |       0 | 1-2060            
  zh-TW.js         |       0 |        0 |       0 |       0 | 1-1676            
  zh.js            |       0 |        0 |       0 |       0 | 1-1915            
 ...nonInteractive |   68.34 |    71.68 |   68.88 |   68.34 |                   
  session.ts       |    73.1 |    69.52 |   81.81 |    73.1 | ...03-604,612-622 
  types.ts         |    42.5 |      100 |   33.33 |    42.5 | ...80-581,584-585 
 ...active/control |   77.55 |    88.23 |      80 |   77.55 |                   
  ...rolContext.ts |    7.69 |        0 |       0 |    7.69 | 47-79             
  ...Dispatcher.ts |   91.66 |    91.83 |   88.88 |   91.66 | ...54-372,388,391 
  ...rolService.ts |       8 |        0 |       0 |       8 | 46-179            
 ...ol/controllers |    7.04 |       80 |   13.33 |    7.04 |                   
  ...Controller.ts |   19.32 |      100 |      60 |   19.32 | 81-118,127-210    
  ...Controller.ts |       0 |        0 |       0 |       0 | 1-56              
  ...Controller.ts |    3.96 |      100 |   11.11 |    3.96 | ...61-379,389-494 
  ...Controller.ts |   14.06 |      100 |       0 |   14.06 | ...82-117,130-133 
  ...Controller.ts |    5.21 |      100 |       0 |    5.21 | ...21-433,442-471 
 .../control/types |       0 |        0 |       0 |       0 |                   
  serviceAPIs.ts   |       0 |        0 |       0 |       0 | 1                 
 ...Interactive/io |   97.59 |    93.06 |   95.18 |   97.59 |                   
  ...putAdapter.ts |   97.33 |    91.89 |   98.07 |   97.33 | ...1343,1368-1369 
  ...putAdapter.ts |      96 |    91.66 |   85.71 |      96 | 51-52             
  ...nputReader.ts |     100 |    94.73 |     100 |     100 | 67                
  ...putAdapter.ts |   98.28 |      100 |      90 |   98.28 | 81-82,122-123     
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/patches       |       0 |        0 |       0 |       0 |                   
  is-in-ci.ts      |       0 |        0 |       0 |       0 | 1-17              
 src/remoteInput   |   86.98 |       75 |   85.71 |   86.98 |                   
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  ...putWatcher.ts |   88.12 |    76.08 |   91.66 |   88.12 | ...21-222,233-236 
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/services      |   89.99 |    89.29 |   94.11 |   89.99 |                   
  ...mandLoader.ts |     100 |     92.3 |     100 |     100 | 87                
  ...killLoader.ts |     100 |    96.29 |     100 |     100 | 44                
  ...andService.ts |    93.5 |      100 |      80 |    93.5 | 107,150-153       
  ...mandLoader.ts |   86.83 |    83.87 |     100 |   86.83 | ...30-335,340-345 
  ...omptLoader.ts |   75.32 |    80.64 |   83.33 |   75.32 | ...05-206,272-273 
  ...mandLoader.ts |     100 |      100 |     100 |     100 |                   
  ...nd-factory.ts |      91 |     90.9 |     100 |      91 | 123,132-139       
  ...ation-tool.ts |     100 |    95.45 |     100 |     100 | 125               
  commandUtils.ts  |      96 |       90 |     100 |      96 | 48                
  ...and-parser.ts |   90.69 |    85.71 |     100 |   90.69 | 63-66             
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...ght/generators |   85.95 |    86.42 |   90.47 |   85.95 |                   
  DataProcessor.ts |   85.68 |    86.46 |   92.85 |   85.68 | ...1110,1114-1121 
  ...tGenerator.ts |   98.21 |    85.71 |     100 |   98.21 | 46                
  ...teRenderer.ts |   45.45 |      100 |       0 |   45.45 | 13-51             
 .../insight/types |       0 |       50 |      50 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 | 1                 
 ...mpt-processors |   97.27 |    94.04 |     100 |   97.27 |                   
  ...tProcessor.ts |     100 |      100 |     100 |     100 |                   
  ...eProcessor.ts |   94.52 |    84.21 |     100 |   94.52 | 46-47,93-94       
  ...tionParser.ts |     100 |      100 |     100 |     100 |                   
  ...lProcessor.ts |   97.41 |    95.65 |     100 |   97.41 | 95-98             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/services/tips |   92.38 |    84.12 |     100 |   92.38 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  tipHistory.ts    |    78.3 |    71.42 |     100 |    78.3 | ...33-148,151,160 
  tipRegistry.ts   |     100 |    95.23 |     100 |     100 | 33                
  tipScheduler.ts  |     100 |    91.66 |     100 |     100 | 55                
 src/test-utils    |   93.75 |    83.33 |      80 |   93.75 |                   
  ...omMatchers.ts |   69.69 |       50 |      50 |   69.69 | 32-35,37-39,45-47 
  ...andContext.ts |     100 |      100 |     100 |     100 |                   
  render.tsx       |     100 |      100 |     100 |     100 |                   
 src/ui            |   62.35 |     65.6 |   51.28 |   62.35 |                   
  App.tsx          |     100 |      100 |     100 |     100 |                   
  AppContainer.tsx |   64.92 |    59.51 |   66.66 |   64.92 | ...2207,2211-2215 
  ...tionNudge.tsx |    9.58 |      100 |       0 |    9.58 | 24-94             
  ...ackDialog.tsx |   29.23 |      100 |       0 |   29.23 | 25-75             
  ...tionNudge.tsx |    7.69 |      100 |       0 |    7.69 | 25-103            
  colors.ts        |   52.72 |      100 |   23.52 |   52.72 | ...52,54-55,60-61 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  keyMatchers.ts   |   91.83 |    88.46 |     100 |   91.83 | 25-26,54-55       
  ...tic-colors.ts |     100 |      100 |     100 |     100 |                   
  textConstants.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/auth       |   29.48 |       50 |   26.08 |   29.48 |                   
  AuthDialog.tsx   |   51.74 |    51.16 |   28.57 |   51.74 | ...73,692,694,696 
  ...nProgress.tsx |       0 |        0 |       0 |       0 | 1-64              
  useAuth.ts       |    2.25 |      100 |       0 |    2.25 | 46-610            
 src/ui/commands   |   58.58 |    78.16 |   59.06 |   58.58 |                   
  aboutCommand.ts  |     100 |    85.71 |     100 |     100 | 36                
  agentsCommand.ts |   72.97 |      100 |      20 |   72.97 | ...32,37-38,42-44 
  ...odeCommand.ts |     100 |      100 |     100 |     100 |                   
  arenaCommand.ts  |   33.13 |    67.64 |    37.5 |   33.13 | ...60-565,644-649 
  authCommand.ts   |     100 |      100 |     100 |     100 |                   
  btwCommand.ts    |   95.59 |    71.42 |     100 |   95.59 | 72,154-159        
  bugCommand.ts    |   76.92 |    66.66 |      50 |   76.92 | 21-22,59-68       
  clearCommand.ts  |   91.04 |    66.66 |      50 |   91.04 | 22-23,49-50,68-69 
  ...essCommand.ts |   63.39 |       48 |      50 |   63.39 | ...48-149,163-166 
  ...extCommand.ts |    6.17 |      100 |      10 |    6.17 | ...21-522,527-528 
  copyCommand.ts   |     100 |      100 |     100 |     100 |                   
  deleteCommand.ts |     100 |      100 |     100 |     100 |                   
  ...ryCommand.tsx |   59.56 |    74.07 |      50 |   59.56 | ...24-225,234-242 
  docsCommand.ts   |   96.07 |     87.5 |      50 |   96.07 | 20-21             
  doctorCommand.ts |     100 |    93.33 |     100 |     100 | 21                
  dreamCommand.ts  |   32.43 |      100 |      50 |   32.43 | 23-50             
  editorCommand.ts |     100 |      100 |     100 |     100 |                   
  exportCommand.ts |   56.93 |    91.66 |   33.33 |   56.93 | ...52-353,361-362 
  ...onsCommand.ts |   45.08 |    85.71 |   27.27 |   45.08 | ...37-238,247-248 
  forgetCommand.ts |   26.82 |      100 |      50 |   26.82 | 18-51             
  helpCommand.ts   |     100 |      100 |     100 |     100 |                   
  hooksCommand.ts  |   19.04 |       25 |      20 |   19.04 | ...86-187,204-205 
  ideCommand.ts    |   57.33 |    57.69 |   35.29 |   57.33 | ...05-306,310-324 
  initCommand.ts   |   84.33 |    72.72 |     100 |   84.33 | 68,82-87,89-94    
  ...ghtCommand.ts |    72.8 |    66.66 |   83.33 |    72.8 | ...31-245,250-273 
  ...ageCommand.ts |   89.39 |    82.35 |   76.92 |   89.39 | ...22-325,348-349 
  mcpCommand.ts    |   86.66 |      100 |      50 |   86.66 | 14-15             
  memoryCommand.ts |   86.66 |      100 |      50 |   86.66 | 14-15             
  modelCommand.ts  |      56 |    70.58 |   66.66 |      56 | ...,67-93,118-136 
  ...onsCommand.ts |     100 |      100 |     100 |     100 |                   
  planCommand.ts   |   78.82 |    76.92 |     100 |   78.82 | 30-35,51-56,68-73 
  quitCommand.ts   |   93.93 |      100 |      50 |   93.93 | 15-16             
  recapCommand.ts  |   21.81 |      100 |      50 |   21.81 | 24-73             
  ...berCommand.ts |   32.43 |      100 |      50 |   32.43 | 23-57             
  renameCommand.ts |   85.61 |    78.18 |     100 |   85.61 | ...15-322,329-334 
  ...oreCommand.ts |    92.3 |     87.5 |     100 |    92.3 | ...,83-88,129-130 
  resumeCommand.ts |     100 |      100 |     100 |     100 |                   
  rewindCommand.ts |      80 |      100 |      50 |      80 | 19-21             
  ...ngsCommand.ts |     100 |      100 |     100 |     100 |                   
  ...hubCommand.ts |   81.43 |    65.21 |      80 |   81.43 | ...70-173,176-179 
  skillsCommand.ts |   15.04 |      100 |      25 |   15.04 | ...90-106,109-136 
  statsCommand.ts  |   79.54 |    66.66 |      50 |   79.54 | ...20-121,131-134 
  ...ineCommand.ts |     100 |      100 |     100 |     100 |                   
  ...aryCommand.ts |    6.51 |      100 |      50 |    6.51 | 28-323            
  ...tupCommand.ts |     100 |      100 |     100 |     100 |                   
  themeCommand.ts  |     100 |      100 |     100 |     100 |                   
  toolsCommand.ts  |   95.23 |      100 |      50 |   95.23 | 18-19             
  trustCommand.ts  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
  vimCommand.ts    |   54.54 |      100 |      50 |   54.54 | 19-29             
 src/ui/components |   61.16 |    72.97 |   60.41 |   61.16 |                   
  AboutBox.tsx     |     100 |      100 |     100 |     100 |                   
  AnsiOutput.tsx   |   65.57 |      100 |      50 |   65.57 | 69-90             
  ApiKeyInput.tsx  |   18.91 |      100 |       0 |   18.91 | 30-95             
  AppHeader.tsx    |   86.79 |    42.85 |     100 |   86.79 | 32-38,40          
  ...odeDialog.tsx |     9.7 |      100 |       0 |     9.7 | 35-47,50-182      
  AsciiArt.ts      |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |   14.63 |      100 |       0 |   14.63 | 18-56             
  ...TextInput.tsx |   66.08 |    69.76 |      50 |   66.08 | ...30-232,250,259 
  Composer.tsx     |   79.31 |    57.14 |     100 |   79.31 | ...-77,95,133,146 
  ...entPrompt.tsx |     100 |      100 |     100 |     100 |                   
  ...ryDisplay.tsx |   75.89 |    62.06 |     100 |   75.89 | ...,88,93-108,113 
  ...geDisplay.tsx |   68.42 |    57.14 |     100 |   68.42 | 16-17,31-32,42-50 
  ...ification.tsx |   28.57 |      100 |       0 |   28.57 | 16-36             
  ...gProfiler.tsx |       0 |        0 |       0 |       0 | 1-36              
  ...ogManager.tsx |   12.46 |      100 |       0 |   12.46 | 57-413            
  ...ngsDialog.tsx |    8.44 |      100 |       0 |    8.44 | 37-195            
  ExitWarning.tsx  |     100 |      100 |     100 |     100 |                   
  ...ustDialog.tsx |     100 |      100 |     100 |     100 |                   
  Footer.tsx       |   78.87 |       60 |     100 |   78.87 | ...30-134,136-140 
  ...ngSpinner.tsx |   54.28 |       50 |      50 |   54.28 | 31-48,61          
  Header.tsx       |   98.14 |    85.71 |     100 |   98.14 | 97,99             
  Help.tsx         |   98.74 |    68.75 |     100 |   98.74 | 74,129            
  ...emDisplay.tsx |   61.13 |    35.55 |     100 |   61.13 | ...75-284,287,290 
  ...ngeDialog.tsx |     100 |      100 |     100 |     100 |                   
  InputPrompt.tsx  |   81.73 |    76.66 |      80 |   81.73 | ...1227,1292,1342 
  ...Shortcuts.tsx |   20.87 |      100 |       0 |   20.87 | ...6,49-51,67-125 
  ...Indicator.tsx |     100 |    91.42 |     100 |     100 | 65,74             
  ...firmation.tsx |   91.42 |      100 |      50 |   91.42 | 26-31             
  MainContent.tsx  |   15.23 |      100 |       0 |   15.23 | 28-133            
  MemoryDialog.tsx |   53.35 |    51.21 |   57.14 |   53.35 | ...55,367,380-382 
  ...geDisplay.tsx |       0 |        0 |       0 |       0 | 1-41              
  ModelDialog.tsx  |   76.59 |    54.54 |     100 |   76.59 | ...60-476,533-537 
  ...tsDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...fications.tsx |   18.18 |      100 |       0 |   18.18 | 15-58             
  ...onsDialog.tsx |    2.13 |      100 |       0 |    2.13 | 62-133,148-1004   
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...icePrompt.tsx |   88.14 |    83.87 |     100 |   88.14 | ...01-105,133-138 
  PrepareLabel.tsx |   91.66 |    76.19 |     100 |   91.66 | 73-75,77-79,110   
  ...geDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...ngDisplay.tsx |   21.42 |      100 |       0 |   21.42 | 13-39             
  ...hProgress.tsx |   85.25 |    88.46 |     100 |   85.25 | 121-147           
  ...dSelector.tsx |    4.45 |      100 |       0 |    4.45 | 28-92,100-328     
  ...ionPicker.tsx |   94.76 |    87.17 |     100 |   94.76 | 99,132,253-261    
  ...onPreview.tsx |   93.38 |    78.26 |     100 |   93.38 | ...,66-67,126-128 
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...putPrompt.tsx |   72.56 |       80 |      40 |   72.56 | ...06-109,114-117 
  ...ngsDialog.tsx |   66.88 |    73.52 |     100 |   66.88 | ...11-819,825-826 
  ...ionDialog.tsx |    87.8 |      100 |   33.33 |    87.8 | 36-39,44-51       
  ...putPrompt.tsx |    15.9 |      100 |       0 |    15.9 | 20-63             
  ...Indicator.tsx |   57.14 |      100 |       0 |   57.14 | 12-15             
  ...MoreLines.tsx |      28 |      100 |       0 |      28 | 18-40             
  ...ionPicker.tsx |   17.59 |      100 |       0 |   17.59 | 55-172            
  StatsDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...yTodoList.tsx |   96.82 |    77.77 |     100 |   96.82 | 39-40             
  ...nsDisplay.tsx |   84.09 |    57.14 |     100 |   84.09 | ...16-118,125-127 
  ThemeDialog.tsx  |   89.95 |    46.15 |      75 |   89.95 | ...71-173,243-245 
  Tips.tsx         |   21.87 |      100 |       0 |   21.87 | 22-40,43-53       
  TodoDisplay.tsx  |     100 |      100 |     100 |     100 |                   
  ...tsDisplay.tsx |     100 |     87.5 |     100 |     100 | 31-32             
  TrustDialog.tsx  |     100 |    81.81 |     100 |     100 | 71-86             
  ...ification.tsx |   36.36 |      100 |       0 |   36.36 | 15-22             
  ...ackDialog.tsx |    7.84 |      100 |       0 |    7.84 | 24-134            
 ...nts/agent-view |   25.56 |       90 |    12.5 |   25.56 |                   
  ...tChatView.tsx |    8.28 |      100 |       0 |    8.28 | 53-275            
  ...tComposer.tsx |    9.95 |      100 |       0 |    9.95 | 57-308            
  AgentFooter.tsx  |   17.07 |      100 |       0 |   17.07 | 28-66             
  AgentHeader.tsx  |   15.38 |      100 |       0 |   15.38 | 27-64             
  AgentTabBar.tsx  |    8.25 |      100 |       0 |    8.25 | 35-55,60-167      
  ...oryAdapter.ts |     100 |    91.83 |     100 |     100 | 103,109-110,138   
  index.ts         |       0 |        0 |       0 |       0 | 1-12              
 ...mponents/arena |   45.72 |    70.53 |   60.86 |   45.72 |                   
  ArenaCards.tsx   |   73.06 |    71.79 |   85.71 |   73.06 | ...83-185,321-326 
  ...ectDialog.tsx |   83.48 |    69.86 |   88.88 |   83.48 | ...88-392,409-410 
  ...artDialog.tsx |   10.15 |      100 |       0 |   10.15 | 27-161            
  ...tusDialog.tsx |    5.63 |      100 |       0 |    5.63 | 33-75,80-288      
  ...topDialog.tsx |    6.17 |      100 |       0 |    6.17 | 33-213            
 ...nts/extensions |   45.28 |    33.33 |      60 |   45.28 |                   
  ...gerDialog.tsx |   44.31 |    34.14 |      75 |   44.31 | ...71-480,483-488 
  index.ts         |       0 |        0 |       0 |       0 | 1-9               
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...tensions/steps |   54.77 |    94.23 |   66.66 |   54.77 |                   
  ...ctionStep.tsx |   95.12 |    92.85 |   85.71 |   95.12 | 84-86,89          
  ...etailStep.tsx |    6.18 |      100 |       0 |    6.18 | 17-128            
  ...nListStep.tsx |   88.35 |    94.73 |      80 |   88.35 | 51-52,58-71,105   
  ...electStep.tsx |   13.46 |      100 |       0 |   13.46 | 20-70             
  ...nfirmStep.tsx |   19.56 |      100 |       0 |   19.56 | 23-65             
  index.ts         |     100 |      100 |     100 |     100 |                   
 ...mponents/hooks |   72.24 |    70.52 |      80 |   72.24 |                   
  ...etailStep.tsx |   96.52 |       75 |     100 |   96.52 | 33,37,50,59       
  ...etailStep.tsx |   93.27 |    73.68 |     100 |   93.27 | 41-42,99-104,110  
  ...abledStep.tsx |     100 |      100 |     100 |     100 |                   
  ...sListStep.tsx |     100 |      100 |     100 |     100 |                   
  ...entDialog.tsx |   36.09 |    47.05 |      50 |   36.09 | ...49,453-466,470 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-13              
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...components/mcp |   18.82 |    84.37 |   77.77 |   18.82 |                   
  ...entDialog.tsx |    3.64 |      100 |       0 |    3.64 | 41-717            
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-30              
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   96.42 |    87.09 |     100 |   96.42 | 21,96-97          
 ...ents/mcp/steps |    6.65 |      100 |       0 |    6.65 |                   
  ...icateStep.tsx |     5.1 |      100 |       0 |     5.1 | 34-95,98-334      
  ...electStep.tsx |   10.95 |      100 |       0 |   10.95 | 16-88             
  ...etailStep.tsx |    5.26 |      100 |       0 |    5.26 | 31-247            
  ...rListStep.tsx |    5.88 |      100 |       0 |    5.88 | 20-176            
  ...etailStep.tsx |   10.41 |      100 |       0 |   10.41 | ...1,67-79,82-139 
  ToolListStep.tsx |    7.14 |      100 |       0 |    7.14 | 16-146            
 ...nents/messages |   78.25 |    76.19 |   68.33 |   78.25 |                   
  ...ionDialog.tsx |   77.35 |    74.54 |    62.5 |   77.35 | ...90,508,526-528 
  BtwMessage.tsx   |     100 |      100 |     100 |     100 |                   
  ...upDisplay.tsx |   82.17 |     61.9 |     100 |   82.17 | ...95,113-116,123 
  ...onMessage.tsx |   91.93 |    82.35 |     100 |   91.93 | 57-59,61,63       
  ...nMessages.tsx |   77.35 |      100 |      70 |   77.35 | ...31-244,248-260 
  DiffRenderer.tsx |   93.19 |    86.17 |     100 |   93.19 | ...09,237-238,304 
  ...ssMessage.tsx |    12.5 |      100 |       0 |    12.5 | 18-59             
  ...edMessage.tsx |   16.66 |      100 |       0 |   16.66 | 22-38             
  ...sMessages.tsx |   55.67 |       40 |   28.57 |   55.67 | ...20-125,133-145 
  ...ryMessage.tsx |   12.82 |      100 |       0 |   12.82 | 22-59             
  ...onMessage.tsx |   73.55 |    55.81 |   33.33 |   73.55 | ...41-443,450-452 
  ...upMessage.tsx |   72.52 |    65.45 |     100 |   72.52 | ...32-247,261-262 
  ToolMessage.tsx  |   90.16 |     83.8 |   91.66 |   90.16 | ...59-564,591-593 
 ...ponents/shared |   81.46 |    77.23 |   92.64 |   81.46 |                   
  ...ctionList.tsx |   99.03 |    95.65 |     100 |   99.03 | 85                
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  EnumSelector.tsx |     100 |    96.42 |     100 |     100 | 58                
  MaxSizedBox.tsx  |   82.54 |    85.15 |   88.88 |   82.54 | ...12-513,618-619 
  MultiSelect.tsx  |    5.59 |      100 |       0 |    5.59 | 34-41,44-193      
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  ...eSelector.tsx |     100 |       60 |     100 |     100 | 40-45             
  TextInput.tsx    |   70.44 |    53.57 |      75 |   70.44 | ...90-194,206-212 
  ...apsedTime.tsx |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |     100 |      100 |     100 |     100 |                   
  text-buffer.ts   |   82.82 |    75.48 |   97.61 |   82.82 | ...2272,2300,2368 
  ...er-actions.ts |   86.71 |    67.79 |     100 |   86.71 | ...07-608,809-811 
 ...ents/subagents |    32.1 |      100 |       0 |    32.1 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  reducers.tsx     |    12.1 |      100 |       0 |    12.1 | 33-190            
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   10.95 |      100 |       0 |   10.95 | ...1,56-57,60-102 
 ...bagents/create |    9.13 |      100 |       0 |    9.13 |                   
  ...ionWizard.tsx |    7.28 |      100 |       0 |    7.28 | 34-299            
  ...rSelector.tsx |   14.75 |      100 |       0 |   14.75 | 26-85             
  ...onSummary.tsx |    4.26 |      100 |       0 |    4.26 | 27-331            
  ...tionInput.tsx |    8.63 |      100 |       0 |    8.63 | 23-177            
  ...dSelector.tsx |   33.33 |      100 |       0 |   33.33 | 20-21,26-27,36-63 
  ...nSelector.tsx |    37.5 |      100 |       0 |    37.5 | 20-21,26-27,36-58 
  ...EntryStep.tsx |   12.76 |      100 |       0 |   12.76 | 34-78             
  ToolSelector.tsx |    4.16 |      100 |       0 |    4.16 | 31-253            
 ...bagents/manage |    8.39 |      100 |       0 |    8.39 |                   
  ...ctionStep.tsx |   10.25 |      100 |       0 |   10.25 | 21-103            
  ...eleteStep.tsx |   20.93 |      100 |       0 |   20.93 | 23-62             
  ...tEditStep.tsx |   25.53 |      100 |       0 |   25.53 | ...2,37-38,51-124 
  ...ctionStep.tsx |    2.29 |      100 |       0 |    2.29 | 28-449            
  ...iewerStep.tsx |   13.72 |      100 |       0 |   13.72 | 18-73             
  ...gerDialog.tsx |    6.74 |      100 |       0 |    6.74 | 35-341            
 ...agents/runtime |    7.51 |      100 |       0 |    7.51 |                   
  ...onDisplay.tsx |    7.51 |      100 |       0 |    7.51 | ...96-526,535-573 
 ...mponents/views |   42.16 |    69.23 |   21.42 |   42.16 |                   
  ContextUsage.tsx |     4.7 |      100 |       0 |     4.7 | ...52-167,170-456 
  DoctorReport.tsx |     9.8 |      100 |       0 |     9.8 | 25-54,57-131      
  ...sionsList.tsx |   87.69 |    73.68 |     100 |   87.69 | 65-72             
  McpStatus.tsx    |   89.53 |    60.52 |     100 |   89.53 | ...72,175-177,262 
  SkillsList.tsx   |   27.27 |      100 |       0 |   27.27 | 18-35             
  ToolsList.tsx    |     100 |      100 |     100 |     100 |                   
 src/ui/contexts   |   74.71 |    79.47 |   86.66 |   74.71 |                   
  ...ewContext.tsx |   65.77 |      100 |      75 |   65.77 | ...22-225,231-241 
  AppContext.tsx   |      40 |      100 |       0 |      40 | 17-22             
  ...deContext.tsx |     100 |      100 |     100 |     100 |                   
  ...igContext.tsx |   81.81 |       50 |     100 |   81.81 | 15-16             
  ...ssContext.tsx |   81.88 |    82.26 |     100 |   81.88 | ...1153,1159-1161 
  ...owContext.tsx |   89.28 |       80 |   66.66 |   89.28 | 34,47-48,60-62    
  ...onContext.tsx |   43.06 |     62.5 |    62.5 |   43.06 | ...57-260,264-267 
  ...gsContext.tsx |   83.33 |       50 |     100 |   83.33 | 17-18             
  ...usContext.tsx |     100 |      100 |     100 |     100 |                   
  ...ngContext.tsx |   71.42 |       50 |     100 |   71.42 | 17-20             
  ...nsContext.tsx |   88.88 |       50 |     100 |   88.88 | 124-125           
  ...teContext.tsx |   85.71 |       50 |     100 |   85.71 | 173-174           
  ...deContext.tsx |   76.08 |    72.72 |     100 |   76.08 | 47-48,52-59,77-78 
 src/ui/editors    |   93.33 |    85.71 |   66.66 |   93.33 |                   
  ...ngsManager.ts |   93.33 |    85.71 |   66.66 |   93.33 | 49,63-64          
 src/ui/hooks      |   80.03 |     80.6 |   84.57 |   80.03 |                   
  ...dProcessor.ts |   83.12 |    82.56 |     100 |   83.12 | ...88-389,408-435 
  keyToAnsi.ts     |    3.92 |      100 |       0 |    3.92 | 19-77             
  ...dProcessor.ts |    94.8 |    70.58 |     100 |    94.8 | ...76-277,282-283 
  ...dProcessor.ts |   72.94 |    57.26 |   61.53 |   72.94 | ...75,799,818-822 
  ...amingState.ts |   12.22 |      100 |       0 |   12.22 | 54-158            
  ...agerDialog.ts |   88.23 |      100 |     100 |   88.23 | 20,24             
  ...ationFrame.ts |      32 |       60 |     100 |      32 | 42-44,51-90       
  ...odeCommand.ts |   58.82 |      100 |     100 |   58.82 | 28,33-48          
  ...enaCommand.ts |      85 |      100 |     100 |      85 | 23-24,29          
  ...aInProcess.ts |   19.81 |    66.66 |      25 |   19.81 | 57-175            
  ...Completion.ts |   92.77 |    89.09 |     100 |   92.77 | ...86-187,220-223 
  ...ifications.ts |   88.05 |    94.73 |     100 |   88.05 | 84-93             
  ...tIndicator.ts |     100 |    93.75 |     100 |     100 | 63                
  ...waySummary.ts |   96.22 |    69.69 |     100 |   96.22 | 125-127,169       
  ...ketedPaste.ts |    23.8 |      100 |       0 |    23.8 | 19-37             
  ...lanUpdates.ts |     100 |       92 |     100 |     100 | 59,158            
  ...ompletion.tsx |   91.28 |    79.59 |     100 |   91.28 | ...20-221,259-269 
  ...dMigration.ts |   90.62 |       75 |     100 |   90.62 | 38-40             
  useCompletion.ts |    92.4 |     87.5 |     100 |    92.4 | 68-69,93-94,98-99 
  ...nitMessage.ts |     100 |      100 |     100 |     100 |                   
  ...extualTips.ts |   76.92 |       50 |     100 |   76.92 | 55,68,71-75,88-96 
  ...eteCommand.ts |   33.33 |       50 |     100 |   33.33 | 30,34,41-90       
  ...ialogClose.ts |      20 |      100 |     100 |      20 | 71-118            
  ...oublePress.ts |   53.12 |       75 |     100 |   53.12 | 33-35,41-54       
  ...orSettings.ts |     100 |      100 |     100 |     100 |                   
  ...ionUpdates.ts |   93.45 |     92.3 |     100 |   93.45 | ...83-287,300-306 
  ...agerDialog.ts |   88.88 |      100 |     100 |   88.88 | 21,25             
  ...backDialog.ts |   50.37 |    77.77 |   33.33 |   50.37 | ...58-174,195-196 
  useFocus.ts      |     100 |      100 |     100 |     100 |                   
  ...olderTrust.ts |     100 |      100 |     100 |     100 |                   
  ...ggestions.tsx |   89.15 |     62.5 |      50 |   89.15 | ...22-124,149-150 
  ...miniStream.ts |   74.76 |    72.01 |   88.88 |   74.76 | ...2086,2099-2107 
  ...BranchName.ts |    90.9 |     92.3 |     100 |    90.9 | 19-20,55-58       
  ...oryManager.ts |   93.15 |    93.75 |     100 |   93.15 | 44,107-110        
  ...ooksDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...stListener.ts |     100 |      100 |     100 |     100 |                   
  ...nAuthError.ts |   76.19 |       50 |     100 |   76.19 | 39-40,43-45       
  ...putHistory.ts |   92.59 |    85.71 |     100 |   92.59 | 63-64,72,94-96    
  ...storyStore.ts |     100 |    94.11 |     100 |     100 | 69                
  useKeypress.ts   |     100 |      100 |     100 |     100 |                   
  ...rdProtocol.ts |   36.36 |      100 |       0 |   36.36 | 24-31             
  ...unchEditor.ts |    9.67 |      100 |       0 |    9.67 | 11-32,39-90       
  ...gIndicator.ts |     100 |      100 |     100 |     100 |                   
  useLogger.ts     |   21.05 |      100 |       0 |   21.05 | 15-37             
  useMcpDialog.ts  |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...moryDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...oryMonitor.ts |     100 |      100 |     100 |     100 |                   
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...delCommand.ts |     100 |       75 |     100 |     100 | 22                
  ...raseCycler.ts |   84.74 |    76.47 |     100 |   84.74 | ...49,52-53,69-71 
  useQwenAuth.ts   |     100 |      100 |     100 |     100 |                   
  ...lScheduler.ts |   84.52 |    93.33 |     100 |   84.52 | ...27-232,328-338 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-7               
  ...umeCommand.ts |   96.25 |    72.72 |     100 |   96.25 | 81-82,108         
  ...ompletion.tsx |   90.59 |    83.33 |     100 |   90.59 | ...01,104,137-140 
  ...ectionList.ts |   96.96 |    95.69 |     100 |   96.96 | ...82-183,237-240 
  ...sionPicker.ts |   91.16 |    71.69 |     100 |   91.16 | ...78-279,283-284 
  ...ngsCommand.ts |   18.75 |      100 |       0 |   18.75 | 10-25             
  ...ellHistory.ts |   91.74 |    79.41 |     100 |   91.74 | ...74,122-123,133 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-73              
  ...Completion.ts |   78.99 |    81.48 |   94.11 |   78.99 | ...77-579,587-624 
  ...tateAndRef.ts |     100 |      100 |     100 |     100 |                   
  useStatusLine.ts |     100 |    98.79 |     100 |     100 | 257               
  ...eateDialog.ts |   88.23 |      100 |     100 |   88.23 | 14,18             
  ...alProgress.ts |   54.23 |    45.45 |      75 |   54.23 | ...65,73-80,91-97 
  ...rminalSize.ts |   76.19 |      100 |      50 |   76.19 | 21-25             
  ...emeCommand.ts |   67.01 |    29.41 |     100 |   67.01 | ...10-111,115-116 
  useTimer.ts      |   88.09 |    85.71 |     100 |   88.09 | 44-45,51-53       
  ...lMigration.ts |       0 |        0 |       0 |       0 |                   
  ...rustModify.ts |     100 |      100 |     100 |     100 |                   
  ...elcomeBack.ts |   87.36 |     90.9 |     100 |   87.36 | ...,94-96,114-115 
  vim.ts           |   83.77 |    80.31 |     100 |   83.77 | ...55,759-767,776 
 src/ui/layouts    |   88.72 |    86.95 |     100 |   88.72 |                   
  ...AppLayout.tsx |   88.88 |    86.66 |     100 |   88.88 | 46-48,87-92       
  ...AppLayout.tsx |   88.46 |     87.5 |     100 |   88.46 | 53-58             
 src/ui/models     |   80.24 |    79.16 |   71.42 |   80.24 |                   
  ...ableModels.ts |   80.24 |    79.16 |   71.42 |   80.24 | ...,61-71,123-125 
 ...noninteractive |     100 |      100 |    7.14 |     100 |                   
  ...eractiveUi.ts |     100 |      100 |    7.14 |     100 |                   
 src/ui/state      |   94.91 |    81.81 |     100 |   94.91 |                   
  extensions.ts    |   94.91 |    81.81 |     100 |   94.91 | 68-69,88          
 src/ui/themes     |   98.53 |    70.58 |     100 |   98.53 |                   
  ansi-light.ts    |     100 |      100 |     100 |     100 |                   
  ansi.ts          |     100 |      100 |     100 |     100 |                   
  atom-one-dark.ts |     100 |      100 |     100 |     100 |                   
  ayu-light.ts     |     100 |      100 |     100 |     100 |                   
  ayu.ts           |     100 |      100 |     100 |     100 |                   
  color-utils.ts   |     100 |      100 |     100 |     100 |                   
  default-light.ts |     100 |      100 |     100 |     100 |                   
  default.ts       |     100 |      100 |     100 |     100 |                   
  ...inal-theme.ts |   88.59 |    85.96 |     100 |   88.59 | ...57-261,266-270 
  dracula.ts       |     100 |      100 |     100 |     100 |                   
  github-dark.ts   |     100 |      100 |     100 |     100 |                   
  github-light.ts  |     100 |      100 |     100 |     100 |                   
  googlecode.ts    |     100 |      100 |     100 |     100 |                   
  no-color.ts      |     100 |      100 |     100 |     100 |                   
  qwen-dark.ts     |     100 |      100 |     100 |     100 |                   
  qwen-light.ts    |     100 |      100 |     100 |     100 |                   
  ...tic-tokens.ts |     100 |      100 |     100 |     100 |                   
  ...-of-purple.ts |     100 |      100 |     100 |     100 |                   
  theme-manager.ts |   87.98 |    82.89 |     100 |   87.98 | ...48-357,362-363 
  theme.ts         |     100 |    38.02 |     100 |     100 | ...34-449,457-461 
  xcode.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/utils      |   75.98 |    85.73 |   84.44 |   75.98 |                   
  ...Colorizer.tsx |   82.78 |    88.23 |     100 |   82.78 | ...10-111,197-223 
  ...nRenderer.tsx |   52.41 |    38.23 |      50 |   52.41 | ...49-151,171-180 
  ...wnDisplay.tsx |   86.79 |    88.88 |     100 |   86.79 | ...06-315,348-373 
  ...eRenderer.tsx |   94.45 |    81.25 |   94.11 |   94.45 | ...65,477,480-483 
  ...boardUtils.ts |   59.61 |    58.82 |     100 |   59.61 | ...,86-88,107-149 
  commandUtils.ts  |   83.95 |    89.09 |    87.5 |   83.95 | ...50-151,247-266 
  computeStats.ts  |     100 |      100 |     100 |     100 |                   
  displayUtils.ts  |   88.37 |    72.22 |     100 |   88.37 | 23,25,29,31,33    
  formatters.ts    |   95.23 |    98.24 |     100 |   95.23 | 117-120           
  gradientUtils.ts |     100 |      100 |     100 |     100 |                   
  highlight.ts     |   98.63 |       95 |     100 |   98.63 | 93                
  ...oryMapping.ts |     100 |    94.28 |     100 |     100 | 33,55             
  isNarrowWidth.ts |     100 |      100 |     100 |     100 |                   
  ...olDetector.ts |    8.23 |      100 |       0 |    8.23 | ...31-132,135-136 
  layoutUtils.ts   |     100 |      100 |     100 |     100 |                   
  ...nUtilities.ts |   69.84 |    85.71 |     100 |   69.84 | 75-91,100-101     
  ...ToolGroups.ts |   98.05 |    95.12 |     100 |   98.05 | 44-45             
  ...lsBySource.ts |     100 |    95.23 |     100 |     100 | 84                
  ...mConstants.ts |     100 |      100 |     100 |     100 |                   
  ...storyUtils.ts |   57.81 |    67.14 |      90 |   57.81 | ...64,412,417-439 
  ...ickerUtils.ts |     100 |      100 |     100 |     100 |                   
  ...izedOutput.ts |   94.94 |      100 |   88.88 |   94.94 | 112-117           
  ...wOptimizer.ts |     100 |    96.77 |     100 |     100 | 69                
  terminalSetup.ts |    4.37 |      100 |       0 |    4.37 | 44-393            
  textUtils.ts     |   96.36 |    93.93 |   88.88 |   96.36 | ...49-150,285-286 
  todoSnapshot.ts  |   81.92 |    88.46 |     100 |   81.92 | 39-51,59-60,106   
  updateCheck.ts   |     100 |    80.95 |     100 |     100 | 30-42             
 ...i/utils/export |    2.36 |        0 |       0 |    2.36 |                   
  collect.ts       |    0.87 |        0 |       0 |    0.87 | 40-394,401-697    
  index.ts         |     100 |      100 |     100 |     100 |                   
  normalize.ts     |     1.2 |      100 |       0 |     1.2 | 17-346            
  types.ts         |       0 |        0 |       0 |       0 | 1                 
  utils.ts         |      40 |      100 |       0 |      40 | 11-13             
 ...ort/formatters |    3.38 |      100 |       0 |    3.38 |                   
  html.ts          |    9.61 |      100 |       0 |    9.61 | ...28,34-76,82-84 
  json.ts          |      50 |      100 |       0 |      50 | 14-15             
  jsonl.ts         |     3.5 |      100 |       0 |     3.5 | 14-76             
  markdown.ts      |    0.94 |      100 |       0 |    0.94 | 13-295            
 src/utils         |   72.88 |    88.69 |   94.73 |   72.88 |                   
  acpModelUtils.ts |     100 |      100 |     100 |     100 |                   
  apiPreconnect.ts |   96.52 |    96.87 |     100 |   96.52 | 166-169           
  ...tification.ts |   92.59 |    71.42 |     100 |   92.59 | 36-37             
  checks.ts        |   33.33 |      100 |       0 |   33.33 | 23-28             
  cleanup.ts       |   84.12 |    93.33 |      80 |   84.12 | 75,106-115        
  commands.ts      |     100 |      100 |     100 |     100 |                   
  commentJson.ts   |   85.29 |    89.47 |     100 |   85.29 | 48-57             
  deepMerge.ts     |     100 |       90 |     100 |     100 | 41-43,49          
  ...ScopeUtils.ts |   97.56 |    88.88 |     100 |   97.56 | 67                
  doctorChecks.ts  |   68.59 |    64.28 |     100 |   68.59 | ...63-269,293-309 
  ...putCapture.ts |   90.65 |    86.02 |     100 |   90.65 | ...72,370,372-373 
  ...arResolver.ts |   94.28 |    88.46 |     100 |   94.28 | 28-29,125-126     
  errors.ts        |   98.43 |    95.55 |     100 |   98.43 | 45-46             
  events.ts        |     100 |      100 |     100 |     100 |                   
  gitUtils.ts      |   91.91 |    84.61 |     100 |   91.91 | 78-81,124-127     
  ...AutoUpdate.ts |   90.76 |    93.33 |   88.88 |   90.76 | 103-114           
  ...lationInfo.ts |     100 |      100 |     100 |     100 |                   
  languageUtils.ts |   97.89 |    96.42 |     100 |   97.89 | 132-133           
  math.ts          |       0 |        0 |       0 |       0 | 1-15              
  ...onfigUtils.ts |     100 |      100 |     100 |     100 |                   
  ...iveHelpers.ts |   96.79 |    93.28 |     100 |   96.79 | ...76-477,575,588 
  package.ts       |   88.88 |       80 |     100 |   88.88 | 33-34             
  processUtils.ts  |     100 |      100 |     100 |     100 |                   
  readStdin.ts     |   79.62 |       90 |      80 |   79.62 | 33-40,52-54       
  relaunch.ts      |   98.07 |    76.92 |     100 |   98.07 | 70                
  resolvePath.ts   |   66.66 |       25 |     100 |   66.66 | 12-13,16,18-19    
  sandbox.ts       |       0 |        0 |       0 |       0 | 1-980             
  settingsUtils.ts |   86.32 |    90.59 |   94.44 |   86.32 | ...38,569,632-644 
  spawnWrapper.ts  |     100 |      100 |     100 |     100 |                   
  ...upProfiler.ts |     100 |       96 |     100 |     100 | 110               
  ...upWarnings.ts |     100 |      100 |     100 |     100 |                   
  stdioHelpers.ts  |     100 |       60 |     100 |     100 | 23,32             
  systemInfo.ts    |   92.52 |     90.9 |   83.33 |   92.52 | 63-69,184         
  ...InfoFields.ts |   86.91 |    65.78 |     100 |   86.91 | ...16-117,138-139 
  ...entEmitter.ts |     100 |      100 |     100 |     100 |                   
  ...upWarnings.ts |   91.17 |    82.35 |     100 |   91.17 | 67-68,73-74,77-78 
  version.ts       |     100 |       50 |     100 |     100 | 11                
  windowTitle.ts   |     100 |      100 |     100 |     100 |                   
  ...WithBackup.ts |    62.1 |    77.77 |     100 |    62.1 | 93,107,118-157    
-------------------|---------|----------|---------|---------|-------------------
Core Package - Full Text Report
-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   75.26 |    81.56 |   77.84 |   75.26 |                   
 src               |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/__mocks__/fs  |       0 |        0 |       0 |       0 |                   
  promises.ts      |       0 |        0 |       0 |       0 | 1-48              
 src/agents        |   92.18 |    79.43 |   96.96 |   92.18 |                   
  ...transcript.ts |   94.79 |    75.75 |     100 |   94.79 | ...69,193-194,281 
  ...ound-tasks.ts |    89.8 |    81.08 |   94.44 |    89.8 | ...46-353,366-367 
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/agents/arena  |    76.9 |    66.66 |   78.94 |    76.9 |                   
  ...gentClient.ts |   79.47 |    88.88 |   81.81 |   79.47 | ...68-183,189-204 
  ArenaManager.ts  |   75.84 |     62.9 |   78.57 |   75.84 | ...1889,1895-1896 
  arena-events.ts  |   64.44 |      100 |      50 |   64.44 | ...71-175,178-183 
  diff-summary.ts  |    87.5 |    73.46 |     100 |    87.5 | ...32-133,137-138 
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...gents/backends |    76.4 |    86.07 |   72.41 |    76.4 |                   
  ITermBackend.ts  |   97.97 |    93.93 |     100 |   97.97 | ...78-180,255,307 
  ...essBackend.ts |   92.17 |    90.32 |   82.35 |   92.17 | ...24-244,303,403 
  TmuxBackend.ts   |    90.7 |    76.55 |   97.36 |    90.7 | ...87,697,743-747 
  detect.ts        |   31.25 |      100 |       0 |   31.25 | 34-88             
  index.ts         |     100 |      100 |     100 |     100 |                   
  iterm-it2.ts     |     100 |     92.1 |     100 |     100 | 37-38,106         
  tmux-commands.ts |    6.64 |      100 |    3.03 |    6.64 | ...93-363,386-503 
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...agents/runtime |   80.56 |    75.64 |   70.32 |   80.56 |                   
  agent-core.ts    |   73.82 |    68.49 |   58.33 |   73.82 | ...1128,1155-1201 
  agent-events.ts  |     100 |      100 |     100 |     100 |                   
  ...t-headless.ts |   79.06 |       75 |   52.38 |   79.06 | ...64-365,368-369 
  ...nteractive.ts |   81.71 |    78.12 |      75 |   81.71 | ...25,527,529,532 
  ...statistics.ts |   98.19 |    82.35 |     100 |   98.19 | 127,151,192,225   
  agent-types.ts   |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/config        |   73.73 |    76.05 |      61 |   73.73 |                   
  config.ts        |   71.13 |    73.14 |   54.94 |   71.13 | ...2701,2705-2717 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  models.ts        |     100 |      100 |     100 |     100 |                   
  storage.ts       |   95.72 |    92.85 |   91.66 |   95.72 | ...06-207,241-242 
 ...nfirmation-bus |   98.29 |    97.14 |     100 |   98.29 |                   
  message-bus.ts   |   98.14 |    97.05 |     100 |   98.14 | 42-43             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/constants     |    4.95 |      100 |       0 |    4.95 |                   
  codingPlan.ts    |    4.95 |      100 |       0 |    4.95 | ...79-291,299-309 
 src/core          |   80.42 |    80.36 |   85.54 |   80.42 |                   
  baseLlmClient.ts |   96.77 |    96.42 |      80 |   96.77 | 123-126           
  client.ts        |   70.75 |    73.59 |   73.07 |   70.75 | ...1115,1119-1135 
  ...tGenerator.ts |    72.1 |    61.11 |     100 |    72.1 | ...45,347,354-357 
  ...lScheduler.ts |   73.63 |    76.97 |   91.17 |   73.63 | ...1888,1945-1949 
  geminiChat.ts    |    89.1 |     84.5 |   85.29 |    89.1 | ...1075,1142-1143 
  geminiRequest.ts |     100 |      100 |     100 |     100 |                   
  ...htProtocol.ts |    9.09 |      100 |       0 |    9.09 | 34-42,45-49,52-87 
  logger.ts        |   82.25 |    81.81 |     100 |   82.25 | ...57-361,407-421 
  ...tyDefaults.ts |     100 |      100 |     100 |     100 |                   
  ...olExecutor.ts |   92.59 |       75 |      50 |   92.59 | 41-42             
  ...on-helpers.ts |   76.53 |    60.71 |     100 |   76.53 | ...81-182,196-205 
  prompts.ts       |    88.8 |    88.05 |      75 |    88.8 | ...-898,1101-1102 
  tokenLimits.ts   |     100 |    89.47 |     100 |     100 | 50-51             
  ...okTriggers.ts |   99.31 |     90.9 |     100 |   99.31 | 124,135           
  turn.ts          |   96.29 |    88.46 |     100 |   96.29 | ...87,400-401,449 
 ...ntentGenerator |   93.72 |    73.43 |    90.9 |   93.72 |                   
  ...tGenerator.ts |   95.99 |    72.17 |   86.66 |   95.99 | ...03-304,438,494 
  converter.ts     |   93.47 |       75 |     100 |   93.47 | ...87-488,498,558 
  index.ts         |       0 |        0 |       0 |       0 | 1-21              
 ...ntentGenerator |   91.53 |    71.21 |   93.33 |   91.53 |                   
  ...tGenerator.ts |      90 |    70.49 |   92.85 |      90 | ...77-283,301-302 
  index.ts         |     100 |       80 |     100 |     100 | 50                
 ...ntentGenerator |   91.08 |    76.14 |   85.71 |   91.08 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tGenerator.ts |   91.04 |    76.14 |   85.71 |   91.04 | ...23,533-534,562 
 ...ntentGenerator |   76.51 |    83.51 |   89.55 |   76.51 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  converter.ts     |   73.07 |       78 |   86.36 |   73.07 | ...1311,1332-1338 
  errorHandler.ts  |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-95              
  ...tGenerator.ts |   48.78 |    91.66 |   77.77 |   48.78 | ...10-163,166-167 
  pipeline.ts      |   94.15 |    89.58 |     100 |   94.15 | ...84,454-455,463 
  ...CallParser.ts |   90.66 |     88.4 |     100 |   90.66 | ...15-319,349-350 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...rator/provider |   95.98 |    85.59 |   93.75 |   95.98 |                   
  dashscope.ts     |   97.22 |    87.69 |   93.33 |   97.22 | ...10-211,287-288 
  deepseek.ts      |    91.3 |    73.91 |     100 |    91.3 | 49-50,54-55,68-69 
  default.ts       |   94.62 |    86.36 |   85.71 |   94.62 | 85-86,156-158     
  index.ts         |     100 |      100 |     100 |     100 |                   
  modelscope.ts    |     100 |      100 |     100 |     100 |                   
  openrouter.ts    |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 |                   
 src/extension     |   60.71 |    79.59 |   79.03 |   60.71 |                   
  ...-converter.ts |   62.35 |    47.82 |      90 |   62.35 | ...90-791,800-832 
  ...ionManager.ts |   46.96 |    82.97 |   67.44 |   46.96 | ...1343,1364-1383 
  ...onSettings.ts |   93.46 |    93.05 |     100 |   93.46 | ...17-221,228-232 
  ...-converter.ts |   54.88 |    94.44 |      60 |   54.88 | ...35-146,158-192 
  github.ts        |   44.94 |    88.52 |      60 |   44.94 | ...53-359,398-451 
  index.ts         |     100 |      100 |     100 |     100 |                   
  marketplace.ts   |   97.29 |    93.75 |     100 |   97.29 | ...64,184-185,274 
  npm.ts           |   48.66 |    76.08 |      75 |   48.66 | ...18-420,427-431 
  override.ts      |   94.11 |    88.88 |     100 |   94.11 | 63-64,81-82       
  settings.ts      |   66.26 |      100 |      50 |   66.26 | 81-108,143-149    
  storage.ts       |   94.73 |       90 |     100 |   94.73 | 41-42             
  ...ableSchema.ts |     100 |      100 |     100 |     100 |                   
  variables.ts     |   88.75 |    83.33 |     100 |   88.75 | ...28-231,234-237 
 src/followup      |   46.18 |     92.3 |   71.87 |   46.18 |                   
  followupState.ts |      96 |    89.74 |     100 |      96 | 159-161,218-219   
  index.ts         |     100 |      100 |     100 |     100 |                   
  overlayFs.ts     |   95.06 |       84 |     100 |   95.06 | 78,108,122,133    
  speculation.ts   |   13.22 |      100 |   16.66 |   13.22 | 88-458,518-568    
  ...onToolGate.ts |     100 |    96.29 |     100 |     100 | 92                
  ...nGenerator.ts |   36.67 |    95.12 |   33.33 |   36.67 | ...24-326,361-391 
 src/generated     |       0 |        0 |       0 |       0 |                   
  git-commit.ts    |       0 |        0 |       0 |       0 | 1-10              
 src/hooks         |    80.6 |    84.37 |   84.16 |    80.6 |                   
  ...okRegistry.ts |   86.48 |    77.08 |     100 |   86.48 | ...41-344,362-369 
  ...bortSignal.ts |     100 |      100 |     100 |     100 |                   
  ...terpolator.ts |   96.66 |    93.33 |     100 |   96.66 | 66-67             
  ...HookRunner.ts |   96.68 |    87.23 |     100 |   96.68 | 110-112,231-233   
  ...Aggregator.ts |   96.37 |    90.54 |     100 |   96.37 | ...89,291-292,365 
  ...entHandler.ts |   95.58 |    84.37 |   92.59 |   95.58 | ...29,682-683,693 
  hookPlanner.ts   |   84.13 |    76.59 |      90 |   84.13 | ...38,144,162-173 
  hookRegistry.ts  |   88.83 |    86.36 |     100 |   88.83 | ...21,326,330,334 
  hookRunner.ts    |   53.63 |    72.22 |   61.11 |   53.63 | ...23-724,733-734 
  hookSystem.ts    |   75.47 |      100 |   56.41 |   75.47 | ...75-576,582-583 
  ...HookRunner.ts |   75.51 |     61.9 |      80 |   75.51 | ...05-406,424-425 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...SkillHooks.ts |   78.75 |       75 |   66.66 |   78.75 | 62-66,137-152     
  ...oksManager.ts |    96.5 |     91.8 |     100 |    96.5 | ...90,209-210,223 
  ssrfGuard.ts     |   77.22 |    85.36 |     100 |   77.22 | ...57,261-267,273 
  trustedHooks.ts  |       0 |        0 |       0 |       0 | 1-124             
  types.ts         |   90.15 |    91.02 |   85.18 |   90.15 | ...91-392,452-456 
  urlValidator.ts  |     100 |      100 |     100 |     100 |                   
 src/ide           |   74.28 |    83.39 |   78.33 |   74.28 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  detect-ide.ts    |     100 |      100 |     100 |     100 |                   
  ide-client.ts    |    64.2 |    81.48 |   66.66 |    64.2 | ...9-970,999-1007 
  ide-installer.ts |   89.06 |    79.31 |     100 |   89.06 | ...36,143-147,160 
  ideContext.ts    |     100 |      100 |     100 |     100 |                   
  process-utils.ts |   84.84 |    71.79 |     100 |   84.84 | ...37,151,193-194 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/lsp           |   33.39 |    43.75 |   44.91 |   33.39 |                   
  ...nfigLoader.ts |   70.27 |    35.89 |   94.73 |   70.27 | ...20-422,426-432 
  ...ionFactory.ts |    4.29 |      100 |       0 |    4.29 | ...20-371,377-394 
  ...Normalizer.ts |   23.09 |    13.72 |   30.43 |   23.09 | ...04-905,909-924 
  ...verManager.ts |   10.47 |       75 |      25 |   10.47 | ...56-675,681-711 
  ...eLspClient.ts |   17.89 |      100 |       0 |   17.89 | ...37-244,254-258 
  ...LspService.ts |   45.87 |    62.13 |   66.66 |   45.87 | ...1282,1299-1309 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/mcp           |   78.69 |    75.34 |   75.92 |   78.69 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...h-provider.ts |   86.95 |      100 |   33.33 |   86.95 | ...,93,97,101-102 
  ...h-provider.ts |   73.82 |    53.92 |     100 |   73.82 | ...88-895,902-904 
  ...en-storage.ts |   98.62 |    97.72 |     100 |   98.62 | 87-88             
  oauth-utils.ts   |   70.58 |    85.29 |    90.9 |   70.58 | ...70-290,315-344 
  ...n-provider.ts |   89.83 |    95.83 |   45.45 |   89.83 | ...43,147,151-152 
 .../token-storage |   79.48 |    86.66 |   86.36 |   79.48 |                   
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   82.75 |    82.35 |   92.85 |   82.75 | ...62-172,180-181 
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   68.14 |    82.35 |   64.28 |   68.14 | ...81-295,298-314 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/memory        |   61.95 |    74.52 |    65.3 |   61.95 |                   
  const.ts         |     100 |      100 |     100 |     100 |                   
  dream.ts         |   88.07 |    66.66 |      80 |   88.07 | ...23,131,141-147 
  ...entPlanner.ts |    55.2 |       75 |   28.57 |    55.2 | ...30,135-142,147 
  entries.ts       |   59.84 |       70 |      50 |   59.84 | ...72-180,183-189 
  extract.ts       |    95.2 |    79.16 |     100 |    95.2 | 81-86,125         
  ...entPlanner.ts |   63.08 |    65.71 |   41.17 |   63.08 | ...17,222-223,332 
  ...ionPlanner.ts |       0 |        0 |       0 |       0 | 1                 
  forget.ts        |    8.04 |      100 |       0 |    8.04 | 67-342            
  governance.ts    |       0 |        0 |       0 |       0 | 1-352             
  indexer.ts       |   83.87 |    45.45 |     100 |   83.87 | ...50,56-57,69-70 
  manager.ts       |   74.16 |    76.23 |   70.27 |   74.16 | ...77-878,891-893 
  memoryAge.ts     |   80.95 |     87.5 |      75 |   80.95 | 48-51             
  paths.ts         |   55.47 |    88.88 |   85.71 |   55.47 | ...,88-89,105-113 
  prompt.ts        |   93.36 |    71.42 |     100 |   93.36 | ...58,161,228-229 
  recall.ts        |   82.24 |    78.04 |   88.88 |   82.24 | ...71-188,246-257 
  ...ceSelector.ts |   91.56 |    73.68 |     100 |   91.56 | ...01,103-104,112 
  scan.ts          |   87.91 |    68.42 |     100 |   87.91 | ...47-48,58,82-87 
  status.ts        |   10.52 |      100 |       0 |   10.52 | 41-98             
  store.ts         |   94.44 |    83.33 |     100 |   94.44 | 56-57,92-93       
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/mocks         |       0 |        0 |       0 |       0 |                   
  msw.ts           |       0 |        0 |       0 |       0 | 1-9               
 src/models        |   89.48 |    86.09 |   87.14 |   89.48 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...tor-config.ts |   88.67 |     90.9 |     100 |   88.67 | 112,118,121-130   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...nfigErrors.ts |   74.22 |    47.82 |   84.61 |   74.22 | ...,67-74,106-117 
  ...igResolver.ts |   98.63 |    92.53 |     100 |   98.63 | 161,323,329       
  modelRegistry.ts |     100 |    98.21 |     100 |     100 | 182               
  modelsConfig.ts  |   85.37 |    83.54 |   81.57 |   85.37 | ...1210,1239-1240 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/output        |     100 |      100 |     100 |     100 |                   
  ...-formatter.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/permissions   |    70.5 |    87.96 |    48.2 |    70.5 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...on-manager.ts |   79.18 |    82.65 |   79.16 |   79.18 | ...85-786,793-802 
  rule-parser.ts   |   95.88 |    93.56 |     100 |   95.88 | ...40-841,990-992 
  ...-semantics.ts |   58.28 |    85.27 |    30.2 |   58.28 | ...1604-1614,1643 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/prompts       |   83.63 |      100 |    87.5 |   83.63 |                   
  mcp-prompts.ts   |   18.18 |      100 |       0 |   18.18 | 11-19             
  ...t-registry.ts |     100 |      100 |     100 |     100 |                   
 src/qwen          |   86.03 |    79.48 |   97.18 |   86.03 |                   
  ...tGenerator.ts |   98.64 |    98.18 |     100 |   98.64 | 105-106           
  qwenOAuth2.ts    |   85.01 |    74.81 |   93.33 |   85.01 | ...,986-1002,1032 
  ...kenManager.ts |   83.79 |    76.22 |     100 |   83.79 | ...63-768,789-794 
 src/services      |   82.46 |    82.11 |   84.32 |   82.46 |                   
  ...ionService.ts |   97.95 |    94.04 |     100 |   97.95 | 255,257-261       
  ...ingService.ts |   72.04 |    78.88 |   73.07 |   72.04 | ...10-911,928-929 
  cronScheduler.ts |   97.56 |    92.98 |     100 |   97.56 | 62-63,77,155      
  ...eryService.ts |   80.43 |    95.45 |      75 |   80.43 | ...19-134,140-141 
  ...temService.ts |   89.76 |     85.1 |   88.88 |   89.76 | ...89,191,266-273 
  gitInit.ts       |     100 |      100 |     100 |     100 |                   
  gitService.ts    |   68.75 |     92.3 |   55.55 |   68.75 | ...12-122,125-129 
  ...reeService.ts |   71.83 |    68.47 |    91.3 |   71.83 | ...89-790,806,822 
  ...ionService.ts |   98.13 |     97.8 |   95.45 |   98.13 | ...32-333,380-381 
  sessionRecap.ts  |   10.71 |      100 |       0 |   10.71 | 48-161            
  ...ionService.ts |   83.91 |    71.72 |      92 |   83.91 | ...-989,1021-1022 
  sessionTitle.ts  |   93.95 |    70.37 |     100 |   93.95 | ...36-239,270-271 
  ...ionService.ts |   84.72 |    82.38 |   83.78 |   84.72 | ...8-989,995-1000 
 ...icrocompaction |   98.62 |    86.44 |     100 |   98.62 |                   
  microcompact.ts  |   98.62 |    86.44 |     100 |   98.62 | 138,142           
 src/skills        |   83.35 |    79.29 |   90.32 |   83.35 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  skill-load.ts    |   91.24 |    78.94 |     100 |   91.24 | ...37,157,169-171 
  skill-manager.ts |   80.66 |    77.85 |   88.46 |   80.66 | ...88-896,903-907 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/subagents     |   82.44 |    80.76 |   91.11 |   82.44 |                   
  ...tin-agents.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...-selection.ts |     100 |      100 |     100 |     100 |                   
  ...nt-manager.ts |   76.05 |    72.81 |   87.09 |   76.05 | ...1112,1134-1135 
  types.ts         |     100 |      100 |     100 |     100 |                   
  validation.ts    |   92.46 |    95.18 |     100 |   92.46 | 51-56,69-74,78-83 
 src/telemetry     |   68.06 |    84.31 |   73.68 |   68.06 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...-exporters.ts |   46.37 |      100 |   44.44 |   46.37 | ...85,88-89,92-93 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-111             
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-128             
  loggers.ts       |   52.09 |    61.64 |   57.77 |   52.09 | ...1218,1235-1255 
  metrics.ts       |    74.9 |    82.95 |   74.54 |    74.9 | ...58-978,981-992 
  sanitize.ts      |      80 |    83.33 |     100 |      80 | 35-36,41-42       
  sdk.ts           |   85.13 |    56.25 |     100 |   85.13 | ...78,184-185,191 
  ...etry-utils.ts |     100 |      100 |     100 |     100 |                   
  ...l-decision.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |   79.13 |    94.49 |   83.33 |   79.13 | ...1136,1139-1168 
  uiTelemetry.ts   |   93.04 |    96.55 |   81.25 |   93.04 | ...95-196,202-209 
 ...ry/qwen-logger |   68.05 |    80.21 |   64.91 |   68.05 |                   
  event-types.ts   |       0 |        0 |       0 |       0 |                   
  qwen-logger.ts   |   68.05 |       80 |   64.28 |   68.05 | ...1043,1081-1082 
 src/test-utils    |   93.07 |    95.65 |   73.52 |   93.07 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  ...st-helpers.ts |   94.11 |       90 |     100 |   94.11 | 69-70             
  index.ts         |     100 |      100 |     100 |     100 |                   
  mock-tool.ts     |   91.02 |    96.87 |   68.96 |   91.02 | ...32,196-197,210 
  ...aceContext.ts |     100 |      100 |     100 |     100 |                   
 src/tools         |   74.09 |    79.42 |   79.39 |   74.09 |                   
  ...erQuestion.ts |   87.89 |     73.8 |    90.9 |   87.89 | ...44-345,349-350 
  cron-create.ts   |   97.61 |    88.88 |   83.33 |   97.61 | 30-31             
  cron-delete.ts   |   96.55 |      100 |   83.33 |   96.55 | 26-27             
  cron-list.ts     |   96.36 |      100 |   83.33 |   96.36 | 25-26             
  diffOptions.ts   |     100 |      100 |     100 |     100 |                   
  edit.ts          |   80.72 |    85.05 |   73.33 |   80.72 | ...09-510,593-643 
  exitPlanMode.ts  |   84.61 |    85.71 |     100 |   84.61 | ...60-163,177-189 
  glob.ts          |   90.56 |    88.33 |   84.61 |   90.56 | ...24,167,297,300 
  grep.ts          |   71.24 |    87.34 |   72.22 |   71.24 | ...88,528,536-543 
  ls.ts            |   96.74 |    90.27 |     100 |   96.74 | 171-176,207,211   
  lsp.ts           |   72.58 |    60.29 |   90.32 |   72.58 | ...1202,1204-1205 
  ...nt-manager.ts |   47.47 |       60 |   44.44 |   47.47 | ...73-491,494-531 
  mcp-client.ts    |   29.65 |    71.05 |   46.87 |   29.65 | ...1434,1438-1441 
  mcp-tool.ts      |   90.92 |    88.88 |   96.42 |   90.92 | ...89-590,640-641 
  memory-config.ts |       0 |        0 |       0 |       0 | 1-48              
  ...iable-tool.ts |     100 |    84.61 |     100 |     100 | 102,109           
  read-file.ts     |    91.9 |    85.71 |   88.88 |    91.9 | ...,94-95,167-176 
  ripGrep.ts       |   94.42 |    89.33 |   91.66 |   94.42 | ...34,337,415-416 
  ...-transport.ts |    6.34 |      100 |       0 |    6.34 | 47-145            
  send-message.ts  |   97.43 |      100 |   83.33 |   97.43 | 43-44             
  shell.ts         |   82.69 |    78.12 |    92.3 |   82.69 | ...84-488,694-695 
  skill-utils.ts   |     100 |      100 |     100 |     100 |                   
  skill.ts         |   86.97 |    87.71 |   83.33 |   86.97 | ...11,315,338-360 
  task-stop.ts     |   97.33 |      100 |   83.33 |   97.33 | 39-40             
  todoWrite.ts     |   85.42 |    84.09 |   84.61 |   85.42 | ...05-410,432-433 
  tool-error.ts    |     100 |      100 |     100 |     100 |                   
  tool-names.ts    |     100 |      100 |     100 |     100 |                   
  tool-registry.ts |   67.49 |    68.91 |   65.71 |   67.49 | ...59-660,668-669 
  tools.ts         |   84.18 |    89.58 |   82.35 |   84.18 | ...25-426,442-448 
  web-fetch.ts     |   88.44 |    76.92 |    92.3 |   88.44 | ...05-306,308-309 
  write-file.ts    |   83.04 |    77.19 |   83.33 |   83.04 | ...11-414,426-461 
 src/tools/agent   |   78.41 |    82.46 |   86.11 |   78.41 |                   
  agent-context.ts |     100 |      100 |     100 |     100 |                   
  agent.ts         |   80.97 |    81.63 |   89.65 |   80.97 | ...1201,1248-1252 
  fork-subagent.ts |   42.02 |      100 |      60 |   42.02 | 54-72,91-128      
 src/utils         |   86.97 |    87.17 |   90.63 |   86.97 |                   
  LruCache.ts      |       0 |        0 |       0 |       0 | 1-41              
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...cFileWrite.ts |   76.08 |    44.44 |     100 |   76.08 | 61-70,72          
  bareMode.ts      |   27.27 |      100 |       0 |   27.27 | 9-15,18-19        
  browser.ts       |    7.69 |      100 |       0 |    7.69 | 17-56             
  ...igResolver.ts |     100 |      100 |     100 |     100 |                   
  cronDisplay.ts   |   42.85 |    23.07 |     100 |   42.85 | 26-31,33-45,47-54 
  cronParser.ts    |   89.74 |    85.71 |     100 |   89.74 | ...,63-64,183-186 
  debugLogger.ts   |   96.12 |    93.75 |   93.75 |   96.12 | 164-168           
  editHelper.ts    |   92.67 |    82.14 |     100 |   92.67 | ...52-454,463-464 
  editor.ts        |   97.61 |    95.71 |     100 |   97.61 | ...70-271,273-274 
  ...arResolver.ts |   94.28 |    88.88 |     100 |   94.28 | 28-29,125-126     
  ...entContext.ts |     100 |       95 |     100 |     100 | 83                
  errorParsing.ts  |   97.05 |       95 |     100 |   97.05 | 39-40             
  ...rReporting.ts |   88.46 |       90 |     100 |   88.46 | 69-74             
  errors.ts        |   70.92 |    80.39 |   53.33 |   70.92 | ...03-219,223-229 
  fetch.ts         |   70.18 |    71.42 |   71.42 |   70.18 | ...42,148,161,186 
  fileUtils.ts     |   89.08 |    85.06 |   94.73 |   89.08 | ...68-875,879-885 
  forkedAgent.ts   |   62.98 |    54.54 |      75 |   62.98 | ...23-432,434-447 
  formatters.ts    |   54.54 |       50 |     100 |   54.54 | 12-16             
  ...eUtilities.ts |   89.21 |    86.66 |     100 |   89.21 | 16-17,49-55,65-66 
  ...rStructure.ts |   94.36 |    94.28 |     100 |   94.36 | ...17-120,330-335 
  getPty.ts        |    12.5 |      100 |       0 |    12.5 | 21-34             
  ...noreParser.ts |    92.3 |    89.36 |     100 |    92.3 | ...15-116,186-187 
  gitUtils.ts      |   38.88 |    84.61 |      50 |   38.88 | ...2,51-74,97-148 
  iconvHelper.ts   |     100 |      100 |     100 |     100 |                   
  ...rePatterns.ts |     100 |      100 |     100 |     100 |                   
  ...ionManager.ts |     100 |     90.9 |     100 |     100 | 26                
  ...lPromptIds.ts |     100 |      100 |     100 |     100 |                   
  jsonl-utils.ts   |   10.07 |      100 |       0 |   10.07 | ...67-200,206-212 
  ...-detection.ts |     100 |      100 |     100 |     100 |                   
  ...yDiscovery.ts |   83.85 |    79.36 |     100 |   83.85 | ...15,318,410-413 
  ...tProcessor.ts |   93.63 |       90 |     100 |   93.63 | ...96-302,384-385 
  ...Inspectors.ts |   61.53 |      100 |      50 |   61.53 | 18-23             
  ...kerChecker.ts |   82.55 |    78.57 |     100 |   82.55 | 68-69,79-84,92-98 
  notebook.ts      |   94.35 |    84.78 |     100 |   94.35 | ...10,122,174-176 
  openaiLogger.ts  |   86.27 |    82.14 |     100 |   86.27 | ...05-107,130-135 
  partUtils.ts     |     100 |      100 |     100 |     100 |                   
  pathReader.ts    |     100 |      100 |     100 |     100 |                   
  paths.ts         |   93.43 |     92.1 |     100 |   93.43 | ...50-351,353-355 
  pdf.ts           |   93.68 |    87.05 |     100 |   93.68 | ...96-297,321-325 
  ...ectSummary.ts |   89.39 |    72.41 |     100 |   89.39 | ...37-142,193-196 
  ...tIdContext.ts |     100 |      100 |     100 |     100 |                   
  proxyUtils.ts    |     100 |      100 |     100 |     100 |                   
  ...rDetection.ts |   58.57 |       76 |     100 |   58.57 | ...4,88-89,95-100 
  ...noreParser.ts |   85.45 |    85.18 |     100 |   85.45 | ...59,65-66,72-73 
  rateLimit.ts     |   91.48 |    94.11 |     100 |   91.48 | 80,93-95          
  readManyFiles.ts |   87.96 |    86.95 |     100 |   87.96 | ...05-207,223-234 
  retry.ts         |   89.81 |    88.05 |     100 |   89.81 | ...29,350,357-358 
  ripgrepUtils.ts  |   46.53 |    83.33 |   66.66 |   46.53 | ...32-233,245-322 
  ...sDiscovery.ts |   97.47 |    93.15 |     100 |   97.47 | ...03,181-182,201 
  ...tchOptions.ts |   63.85 |    64.28 |   83.33 |   63.85 | ...29-130,187-188 
  safeJsonParse.ts |   74.07 |    83.33 |     100 |   74.07 | 40-46             
  ...nStringify.ts |     100 |      100 |     100 |     100 |                   
  ...aConverter.ts |   90.78 |    87.87 |     100 |   90.78 | ...41-42,93,95-96 
  ...aValidator.ts |   93.43 |    77.41 |     100 |   93.43 | ...46,155-158,212 
  ...r-launcher.ts |   76.92 |     91.3 |   66.66 |   76.92 | ...34,136,157-195 
  ...orageUtils.ts |   92.41 |    82.82 |     100 |   92.41 | ...39,423-430,441 
  shell-utils.ts   |    83.6 |    90.63 |     100 |    83.6 | ...1040,1047-1051 
  ...lAstParser.ts |   95.58 |    85.79 |     100 |   95.58 | ...1059-1061,1071 
  ...nlyChecker.ts |   95.75 |    92.47 |     100 |   95.75 | ...00-301,313-314 
  sideQuery.ts     |     100 |    92.85 |     100 |     100 | 43                
  ...tGenerator.ts |     100 |      100 |     100 |     100 |                   
  ...ameContext.ts |     100 |      100 |     100 |     100 |                   
  symlink.ts       |   77.77 |       50 |     100 |   77.77 | 44,54-59          
  ...emEncoding.ts |   96.36 |    91.17 |     100 |   96.36 | 59-60,124-125     
  terminalSafe.ts  |     100 |      100 |     100 |     100 |                   
  ...Serializer.ts |   98.72 |       90 |     100 |   98.72 | 42-43,134,201-203 
  testUtils.ts     |   53.33 |      100 |   33.33 |   53.33 | ...53,59-64,70-72 
  textUtils.ts     |      60 |      100 |   66.66 |      60 | 36-55             
  thoughtUtils.ts  |     100 |    92.85 |     100 |     100 | 71                
  ...-converter.ts |   94.59 |    85.71 |     100 |   94.59 | 35-36             
  tool-utils.ts    |    93.6 |     91.3 |     100 |    93.6 | ...58-159,162-163 
  truncation.ts    |     100 |       92 |     100 |     100 | 52,71             
  windowsPath.ts   |   89.47 |    79.31 |     100 |   89.47 | ...57-58,62,90-91 
  ...aceContext.ts |   93.71 |    88.88 |   93.33 |   93.71 | ...24-225,249-251 
  yaml-parser.ts   |      92 |    84.31 |     100 |      92 | 49-53,65-69       
 ...ils/filesearch |   96.34 |    91.66 |     100 |   96.34 |                   
  crawlCache.ts    |     100 |      100 |     100 |     100 |                   
  crawler.ts       |   96.87 |    94.44 |     100 |   96.87 | 83-84             
  fileSearch.ts    |   93.29 |    86.76 |     100 |   93.29 | ...40-241,243-244 
  ignore.ts        |     100 |      100 |     100 |     100 |                   
  result-cache.ts  |     100 |     92.3 |     100 |     100 | 46                
 ...uest-tokenizer |   56.63 |    74.52 |   74.19 |   56.63 |                   
  ...eTokenizer.ts |   41.86 |    76.47 |   69.23 |   41.86 | ...70-443,453-507 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tTokenizer.ts |   68.39 |    69.49 |    90.9 |   68.39 | ...24-325,327-328 
  ...ageFormats.ts |      76 |      100 |   33.33 |      76 | 45-48,55-56       
  textTokenizer.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
-------------------|---------|----------|---------|---------|-------------------

For detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run.

Comment thread packages/core/src/agents/background-tasks.ts
The nonInteractiveCli test stub was missing two methods that the
runtime now calls when draining background agents on shutdown,
causing every runNonInteractive test to fail with TypeError.
Hard-coded forward slashes in expected paths failed on Windows where
path.join produces backslashes.
Comment thread packages/core/src/tools/read-file.ts
Comment thread packages/core/src/tools/send-message.ts Outdated
Comment thread packages/core/src/tools/agent/agent.ts Outdated
- Add internal-ID qualifier, anti-duplication clause, and large-file reading strategy to the launch tool-result template, ported from claw-code.
- Rename transcript_file to output_file for consistency.
- Reference read_file and run_shell_command via ToolNames constants instead of raw strings.
@tanzhenxin tanzhenxin marked this pull request as ready for review April 23, 2026 10:02
Comment thread packages/core/src/config/config.ts
Comment thread packages/core/src/agents/background-tasks.ts
…-control

# Conflicts:
#	packages/core/src/services/chatRecordingService.ts
These are parent-side control-plane tools for managing background
subagents. Subagents themselves cannot launch background agents
(AGENT is already excluded), so they have no agent IDs to manage
natively, and exposing the tools only widens the surface for
cross-agent interference if an ID leaks via prompt or transcript.
@wenshao

wenshao commented Apr 26, 2026

Copy link
Copy Markdown
Collaborator

谢谢推进这个 PR。先说一下整体 framing — 这两个 PR 落地的是 background tasks 的控制面和 UI 面。随着后续 task 类型加入(Bash 后台池是下一步明显方向 — 我正在 scoping 那个 PR),这里 baked-in 的每一条 agent-only 假设都得后续 rename 或 work-around 一遍。所以下面的建议大多围绕一个问题:如果同一个 registry / 工具 / dialog 也要装 shell 或 monitor,现在这套接口读起来还对吗?

高杠杆建议

  1. task_stop 写死成只能停 background agent — 应该一般化

    • 现状(packages/core/src/tools/task-stop.ts):L39 getDescription() 返回 "Stop background agent ${task_id}";L97 description "Cancel a running background agent";L52、L63 错误类型 TASK_STOP_AGENT_NOT_FOUND / _AGENT_NOT_RUNNING;消息 "No background agent found" / "Background agent ... is not running"。
    • 顾虑:将来 background shell 会注册到同一个 registry、同一个 task_id 命名空间。模型看到这个工具描述写死 agent,根本不会拿来停 shell — 即使试了,错误消息也读着别扭。后续任何 task 类型(monitor / remote 等)都同理。
    • 建议:description 改成 "Stop a running background task by its task ID.";错误改名 TASK_STOP_NOT_FOUND / TASK_STOP_NOT_RUNNING。可选:把 registry entry 里的 kind 读进错误消息("shell"/"agent"/...),保持具体而不绑定单一类型。
    • 工作量:小。
  2. SendMessage 参数 to + agent-only 措辞 → 无法延伸到非 agent 的 task 类型

    • 现状(packages/core/src/tools/send-message.ts):L24-27 interface SendMessageParams { to: string; message: string };L93 description "Send a text message to a running background agent";L56、L67 错误 SEND_MESSAGE_AGENT_*
    • 顾虑:对其他 task 类型来说,"send_message" 合理映射是:写 stdin(shell)、重启 filter(monitor)、中途 redirect(remote task)。同一个形状,下游不同。把工具锁在 agent 上就关掉了这个空间。
    • 建议:totask_id,与 task_stop 统一参数名(一套心智模型,模型/读者都更顺,整个工具家族契约一致)。错误改 SEND_MESSAGE_NOT_FOUND / SEND_MESSAGE_NOT_RUNNINGagent-core.ts:87EXTERNAL_MESSAGE_PREFIX 单独 export 这点对,保留。
    • 工作量:小。
  3. ChatRecord schema 被 4 个 agent-only 字段污染,挡住 shell/monitor JSONL transcript 共用同一 parser

    • 现状(packages/core/src/services/chatRecordingService.ts:108-114):agentIdagentNameagentColorisSidechain 全加成 ChatRecord 的 optional 字段。
    • 顾虑:本 PR 的目标是 "single parser 统吃 main session 和 subagent transcript"。如果 shell/monitor 也走 JSONL transcript(应该走),这个目标自然延伸 — 但 isSidechain 无法泛化到 backgrounded shell,而且这 4 个字段是 file-level 常量,逐条记录重复一次纯属浪费。schema 噪音 + 未来非 agent 类型的别扭。
    • 建议:4 个全部挪到现有 .meta.json sidecar — sidecar 已经是 transcript metadata 的官方位置,挪过去消除逐条重复。ChatRecord 保持干净可复用。
    • 工作量:中。
  4. transcript 路径 <projectDir>/subagents/<sessionId>/ 在目录层就写死 "subagent"

    • 现状(packages/core/src/agents/agent-transcript.ts:43-58):getSubagentSessionDir 返回 <projectDir>/subagents/<sessionId>/getAgentJsonlPathagent-<id>.jsonlread-file.ts:105-118 自动放行 <projectDir>/subagents/
    • 顾虑:这个路径通过 notification XML 里的 <output-file> 暴露给模型,已经是 model-facing contract 的一部分。延伸到其他 task 类型要么开兄弟目录(shells/monitors/)— 拆分命名空间,要么在第二种类型落地时整体改名。
    • 建议:现在就改成 <projectDir>/tasks/<sessionId>/<kind>-<id>.jsonlkind = agent / 未来类型)。read-file.ts:109 自动放行扩到 tasks/getAgentJsonlPathgetTaskJsonlPath(projectDir, sessionId, kind, id),writer 加 kind 参数。如果想推后,至少在 agent-transcript.ts:43 加 TODO,避免下个 PR 的 reviewer 措手不及。
    • 工作量:小(rename + grep)。

小建议 / 疑问

  • CANCEL_GRACE_MS = 5000 写死background-tasks.ts:28)。注释写得很好。考虑通过 Config 暴露,让测试不用 fake timer 就能压缩 — 也让未来 teardown 较慢的 task 类型可调。hasUnfinalizedAgentsbackground-tasks.ts:231)是真正的安全网,所以这条只是 polish。
  • finalizeCancelled(L185)和 finalizeCancellationIfPending(L207)微妙不同。 唯一差异是 partialResult/stats 是否带上。两个几乎一样的 method 容易调错。考虑合成一个 finalizeCancelled(agentId, partialResult?, stats?),对 notified + status !== 'cancelled' 的情况 no-op。
  • cancelled 但还没 notified 的 agent 发 SendMessage 静默丢弃send-message.ts:62-67)。如果父 agent 把 send_messagetask_stop 几乎同时发出,消息会丢但不通知父 agent。大概是有意为之 — 加一行注释说明 intentional,避免后人当 bug 改。
  • agent-context.ts AsyncLocalStorage(L17、L23)形状对 — 通过 await 边界传播 parentAgentId。未来跨 task 类型的嵌套启动可能也复用这个通道;考虑趁早把 AgentContext + parentAgentId 改成 TaskContext + parentTaskId(或保持现状以后再 rename — API 表面小,两种都行)。
  • EXTERNAL_MESSAGE_PREFIX 在循环里逐条加agent-core.ts:592)。3 条 queued message → 3 行各自带前缀。确认这是想要的(vs. 一个 block 共用一个 header)。
  • 测试覆盖扎实。 两个值得补的 gap:(a) read-file.ts:109 自动放行用的是 raw <projectDir>/subagents 没做 sanitization — 测一下 path traversal 风格的 sessionId 能不能从读取路径上逃逸;(b) 显式测 task_stop 在 agent 处于 AgentTerminateMode.CANCELLED unwind 中途时触发,验证 background-tasks.ts:125 注释里写的 cancelled → completed transition。

做对了,应该作为模式延续到未来 task 类型

  • cancel 立即返回 ack,最终 notification 由自然 unwind 发出task-stop.ts:68-83background-tasks.ts:185-203)。父 agent 不阻塞,partial result 真实保留。shell/monitor 取消也应该沿用这个契约。
  • 每个 background entry 一个独立 emitteragent.ts:1053-1069)。"一个被后台化的东西配一个 emitter" 隔离串扰 — shell 池正需要这个。
  • hasUnfinalizedAgents 作为 headless 模式退出兜底background-tasks.ts:231)。task_startedtask_notification 严格配对是正确的 SDK 契约。建议改名 hasUnfinalizedTasks,配合一般化。
  • sidecar .meta.json 与 JSONL 分离。 外部工具列 session 不用解析 transcript。形状对 — 上面建议 如何自定义密钥文件 .env可能与其他文件冲突 #3 挪进去的 4 个字段也属于这里。

拆分建议(这个切割线对未来 PR 也适用)

23 文件 / +2027 偏大但内部 coherent。两个干净的剥离点:

  • 通用控制面task-stop.ts + send-message.ts + background-tasks.ts 里的 registry lifecycle(cancel/grace/notification 原语)。与 task 类型无关。
  • agent-specific 扩展chatRecordingService.ts schema、agent-transcript.tsagent-core.ts 里的 per-agent emitter 接线、agent-context.ts。专门关于 agent transcript 和父子 agent 链接。

未来 task 类型 PR(shell 池、monitor)也应该按这个切割线 — 通用管道一个 PR,per-kind 扩展 follow-up。在这个 PR 里建立这个切割线,未来 PR 就能按同一形状 review。

English version

Thanks for taking this on. A quick framing first — these two PRs land the control plane and UI plane for background tasks. As more task types come online (a Bash background pool is the next obvious one — I'm scoping that PR), every agent-only assumption baked in here has to be either renamed away or worked around. So most comments below ask: would this still read right if the same registry / tool / dialog also held a shell or a monitor?

High-leverage suggestions

  1. task_stop is hard-wired to background agents — generalize the contract

    • Current (packages/core/src/tools/task-stop.ts): L39 getDescription() returns "Stop background agent ${task_id}"; L97 description "Cancel a running background agent"; L52,L63 errors TASK_STOP_AGENT_NOT_FOUND / _AGENT_NOT_RUNNING; messages "No background agent found" / "Background agent ... is not running".
    • Concern: A background shell will sit in the same registry under the same task_id namespace. The model sees this tool advertised as agent-only and won't try it on a shell — and the error reads wrong if it does. The same applies to any later task type (monitors, remote tasks, etc.).
    • Suggestion: Generalize description to "Stop a running background task by its task ID." Rename errors to TASK_STOP_NOT_FOUND / TASK_STOP_NOT_RUNNING. Optionally read kind from the registry entry into the error message ("shell"/"agent"/...) so it stays specific without coupling to a single type.
    • Effort: small.
  2. SendMessage parameter to and the agent-only framing won't stretch to non-agent task kinds

    • Current (packages/core/src/tools/send-message.ts): L24-27 interface SendMessageParams { to: string; message: string }; L93 description "Send a text message to a running background agent"; L56,L67 errors SEND_MESSAGE_AGENT_*.
    • Concern: For other task kinds, "send_message" plausibly maps to stdin write (shell), filter restart (monitor), or mid-flight steering (remote task). Same shape, different downstream. Locking the tool to agents in description and errors removes that headroom.
    • Suggestion: Rename totask_id to match task_stop (one parameter name across the family — easier model recall and a uniform contract). Generalize errors to SEND_MESSAGE_NOT_FOUND / SEND_MESSAGE_NOT_RUNNING. Keep exporting EXTERNAL_MESSAGE_PREFIX (agent-core.ts:87) — it's the right shared mechanism.
    • Effort: small.
  3. ChatRecord schema polluted with four agent-only fields blocks shell/monitor JSONL transcripts from sharing the same parser

    • Current (packages/core/src/services/chatRecordingService.ts:108-114): agentId, agentName, agentColor, isSidechain added as optional fields on every ChatRecord.
    • Concern: This PR's stated goal is "single parser covers main session and subagent transcripts." That goal extends naturally if shells/monitors get JSONL transcripts too — except isSidechain doesn't generalize, and these four fields are file-level constants leaking onto every record. Schema noise + a future awkwardness for non-agent kinds.
    • Suggestion: Move all four to the existing .meta.json sidecar — it's already the documented home for transcript metadata and removes per-record duplication. ChatRecord stays clean and reusable.
    • Effort: medium.
  4. Transcript path <projectDir>/subagents/<sessionId>/ builds in "subagent" at the directory level

    • Current (packages/core/src/agents/agent-transcript.ts:43-58): getSubagentSessionDir returns <projectDir>/subagents/<sessionId>/; getAgentJsonlPath writes agent-<id>.jsonl. read-file.ts:105-118 auto-allows <projectDir>/subagents/.
    • Concern: The path is exposed to the model via <output-file> in the notification XML, so it's part of the model-facing contract. Extending to other task kinds means either sibling dirs (shells/, monitors/) — splitting the namespace — or a directory rename when a second kind lands.
    • Suggestion: Rename to <projectDir>/tasks/<sessionId>/<kind>-<id>.jsonl now (kind = agent / future). read-file.ts:109 auto-allow widens to tasks/. getAgentJsonlPathgetTaskJsonlPath(projectDir, sessionId, kind, id). Add a kind parameter to the writer. If you want to defer, at minimum a TODO at agent-transcript.ts:43 so reviewers of the next PR aren't surprised.
    • Effort: small (rename + grep).

Smaller suggestions / questions

  • CANCEL_GRACE_MS = 5000 hardcoded (background-tasks.ts:28). Comment is excellent. Consider exposing via Config so tests don't need fake timers — and so future task kinds with longer teardowns can tune it. hasUnfinalizedAgents (background-tasks.ts:231) is the real backstop, so this is polish.
  • finalizeCancelled (L185) and finalizeCancellationIfPending (L207) are subtly different. Only divergence is whether partialResult/stats get attached. Easy to call the wrong one. Consider collapsing to one finalizeCancelled(agentId, partialResult?, stats?) that no-ops on notified + status !== 'cancelled'.
  • SendMessage to a cancelled-but-not-notified agent silently no-ops (send-message.ts:62-67). If parent races send_message against task_stop, message is dropped. Probably intentional — worth a one-line comment so future readers don't read it as a bug.
  • agent-context.ts AsyncLocalStorage (L17, L23) is the right shape — propagates parentAgentId through await. Future nested launches across task kinds will likely reuse this channel; consider whether AgentContext + parentAgentId should be TaskContext + parentTaskId while it's young (or keep them as-is and rename later — the API surface is small).
  • EXTERNAL_MESSAGE_PREFIX applied per-message in a loop (agent-core.ts:592). 3 queued messages → 3 prefixed lines. Confirm intentional vs. one shared header.
  • Test coverage solid. Two gaps to consider: (a) read-file.ts:109 auto-allow uses raw <projectDir>/subagents with no sanitization on lookup — exercise path-traversal sessionIds for read-side escape; (b) explicit task_stop against an agent mid-AgentTerminateMode.CANCELLED unwind, exercising the cancelled → completed transition noted at background-tasks.ts:125.

Things you got right (worth keeping as patterns for future task kinds)

  • Cancel returns immediately, terminal notification from natural unwind (task-stop.ts:68-83, background-tasks.ts:185-203). Parent isn't blocked, real partial result preserved. Keep this contract for shell/monitor cancellation too.
  • Dedicated emitter per background entry (agent.ts:1053-1069). The "one emitter per backgrounded thing" pattern isolates cross-talk — exactly what a shell pool needs.
  • hasUnfinalizedAgents headless holdback invariant (background-tasks.ts:231). Strict pairing of task_started and task_notification is the right SDK contract. Rename suggestion: hasUnfinalizedTasks along with the generalization.
  • Sidecar .meta.json separate from JSONL. External tooling can list sessions without parsing transcripts. Keep the shape and grow it (suggestion 如何自定义密钥文件 .env可能与其他文件冲突 #3).

Suggested split (and why this cleavage matters beyond this PR)

23 files / +2027 is upper-edge but coherent. Two natural carve-outs:

  • Generic control plane: task-stop.ts + send-message.ts + the registry lifecycle in background-tasks.ts (cancel/grace/notification primitives). Independent of task kind.
  • Agent-specific extension: chatRecordingService.ts schema, agent-transcript.ts, agent-core.ts per-agent emitter wiring, agent-context.ts. Specifically about agent transcripts and parent-child agent linkage.

This is also the cleavage future task-kind PRs (shell pool, monitor) should follow — generic plumbing in one PR, per-kind extensions in follow-ups. Establishing the split here makes those future PRs reviewable in the same shape.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] packages/cli/src/nonInteractiveCli.test.ts:1176: Property bySource is missing in the object passed to setupMetricsMock, but required in type ModelMetrics.

This fails the build typecheck. Fix by adding the bySource object to the token metrics mock.

— gemini-3.1-pro via Gemini CLI /review

wenshao pushed a commit that referenced this pull request Apr 26, 2026
Three [Critical] from the auto review + naming alignment with Claude Code:

- shell.ts settle: non-zero exit code or termination signal now bucket into
  `failed` instead of `completed`. The previous `if (result.error) fail else
  complete()` would misreport `false` / failed `npm test` as success because
  ShellExecutionService surfaces ordinary command failures as a non-zero
  exitCode with `error: null`. Failure reason carries the exit code or signal
  so `/tasks` shows the real cause.

- ShellExecutionService.childProcessFallback: add `streamStdout` mode that
  emits each decoded chunk through the existing onOutputEvent path. The
  default (foreground) path continues to buffer + emit the cleaned final
  blob, so existing in-line shell calls are unaffected. executeBackground
  opts in via `{ streamStdout: true }`, which is what makes the captured
  output file actually useful for long-running processes (dev servers,
  watchers) — without it the file stayed empty until the process exited.

- shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'`
  but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the
  build; switched to `signal: null`. Added a test that explicitly covers
  the new "non-zero exit → failed" path so the bucketing change has
  regression coverage.

- shell.ts comment: explicitly document why background shells force
  `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead
  weight for fire-and-forget commands).

- /bashes → /tasks (alias bashes), description "List and manage background
  tasks" — matches Claude Code's command name. Currently lists shells only;
  will surface other task kinds (subagents, monitor) as those registries
  land via #3471 / #3488.
wenshao pushed a commit that referenced this pull request Apr 26, 2026
- shellExecutionService streaming: drop stdout/stderr buffer + outputChunks
  accumulation in streaming mode. Each decoded chunk goes straight to
  onOutputEvent and is GC-eligible immediately. Long-running background
  commands (dev servers, watchers) no longer accumulate unbounded memory
  proportional to total output. Buffered (foreground) mode is unchanged.

- shell.ts executeBackground: stripAnsi each chunk before writing to the
  output file. Dev servers / build tools spam color codes and cursor-move
  sequences that would render as garbage in the file the agent reads.

- bashesCommand: command description "List and manage" → "List background
  tasks" — current implementation only supports listing, cancellation
  follows when the unified task_stop tool from #3471 is wired in. Replace
  the hand-rolled formatRuntime helper with the shared formatDuration
  utility (uses hideTrailingZeros for parity with the previous output).

- backgroundShellRegistry: add a comment documenting the lack of an
  eviction policy as a known limitation. LRU / age-based / capped-size
  eviction (and on-disk output rotation) is left as a follow-up alongside
  the broader output-file lifecycle story.
wenshao pushed a commit that referenced this pull request Apr 26, 2026
Four reviewer concerns from @wenshao + @doudouOUC:

- [Critical] Config.shutdown() now also calls
  `backgroundShellRegistry.abortAll()`. Previously only the subagent
  registry was aborted, so a managed background shell could outlive the
  CLI process and orphan its child. Symmetric with how
  `BackgroundTaskRegistry.abortAll()` is wired in.

- [P1] shell.ts executeBackground strips a trailing `&` from the command
  before spawn. The managed path is itself the backgrounding mechanism;
  forwarding `node server.js &` verbatim made bash exit immediately while
  the real child outlived the wrapper, causing the registry to settle as
  `completed` while the shell was still running and chunked output to
  land on a closed stream. Strip + warn.

- [P2] Output file moves under `storage.getProjectTempDir()` (specifically
  `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`).
  `ReadFileTool` already auto-allows the project temp dir, so the LLM
  can `Read` the captured output without bouncing off a permission
  prompt — important because background-agent contexts can't surface
  interactive prompts.

- [P2] Background shells are no longer killed when the current turn's
  AbortSignal fires. Forwarding the turn signal into the entry's
  AbortController meant a Ctrl+C on the turn would also terminate
  intentionally backgrounded dev servers / watchers, contradicting the
  independent-lifecycle promise. Cancellation now flows only through
  `entryAc` (driven by future `task_stop` integration via #3471).

Tests:
- New `abortAll` registry tests cover running / mixed / empty cases.
- `runs background commands as managed pool entries` test stops asserting
  the wrapper-vs-entry signal identity since they're now structurally
  separate (no turn-to-entry forwarding).
- New `does not forward the turn signal into the background shell` test
  pins the new behavior.
- New `strips trailing & from the spawned command` test pins the strip.
- Removed the cancel-via-outer-signal settle test — that path no longer
  exists; cancellation is exercised end-to-end via the registry's own
  `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`.
@tanzhenxin

Copy link
Copy Markdown
Collaborator Author

@wenshao Could you take another pass when you have a moment? All open comments have replies, and 57c89e4 is a no-behavior-change refactor that generalizes the model-facing framing of the two control-plane tools from "agent" to "task" (so BackgroundTaskRegistry can host other task kinds later). Thanks!

Comment thread packages/core/src/agents/runtime/agent-core.ts
Comment thread packages/core/src/agents/background-tasks.ts
Comment thread packages/core/src/agents/runtime/agent-core.ts
Comment thread packages/core/src/agents/background-tasks.ts
@tanzhenxin

Copy link
Copy Markdown
Collaborator Author

Hi @wenshao — wanted to flag a pattern across the recent /review batches on this PR.

The most recent run posted byte-for-byte duplicates of two comments I'd already rejected and resolved earlier today, against unchanged code. The originals themselves were already low-signal:

  • The "memory leak" describes bounded growth in a session-scoped map driven by user-initiated launches — single-digit agents, KB-scale results, no measured problem.
  • The "external messages stranded" edge case fires only on truly empty rounds (no text, no tool calls), where the nudge is malfunction recovery; injecting queued user messages alongside contradictory recovery instructions would worsen recovery, not help.

Earlier batches followed the same pattern — e.g., the read_file allowlist comment claimed the transcript path lived under the project root when it actually lives under storage.getProjectDir() (where the allowlist already pointed). Most of the recent comments have been triaged as false positives or pathological edge cases.

Could we (a) have the tool dedupe against resolved threads, and (b) skip re-runs when the diff hasn't changed since the last review? The current signal-to-noise is costing more triage time than the comments contribute.

@yiliang114 yiliang114 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tanzhenxin tanzhenxin dismissed wenshao’s stale review April 27, 2026 12:36

Resolved. Merge into main now.

@tanzhenxin tanzhenxin merged commit 581c74d into main Apr 27, 2026
13 checks passed
tanzhenxin added a commit that referenced this pull request Apr 27, 2026
Adds the user-facing surface for background tasks on top of the
model-facing agent control primitives merged in #3471. A dedicated
pill in the footer summarises running tasks, ↓ focuses it, and Enter
opens a combined dialog listing every task with a detail view that
shows the original prompt, live stats, and a rolling progress feed
of recent tool invocations.

Also renames BackgroundAgent* to BackgroundTask* for consistency with
the user-facing terminology and the task_* tool family.
wenshao pushed a commit that referenced this pull request Apr 28, 2026
…3488)

* feat(cli): background-task UI — pill, combined dialog, detail view

Adds the user-facing surface for background tasks on top of the
model-facing agent control primitives merged in #3471. A dedicated
pill in the footer summarises running tasks, ↓ focuses it, and Enter
opens a combined dialog listing every task with a detail view that
shows the original prompt, live stats, and a rolling progress feed
of recent tool invocations.

Also renames BackgroundAgent* to BackgroundTask* for consistency with
the user-facing terminology and the task_* tool family.

* chore: trigger CI
wenshao added a commit that referenced this pull request Apr 28, 2026
* feat(core): managed background shell pool with /bashes command

Replace shell.ts's `&` fork-and-detach background path with a managed
process registry. Background shells now have observable lifecycle, captured
output, and explicit cancellation — matching the pattern used by background
subagents (#3076).

Phase B from #3634 (background task management roadmap).

What changes
- New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts):
  per-process entry with status (running / completed / failed / cancelled),
  AbortController, output file path. State transitions are one-shot
  (terminal status sticks; late callbacks no-op). Mirrors the lifecycle
  shape of #3471's BackgroundTaskRegistry so the two can be unified later.
- `shell.ts` is_background path rewritten as `executeBackground`:
  - Spawns the unwrapped command (no '&', no pgrep envelope)
  - Streams stdout to `<projectDir>/tasks/<sessionId>/shell-<id>.output`
    (path layout aligns with the direction sketched in #3471 review)
  - Bridges the external abort signal into the entry's AbortController so
    a single source of truth governs cancellation
  - Returns immediately with id + output path; agent's turn isn't blocked
  - Settles the registry entry asynchronously when ShellExecutionService
    resolves: complete (clean exit) / fail (error) / cancel (aborted)
- Removes ~120 lines of dead bg-specific code from shell.ts:
  pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows
  early-return path, bg PID parsing, tempFile cleanup
- New `/bashes` slash command: lists registered shells with id, status,
  runtime, command, output path. Empty state prints a friendly message.

What this PR doesn't do
- Footer pill / dialog integration — gated on #3488 landing
- task_stop / send_message integration — gated on #3471 landing
- Auto-backgrounding heuristics for long foreground bash — Phase D

Test plan
- 11 registry unit tests (state machine + idempotent terminal transitions)
- 4 background-path tests in shell.test.ts (spawn no-wrap + complete /
  fail / cancel settle paths)
- 2 /bashes command tests (empty + populated)
- Full core suite: 247 files / 6075 passed (existing tests unaffected)

* fix(core): address PR #3642 review feedback

Three [Critical] from the auto review + naming alignment with Claude Code:

- shell.ts settle: non-zero exit code or termination signal now bucket into
  `failed` instead of `completed`. The previous `if (result.error) fail else
  complete()` would misreport `false` / failed `npm test` as success because
  ShellExecutionService surfaces ordinary command failures as a non-zero
  exitCode with `error: null`. Failure reason carries the exit code or signal
  so `/tasks` shows the real cause.

- ShellExecutionService.childProcessFallback: add `streamStdout` mode that
  emits each decoded chunk through the existing onOutputEvent path. The
  default (foreground) path continues to buffer + emit the cleaned final
  blob, so existing in-line shell calls are unaffected. executeBackground
  opts in via `{ streamStdout: true }`, which is what makes the captured
  output file actually useful for long-running processes (dev servers,
  watchers) — without it the file stayed empty until the process exited.

- shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'`
  but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the
  build; switched to `signal: null`. Added a test that explicitly covers
  the new "non-zero exit → failed" path so the bucketing change has
  regression coverage.

- shell.ts comment: explicitly document why background shells force
  `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead
  weight for fire-and-forget commands).

- /bashes → /tasks (alias bashes), description "List and manage background
  tasks" — matches Claude Code's command name. Currently lists shells only;
  will surface other task kinds (subagents, monitor) as those registries
  land via #3471 / #3488.

* fix(core): address PR #3642 second-round review feedback

- shellExecutionService streaming: drop stdout/stderr buffer + outputChunks
  accumulation in streaming mode. Each decoded chunk goes straight to
  onOutputEvent and is GC-eligible immediately. Long-running background
  commands (dev servers, watchers) no longer accumulate unbounded memory
  proportional to total output. Buffered (foreground) mode is unchanged.

- shell.ts executeBackground: stripAnsi each chunk before writing to the
  output file. Dev servers / build tools spam color codes and cursor-move
  sequences that would render as garbage in the file the agent reads.

- bashesCommand: command description "List and manage" → "List background
  tasks" — current implementation only supports listing, cancellation
  follows when the unified task_stop tool from #3471 is wired in. Replace
  the hand-rolled formatRuntime helper with the shared formatDuration
  utility (uses hideTrailingZeros for parity with the previous output).

- backgroundShellRegistry: add a comment documenting the lack of an
  eviction policy as a known limitation. LRU / age-based / capped-size
  eviction (and on-disk output rotation) is left as a follow-up alongside
  the broader output-file lifecycle story.

* fix(core): address PR #3642 third-round review feedback

- shell.ts executeBackground: add 'error' listener on the output write
  stream. fs.createWriteStream surfaces write failures (disk full,
  permission, fs going away) as 'error' events; without a listener Node
  treats it as an uncaught exception and kills the entire CLI session.
  Log + drop is the sane default — the registry still settles via
  resultPromise so /tasks shows the right terminal status.

- shell.ts executeBackground: store the abort handler reference and
  removeEventListener in the settle callback. Background shells outlive
  the turn signal; the dangling listener was keeping `entryAc` (and
  transitively `outputStream`) reachable until the turn signal itself was
  GC'd, which for long sessions would never happen.

- shell.test.ts: extend the createWriteStream mock with an `on` stub so
  the new error-listener wiring doesn't crash the test suite.

* refactor(cli): drop /bashes alias and rename file to tasksCommand

Per follow-up review: the slash command should be exclusively /tasks.
Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` →
`tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand`
→ `tasksCommand`, and cleans up the remaining `/bashes` references in
backgroundShellRegistry.ts comments. No behavior change beyond the
alias removal.

* refactor(cli): finish tasksCommand rename — apply content changes

The previous commit (03c8503) only captured the file rename via
`git mv`; the export name change (`bashesCommand` → `tasksCommand`),
the removal of `altNames: ['bashes']`, the import update in
BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in
backgroundShellRegistry.ts were unstaged when that commit landed.
Squash candidate before merge.

* fix(core): address PR #3642 fourth-round review feedback

Four reviewer concerns from @wenshao + @doudouOUC:

- [Critical] Config.shutdown() now also calls
  `backgroundShellRegistry.abortAll()`. Previously only the subagent
  registry was aborted, so a managed background shell could outlive the
  CLI process and orphan its child. Symmetric with how
  `BackgroundTaskRegistry.abortAll()` is wired in.

- [P1] shell.ts executeBackground strips a trailing `&` from the command
  before spawn. The managed path is itself the backgrounding mechanism;
  forwarding `node server.js &` verbatim made bash exit immediately while
  the real child outlived the wrapper, causing the registry to settle as
  `completed` while the shell was still running and chunked output to
  land on a closed stream. Strip + warn.

- [P2] Output file moves under `storage.getProjectTempDir()` (specifically
  `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`).
  `ReadFileTool` already auto-allows the project temp dir, so the LLM
  can `Read` the captured output without bouncing off a permission
  prompt — important because background-agent contexts can't surface
  interactive prompts.

- [P2] Background shells are no longer killed when the current turn's
  AbortSignal fires. Forwarding the turn signal into the entry's
  AbortController meant a Ctrl+C on the turn would also terminate
  intentionally backgrounded dev servers / watchers, contradicting the
  independent-lifecycle promise. Cancellation now flows only through
  `entryAc` (driven by future `task_stop` integration via #3471).

Tests:
- New `abortAll` registry tests cover running / mixed / empty cases.
- `runs background commands as managed pool entries` test stops asserting
  the wrapper-vs-entry signal identity since they're now structurally
  separate (no turn-to-entry forwarding).
- New `does not forward the turn signal into the background shell` test
  pins the new behavior.
- New `strips trailing & from the spawned command` test pins the strip.
- Removed the cancel-via-outer-signal settle test — that path no longer
  exists; cancellation is exercised end-to-end via the registry's own
  `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`.

* fix(core): tighten trailing & strip — narrow regex + ReDoS-safe

Two reviewer concerns on the same line of #3642 round 4:

- [Critical CodeQL] `\s*&+\s*$` is a polynomial-time regex on
  uncontrolled input (long all-`&` strings backtrack quadratically).
- [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&`
  into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into
  `echo foo \` (eats the escaped literal). Only the bare bash background
  operator should be stripped.

Replace the regex with a small linear-time helper
`stripTrailingBackgroundAmp` that explicitly checks for the three
"don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` /
`slice` — no regex backtracking, and the intent reads off the page.

Tests:
- Existing strip-trailing-`&` test still passes.
- New `does not strip a trailing &&` test pins the logical-AND case.
- New `does not strip an escaped trailing \\&` test pins the escape case.

* fix(core): keep binary-detection sniff in streaming mode

@doudouOUC noted that `streamStdout` shortcut returned before the
binary-sniff path, so a background command emitting binary bytes
(`cat /bin/ls`, image dump, etc.) would be text-decoded and appended
to the task output file unbounded.

Restructure handleOutput so the sniff-and-cutover logic runs in both
modes:

- Both modes accumulate up to MAX_SNIFF_SIZE for the binary check.
  The accumulator is bounded; once the threshold is reached, it stops
  growing in streaming mode (dropped on binary detection / left
  inert on text confirmation) and continues to accumulate in buffered
  mode (existing foreground behavior).
- Streaming mode emits 'binary_detected' as soon as `isBinary` trips
  so the consumer can stop writing the output file. Up to ~4KB of
  bytes may have been emitted as text chunks before detection — this
  is bounded and acceptable; the unbounded write is the pathology
  reviewers flagged.
- Streaming text mode still emits each decoded chunk immediately and
  does not accumulate stdout/stderr strings, so long-running text
  streams remain GC-friendly.
- Buffered (foreground) behavior is unchanged — the sniff accumulator
  is the same path the existing tests cover.

Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57
shell.test.ts all pass; no regressions.

* fix(core): tighten streaming sniff bound + Windows rmSync flake

Two unrelated reds on the latest CI run:

1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks.
   The previous fix recomputed `sniffedBytes` from
   `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk —
   pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE
   (line-sized stdout, e.g. dev-server logs) the byte count never grew,
   the sniff branch stayed open forever, and `outputChunks` accumulated
   every later chunk — exactly the leak `streamStdout` was meant to
   prevent.

   Track sniffed bytes by running sum (`sniffedBytes += data.length`)
   so the bound is genuine. When sniff confirms text in streaming mode,
   drop the accumulator immediately so subsequent chunks fall through
   the streaming emit path without ever touching it.

2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows
   (ENOTEMPTY: directory not empty). The exporter's underlying write
   stream hasn't always released its handle by the time `rmSync` runs.
   Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through
   the brief Windows handle-release window instead of failing the test
   on a CI quirk.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
tanzhenxin pushed a commit that referenced this pull request Apr 29, 2026
* feat(core): wire background shells into the task_stop tool

Phase B follow-up #1 from #3634, unblocked by #3471 (control plane) merging
in. The model can now cancel a managed background shell with the same
`task_stop` tool it uses for subagents — no more falling back to
`kill <pid>` via BashTool.

Lookup order: subagent registry first (existing behavior), then the
background shell registry as a fallback. Agent IDs follow
`<subagentName>-<suffix>` and shell IDs follow `bg_<8 hex chars>`, so the
two namespaces cannot collide in practice; the order is fixed for
determinism (a defensive test pins agent-wins-over-shell).

The shell cancel path resolves through the entry's own AbortController
(which `BackgroundShellRegistry.cancel` triggers); the child process
exit handler then settles the registry to `cancelled` and the on-disk
output file is preserved for inspection via `/tasks` or a direct `Read`.
This matches Phase B's "registry's own AbortController is the
cancellation source of truth" design without needing the in-flight
notification framework that subagents use.

Tests: 7 task-stop tests (was 4) — added cancel-shell happy path,
NOT_RUNNING for already-exited shell, and a defensive
agent-takes-precedence-on-id-collision case.

* fix(core): defer shell terminal transition until spawn handler settles

@doudouOUC noticed that the previous task_stop path called
`BackgroundShellRegistry.cancel(id, Date.now())`, which marked the entry
`cancelled` immediately. The spawn handler's settle path only records
real exit info via cancel/complete/fail when the entry is still
`running`, so the cancel-vs-exit race could permanently hide a real
completed/failed result and `/tasks` would show a terminal endTime
while the process was still draining.

Add a `requestCancel(id)` method to `BackgroundShellRegistry` that
triggers the entry's AbortController only; status stays `running` until
the settle path observes the abort and records the real terminal state.
The immediate-mark `cancel(id, endTime)` is reserved for `abortAll()` /
shutdown, where the CLI process is tearing down anyway and there is no
settle handler to wait for.

Tests updated:
- `task-stop.test.ts` cancel-shell happy path now asserts the entry
  stays `running` with `endTime` undefined post-stop, and the abort
  signal fires (the settle path's contract, not task_stop's, is the
  one that flips status).
- 3 new `requestCancel` tests in `backgroundShellRegistry.test.ts`:
  running → abort+still-running, terminal entry no-op, unknown id no-op.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
@wenshao

wenshao commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator

PR 3471 tmux real-user verification

  • Date: 2026-04-29
  • tmux session: pr3471-agent-control-tmux-20260429-213334
  • Command: npm run dev -- --approval-mode yolo
  • Workspace: /Users/wenshao/git/qwen-code-x7
  • Result: PASS

Scope

Real TUI verification for PR 3471 background agent control behavior:

  1. Launch a background general-purpose agent.
  2. Read its JSONL transcript while it is still running.
  3. Send a mid-flight message with send_message.
  4. Stop it with task_stop.
  5. Confirm final cancelled notification is surfaced in the TUI.

Observations

Primary evidence is in tmux-readable-full.log, section 05 final verified result / final capture before cleanup.

  • Background task launched successfully:
    • Task id: general-purpose-call_CSsQjVEFYvvJNCco3ykPxy0c
    • TUI showed general-purpose ● Running in background.
  • Transcript was readable while the task was running:
    • Path: /Users/wenshao/.qwen/projects/-Users-wenshao-git-qwen-code-x7/subagents/322c7602-2a4e-4cc6-abef-677fd24bb974/agent-general-purpose-call_CSsQjVEFYvvJNCco3ykPxy0c.jsonl
    • TUI showed successful ReadFile calls against the transcript.
  • send_message worked:
    • TUI showed SendMessage Send message to task general-purpose-call_CSsQjVEFYvvJNCco3ykPxy0c.
    • The verification run reported marker MID_FLIGHT_INJECTION_PR3471_CHECK appeared in a type:"user" transcript record.
  • task_stop worked:
    • TUI showed TaskStop Stop background task general-purpose-call_CSsQjVEFYvvJNCco3ykPxy0c and Cancelled: 验证后台控制.
    • Final notification appeared: Background agent "general-purpose: 验证后台控制" was cancelled.
    • Final notification fields included Status: cancelled, the task id, output file path, usage, and duration.

Artifacts

  • Primary readable log: tmp/pr3471-agent-control-tmux-20260429-213334/tmux-readable-full.log
  • Final screen capture: tmp/pr3471-agent-control-tmux-20260429-213334/tmux-final-capture.log
  • Scratch/current pane: tmp/pr3471-agent-control-tmux-20260429-213334/current-pane.txt
  • This report: tmp/pr3471-agent-control-tmux-20260429-213334/report.md

Side effects

  • No source files were intentionally modified by the verification prompt.
  • A background subagent transcript was written under the Qwen project storage directory shown above.
  • The tmux session was cleaned up after final capture.

wenshao pushed a commit that referenced this pull request May 2, 2026
… C) (#3684)

* feat(core): event monitor tool with throttled stdout streaming (Phase C)

Add a new Monitor tool that spawns a long-running shell command and streams
its stdout lines back to the agent as event notifications. This is Phase C
from the background task management roadmap (#3634, #3666).

What changes:
- New MonitorRegistry (services/monitorRegistry.ts): per-monitor entry with
  lifecycle (running/completed/failed/cancelled), idle timeout auto-stop,
  max events auto-stop, AbortController-based cancellation. Follows the
  same structural pattern as BackgroundTaskRegistry.
- New Monitor tool (tools/monitor.ts): spawns via child_process.spawn with
  independent AbortController (Ctrl+C won't kill monitors), separate
  stdout/stderr line buffers, token-bucket throttling (burst=5, sustain=1/s).
  Returns immediately with monitor ID; events stream as notifications.
- Sleep interception in shell.ts: detectBlockedSleepPattern() blocks
  foreground `sleep N` (N>=2) and guides model to use Monitor or
  is_background instead.
- Config integration: MonitorRegistry instantiation, accessor, shutdown
  cleanup (abortAll), lazy tool registration.
- CLI wiring: notification callbacks in useGeminiStream.ts (interactive)
  and nonInteractiveCli.ts (headless), including hold-back loop abort on
  exit and SIGINT cleanup.

What this PR doesn't do (gated on #3471/#3488):
- Footer pill / dialog integration
- task_stop / send_message integration

Test plan:
- 21 MonitorRegistry unit tests (lifecycle, idle timeout, max events,
  XML escaping, nonexistent ID guard, callback clearing)
- 20 Monitor tool unit tests (validation, spawn, line buffering, separate
  stdout/stderr buffers, throttling, signal-killed path, turn isolation)
- 7 detectBlockedSleepPattern unit tests
- 2 E2E tests (monitor invocation, sleep interception)
- Full core suite: 248 files / 6151 passed

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): hold-back loop waits for monitors + emit task_started for SDK

Two fixes from Codex review:

P1: The non-interactive hold-back loop now includes monitorRegistry.getRunning()
in its wait condition, so monitors can stream events before the CLI exits.
Previously monitors were aborted immediately after the agent's first reply.

P2: MonitorRegistry gains setRegisterCallback(), and nonInteractiveCli wires
it to emit task_started system messages. Stream-json/SDK consumers now see
a task_started for each monitor, matching the backgroundTaskRegistry contract.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): Windows process kill + pipeline sleep false-positive

Two fixes from Codex review:

P1: Monitor abort handler now uses `taskkill /f /t` on Windows instead
of POSIX-only `process.kill(-pid)`. Follows the existing pattern in
ShellExecutionService.childProcessFallback.

P2: detectBlockedSleepPattern no longer uses splitCommands (which splits
on `|` pipes). Replaced with a regex that only matches sleep followed by
sequential separators (&&, ||, ;, &, newline), not pipes. `sleep 5 | cat`
is now correctly allowed.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): resolve TS errors in monitor.test.ts mock types

Use Object.defineProperty for readonly ChildProcess.pid and proper
Readable type for stdout/stderr mocks to satisfy strict tsc builds.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): remove false notification promise + add early-abort guard

P1: Sleep interception guidance no longer promises "completion notification"
for is_background — that wiring doesn't exist yet (follow-up from #3642).

P2: Monitor.execute() now checks _signal.aborted before spawning, preventing
a race where cancellation during tool scheduling still launches a monitor.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to useGeminiStream tests

The useGeminiStream hook now calls config.getMonitorRegistry() to wire
up monitor notification callbacks. The test mock config was missing this
method, causing 64 test failures with "config.getMonitorRegistry is not
a function".

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to nonInteractiveCli tests

Same fix as useGeminiStream.test.tsx — the mock config needs
getMonitorRegistry to avoid "is not a function" errors (29 failures).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review — CORE_TOOLS, directory param, test fix

1. Add 'monitor' to PermissionManager.CORE_TOOLS so coreTools allowlist
   correctly gates the monitor tool (same as run_shell_command).

2. Add optional 'directory' parameter to MonitorTool with workspace
   validation, mirroring ShellTool's directory support for multi-root
   workspaces.

3. Fix sleep-interception E2E test: readToolLogs() doesn't expose
   toolResult, so the old assertion was dead code. Now verifies via
   the model's output text instead.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address MonitorTool review #4186888042

Addresses three [Critical] review comments on packages/core/src/tools/monitor.ts:

1. Partial-line buffer unbounded growth (processLines)
   MAX_LINE_LENGTH was only enforced after a newline, so a command emitting
   a long stream without newlines would grow buffer.value without bound and
   re-split the entire accumulated string on every chunk. Now, when the
   buffer has no newline and exceeds MAX_LINE_LENGTH, we force-emit a single
   truncated event through the throttled path and reset the buffer.

2. Missing type guard on params.command
   validateToolParamValues called params.command.trim() without a typeof
   check. Schema validation normally catches this, but SDK/direct callers
   could bypass it and hit an uncaught TypeError. Added typeof === 'string'
   guard, matching the pattern used for max_events / idle_timeout_ms.

3. Workspace check bypass via raw startsWith
   The directory validator used workspaceDirs.some(d => params.directory
   .startsWith(d)), which allowed prefix collisions (e.g. /tmp/project-evil
   against a /tmp/project workspace) and skipped canonicalisation / symlink
   resolution. Switched to WorkspaceContext.isPathWithinWorkspace, which
   already does fullyResolvedPath + segment-aware isPathWithinRoot matching
   and is the standard used elsewhere in the codebase.

Test coverage: added 6 unit tests covering non-string command guard,
non-absolute directory rejection, prefix-collision rejection, traversal
rejection, workspace acceptance, and partial-line cap behaviour
(including buffer reset). All 26 monitor.test.ts cases pass.

The same startsWith pattern also exists in ShellTool and is tracked as a
separate follow-up to keep this PR focused on Phase C scope.

* fix(core): scope monitor always-allow permissions

Populate Monitor confirmation permissionRules using the same command-rule extraction path as ShellTool, so ProceedAlways persists command-scoped Bash(...) rules instead of a broad monitor-level allow. Also add unit coverage for command-scoped rules, filtering already-allowed subcommands, and extractor fallback behavior.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): decouple monitor permission scope from Bash rules

Remove pm.isCommandAllowed() from MonitorToolInvocation.getConfirmationDetails()
to prevent existing Bash(...) allow rules from shrinking the monitor confirmation
scope. Monitor is a long-running background process with a different risk profile
than one-shot shell execution and should maintain its own permission boundary.
Only AST-based read-only filtering is retained.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): unify monitor error/exit cleanup to prevent resource leaks

Extract a shared cleanup() helper called from both the `exit` and
`error` event handlers. Previously the `error` handler did not flush
buffers, clear buffer values, remove the abort listener, or log
dropped-line stats — causing potential memory leaks when `error` fires
without a subsequent `exit` (e.g. ENOENT for missing commands).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): add user-skills-directory guard to monitor directory validation

Mirror ShellTool's getUserSkillsDirs() check in MonitorTool's
validateToolParamValues() to prevent monitor commands from running
inside user skills directories.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(core): add Monitor(...) permission namespace for monitor tool (#3726)

Introduce a dedicated Monitor(...) permission namespace so monitor and
shell tools have independent permission boundaries. Previously monitor
emitted Bash(...) rules, causing "Always Allow" to fail for future
monitor invocations while unintentionally granting run_shell_command.

Changes:
- rule-parser.ts: add Monitor alias, SHELL_TOOL_NAMES entry,
  CANONICAL_TO_RULE_DISPLAY, DISPLAY_NAME_TO_VERB
- permission-manager.ts: extract SHELL_LIKE_TOOLS set so evaluate(),
  evaluateSingle(), hasRelevantRules(), hasMatchingAskRule() handle
  both run_shell_command and monitor
- monitor.ts: emit Monitor(...) instead of Bash(...) in permissionRules
- Tests: parseRule, matchesRule, cross-tool isolation regression,
  buildPermissionRules, buildHumanReadableRuleLabel for Monitor

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): decouple headless monitor lifetime from final result

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): stabilize stream-json monitor session shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): deny monitor in headless approval defaults

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): honor tool aliases in headless allow checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address opus review — sleep regex, monitor cap, non-interactive cleanup

- Fix sleep interception false positive for backgrounded sleep (`sleep 5 &
  echo done`). Remove bare `&` from separator character class so the
  background operator is not treated as a sequential separator.
- Add MAX_CONCURRENT_MONITORS (16) check in MonitorRegistry.register()
  and early rejection in MonitorTool.execute() to prevent unbounded
  process spawning.
- Widen monitorId from 8 to 16 hex chars to reduce birthday collision risk.
- Abort all running monitors in nonInteractiveCli.ts success-path finally
  so piped stdio refs don't keep the Node event loop alive after result
  emission in one-shot (--print) mode.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors and background shells on /clear

Without this, long-running monitors from a previous session survive
/clear and continue pushing events into the new session's notification
queue. This enables cross-session prompt injection where a malicious
monitor persists across the user's escape hatch.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors on stream-json session shutdown

Call monitorRegistry.abortAll() in both shutdown() and
drainAndShutdown() so detached monitor child processes don't survive
session termination.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): use content event type in stream tests

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): isolate session cleanup on clear and shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): finalize session cleanup after drain

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close remaining monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve shell cwd in virtual permission checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize trailing background ampersands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor permission and wrapper handling

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): make monitor CI assertions cross-platform

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor wrapper normalization

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize wrapped monitor commands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor headless edge cases

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve monitor spawn errors

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor register cleanup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): parse monitor wrapper script token

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review comments for monitor tool

- Make Bash(...) permission rules cover monitor via toolMatchesRuleToolName,
  so deny rules like Bash(rm *) also block monitor({command: "rm ..."})
- Remove dead `normalizeRuleToolName` mock reference in config.test.ts
- Fix tool description to mention stdout/stderr instead of just stdout
- Export MAX_CONCURRENT_MONITORS from monitorRegistry and use it in
  monitor.ts instead of hardcoded 16
- Rename ambiguous MAX_LINE_LENGTH constants: PARTIAL_LINE_BUFFER_CAP
  (4096, monitor.ts) and EVENT_LINE_TRUNCATE (2000, monitorRegistry.ts)
- Fix schema description text: "Max 80 characters" → "Truncated to 80
  characters in display"
- Add .unref() to SIGTERM→SIGKILL escalation timer to prevent 200ms
  exit delay

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): resolve clear command typecheck issues

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve background tasks across shutdown abort

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address latest monitor review comments

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): handle monitors across session switches

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): cover aborted monitor startup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address remaining monitor PR review comments

Adopts four unresolved review threads on PR #3684:

* shell: trim top-level trailing comments before validating sleep
  separator so 'sleep 5 # wait' no longer bypasses
  detectBlockedSleepPattern.
* monitor: add sanitizeMonitorLine to strip C0/C1 control chars
  (except tab) and defang structural envelope tag names with U+200B
  before forwarding output to the model, blocking prompt-injection
  attempts hidden in monitored stdout/stderr.
* monitor: declare line buffers and throttledEmit before abortHandler
  to avoid TDZ on synchronous abort paths, and add
  flushPartialLineBuffers called from both abortHandler (before kill)
  and cleanup (natural exit/error) so partial-line data is no longer
  silently dropped on cancel.
* permissions: document that normalizePermissionContext relies on
  buildPermissionCheckContext to forward monitor's directory as cwd,
  and add regression tests proving relative-path Read(./...) allow
  and deny rules resolve against the monitor's explicit cwd.

* fix(core): abort running monitors in MonitorRegistry.reset()

reset() previously only cleared idle timers and emptied the map without
aborting running monitors' AbortControllers. This could orphan child
processes when reset() was called without a prior abortAll(), e.g. via
useResumeCommand → resetBackgroundStateForSessionSwitch.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(core): harden monitor notification XML and displayText

- Extend escapeXml to escape " and ' as defense-in-depth: safe to reuse
  the helper in any future XML attribute context without re-auditing.
- Strip C0 (except tab) and C1 control characters from the displayText
  surface before interpolation, so untrusted child-process output cannot
  leak ANSI escapes / NUL bytes into the operator's terminal even if a
  direct caller of MonitorRegistry.emitEvent skips sanitization.

Adds unit tests for both hardening paths.

* test(core): cover token-bucket throttling and commented-sleep bypass

- Add 4 unit tests for the monitor token-bucket throttle (burst=5,
  1 token/sec refill): burst cap, refill release, long-idle bucket cap,
  and whitespace lines not consuming budget. Uses vi.setSystemTime to
  exercise Date.now() without advancing pending setTimeouts.
- Add an E2E case that feeds 'sleep 5 # wait for db' through the shell
  tool to lock in trimTrailingShellComment behavior end-to-end; the
  unit-level coverage in shell.test.ts remains authoritative but the
  E2E anchor prevents a regression from silently passing unit tests.

* fix(core): address 3 remaining copilot review comments

1. shell.ts sleep interception: strip shell wrapper before detecting the
   blocked sleep pattern so `bash -c 'sleep 5'` / `sh -c ...` cannot
   route around the block. Mirrors every other sensitive check in
   shell.ts, which already normalizes through stripShellWrapper.

2. monitorRegistry.ts emitEvent auto-stop: settle the entry BEFORE
   aborting its controller so that any synchronous abort listener that
   flushes buffered output back through registry.emitEvent() (e.g. the
   Monitor tool's flushPartialLineBuffers) finds status !== 'running'
   and short-circuits instead of overshooting maxEvents and emitting a
   duplicate 'Max events reached' terminal notification.

3. monitorRegistry.ts truncateDescription: cap output at exactly
   MAX_DESCRIPTION_LENGTH by counting the ellipsis against the budget,
   instead of returning MAX_DESCRIPTION_LENGTH + 3 characters.

Each fix is covered by a new unit test.

* fix(core): address review comments — sanitize, notify, kill logging, throttle observability

- Remove double normalize in buildPermissionCheckContext (PM is single source)
- Add {notify:false} to Config.shutdown() and abortTaskRegistries() abortAll
- Swap settle-before-abort in cancel() and resetIdleTimer() to prevent races
- Add stripDisplayControlChars to emitTerminalNotification
- Sanitize monitor description at entry creation via sanitizeMonitorLine
- Surface throttle-dropped line count in terminal notification
- Add .unref() to idle timer to allow clean process exit
- Add error handler + stdio:ignore to Windows taskkill spawn
- Log SIGTERM/SIGKILL kill failures via debugLogger.warn
- Attach early child error handler to cover spawn-to-register window
- Destroy child stdio on register failure to prevent handle leaks
- Improve stripShellWrapper to handle absolute paths, combined flags, env prefix
- Improve SHELL_TOOL_NAMES documentation and toolMatchesRuleToolName clarity

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): resolve monitor tool typecheck errors

- Cast child.stdout/stderr to a minimal { destroy?: () => void } shape so
  the optional destroy() call compiles and still works with test mocks.
- Initialize droppedLines: 0 in MonitorEntry test fixtures that predate
  the field becoming required.

* fix(monitor): add missing stdio option in taskkill test assertions (#3784)

* fix(core): address monitor review feedback

* fix(core): harden monitor command lifecycle

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request May 3, 2026
Captures current state of the bg-agent subsystem in the fork (what's
already wired, what is not), maps the upstream qwen-code PRs we have
not yet ported (QwenLM#3076QwenLM#3739), and sketches three phases to close the
gap:

- Phase A: model-facing agent control + event monitor (QwenLM#3471, QwenLM#3684,
  QwenLM#3687)
- Phase B: TUI surface + /tasks command (QwenLM#3488, QwenLM#3642)
- Phase C: cross-session resume (QwenLM#3739)

Also calls out cross-cutting decisions we should make before Phase A
lands: settings layout, stop-tool naming, persistence shape, gateway
validation.

This is a planning doc, not a spec. Per-phase code-level designs come
later.

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DragonnZhang pushed a commit that referenced this pull request May 8, 2026
… C) (#3684)

* feat(core): event monitor tool with throttled stdout streaming (Phase C)

Add a new Monitor tool that spawns a long-running shell command and streams
its stdout lines back to the agent as event notifications. This is Phase C
from the background task management roadmap (#3634, #3666).

What changes:
- New MonitorRegistry (services/monitorRegistry.ts): per-monitor entry with
  lifecycle (running/completed/failed/cancelled), idle timeout auto-stop,
  max events auto-stop, AbortController-based cancellation. Follows the
  same structural pattern as BackgroundTaskRegistry.
- New Monitor tool (tools/monitor.ts): spawns via child_process.spawn with
  independent AbortController (Ctrl+C won't kill monitors), separate
  stdout/stderr line buffers, token-bucket throttling (burst=5, sustain=1/s).
  Returns immediately with monitor ID; events stream as notifications.
- Sleep interception in shell.ts: detectBlockedSleepPattern() blocks
  foreground `sleep N` (N>=2) and guides model to use Monitor or
  is_background instead.
- Config integration: MonitorRegistry instantiation, accessor, shutdown
  cleanup (abortAll), lazy tool registration.
- CLI wiring: notification callbacks in useGeminiStream.ts (interactive)
  and nonInteractiveCli.ts (headless), including hold-back loop abort on
  exit and SIGINT cleanup.

What this PR doesn't do (gated on #3471/#3488):
- Footer pill / dialog integration
- task_stop / send_message integration

Test plan:
- 21 MonitorRegistry unit tests (lifecycle, idle timeout, max events,
  XML escaping, nonexistent ID guard, callback clearing)
- 20 Monitor tool unit tests (validation, spawn, line buffering, separate
  stdout/stderr buffers, throttling, signal-killed path, turn isolation)
- 7 detectBlockedSleepPattern unit tests
- 2 E2E tests (monitor invocation, sleep interception)
- Full core suite: 248 files / 6151 passed

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): hold-back loop waits for monitors + emit task_started for SDK

Two fixes from Codex review:

P1: The non-interactive hold-back loop now includes monitorRegistry.getRunning()
in its wait condition, so monitors can stream events before the CLI exits.
Previously monitors were aborted immediately after the agent's first reply.

P2: MonitorRegistry gains setRegisterCallback(), and nonInteractiveCli wires
it to emit task_started system messages. Stream-json/SDK consumers now see
a task_started for each monitor, matching the backgroundTaskRegistry contract.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): Windows process kill + pipeline sleep false-positive

Two fixes from Codex review:

P1: Monitor abort handler now uses `taskkill /f /t` on Windows instead
of POSIX-only `process.kill(-pid)`. Follows the existing pattern in
ShellExecutionService.childProcessFallback.

P2: detectBlockedSleepPattern no longer uses splitCommands (which splits
on `|` pipes). Replaced with a regex that only matches sleep followed by
sequential separators (&&, ||, ;, &, newline), not pipes. `sleep 5 | cat`
is now correctly allowed.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): resolve TS errors in monitor.test.ts mock types

Use Object.defineProperty for readonly ChildProcess.pid and proper
Readable type for stdout/stderr mocks to satisfy strict tsc builds.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): remove false notification promise + add early-abort guard

P1: Sleep interception guidance no longer promises "completion notification"
for is_background — that wiring doesn't exist yet (follow-up from #3642).

P2: Monitor.execute() now checks _signal.aborted before spawning, preventing
a race where cancellation during tool scheduling still launches a monitor.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to useGeminiStream tests

The useGeminiStream hook now calls config.getMonitorRegistry() to wire
up monitor notification callbacks. The test mock config was missing this
method, causing 64 test failures with "config.getMonitorRegistry is not
a function".

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to nonInteractiveCli tests

Same fix as useGeminiStream.test.tsx — the mock config needs
getMonitorRegistry to avoid "is not a function" errors (29 failures).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review — CORE_TOOLS, directory param, test fix

1. Add 'monitor' to PermissionManager.CORE_TOOLS so coreTools allowlist
   correctly gates the monitor tool (same as run_shell_command).

2. Add optional 'directory' parameter to MonitorTool with workspace
   validation, mirroring ShellTool's directory support for multi-root
   workspaces.

3. Fix sleep-interception E2E test: readToolLogs() doesn't expose
   toolResult, so the old assertion was dead code. Now verifies via
   the model's output text instead.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address MonitorTool review #4186888042

Addresses three [Critical] review comments on packages/core/src/tools/monitor.ts:

1. Partial-line buffer unbounded growth (processLines)
   MAX_LINE_LENGTH was only enforced after a newline, so a command emitting
   a long stream without newlines would grow buffer.value without bound and
   re-split the entire accumulated string on every chunk. Now, when the
   buffer has no newline and exceeds MAX_LINE_LENGTH, we force-emit a single
   truncated event through the throttled path and reset the buffer.

2. Missing type guard on params.command
   validateToolParamValues called params.command.trim() without a typeof
   check. Schema validation normally catches this, but SDK/direct callers
   could bypass it and hit an uncaught TypeError. Added typeof === 'string'
   guard, matching the pattern used for max_events / idle_timeout_ms.

3. Workspace check bypass via raw startsWith
   The directory validator used workspaceDirs.some(d => params.directory
   .startsWith(d)), which allowed prefix collisions (e.g. /tmp/project-evil
   against a /tmp/project workspace) and skipped canonicalisation / symlink
   resolution. Switched to WorkspaceContext.isPathWithinWorkspace, which
   already does fullyResolvedPath + segment-aware isPathWithinRoot matching
   and is the standard used elsewhere in the codebase.

Test coverage: added 6 unit tests covering non-string command guard,
non-absolute directory rejection, prefix-collision rejection, traversal
rejection, workspace acceptance, and partial-line cap behaviour
(including buffer reset). All 26 monitor.test.ts cases pass.

The same startsWith pattern also exists in ShellTool and is tracked as a
separate follow-up to keep this PR focused on Phase C scope.

* fix(core): scope monitor always-allow permissions

Populate Monitor confirmation permissionRules using the same command-rule extraction path as ShellTool, so ProceedAlways persists command-scoped Bash(...) rules instead of a broad monitor-level allow. Also add unit coverage for command-scoped rules, filtering already-allowed subcommands, and extractor fallback behavior.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): decouple monitor permission scope from Bash rules

Remove pm.isCommandAllowed() from MonitorToolInvocation.getConfirmationDetails()
to prevent existing Bash(...) allow rules from shrinking the monitor confirmation
scope. Monitor is a long-running background process with a different risk profile
than one-shot shell execution and should maintain its own permission boundary.
Only AST-based read-only filtering is retained.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): unify monitor error/exit cleanup to prevent resource leaks

Extract a shared cleanup() helper called from both the `exit` and
`error` event handlers. Previously the `error` handler did not flush
buffers, clear buffer values, remove the abort listener, or log
dropped-line stats — causing potential memory leaks when `error` fires
without a subsequent `exit` (e.g. ENOENT for missing commands).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): add user-skills-directory guard to monitor directory validation

Mirror ShellTool's getUserSkillsDirs() check in MonitorTool's
validateToolParamValues() to prevent monitor commands from running
inside user skills directories.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(core): add Monitor(...) permission namespace for monitor tool (#3726)

Introduce a dedicated Monitor(...) permission namespace so monitor and
shell tools have independent permission boundaries. Previously monitor
emitted Bash(...) rules, causing "Always Allow" to fail for future
monitor invocations while unintentionally granting run_shell_command.

Changes:
- rule-parser.ts: add Monitor alias, SHELL_TOOL_NAMES entry,
  CANONICAL_TO_RULE_DISPLAY, DISPLAY_NAME_TO_VERB
- permission-manager.ts: extract SHELL_LIKE_TOOLS set so evaluate(),
  evaluateSingle(), hasRelevantRules(), hasMatchingAskRule() handle
  both run_shell_command and monitor
- monitor.ts: emit Monitor(...) instead of Bash(...) in permissionRules
- Tests: parseRule, matchesRule, cross-tool isolation regression,
  buildPermissionRules, buildHumanReadableRuleLabel for Monitor

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): decouple headless monitor lifetime from final result

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): stabilize stream-json monitor session shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): deny monitor in headless approval defaults

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): honor tool aliases in headless allow checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address opus review — sleep regex, monitor cap, non-interactive cleanup

- Fix sleep interception false positive for backgrounded sleep (`sleep 5 &
  echo done`). Remove bare `&` from separator character class so the
  background operator is not treated as a sequential separator.
- Add MAX_CONCURRENT_MONITORS (16) check in MonitorRegistry.register()
  and early rejection in MonitorTool.execute() to prevent unbounded
  process spawning.
- Widen monitorId from 8 to 16 hex chars to reduce birthday collision risk.
- Abort all running monitors in nonInteractiveCli.ts success-path finally
  so piped stdio refs don't keep the Node event loop alive after result
  emission in one-shot (--print) mode.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors and background shells on /clear

Without this, long-running monitors from a previous session survive
/clear and continue pushing events into the new session's notification
queue. This enables cross-session prompt injection where a malicious
monitor persists across the user's escape hatch.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors on stream-json session shutdown

Call monitorRegistry.abortAll() in both shutdown() and
drainAndShutdown() so detached monitor child processes don't survive
session termination.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): use content event type in stream tests

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): isolate session cleanup on clear and shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): finalize session cleanup after drain

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close remaining monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve shell cwd in virtual permission checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize trailing background ampersands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor permission and wrapper handling

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): make monitor CI assertions cross-platform

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor wrapper normalization

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize wrapped monitor commands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor headless edge cases

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve monitor spawn errors

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor register cleanup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): parse monitor wrapper script token

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review comments for monitor tool

- Make Bash(...) permission rules cover monitor via toolMatchesRuleToolName,
  so deny rules like Bash(rm *) also block monitor({command: "rm ..."})
- Remove dead `normalizeRuleToolName` mock reference in config.test.ts
- Fix tool description to mention stdout/stderr instead of just stdout
- Export MAX_CONCURRENT_MONITORS from monitorRegistry and use it in
  monitor.ts instead of hardcoded 16
- Rename ambiguous MAX_LINE_LENGTH constants: PARTIAL_LINE_BUFFER_CAP
  (4096, monitor.ts) and EVENT_LINE_TRUNCATE (2000, monitorRegistry.ts)
- Fix schema description text: "Max 80 characters" → "Truncated to 80
  characters in display"
- Add .unref() to SIGTERM→SIGKILL escalation timer to prevent 200ms
  exit delay

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): resolve clear command typecheck issues

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve background tasks across shutdown abort

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address latest monitor review comments

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): handle monitors across session switches

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): cover aborted monitor startup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address remaining monitor PR review comments

Adopts four unresolved review threads on PR #3684:

* shell: trim top-level trailing comments before validating sleep
  separator so 'sleep 5 # wait' no longer bypasses
  detectBlockedSleepPattern.
* monitor: add sanitizeMonitorLine to strip C0/C1 control chars
  (except tab) and defang structural envelope tag names with U+200B
  before forwarding output to the model, blocking prompt-injection
  attempts hidden in monitored stdout/stderr.
* monitor: declare line buffers and throttledEmit before abortHandler
  to avoid TDZ on synchronous abort paths, and add
  flushPartialLineBuffers called from both abortHandler (before kill)
  and cleanup (natural exit/error) so partial-line data is no longer
  silently dropped on cancel.
* permissions: document that normalizePermissionContext relies on
  buildPermissionCheckContext to forward monitor's directory as cwd,
  and add regression tests proving relative-path Read(./...) allow
  and deny rules resolve against the monitor's explicit cwd.

* fix(core): abort running monitors in MonitorRegistry.reset()

reset() previously only cleared idle timers and emptied the map without
aborting running monitors' AbortControllers. This could orphan child
processes when reset() was called without a prior abortAll(), e.g. via
useResumeCommand → resetBackgroundStateForSessionSwitch.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(core): harden monitor notification XML and displayText

- Extend escapeXml to escape " and ' as defense-in-depth: safe to reuse
  the helper in any future XML attribute context without re-auditing.
- Strip C0 (except tab) and C1 control characters from the displayText
  surface before interpolation, so untrusted child-process output cannot
  leak ANSI escapes / NUL bytes into the operator's terminal even if a
  direct caller of MonitorRegistry.emitEvent skips sanitization.

Adds unit tests for both hardening paths.

* test(core): cover token-bucket throttling and commented-sleep bypass

- Add 4 unit tests for the monitor token-bucket throttle (burst=5,
  1 token/sec refill): burst cap, refill release, long-idle bucket cap,
  and whitespace lines not consuming budget. Uses vi.setSystemTime to
  exercise Date.now() without advancing pending setTimeouts.
- Add an E2E case that feeds 'sleep 5 # wait for db' through the shell
  tool to lock in trimTrailingShellComment behavior end-to-end; the
  unit-level coverage in shell.test.ts remains authoritative but the
  E2E anchor prevents a regression from silently passing unit tests.

* fix(core): address 3 remaining copilot review comments

1. shell.ts sleep interception: strip shell wrapper before detecting the
   blocked sleep pattern so `bash -c 'sleep 5'` / `sh -c ...` cannot
   route around the block. Mirrors every other sensitive check in
   shell.ts, which already normalizes through stripShellWrapper.

2. monitorRegistry.ts emitEvent auto-stop: settle the entry BEFORE
   aborting its controller so that any synchronous abort listener that
   flushes buffered output back through registry.emitEvent() (e.g. the
   Monitor tool's flushPartialLineBuffers) finds status !== 'running'
   and short-circuits instead of overshooting maxEvents and emitting a
   duplicate 'Max events reached' terminal notification.

3. monitorRegistry.ts truncateDescription: cap output at exactly
   MAX_DESCRIPTION_LENGTH by counting the ellipsis against the budget,
   instead of returning MAX_DESCRIPTION_LENGTH + 3 characters.

Each fix is covered by a new unit test.

* fix(core): address review comments — sanitize, notify, kill logging, throttle observability

- Remove double normalize in buildPermissionCheckContext (PM is single source)
- Add {notify:false} to Config.shutdown() and abortTaskRegistries() abortAll
- Swap settle-before-abort in cancel() and resetIdleTimer() to prevent races
- Add stripDisplayControlChars to emitTerminalNotification
- Sanitize monitor description at entry creation via sanitizeMonitorLine
- Surface throttle-dropped line count in terminal notification
- Add .unref() to idle timer to allow clean process exit
- Add error handler + stdio:ignore to Windows taskkill spawn
- Log SIGTERM/SIGKILL kill failures via debugLogger.warn
- Attach early child error handler to cover spawn-to-register window
- Destroy child stdio on register failure to prevent handle leaks
- Improve stripShellWrapper to handle absolute paths, combined flags, env prefix
- Improve SHELL_TOOL_NAMES documentation and toolMatchesRuleToolName clarity

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): resolve monitor tool typecheck errors

- Cast child.stdout/stderr to a minimal { destroy?: () => void } shape so
  the optional destroy() call compiles and still works with test mocks.
- Initialize droppedLines: 0 in MonitorEntry test fixtures that predate
  the field becoming required.

* fix(monitor): add missing stdio option in taskkill test assertions (#3784)

* fix(core): address monitor review feedback

* fix(core): harden monitor command lifecycle

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
…agent transcript) (QwenLM#3471)

* feat(core): task_stop, send_message, and live transcripts for background agents

Add two new tools (task_stop, send_message) and a plain-text transcript
writer so the parent model can control and observe long-running background
subagents. The agent lifecycle is also tightened so every background launch
is paired with exactly one terminal task_notification — including under
cancellation races and pathological tools that swallow AbortSignal.

* feat(core): switch background-agent transcript to ChatRecord JSONL

Replaces the plain-text per-agent transcript writer with one that emits
the same ChatRecord schema as the main session log. Each background
subagent now writes to <projectDir>/subagents/<sessionId>/agent-<id>.jsonl
with a .meta.json sidecar; records carry agentId/agentName/agentColor
and isSidechain so a single parser can reconstruct the parent session
and its subagents as one tree.

A new EXTERNAL_MESSAGE event is emitted when send_message injections are
drained inside agent-core, so each follow-up message is persisted as a
user-role record and the transcript remains a complete view of the run.

read_file's auto-allow set is extended to <projectDir>/subagents/ so the
model can keep polling the transcript path advertised in the launch
response and the completion notification XML.

* feat(core): emit full background agent result in task-notification

Drop the 2000-char truncation on <result> in emitNotification. The agent
output is already a model-generated summary; truncating it strips content
the parent agent specifically asked for. The <output-file> path is still
included for anyone who wants the structured transcript.

* test(cli): add hasUnfinalizedAgents/abortAll to registry mock

The nonInteractiveCli test stub was missing two methods that the
runtime now calls when draining background agents on shutdown,
causing every runNonInteractive test to fail with TypeError.

* test(core): use path.join in agent-transcript path helper assertions

Hard-coded forward slashes in expected paths failed on Windows where
path.join produces backslashes.

* fix(core): thread nested agent identity into sidecar metadata

* feat(agent): improve background agent launch tool result

- Add internal-ID qualifier, anti-duplication clause, and large-file reading strategy to the launch tool-result template, ported from claw-code.
- Rename transcript_file to output_file for consistency.
- Reference read_file and run_shell_command via ToolNames constants instead of raw strings.

* fix(core): rename send_message target field

* fix(core): exclude task_stop and send_message from subagents

These are parent-side control-plane tools for managing background
subagents. Subagents themselves cannot launch background agents
(AGENT is already excluded), so they have no agent IDs to manage
natively, and exposing the tools only widens the surface for
cross-agent interference if an ID leaks via prompt or transcript.

* refactor(core): generalize task_stop and send_message framing to "task"

Today every BackgroundTaskRegistry entry is a subagent, but the
control-plane tools were named and described as agent-only. Generalize
so future task kinds (e.g. backgrounded shells, monitors) can share
the same registry without a model-facing rename.

- task_stop / send_message: descriptions, error messages, and ToolError
  enum values drop the "agent" framing in favor of "task".
- send_message: parameter to -> task_id, matching task_stop for a
  uniform control-plane contract.
- BackgroundTaskRegistry.hasUnfinalizedAgents -> hasUnfinalizedTasks.
- agent-transcript: add a TODO at getSubagentSessionDir flagging that
  <projectDir>/subagents/ is part of the model-facing contract via
  <output-file>; future kinds should migrate to <projectDir>/tasks/.
- Add a test for complete()-after-finalizeCancelled no-op to pin the
  one-notification-per-task SDK contract through the post-notified
  re-entry path.
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
…wenLM#3488)

* feat(cli): background-task UI — pill, combined dialog, detail view

Adds the user-facing surface for background tasks on top of the
model-facing agent control primitives merged in QwenLM#3471. A dedicated
pill in the footer summarises running tasks, ↓ focuses it, and Enter
opens a combined dialog listing every task with a detail view that
shows the original prompt, live stats, and a rolling progress feed
of recent tool invocations.

Also renames BackgroundAgent* to BackgroundTask* for consistency with
the user-facing terminology and the task_* tool family.

* chore: trigger CI
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
…#3642)

* feat(core): managed background shell pool with /bashes command

Replace shell.ts's `&` fork-and-detach background path with a managed
process registry. Background shells now have observable lifecycle, captured
output, and explicit cancellation — matching the pattern used by background
subagents (QwenLM#3076).

Phase B from QwenLM#3634 (background task management roadmap).

What changes
- New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts):
  per-process entry with status (running / completed / failed / cancelled),
  AbortController, output file path. State transitions are one-shot
  (terminal status sticks; late callbacks no-op). Mirrors the lifecycle
  shape of QwenLM#3471's BackgroundTaskRegistry so the two can be unified later.
- `shell.ts` is_background path rewritten as `executeBackground`:
  - Spawns the unwrapped command (no '&', no pgrep envelope)
  - Streams stdout to `<projectDir>/tasks/<sessionId>/shell-<id>.output`
    (path layout aligns with the direction sketched in QwenLM#3471 review)
  - Bridges the external abort signal into the entry's AbortController so
    a single source of truth governs cancellation
  - Returns immediately with id + output path; agent's turn isn't blocked
  - Settles the registry entry asynchronously when ShellExecutionService
    resolves: complete (clean exit) / fail (error) / cancel (aborted)
- Removes ~120 lines of dead bg-specific code from shell.ts:
  pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows
  early-return path, bg PID parsing, tempFile cleanup
- New `/bashes` slash command: lists registered shells with id, status,
  runtime, command, output path. Empty state prints a friendly message.

What this PR doesn't do
- Footer pill / dialog integration — gated on QwenLM#3488 landing
- task_stop / send_message integration — gated on QwenLM#3471 landing
- Auto-backgrounding heuristics for long foreground bash — Phase D

Test plan
- 11 registry unit tests (state machine + idempotent terminal transitions)
- 4 background-path tests in shell.test.ts (spawn no-wrap + complete /
  fail / cancel settle paths)
- 2 /bashes command tests (empty + populated)
- Full core suite: 247 files / 6075 passed (existing tests unaffected)

* fix(core): address PR QwenLM#3642 review feedback

Three [Critical] from the auto review + naming alignment with Claude Code:

- shell.ts settle: non-zero exit code or termination signal now bucket into
  `failed` instead of `completed`. The previous `if (result.error) fail else
  complete()` would misreport `false` / failed `npm test` as success because
  ShellExecutionService surfaces ordinary command failures as a non-zero
  exitCode with `error: null`. Failure reason carries the exit code or signal
  so `/tasks` shows the real cause.

- ShellExecutionService.childProcessFallback: add `streamStdout` mode that
  emits each decoded chunk through the existing onOutputEvent path. The
  default (foreground) path continues to buffer + emit the cleaned final
  blob, so existing in-line shell calls are unaffected. executeBackground
  opts in via `{ streamStdout: true }`, which is what makes the captured
  output file actually useful for long-running processes (dev servers,
  watchers) — without it the file stayed empty until the process exited.

- shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'`
  but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the
  build; switched to `signal: null`. Added a test that explicitly covers
  the new "non-zero exit → failed" path so the bucketing change has
  regression coverage.

- shell.ts comment: explicitly document why background shells force
  `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead
  weight for fire-and-forget commands).

- /bashes → /tasks (alias bashes), description "List and manage background
  tasks" — matches Claude Code's command name. Currently lists shells only;
  will surface other task kinds (subagents, monitor) as those registries
  land via QwenLM#3471 / QwenLM#3488.

* fix(core): address PR QwenLM#3642 second-round review feedback

- shellExecutionService streaming: drop stdout/stderr buffer + outputChunks
  accumulation in streaming mode. Each decoded chunk goes straight to
  onOutputEvent and is GC-eligible immediately. Long-running background
  commands (dev servers, watchers) no longer accumulate unbounded memory
  proportional to total output. Buffered (foreground) mode is unchanged.

- shell.ts executeBackground: stripAnsi each chunk before writing to the
  output file. Dev servers / build tools spam color codes and cursor-move
  sequences that would render as garbage in the file the agent reads.

- bashesCommand: command description "List and manage" → "List background
  tasks" — current implementation only supports listing, cancellation
  follows when the unified task_stop tool from QwenLM#3471 is wired in. Replace
  the hand-rolled formatRuntime helper with the shared formatDuration
  utility (uses hideTrailingZeros for parity with the previous output).

- backgroundShellRegistry: add a comment documenting the lack of an
  eviction policy as a known limitation. LRU / age-based / capped-size
  eviction (and on-disk output rotation) is left as a follow-up alongside
  the broader output-file lifecycle story.

* fix(core): address PR QwenLM#3642 third-round review feedback

- shell.ts executeBackground: add 'error' listener on the output write
  stream. fs.createWriteStream surfaces write failures (disk full,
  permission, fs going away) as 'error' events; without a listener Node
  treats it as an uncaught exception and kills the entire CLI session.
  Log + drop is the sane default — the registry still settles via
  resultPromise so /tasks shows the right terminal status.

- shell.ts executeBackground: store the abort handler reference and
  removeEventListener in the settle callback. Background shells outlive
  the turn signal; the dangling listener was keeping `entryAc` (and
  transitively `outputStream`) reachable until the turn signal itself was
  GC'd, which for long sessions would never happen.

- shell.test.ts: extend the createWriteStream mock with an `on` stub so
  the new error-listener wiring doesn't crash the test suite.

* refactor(cli): drop /bashes alias and rename file to tasksCommand

Per follow-up review: the slash command should be exclusively /tasks.
Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` →
`tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand`
→ `tasksCommand`, and cleans up the remaining `/bashes` references in
backgroundShellRegistry.ts comments. No behavior change beyond the
alias removal.

* refactor(cli): finish tasksCommand rename — apply content changes

The previous commit (7b8b73b75) only captured the file rename via
`git mv`; the export name change (`bashesCommand` → `tasksCommand`),
the removal of `altNames: ['bashes']`, the import update in
BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in
backgroundShellRegistry.ts were unstaged when that commit landed.
Squash candidate before merge.

* fix(core): address PR QwenLM#3642 fourth-round review feedback

Four reviewer concerns from @wenshao + @doudouOUC:

- [Critical] Config.shutdown() now also calls
  `backgroundShellRegistry.abortAll()`. Previously only the subagent
  registry was aborted, so a managed background shell could outlive the
  CLI process and orphan its child. Symmetric with how
  `BackgroundTaskRegistry.abortAll()` is wired in.

- [P1] shell.ts executeBackground strips a trailing `&` from the command
  before spawn. The managed path is itself the backgrounding mechanism;
  forwarding `node server.js &` verbatim made bash exit immediately while
  the real child outlived the wrapper, causing the registry to settle as
  `completed` while the shell was still running and chunked output to
  land on a closed stream. Strip + warn.

- [P2] Output file moves under `storage.getProjectTempDir()` (specifically
  `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`).
  `ReadFileTool` already auto-allows the project temp dir, so the LLM
  can `Read` the captured output without bouncing off a permission
  prompt — important because background-agent contexts can't surface
  interactive prompts.

- [P2] Background shells are no longer killed when the current turn's
  AbortSignal fires. Forwarding the turn signal into the entry's
  AbortController meant a Ctrl+C on the turn would also terminate
  intentionally backgrounded dev servers / watchers, contradicting the
  independent-lifecycle promise. Cancellation now flows only through
  `entryAc` (driven by future `task_stop` integration via QwenLM#3471).

Tests:
- New `abortAll` registry tests cover running / mixed / empty cases.
- `runs background commands as managed pool entries` test stops asserting
  the wrapper-vs-entry signal identity since they're now structurally
  separate (no turn-to-entry forwarding).
- New `does not forward the turn signal into the background shell` test
  pins the new behavior.
- New `strips trailing & from the spawned command` test pins the strip.
- Removed the cancel-via-outer-signal settle test — that path no longer
  exists; cancellation is exercised end-to-end via the registry's own
  `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`.

* fix(core): tighten trailing & strip — narrow regex + ReDoS-safe

Two reviewer concerns on the same line of QwenLM#3642 round 4:

- [Critical CodeQL] `\s*&+\s*$` is a polynomial-time regex on
  uncontrolled input (long all-`&` strings backtrack quadratically).
- [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&`
  into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into
  `echo foo \` (eats the escaped literal). Only the bare bash background
  operator should be stripped.

Replace the regex with a small linear-time helper
`stripTrailingBackgroundAmp` that explicitly checks for the three
"don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` /
`slice` — no regex backtracking, and the intent reads off the page.

Tests:
- Existing strip-trailing-`&` test still passes.
- New `does not strip a trailing &&` test pins the logical-AND case.
- New `does not strip an escaped trailing \\&` test pins the escape case.

* fix(core): keep binary-detection sniff in streaming mode

@doudouOUC noted that `streamStdout` shortcut returned before the
binary-sniff path, so a background command emitting binary bytes
(`cat /bin/ls`, image dump, etc.) would be text-decoded and appended
to the task output file unbounded.

Restructure handleOutput so the sniff-and-cutover logic runs in both
modes:

- Both modes accumulate up to MAX_SNIFF_SIZE for the binary check.
  The accumulator is bounded; once the threshold is reached, it stops
  growing in streaming mode (dropped on binary detection / left
  inert on text confirmation) and continues to accumulate in buffered
  mode (existing foreground behavior).
- Streaming mode emits 'binary_detected' as soon as `isBinary` trips
  so the consumer can stop writing the output file. Up to ~4KB of
  bytes may have been emitted as text chunks before detection — this
  is bounded and acceptable; the unbounded write is the pathology
  reviewers flagged.
- Streaming text mode still emits each decoded chunk immediately and
  does not accumulate stdout/stderr strings, so long-running text
  streams remain GC-friendly.
- Buffered (foreground) behavior is unchanged — the sniff accumulator
  is the same path the existing tests cover.

Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57
shell.test.ts all pass; no regressions.

* fix(core): tighten streaming sniff bound + Windows rmSync flake

Two unrelated reds on the latest CI run:

1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks.
   The previous fix recomputed `sniffedBytes` from
   `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk —
   pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE
   (line-sized stdout, e.g. dev-server logs) the byte count never grew,
   the sniff branch stayed open forever, and `outputChunks` accumulated
   every later chunk — exactly the leak `streamStdout` was meant to
   prevent.

   Track sniffed bytes by running sum (`sniffedBytes += data.length`)
   so the bound is genuine. When sniff confirms text in streaming mode,
   drop the accumulator immediately so subsequent chunks fall through
   the streaming emit path without ever touching it.

2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows
   (ENOTEMPTY: directory not empty). The exporter's underlying write
   stream hasn't always released its handle by the time `rmSync` runs.
   Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through
   the brief Windows handle-release window instead of failing the test
   on a CI quirk.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
* feat(core): wire background shells into the task_stop tool

Phase B follow-up QwenLM#1 from QwenLM#3634, unblocked by QwenLM#3471 (control plane) merging
in. The model can now cancel a managed background shell with the same
`task_stop` tool it uses for subagents — no more falling back to
`kill <pid>` via BashTool.

Lookup order: subagent registry first (existing behavior), then the
background shell registry as a fallback. Agent IDs follow
`<subagentName>-<suffix>` and shell IDs follow `bg_<8 hex chars>`, so the
two namespaces cannot collide in practice; the order is fixed for
determinism (a defensive test pins agent-wins-over-shell).

The shell cancel path resolves through the entry's own AbortController
(which `BackgroundShellRegistry.cancel` triggers); the child process
exit handler then settles the registry to `cancelled` and the on-disk
output file is preserved for inspection via `/tasks` or a direct `Read`.
This matches Phase B's "registry's own AbortController is the
cancellation source of truth" design without needing the in-flight
notification framework that subagents use.

Tests: 7 task-stop tests (was 4) — added cancel-shell happy path,
NOT_RUNNING for already-exited shell, and a defensive
agent-takes-precedence-on-id-collision case.

* fix(core): defer shell terminal transition until spawn handler settles

@doudouOUC noticed that the previous task_stop path called
`BackgroundShellRegistry.cancel(id, Date.now())`, which marked the entry
`cancelled` immediately. The spawn handler's settle path only records
real exit info via cancel/complete/fail when the entry is still
`running`, so the cancel-vs-exit race could permanently hide a real
completed/failed result and `/tasks` would show a terminal endTime
while the process was still draining.

Add a `requestCancel(id)` method to `BackgroundShellRegistry` that
triggers the entry's AbortController only; status stays `running` until
the settle path observes the abort and records the real terminal state.
The immediate-mark `cancel(id, endTime)` is reserved for `abortAll()` /
shutdown, where the CLI process is tearing down anyway and there is no
settle handler to wait for.

Tests updated:
- `task-stop.test.ts` cancel-shell happy path now asserts the entry
  stays `running` with `endTime` undefined post-stop, and the abort
  signal fires (the settle path's contract, not task_stop's, is the
  one that flips status).
- 3 new `requestCancel` tests in `backgroundShellRegistry.test.ts`:
  running → abort+still-running, terminal entry no-op, unknown id no-op.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
… C) (QwenLM#3684)

* feat(core): event monitor tool with throttled stdout streaming (Phase C)

Add a new Monitor tool that spawns a long-running shell command and streams
its stdout lines back to the agent as event notifications. This is Phase C
from the background task management roadmap (QwenLM#3634, QwenLM#3666).

What changes:
- New MonitorRegistry (services/monitorRegistry.ts): per-monitor entry with
  lifecycle (running/completed/failed/cancelled), idle timeout auto-stop,
  max events auto-stop, AbortController-based cancellation. Follows the
  same structural pattern as BackgroundTaskRegistry.
- New Monitor tool (tools/monitor.ts): spawns via child_process.spawn with
  independent AbortController (Ctrl+C won't kill monitors), separate
  stdout/stderr line buffers, token-bucket throttling (burst=5, sustain=1/s).
  Returns immediately with monitor ID; events stream as notifications.
- Sleep interception in shell.ts: detectBlockedSleepPattern() blocks
  foreground `sleep N` (N>=2) and guides model to use Monitor or
  is_background instead.
- Config integration: MonitorRegistry instantiation, accessor, shutdown
  cleanup (abortAll), lazy tool registration.
- CLI wiring: notification callbacks in useGeminiStream.ts (interactive)
  and nonInteractiveCli.ts (headless), including hold-back loop abort on
  exit and SIGINT cleanup.

What this PR doesn't do (gated on QwenLM#3471/QwenLM#3488):
- Footer pill / dialog integration
- task_stop / send_message integration

Test plan:
- 21 MonitorRegistry unit tests (lifecycle, idle timeout, max events,
  XML escaping, nonexistent ID guard, callback clearing)
- 20 Monitor tool unit tests (validation, spawn, line buffering, separate
  stdout/stderr buffers, throttling, signal-killed path, turn isolation)
- 7 detectBlockedSleepPattern unit tests
- 2 E2E tests (monitor invocation, sleep interception)
- Full core suite: 248 files / 6151 passed

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): hold-back loop waits for monitors + emit task_started for SDK

Two fixes from Codex review:

P1: The non-interactive hold-back loop now includes monitorRegistry.getRunning()
in its wait condition, so monitors can stream events before the CLI exits.
Previously monitors were aborted immediately after the agent's first reply.

P2: MonitorRegistry gains setRegisterCallback(), and nonInteractiveCli wires
it to emit task_started system messages. Stream-json/SDK consumers now see
a task_started for each monitor, matching the backgroundTaskRegistry contract.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): Windows process kill + pipeline sleep false-positive

Two fixes from Codex review:

P1: Monitor abort handler now uses `taskkill /f /t` on Windows instead
of POSIX-only `process.kill(-pid)`. Follows the existing pattern in
ShellExecutionService.childProcessFallback.

P2: detectBlockedSleepPattern no longer uses splitCommands (which splits
on `|` pipes). Replaced with a regex that only matches sleep followed by
sequential separators (&&, ||, ;, &, newline), not pipes. `sleep 5 | cat`
is now correctly allowed.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): resolve TS errors in monitor.test.ts mock types

Use Object.defineProperty for readonly ChildProcess.pid and proper
Readable type for stdout/stderr mocks to satisfy strict tsc builds.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): remove false notification promise + add early-abort guard

P1: Sleep interception guidance no longer promises "completion notification"
for is_background — that wiring doesn't exist yet (follow-up from QwenLM#3642).

P2: Monitor.execute() now checks _signal.aborted before spawning, preventing
a race where cancellation during tool scheduling still launches a monitor.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to useGeminiStream tests

The useGeminiStream hook now calls config.getMonitorRegistry() to wire
up monitor notification callbacks. The test mock config was missing this
method, causing 64 test failures with "config.getMonitorRegistry is not
a function".

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(test): add getMonitorRegistry mock to nonInteractiveCli tests

Same fix as useGeminiStream.test.tsx — the mock config needs
getMonitorRegistry to avoid "is not a function" errors (29 failures).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review — CORE_TOOLS, directory param, test fix

1. Add 'monitor' to PermissionManager.CORE_TOOLS so coreTools allowlist
   correctly gates the monitor tool (same as run_shell_command).

2. Add optional 'directory' parameter to MonitorTool with workspace
   validation, mirroring ShellTool's directory support for multi-root
   workspaces.

3. Fix sleep-interception E2E test: readToolLogs() doesn't expose
   toolResult, so the old assertion was dead code. Now verifies via
   the model's output text instead.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address MonitorTool review #4186888042

Addresses three [Critical] review comments on packages/core/src/tools/monitor.ts:

1. Partial-line buffer unbounded growth (processLines)
   MAX_LINE_LENGTH was only enforced after a newline, so a command emitting
   a long stream without newlines would grow buffer.value without bound and
   re-split the entire accumulated string on every chunk. Now, when the
   buffer has no newline and exceeds MAX_LINE_LENGTH, we force-emit a single
   truncated event through the throttled path and reset the buffer.

2. Missing type guard on params.command
   validateToolParamValues called params.command.trim() without a typeof
   check. Schema validation normally catches this, but SDK/direct callers
   could bypass it and hit an uncaught TypeError. Added typeof === 'string'
   guard, matching the pattern used for max_events / idle_timeout_ms.

3. Workspace check bypass via raw startsWith
   The directory validator used workspaceDirs.some(d => params.directory
   .startsWith(d)), which allowed prefix collisions (e.g. /tmp/project-evil
   against a /tmp/project workspace) and skipped canonicalisation / symlink
   resolution. Switched to WorkspaceContext.isPathWithinWorkspace, which
   already does fullyResolvedPath + segment-aware isPathWithinRoot matching
   and is the standard used elsewhere in the codebase.

Test coverage: added 6 unit tests covering non-string command guard,
non-absolute directory rejection, prefix-collision rejection, traversal
rejection, workspace acceptance, and partial-line cap behaviour
(including buffer reset). All 26 monitor.test.ts cases pass.

The same startsWith pattern also exists in ShellTool and is tracked as a
separate follow-up to keep this PR focused on Phase C scope.

* fix(core): scope monitor always-allow permissions

Populate Monitor confirmation permissionRules using the same command-rule extraction path as ShellTool, so ProceedAlways persists command-scoped Bash(...) rules instead of a broad monitor-level allow. Also add unit coverage for command-scoped rules, filtering already-allowed subcommands, and extractor fallback behavior.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): decouple monitor permission scope from Bash rules

Remove pm.isCommandAllowed() from MonitorToolInvocation.getConfirmationDetails()
to prevent existing Bash(...) allow rules from shrinking the monitor confirmation
scope. Monitor is a long-running background process with a different risk profile
than one-shot shell execution and should maintain its own permission boundary.
Only AST-based read-only filtering is retained.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): unify monitor error/exit cleanup to prevent resource leaks

Extract a shared cleanup() helper called from both the `exit` and
`error` event handlers. Previously the `error` handler did not flush
buffers, clear buffer values, remove the abort listener, or log
dropped-line stats — causing potential memory leaks when `error` fires
without a subsequent `exit` (e.g. ENOENT for missing commands).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): add user-skills-directory guard to monitor directory validation

Mirror ShellTool's getUserSkillsDirs() check in MonitorTool's
validateToolParamValues() to prevent monitor commands from running
inside user skills directories.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(core): add Monitor(...) permission namespace for monitor tool (QwenLM#3726)

Introduce a dedicated Monitor(...) permission namespace so monitor and
shell tools have independent permission boundaries. Previously monitor
emitted Bash(...) rules, causing "Always Allow" to fail for future
monitor invocations while unintentionally granting run_shell_command.

Changes:
- rule-parser.ts: add Monitor alias, SHELL_TOOL_NAMES entry,
  CANONICAL_TO_RULE_DISPLAY, DISPLAY_NAME_TO_VERB
- permission-manager.ts: extract SHELL_LIKE_TOOLS set so evaluate(),
  evaluateSingle(), hasRelevantRules(), hasMatchingAskRule() handle
  both run_shell_command and monitor
- monitor.ts: emit Monitor(...) instead of Bash(...) in permissionRules
- Tests: parseRule, matchesRule, cross-tool isolation regression,
  buildPermissionRules, buildHumanReadableRuleLabel for Monitor

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): decouple headless monitor lifetime from final result

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): stabilize stream-json monitor session shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): deny monitor in headless approval defaults

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): honor tool aliases in headless allow checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address opus review — sleep regex, monitor cap, non-interactive cleanup

- Fix sleep interception false positive for backgrounded sleep (`sleep 5 &
  echo done`). Remove bare `&` from separator character class so the
  background operator is not treated as a sequential separator.
- Add MAX_CONCURRENT_MONITORS (16) check in MonitorRegistry.register()
  and early rejection in MonitorTool.execute() to prevent unbounded
  process spawning.
- Widen monitorId from 8 to 16 hex chars to reduce birthday collision risk.
- Abort all running monitors in nonInteractiveCli.ts success-path finally
  so piped stdio refs don't keep the Node event loop alive after result
  emission in one-shot (--print) mode.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors and background shells on /clear

Without this, long-running monitors from a previous session survive
/clear and continue pushing events into the new session's notification
queue. This enables cross-session prompt injection where a malicious
monitor persists across the user's escape hatch.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): abort monitors on stream-json session shutdown

Call monitorRegistry.abortAll() in both shutdown() and
drainAndShutdown() so detached monitor child processes don't survive
session termination.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): use content event type in stream tests

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): isolate session cleanup on clear and shutdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): finalize session cleanup after drain

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close remaining monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve shell cwd in virtual permission checks

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize trailing background ampersands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor permission and wrapper handling

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): make monitor CI assertions cross-platform

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): align monitor wrapper normalization

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): normalize wrapped monitor commands

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor headless edge cases

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve monitor spawn errors

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): harden monitor register cleanup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): parse monitor wrapper script token

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address PR review comments for monitor tool

- Make Bash(...) permission rules cover monitor via toolMatchesRuleToolName,
  so deny rules like Bash(rm *) also block monitor({command: "rm ..."})
- Remove dead `normalizeRuleToolName` mock reference in config.test.ts
- Fix tool description to mention stdout/stderr instead of just stdout
- Export MAX_CONCURRENT_MONITORS from monitorRegistry and use it in
  monitor.ts instead of hardcoded 16
- Rename ambiguous MAX_LINE_LENGTH constants: PARTIAL_LINE_BUFFER_CAP
  (4096, monitor.ts) and EVENT_LINE_TRUNCATE (2000, monitorRegistry.ts)
- Fix schema description text: "Max 80 characters" → "Truncated to 80
  characters in display"
- Add .unref() to SIGTERM→SIGKILL escalation timer to prevent 200ms
  exit delay

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): resolve clear command typecheck issues

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): preserve background tasks across shutdown abort

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): close monitor review gaps

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address latest monitor review comments

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): handle monitors across session switches

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(core): cover aborted monitor startup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): address remaining monitor PR review comments

Adopts four unresolved review threads on PR QwenLM#3684:

* shell: trim top-level trailing comments before validating sleep
  separator so 'sleep 5 # wait' no longer bypasses
  detectBlockedSleepPattern.
* monitor: add sanitizeMonitorLine to strip C0/C1 control chars
  (except tab) and defang structural envelope tag names with U+200B
  before forwarding output to the model, blocking prompt-injection
  attempts hidden in monitored stdout/stderr.
* monitor: declare line buffers and throttledEmit before abortHandler
  to avoid TDZ on synchronous abort paths, and add
  flushPartialLineBuffers called from both abortHandler (before kill)
  and cleanup (natural exit/error) so partial-line data is no longer
  silently dropped on cancel.
* permissions: document that normalizePermissionContext relies on
  buildPermissionCheckContext to forward monitor's directory as cwd,
  and add regression tests proving relative-path Read(./...) allow
  and deny rules resolve against the monitor's explicit cwd.

* fix(core): abort running monitors in MonitorRegistry.reset()

reset() previously only cleared idle timers and emptied the map without
aborting running monitors' AbortControllers. This could orphan child
processes when reset() was called without a prior abortAll(), e.g. via
useResumeCommand → resetBackgroundStateForSessionSwitch.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(core): harden monitor notification XML and displayText

- Extend escapeXml to escape " and ' as defense-in-depth: safe to reuse
  the helper in any future XML attribute context without re-auditing.
- Strip C0 (except tab) and C1 control characters from the displayText
  surface before interpolation, so untrusted child-process output cannot
  leak ANSI escapes / NUL bytes into the operator's terminal even if a
  direct caller of MonitorRegistry.emitEvent skips sanitization.

Adds unit tests for both hardening paths.

* test(core): cover token-bucket throttling and commented-sleep bypass

- Add 4 unit tests for the monitor token-bucket throttle (burst=5,
  1 token/sec refill): burst cap, refill release, long-idle bucket cap,
  and whitespace lines not consuming budget. Uses vi.setSystemTime to
  exercise Date.now() without advancing pending setTimeouts.
- Add an E2E case that feeds 'sleep 5 # wait for db' through the shell
  tool to lock in trimTrailingShellComment behavior end-to-end; the
  unit-level coverage in shell.test.ts remains authoritative but the
  E2E anchor prevents a regression from silently passing unit tests.

* fix(core): address 3 remaining copilot review comments

1. shell.ts sleep interception: strip shell wrapper before detecting the
   blocked sleep pattern so `bash -c 'sleep 5'` / `sh -c ...` cannot
   route around the block. Mirrors every other sensitive check in
   shell.ts, which already normalizes through stripShellWrapper.

2. monitorRegistry.ts emitEvent auto-stop: settle the entry BEFORE
   aborting its controller so that any synchronous abort listener that
   flushes buffered output back through registry.emitEvent() (e.g. the
   Monitor tool's flushPartialLineBuffers) finds status !== 'running'
   and short-circuits instead of overshooting maxEvents and emitting a
   duplicate 'Max events reached' terminal notification.

3. monitorRegistry.ts truncateDescription: cap output at exactly
   MAX_DESCRIPTION_LENGTH by counting the ellipsis against the budget,
   instead of returning MAX_DESCRIPTION_LENGTH + 3 characters.

Each fix is covered by a new unit test.

* fix(core): address review comments — sanitize, notify, kill logging, throttle observability

- Remove double normalize in buildPermissionCheckContext (PM is single source)
- Add {notify:false} to Config.shutdown() and abortTaskRegistries() abortAll
- Swap settle-before-abort in cancel() and resetIdleTimer() to prevent races
- Add stripDisplayControlChars to emitTerminalNotification
- Sanitize monitor description at entry creation via sanitizeMonitorLine
- Surface throttle-dropped line count in terminal notification
- Add .unref() to idle timer to allow clean process exit
- Add error handler + stdio:ignore to Windows taskkill spawn
- Log SIGTERM/SIGKILL kill failures via debugLogger.warn
- Attach early child error handler to cover spawn-to-register window
- Destroy child stdio on register failure to prevent handle leaks
- Improve stripShellWrapper to handle absolute paths, combined flags, env prefix
- Improve SHELL_TOOL_NAMES documentation and toolMatchesRuleToolName clarity

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): resolve monitor tool typecheck errors

- Cast child.stdout/stderr to a minimal { destroy?: () => void } shape so
  the optional destroy() call compiles and still works with test mocks.
- Initialize droppedLines: 0 in MonitorEntry test fixtures that predate
  the field becoming required.

* fix(monitor): add missing stdio option in taskkill test assertions (QwenLM#3784)

* fix(core): address monitor review feedback

* fix(core): harden monitor command lifecycle

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
@tanzhenxin tanzhenxin deleted the feat/background-agent-control branch June 13, 2026 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bring subagent system to feature parity with Claude Code

3 participants