Skip to content

feat(skills): add agent reproduction workflows#4118

Merged
DragonnZhang merged 14 commits into
mainfrom
dragon/feat-reproduce-skill
Jun 2, 2026
Merged

feat(skills): add agent reproduction workflows#4118
DragonnZhang merged 14 commits into
mainfrom
dragon/feat-reproduce-skill

Conversation

@DragonnZhang

@DragonnZhang DragonnZhang commented May 13, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • What changed: Added two local agent workflows for reproducing reference-agent behavior and then aligning Qwen Code against the captured baseline.
  • Why it changed: Agent parity work needs a repeatable path for request capture, terminal capture, trace normalization, comparison, and local artifact handling.
  • Reviewer focus: Check that the workflows stay scoped to local reproduction, generated traces remain out of commits, and the helper scripts are safe to run in a developer environment.

Validation

  • Commands run:
    python3 -m py_compile .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py .qwen/skills/agent-reproduce-align/scripts/compare_traces.py .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
    bash -n .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh .qwen/skills/agent-reproduce-feature/scripts/run_tmux_capture.sh .qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh
  • Prompts / inputs used: N/A, documentation and local helper script addition only.
  • Expected result: Python helpers compile and shell helpers parse successfully.
  • Observed result: Both commands completed successfully with no output.
  • Quickest reviewer verification path: Run the two commands above from the repository root, then skim the two reproduction workflows.
  • Evidence: Local trace output is ignored so captured request bodies and terminal artifacts stay out of commits by default.

Scope / Risk

  • Main risk or tradeoff: This PR adds workflow guidance and developer-only helper scripts; it does not change product runtime behavior.
  • Not covered / not validated: Full proxy and tmux capture were not run because they depend on local reference-agent credentials and environment setup.
  • Breaking changes / migration notes: None.

Testing Matrix

🍏 🪟 🐧
npm run N/A N/A N/A
npx N/A N/A N/A
Docker N/A N/A N/A
Podman N/A N/A N/A
Seatbelt N/A N/A N/A

Testing matrix notes:

  • This PR only adds local agent workflow guidance and helper scripts. Runtime product test matrices are not applicable.

Linked Issues / Bugs

  • Related to agent parity and reproduction workflows.

@DragonnZhang DragonnZhang force-pushed the dragon/feat-reproduce-skill branch from 3ebded8 to 184a416 Compare May 13, 2026 12:25
@github-actions

github-actions Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Summary

Package Lines Statements Functions Branches
CLI 77.36% 77.36% 80.22% 79.9%
Core 80.32% 80.32% 82.57% 83.02%
CLI Package - Full Text Report
-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   77.36 |     79.9 |   80.22 |   77.36 |                   
 src               |   75.52 |    68.65 |   78.94 |   75.52 |                   
  gemini.tsx       |   69.03 |    66.15 |   77.77 |   69.03 | ...43,960-963,975 
  ...ractiveCli.ts |   78.73 |    66.89 |   73.33 |   78.73 | ...1284-1285,1321 
  ...liCommands.ts |    74.9 |     75.6 |     100 |    74.9 | ...41-265,290,391 
  ...ActiveAuth.ts |     100 |     87.5 |     100 |     100 | 66-80             
 ...cp-integration |   61.97 |    65.24 |   78.12 |   61.97 |                   
  acpAgent.ts      |   63.32 |    65.35 |   83.05 |   63.32 | ...2112,2126-2134 
  authMethods.ts   |   12.19 |      100 |       0 |   12.19 | 11-31,34-38,41-50 
  errorCodes.ts    |       0 |        0 |       0 |       0 | 1-22              
  ...DirContext.ts |     100 |      100 |     100 |     100 |                   
 ...ration/service |   68.65 |    83.33 |   66.66 |   68.65 |                   
  filesystem.ts    |   68.65 |    83.33 |   66.66 |   68.65 | ...32,77-94,97-98 
 ...ration/session |   75.88 |    72.05 |   86.25 |   75.88 |                   
  ...ryReplayer.ts |   67.34 |     75.6 |   81.81 |   67.34 | ...54-269,282-283 
  Session.ts       |   74.93 |    70.81 |   88.46 |   74.93 | ...2658,2664-2667 
  ...entTracker.ts |   90.85 |    84.84 |      90 |   90.85 | ...35,199,251-260 
  index.ts         |       0 |        0 |       0 |       0 | 1-40              
  ...ssionUtils.ts |   84.21 |    77.77 |     100 |   84.21 | ...37-153,209-211 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ssion/emitters |   96.01 |    90.75 |    92.3 |   96.01 |                   
  BaseEmitter.ts   |   76.92 |    66.66 |      80 |   76.92 | 23-24,39-40,55-56 
  ...ageEmitter.ts |     100 |    89.47 |     100 |     100 | 109,111           
  PlanEmitter.ts   |     100 |      100 |     100 |     100 |                   
  ...allEmitter.ts |   98.06 |     92.3 |     100 |   98.06 | 227-228,327,335   
  index.ts         |       0 |        0 |       0 |       0 | 1-10              
 ...ession/rewrite |   90.36 |    87.83 |   94.11 |   90.36 |                   
  LlmRewriter.ts   |      81 |       84 |     100 |      81 | ...,88-89,155-159 
  ...Middleware.ts |   95.83 |    85.71 |     100 |   95.83 | 119,127-129       
  TurnBuffer.ts    |     100 |      100 |     100 |     100 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 src/commands      |   45.67 |    85.71 |   43.47 |   45.67 |                   
  auth.ts          |     100 |    83.33 |     100 |     100 | 11,14             
  channel.ts       |   56.66 |      100 |       0 |   56.66 | 15-19,27-34       
  extensions.tsx   |   96.55 |      100 |      50 |   96.55 | 37                
  hooks.tsx        |   66.66 |      100 |       0 |   66.66 | 20-24             
  mcp.ts           |   94.73 |      100 |      50 |   94.73 | 28                
  review.ts        |   51.85 |      100 |       0 |   51.85 | 24-35,38          
  serve.ts         |    8.02 |      100 |       0 |    8.02 | ...56-152,154-266 
 ...mmands/channel |   39.25 |    79.45 |      50 |   39.25 |                   
  ...l-registry.ts |    8.57 |      100 |       0 |    8.57 | 6-21,24-42        
  config-utils.ts  |      92 |      100 |   66.66 |      92 | 21-26             
  configure.ts     |    14.7 |      100 |       0 |    14.7 | 18-21,23-84       
  pairing.ts       |   26.31 |      100 |       0 |   26.31 | ...30,40-50,52-65 
  pidfile.ts       |   96.34 |    86.95 |     100 |   96.34 | 49,59,91          
  start.ts         |   30.98 |       52 |   69.23 |   30.98 | ...72-475,484-486 
  status.ts        |   17.85 |      100 |       0 |   17.85 | 15-26,32-76       
  stop.ts          |      20 |      100 |       0 |      20 | 14-48             
 ...nds/extensions |   84.89 |    88.52 |   81.81 |   84.89 |                   
  consent.ts       |   71.65 |    89.28 |   42.85 |   71.65 | ...85-141,156-162 
  disable.ts       |     100 |      100 |     100 |     100 |                   
  enable.ts        |     100 |      100 |     100 |     100 |                   
  install.ts       |    75.6 |    66.66 |   66.66 |    75.6 | ...39-142,145-153 
  link.ts          |     100 |      100 |     100 |     100 |                   
  list.ts          |     100 |      100 |     100 |     100 |                   
  new.ts           |     100 |      100 |     100 |     100 |                   
  settings.ts      |   99.15 |      100 |   83.33 |   99.15 | 151               
  uninstall.ts     |    37.5 |      100 |   33.33 |    37.5 | 23-45,57-64,67-70 
  update.ts        |   96.32 |      100 |     100 |   96.32 | 101-105           
  utils.ts         |   65.06 |    31.25 |     100 |   65.06 | ...85,87-91,93-97 
 ...les/mcp-server |       0 |        0 |       0 |       0 |                   
  example.ts       |       0 |        0 |       0 |       0 | 1-60              
 src/commands/mcp  |   92.29 |    86.08 |   88.88 |   92.29 |                   
  add.ts           |     100 |    98.03 |     100 |     100 | 293               
  list.ts          |   91.22 |    80.76 |      80 |   91.22 | ...19-121,146-147 
  reconnect.ts     |   76.72 |    71.42 |   85.71 |   76.72 | 35-48,153-175     
  remove.ts        |     100 |       80 |     100 |     100 | 21-25             
 ...ommands/review |   11.57 |      100 |       0 |   11.57 |                   
  cleanup.ts       |   17.94 |      100 |       0 |   17.94 | ...01-106,108-109 
  deterministic.ts |   13.75 |      100 |       0 |   13.75 | ...22-738,740-741 
  fetch-pr.ts      |   11.36 |      100 |       0 |   11.36 | ...80-201,203-204 
  load-rules.ts    |   11.32 |      100 |       0 |   11.32 | ...41-153,155-156 
  pr-context.ts    |    6.22 |      100 |       0 |    6.22 | ...97-312,314-315 
  presubmit.ts     |    9.35 |      100 |       0 |    9.35 | ...62-287,289-290 
 ...nds/review/lib |      30 |      100 |       0 |      30 |                   
  gh.ts            |   22.58 |      100 |       0 |   22.58 | ...49,53-54,62-69 
  git.ts           |   22.72 |      100 |       0 |   22.72 | 15-18,29-39,43-44 
  paths.ts         |   52.94 |      100 |       0 |   52.94 | ...26,37-38,42-43 
 src/config        |   92.56 |    84.55 |   89.36 |   92.56 |                   
  auth.ts          |   86.98 |    80.32 |     100 |   86.98 | ...26-227,243-244 
  config.ts        |   86.68 |    83.49 |   81.48 |   86.68 | ...1935,1937-1945 
  keyBindings.ts   |   96.55 |       50 |     100 |   96.55 | 193-196           
  ...ngsAdapter.ts |     100 |    94.11 |     100 |     100 | 64                
  ...idersScope.ts |      92 |       90 |     100 |      92 | 11-12             
  sandboxConfig.ts |   61.64 |    71.87 |   66.66 |   61.64 | ...54-68,73,77-89 
  settings.ts      |   85.76 |    87.25 |   89.18 |   85.76 | ...1148,1153-1156 
  ...ingsSchema.ts |     100 |      100 |     100 |     100 |                   
  ...tedFolders.ts |   96.22 |       94 |     100 |   96.22 | ...88-190,205-206 
 ...nfig/migration |   94.89 |    78.94 |   83.33 |   94.89 |                   
  index.ts         |   94.87 |    88.88 |     100 |   94.87 | 91-92             
  scheduler.ts     |   96.55 |    77.77 |     100 |   96.55 | 19-20             
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ation/versions |   94.74 |       96 |     100 |   94.74 |                   
  ...-v2-shared.ts |     100 |      100 |     100 |     100 |                   
  v1-to-v2.ts      |   81.75 |    90.19 |     100 |   81.75 | ...28-229,231-247 
  v2-to-v3.ts      |     100 |      100 |     100 |     100 |                   
  v3-to-v4.ts      |     100 |      100 |     100 |     100 |                   
 src/core          |     100 |      100 |     100 |     100 |                   
  auth.ts          |     100 |      100 |     100 |     100 |                   
  initializer.ts   |     100 |      100 |     100 |     100 |                   
  theme.ts         |     100 |      100 |     100 |     100 |                   
 src/dualOutput    |   63.09 |    64.51 |   55.55 |   63.09 |                   
  ...tputBridge.ts |   62.94 |    65.51 |   56.25 |   62.94 | ...22-323,331-334 
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/export        |       0 |        0 |       0 |       0 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-7               
 src/generated     |     100 |      100 |     100 |     100 |                   
  git-commit.ts    |     100 |      100 |     100 |     100 |                   
 src/i18n          |   81.47 |    75.94 |   65.71 |   81.47 |                   
  index.ts         |   63.68 |    69.56 |   53.84 |   63.68 | ...70-271,281-286 
  languages.ts     |   96.92 |    86.66 |     100 |   96.92 | 134-135,167,184   
  ...nslateKeys.ts |     100 |      100 |     100 |     100 |                   
  ...lationDict.ts |   93.33 |    66.66 |     100 |   93.33 | 15                
 src/i18n/locales  |     100 |      100 |     100 |     100 |                   
  ca.js            |     100 |      100 |     100 |     100 |                   
  de.js            |     100 |      100 |     100 |     100 |                   
  en.js            |     100 |      100 |     100 |     100 |                   
  fr.js            |     100 |      100 |     100 |     100 |                   
  ja.js            |     100 |      100 |     100 |     100 |                   
  pt.js            |     100 |      100 |     100 |     100 |                   
  ru.js            |     100 |      100 |     100 |     100 |                   
  zh-TW.js         |     100 |      100 |     100 |     100 |                   
  zh.js            |     100 |      100 |     100 |     100 |                   
 ...nonInteractive |   72.57 |    71.12 |   74.07 |   72.57 |                   
  session.ts       |   76.64 |     69.4 |   85.71 |   76.64 | ...23-824,833-843 
  types.ts         |    42.5 |      100 |   33.33 |    42.5 | ...90-591,594-595 
 ...active/control |   76.79 |    88.23 |      80 |   76.79 |                   
  ...rolContext.ts |    6.89 |        0 |       0 |    6.89 | 50-86             
  ...Dispatcher.ts |   91.66 |    91.83 |   88.88 |   91.66 | ...54-372,388,391 
  ...rolService.ts |       8 |        0 |       0 |       8 | 46-179            
 ...ol/controllers |   27.25 |    35.71 |   36.66 |   27.25 |                   
  ...Controller.ts |   36.97 |       80 |      80 |   36.97 | ...15-117,127-210 
  ...Controller.ts |       0 |        0 |       0 |       0 | 1-56              
  ...Controller.ts |    33.7 |    34.48 |   44.44 |    33.7 | ...57-466,481-486 
  ...Controller.ts |   14.06 |      100 |       0 |   14.06 | ...82-117,130-133 
  ...Controller.ts |   21.97 |    28.57 |   27.27 |   21.97 | ...39-451,460-489 
 .../control/types |       0 |        0 |       0 |       0 |                   
  serviceAPIs.ts   |       0 |        0 |       0 |       0 | 1                 
 ...Interactive/io |   98.01 |    93.77 |   95.23 |   98.01 |                   
  ...putAdapter.ts |   97.89 |    92.82 |   98.07 |   97.89 | ...1303,1398-1399 
  ...putAdapter.ts |      96 |     90.9 |   85.71 |      96 | 51-52             
  ...nputReader.ts |     100 |    94.73 |     100 |     100 | 67                
  ...putAdapter.ts |   98.38 |      100 |   90.47 |   98.38 | 83-84,124-125     
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/patches       |       0 |        0 |       0 |       0 |                   
  is-in-ci.ts      |       0 |        0 |       0 |       0 | 1-17              
 src/remoteInput   |   86.98 |       75 |   85.71 |   86.98 |                   
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  ...putWatcher.ts |   88.12 |    76.08 |   91.66 |   88.12 | ...21-222,233-236 
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/serve         |    79.3 |     78.8 |   92.85 |    79.3 |                   
  auth.ts          |   88.49 |    88.63 |     100 |   88.49 | ...49-150,153-155 
  capabilities.ts  |     100 |     90.9 |     100 |     100 | 264               
  ...usProvider.ts |   67.01 |    51.42 |     100 |   67.01 | ...40-245,278-286 
  debugMode.ts     |     100 |      100 |     100 |     100 |                   
  demo.ts          |     100 |      100 |     100 |     100 |                   
  envSnapshot.ts   |    92.3 |       84 |     100 |    92.3 | 108-111,170-177   
  eventBus.ts      |     100 |      100 |     100 |     100 |                   
  httpAcpBridge.ts |   79.62 |    78.84 |   96.38 |   79.62 | ...4246,4277-4318 
  ...oryChannel.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-106             
  loopbackBinds.ts |     100 |      100 |     100 |     100 |                   
  runQwenServe.ts  |   73.98 |    87.83 |   55.55 |   73.98 | ...94-710,735-737 
  server.ts        |   86.18 |    82.94 |   90.62 |   86.18 | ...2478,2543-2552 
  status.ts        |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
  ...paceAgents.ts |   64.87 |    70.45 |    90.9 |   64.87 | ...1306,1316-1326 
  ...paceMemory.ts |   87.13 |    78.46 |     100 |   87.13 | ...54-361,421-428 
 src/serve/auth    |   86.54 |    78.75 |   93.75 |   86.54 |                   
  deviceFlow.ts    |   96.33 |    79.51 |    97.5 |   96.33 | ...1526,1630,1700 
  ...owProvider.ts |   45.23 |    74.07 |      75 |   45.23 | ...90-359,375,379 
 src/serve/fs      |   84.85 |    79.75 |     100 |   84.85 |                   
  audit.ts         |     100 |    96.15 |     100 |     100 | 201               
  errors.ts        |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  paths.ts         |   77.82 |    77.08 |     100 |   77.82 | ...64,493-497,510 
  policy.ts        |   90.32 |    89.18 |     100 |   90.32 | 142-150           
  ...FileSystem.ts |   83.55 |    76.22 |     100 |   83.55 | ...1859,1886-1887 
 src/serve/routes  |   89.41 |       70 |     100 |   89.41 |                   
  ...ceFileRead.ts |   94.41 |    76.92 |     100 |   94.41 | ...28-329,390-392 
  ...eFileWrite.ts |    82.1 |    60.52 |     100 |    82.1 | ...42-244,247-249 
 src/services      |   91.66 |    91.21 |   97.56 |   91.66 |                   
  ...mandLoader.ts |     100 |    93.75 |     100 |     100 | 92                
  ...killLoader.ts |     100 |    96.15 |     100 |     100 | 47                
  ...andService.ts |    98.7 |      100 |     100 |    98.7 | 107               
  ...mandLoader.ts |   86.83 |    83.87 |     100 |   86.83 | ...30-335,340-345 
  ...omptLoader.ts |   75.84 |    80.64 |   83.33 |   75.84 | ...10-211,277-278 
  ...mandLoader.ts |     100 |      100 |     100 |     100 |                   
  ...nd-factory.ts |   91.42 |    91.66 |     100 |   91.42 | 128,137-144       
  ...ation-tool.ts |     100 |    95.45 |     100 |     100 | 125               
  ...ndMetadata.ts |   98.21 |    96.66 |     100 |   98.21 | 83,87             
  commandUtils.ts  |      96 |     90.9 |     100 |      96 | 48                
  ...and-parser.ts |   90.69 |    85.71 |     100 |   90.69 | 63-66             
  ...ionService.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...ght/generators |    85.9 |    85.61 |   90.47 |    85.9 |                   
  DataProcessor.ts |   85.63 |     85.6 |   92.85 |   85.63 | ...1122,1126-1133 
  ...tGenerator.ts |   98.21 |    85.71 |     100 |   98.21 | 46                
  ...teRenderer.ts |   45.45 |      100 |       0 |   45.45 | 13-51             
 .../insight/types |       0 |       50 |      50 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 | 1                 
 ...mpt-processors |   97.27 |    94.04 |     100 |   97.27 |                   
  ...tProcessor.ts |     100 |      100 |     100 |     100 |                   
  ...eProcessor.ts |   94.52 |    84.21 |     100 |   94.52 | 46-47,93-94       
  ...tionParser.ts |     100 |      100 |     100 |     100 |                   
  ...lProcessor.ts |   97.41 |    95.65 |     100 |   97.41 | 95-98             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/services/tips |   97.34 |    84.84 |     100 |   97.34 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  tipHistory.ts    |   92.45 |       70 |     100 |   92.45 | ...22,144,151,160 
  tipRegistry.ts   |     100 |      100 |     100 |     100 |                   
  tipScheduler.ts  |     100 |    91.66 |     100 |     100 | 55                
 src/test-utils    |   93.75 |    83.33 |      80 |   93.75 |                   
  ...omMatchers.ts |   69.69 |       50 |      50 |   69.69 | 32-35,37-39,45-47 
  ...andContext.ts |     100 |      100 |     100 |     100 |                   
  render.tsx       |     100 |      100 |     100 |     100 |                   
 src/ui            |   65.51 |    73.04 |   60.34 |   65.51 |                   
  App.tsx          |     100 |      100 |     100 |     100 |                   
  AppContainer.tsx |   63.66 |     64.7 |      50 |   63.66 | ...3151,3155-3159 
  ...tionNudge.tsx |    9.58 |      100 |       0 |    9.58 | 24-94             
  ...ackDialog.tsx |   29.23 |      100 |       0 |   29.23 | 25-75             
  ...tionNudge.tsx |    7.69 |      100 |       0 |    7.69 | 25-103            
  colors.ts        |      60 |      100 |   35.29 |      60 | ...52,54-55,60-61 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  keyMatchers.ts   |   95.91 |    97.05 |     100 |   95.91 | 25-26             
  ...tic-colors.ts |     100 |      100 |     100 |     100 |                   
  ...inePresets.ts |   98.17 |    88.88 |     100 |   98.17 | ...12,239,387-389 
  textConstants.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/auth       |   52.97 |    51.21 |   42.42 |   52.97 |                   
  AuthDialog.tsx   |   62.87 |     42.1 |   18.18 |   62.87 | ...03,310-332,336 
  ...nProgress.tsx |       0 |        0 |       0 |       0 | 1-64              
  ...etupSteps.tsx |    39.4 |       32 |   38.46 |    39.4 | ...68,471,477,480 
  useAuth.ts       |   94.55 |    73.52 |     100 |   94.55 | ...19-220,239-245 
  ...rSetupFlow.ts |   43.45 |    33.33 |      50 |   43.45 | ...68-389,406-449 
 src/ui/commands   |   75.97 |    81.52 |   83.82 |   75.97 |                   
  aboutCommand.ts  |     100 |      100 |     100 |     100 |                   
  agentsCommand.ts |   83.78 |      100 |      60 |   83.78 | 30-32,42-44       
  ...odeCommand.ts |   89.04 |    81.25 |     100 |   89.04 | 91-92,94-99       
  arenaCommand.ts  |   62.81 |    58.73 |   65.21 |   62.81 | ...91-596,681-689 
  authCommand.ts   |     100 |      100 |     100 |     100 |                   
  branchCommand.ts |     100 |      100 |     100 |     100 |                   
  btwCommand.ts    |   95.59 |    71.42 |     100 |   95.59 | 72,154-159        
  bugCommand.ts    |   81.13 |    71.42 |     100 |   81.13 | 60-69             
  clearCommand.ts  |      92 |    76.47 |     100 |      92 | 43-44,72-73,91-92 
  ...essCommand.ts |    64.7 |       50 |      75 |    64.7 | ...48-149,163-166 
  ...extCommand.ts |   65.06 |    67.24 |   84.61 |   65.06 | ...39-574,585-586 
  copyCommand.ts   |   98.28 |    94.89 |     100 |   98.28 | ...80,280,321,327 
  deleteCommand.ts |     100 |      100 |     100 |     100 |                   
  diffCommand.ts   |     100 |     87.5 |     100 |     100 | ...61,224-225,238 
  ...ryCommand.tsx |   76.87 |    79.03 |   88.88 |   76.87 | ...59-264,318-326 
  docsCommand.ts   |     100 |    88.88 |     100 |     100 | 25                
  doctorCommand.ts |   95.06 |    88.28 |     100 |   95.06 | ...92-293,320-321 
  dreamCommand.ts  |      75 |    66.66 |   66.66 |      75 | 22-27,44-47       
  editorCommand.ts |     100 |      100 |     100 |     100 |                   
  exportCommand.ts |   98.25 |    91.02 |     100 |   98.25 | ...81,198-199,364 
  ...onsCommand.ts |   49.33 |     90.9 |   63.63 |   49.33 | ...06-110,163-215 
  forgetCommand.ts |   26.82 |      100 |      50 |   26.82 | 18-51             
  goalCommand.ts   |   91.41 |    84.44 |      90 |   91.41 | ...86-189,201-204 
  helpCommand.ts   |     100 |      100 |     100 |     100 |                   
  hooksCommand.ts  |    20.4 |       40 |      40 |    20.4 | ...48-180,204-205 
  ideCommand.ts    |   60.75 |    64.28 |   41.17 |   60.75 | ...05-306,310-324 
  initCommand.ts   |   84.33 |    72.72 |     100 |   84.33 | 68,82-87,89-94    
  ...ghtCommand.ts |   74.56 |    68.42 |     100 |   74.56 | ...31-245,250-273 
  ...ageCommand.ts |   92.17 |    82.69 |     100 |   92.17 | ...43,164,173-183 
  lspCommand.ts    |     100 |    86.95 |     100 |     100 | 31,101-102        
  mcpCommand.ts    |     100 |      100 |     100 |     100 |                   
  memoryCommand.ts |     100 |      100 |     100 |     100 |                   
  modelCommand.ts  |   75.09 |    78.18 |      75 |   75.09 | ...20-225,262-267 
  ...onsCommand.ts |     100 |      100 |     100 |     100 |                   
  planCommand.ts   |   78.82 |    76.92 |     100 |   78.82 | 30-35,51-56,68-73 
  quitCommand.ts   |     100 |      100 |     100 |     100 |                   
  recapCommand.ts  |   21.81 |      100 |      50 |   21.81 | 24-73             
  ...berCommand.ts |   32.43 |      100 |      50 |   32.43 | 23-57             
  renameCommand.ts |   85.71 |    86.04 |     100 |   85.71 | ...02-209,216-221 
  ...oreCommand.ts |    92.3 |    87.87 |     100 |    92.3 | ...,83-88,129-130 
  resumeCommand.ts |     100 |      100 |     100 |     100 |                   
  rewindCommand.ts |      80 |      100 |      50 |      80 | 19-21             
  ...ngsCommand.ts |     100 |      100 |     100 |     100 |                   
  ...hubCommand.ts |   81.43 |    65.21 |      80 |   81.43 | ...70-173,176-179 
  skillsCommand.ts |   37.06 |       50 |      50 |   37.06 | ...99-115,118-145 
  statsCommand.ts  |   88.19 |    84.21 |     100 |   88.19 | ...,58-61,143-146 
  ...ineCommand.ts |     100 |      100 |     100 |     100 |                   
  ...aryCommand.ts |    6.46 |      100 |      50 |    6.46 | 31-329            
  tasksCommand.ts  |   77.22 |    72.13 |     100 |   77.22 | ...46-150,172-177 
  ...tupCommand.ts |     100 |      100 |     100 |     100 |                   
  themeCommand.ts  |     100 |      100 |     100 |     100 |                   
  toolsCommand.ts  |     100 |      100 |     100 |     100 |                   
  trustCommand.ts  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
  vimCommand.ts    |   54.54 |      100 |      50 |   54.54 | 19-29             
 src/ui/components |   62.34 |    75.09 |   64.85 |   62.34 |                   
  AboutBox.tsx     |     100 |      100 |     100 |     100 |                   
  AnsiOutput.tsx   |   65.57 |      100 |      50 |   65.57 | 69-90             
  ApiKeyInput.tsx  |       0 |        0 |       0 |       0 | 1-97              
  AppHeader.tsx    |   89.06 |       75 |     100 |   89.06 | 37,39-44,46       
  ...odeDialog.tsx |     9.7 |      100 |       0 |     9.7 | 35-47,50-182      
  AsciiArt.ts      |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |   13.04 |      100 |       0 |   13.04 | 18-61             
  ...TextInput.tsx |   77.01 |       76 |     100 |   77.01 | ...20,234-236,263 
  Composer.tsx     |    81.6 |     64.7 |     100 |    81.6 | ...90,108,160,173 
  ...entPrompt.tsx |     100 |      100 |     100 |     100 |                   
  ...ryDisplay.tsx |   75.89 |    62.06 |     100 |   75.89 | ...,88,93-108,113 
  ...geDisplay.tsx |   68.42 |    57.14 |     100 |   68.42 | 16-17,31-32,42-50 
  ...ification.tsx |   28.57 |      100 |       0 |   28.57 | 16-36             
  ...gProfiler.tsx |       0 |        0 |       0 |       0 | 1-36              
  ...ogManager.tsx |   11.98 |      100 |       0 |   11.98 | 65-508            
  DiffDialog.tsx   |    2.47 |      100 |       0 |    2.47 | 68-732            
  ...ngsDialog.tsx |    8.44 |      100 |       0 |    8.44 | 37-195            
  ExitWarning.tsx  |     100 |      100 |     100 |     100 |                   
  ...hProgress.tsx |    87.8 |    33.33 |     100 |    87.8 | 28-31,56          
  ...ustDialog.tsx |     100 |      100 |     100 |     100 |                   
  Footer.tsx       |   76.59 |    48.64 |     100 |   76.59 | ...35-136,175-180 
  ...ngSpinner.tsx |   68.42 |       80 |      50 |   68.42 | 35-52,73,80-81    
  GoalPill.tsx     |   76.19 |    81.81 |     100 |   76.19 | 24-30,46-50       
  Header.tsx       |   98.62 |    94.28 |     100 |   98.62 | 162,164           
  Help.tsx         |   98.32 |       90 |     100 |   98.32 | ...24,381,447-448 
  ...emDisplay.tsx |    61.7 |       36 |     100 |    61.7 | ...42,345,348-354 
  ...ngeDialog.tsx |     100 |      100 |     100 |     100 |                   
  InputPrompt.tsx  |   80.76 |    79.62 |   83.33 |   80.76 | ...1456,1588,1638 
  ...Shortcuts.tsx |   20.87 |      100 |       0 |   20.87 | ...6,49-51,67-125 
  ...Indicator.tsx |     100 |    91.42 |     100 |     100 | 65,74             
  ...firmation.tsx |   91.42 |      100 |      50 |   91.42 | 26-31             
  MainContent.tsx  |   81.75 |       75 |     100 |   81.75 | ...70-274,282-286 
  MemoryDialog.tsx |    55.1 |    54.54 |   57.14 |    55.1 | ...56,368,381-383 
  ...geDisplay.tsx |       0 |        0 |       0 |       0 | 1-41              
  ModelDialog.tsx  |   80.12 |    63.55 |     100 |   80.12 | ...39-555,612-616 
  ...tsDisplay.tsx |     100 |    97.22 |     100 |     100 | 270               
  ...fications.tsx |   18.18 |      100 |       0 |   18.18 | 15-58             
  ...onsDialog.tsx |    2.13 |      100 |       0 |    2.13 | 62-133,148-1004   
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...icePrompt.tsx |   92.64 |    85.71 |     100 |   92.64 | 102-106,134-139   
  PrepareLabel.tsx |   91.66 |    77.27 |     100 |   91.66 | 73-75,77-79,110   
  ...atePrompt.tsx |    8.57 |      100 |       0 |    8.57 | 24-55,58-134      
  ...geDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...ngDisplay.tsx |   21.42 |      100 |       0 |   21.42 | 13-39             
  ...hProgress.tsx |   85.25 |    88.46 |     100 |   85.25 | 121-147           
  ...dSelector.tsx |   41.26 |    61.53 |   71.42 |   41.26 | ...74-472,476-520 
  ...ionPicker.tsx |   83.66 |    72.13 |     100 |   83.66 | ...96,402,444-466 
  ...onPreview.tsx |   92.42 |    84.37 |     100 |   92.42 | ...,70-71,143-145 
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...putPrompt.tsx |   72.56 |       80 |      40 |   72.56 | ...06-109,114-117 
  ...ngsDialog.tsx |   66.27 |    71.16 |      75 |   66.27 | ...12-820,826-827 
  ...ionDialog.tsx |    87.8 |      100 |   33.33 |    87.8 | 36-39,44-51       
  ...putPrompt.tsx |    15.9 |      100 |       0 |    15.9 | 20-63             
  ...Indicator.tsx |   57.14 |      100 |       0 |   57.14 | 12-15             
  ...MoreLines.tsx |      28 |      100 |       0 |      28 | 18-40             
  ...ionPicker.tsx |   17.59 |      100 |       0 |   17.59 | 55-172            
  StatsDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...ineDialog.tsx |   93.69 |    83.92 |     100 |   93.69 | ...11,273,293-295 
  ...yTodoList.tsx |   94.17 |       80 |     100 |   94.17 | 56-57,131-134     
  ...nsDisplay.tsx |   87.25 |       64 |     100 |   87.25 | ...47-149,156-158 
  ThemeDialog.tsx  |   89.95 |    46.15 |      75 |   89.95 | ...71-173,243-245 
  Tips.tsx         |   93.54 |       75 |     100 |   93.54 | 39-40             
  TodoDisplay.tsx  |     100 |      100 |     100 |     100 |                   
  ...tsDisplay.tsx |     100 |     87.5 |     100 |     100 | 31-32             
  TrustDialog.tsx  |     100 |    81.81 |     100 |     100 | 71-86             
  ...ification.tsx |   36.36 |      100 |       0 |   36.36 | 15-22             
  ...ackDialog.tsx |    7.84 |      100 |       0 |    7.84 | 24-134            
  ...xitDialog.tsx |   80.36 |    43.47 |      60 |   80.36 | ...24-238,248-251 
 ...nts/agent-view |   38.31 |    70.83 |   36.36 |   38.31 |                   
  ...atContent.tsx |    8.79 |      100 |       0 |    8.79 | 53-265,271-273    
  ...tChatView.tsx |   21.05 |      100 |       0 |   21.05 | 21-39             
  ...tComposer.tsx |   10.28 |      100 |       0 |   10.28 | 58-311            
  AgentFooter.tsx  |   17.07 |      100 |       0 |   17.07 | 28-66             
  AgentHeader.tsx  |   15.38 |      100 |       0 |   15.38 | 27-64             
  AgentTabBar.tsx  |    87.8 |    27.27 |     100 |    87.8 | ...,85,95-103,121 
  ...oryAdapter.ts |     100 |    91.83 |     100 |     100 | 103,109-110,138   
  index.ts         |       0 |        0 |       0 |       0 | 1-12              
 ...mponents/arena |   45.72 |    70.53 |   60.86 |   45.72 |                   
  ArenaCards.tsx   |   73.06 |    71.79 |   85.71 |   73.06 | ...83-185,321-326 
  ...ectDialog.tsx |   83.48 |    69.86 |   88.88 |   83.48 | ...88-392,409-410 
  ...artDialog.tsx |   10.15 |      100 |       0 |   10.15 | 27-161            
  ...tusDialog.tsx |    5.63 |      100 |       0 |    5.63 | 33-75,80-288      
  ...topDialog.tsx |    6.17 |      100 |       0 |    6.17 | 33-213            
 ...ackground-view |    75.6 |    82.66 |   85.29 |    75.6 |                   
  ...sksDialog.tsx |   70.99 |    80.48 |   76.19 |   70.99 | ...1120,1196-1198 
  ...TasksPill.tsx |   63.75 |    86.95 |     100 |   63.75 | 44,86-106,114-122 
  ...gentPanel.tsx |    97.4 |    86.31 |     100 |    97.4 | 123,434-438       
 ...nts/extensions |   45.28 |    33.33 |      60 |   45.28 |                   
  ...gerDialog.tsx |   44.31 |    34.14 |      75 |   44.31 | ...71-480,483-488 
  index.ts         |       0 |        0 |       0 |       0 | 1-9               
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...tensions/steps |   54.88 |    94.23 |   66.66 |   54.88 |                   
  ...ctionStep.tsx |   95.12 |    92.85 |   85.71 |   95.12 | 84-86,89          
  ...etailStep.tsx |    6.18 |      100 |       0 |    6.18 | 20-131            
  ...nListStep.tsx |   88.43 |    94.73 |      80 |   88.43 | 52-53,59-72,106   
  ...electStep.tsx |   13.46 |      100 |       0 |   13.46 | 20-70             
  ...nfirmStep.tsx |   19.56 |      100 |       0 |   19.56 | 23-65             
  index.ts         |     100 |      100 |     100 |     100 |                   
 ...mponents/hooks |   68.67 |    69.07 |   69.56 |   68.67 |                   
  ...etailStep.tsx |   74.68 |    66.66 |   66.66 |   74.68 | ...71-184,188-201 
  ...etailStep.tsx |    87.4 |    73.68 |     100 |    87.4 | 41-42,99-113,119  
  ...abledStep.tsx |     100 |      100 |     100 |     100 |                   
  ...sListStep.tsx |     100 |      100 |     100 |     100 |                   
  ...entDialog.tsx |   34.51 |    47.05 |   42.85 |   34.51 | ...78,482-495,499 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-13              
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...components/mcp |   20.98 |    86.36 |   83.33 |   20.98 |                   
  ...ealthPill.tsx |   68.42 |    85.71 |     100 |   68.42 | 40-46             
  ...entDialog.tsx |    3.64 |      100 |       0 |    3.64 | 41-717            
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-30              
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   95.83 |    88.88 |     100 |   95.83 | 16,20,109-110     
 ...ents/mcp/steps |   26.74 |    54.54 |   42.85 |   26.74 |                   
  ...icateStep.tsx |    5.88 |      100 |       0 |    5.88 | 40-55,58-296      
  ...electStep.tsx |   10.95 |      100 |       0 |   10.95 | 16-88             
  ...etailStep.tsx |    5.26 |      100 |       0 |    5.26 | 31-247            
  ...rListStep.tsx |   75.18 |    59.37 |     100 |   75.18 | ...53-158,169-173 
  ...etailStep.tsx |   10.41 |      100 |       0 |   10.41 | ...1,67-79,82-139 
  ToolListStep.tsx |   69.02 |       50 |     100 |   69.02 | ...22,125,134-143 
 ...nents/messages |    83.1 |    80.33 |    75.6 |    83.1 |                   
  ...ionDialog.tsx |   80.84 |     77.6 |    62.5 |   80.84 | ...98,516,534-536 
  BtwMessage.tsx   |     100 |      100 |     100 |     100 |                   
  ...upDisplay.tsx |   97.67 |    83.72 |     100 |   97.67 | 119,142,150       
  ...onMessage.tsx |   91.93 |    82.35 |     100 |   91.93 | 57-59,61,63       
  ...nMessages.tsx |   79.06 |      100 |      70 |   79.06 | ...51-264,268-280 
  DiffRenderer.tsx |   93.19 |    86.17 |     100 |   93.19 | ...09,237-238,304 
  ...tsDisplay.tsx |   97.82 |    77.27 |     100 |   97.82 | 87,89             
  ...usMessage.tsx |   76.31 |     42.1 |   66.66 |   76.31 | ...99,101,124,155 
  ...tsDisplay.tsx |    95.1 |    88.05 |     100 |    95.1 | ...29,131,164-169 
  ...ssMessage.tsx |    12.5 |      100 |       0 |    12.5 | 18-59             
  ...edMessage.tsx |   16.66 |      100 |       0 |   16.66 | 22-38             
  ...sMessages.tsx |   55.67 |       40 |   28.57 |   55.67 | ...20-125,133-145 
  ...ryMessage.tsx |   14.28 |      100 |       0 |   14.28 | 23-62             
  ...onMessage.tsx |   81.02 |    69.23 |   33.33 |   81.02 | ...24-426,433-435 
  ...upMessage.tsx |   82.63 |    92.85 |     100 |   82.63 | ...85-412,434-449 
  ToolMessage.tsx  |   88.84 |    75.71 |    92.3 |   88.84 | ...44-749,776-778 
 ...ponents/shared |   86.23 |    79.33 |   95.94 |   86.23 |                   
  ...ctionList.tsx |   99.03 |    95.65 |     100 |   99.03 | 85                
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  EnumSelector.tsx |     100 |    96.42 |     100 |     100 | 58                
  MaxSizedBox.tsx  |   83.01 |    86.25 |   88.88 |   83.01 | ...12-513,618-619 
  MultiSelect.tsx  |   84.31 |    74.19 |     100 |   84.31 | ...37,193-195,205 
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  ...eSelector.tsx |     100 |       60 |     100 |     100 | 40-45             
  TextInput.tsx    |   77.77 |    48.78 |      80 |   77.77 | ...14-218,230-236 
  ...apsedTime.tsx |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |     100 |      100 |     100 |     100 |                   
  text-buffer.ts   |   85.38 |       80 |   97.77 |   85.38 | ...2437,2535-2536 
  ...er-actions.ts |   86.71 |    67.79 |     100 |   86.71 | ...07-608,809-811 
 ...ents/subagents |   30.87 |        0 |       0 |   30.87 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-11              
  reducers.tsx     |    12.1 |      100 |       0 |    12.1 | 33-190            
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   10.95 |      100 |       0 |   10.95 | ...1,56-57,60-102 
 ...bagents/create |    9.13 |      100 |       0 |    9.13 |                   
  ...ionWizard.tsx |    7.28 |      100 |       0 |    7.28 | 34-299            
  ...rSelector.tsx |   14.75 |      100 |       0 |   14.75 | 26-85             
  ...onSummary.tsx |    4.26 |      100 |       0 |    4.26 | 27-331            
  ...tionInput.tsx |    8.63 |      100 |       0 |    8.63 | 23-177            
  ...dSelector.tsx |   33.33 |      100 |       0 |   33.33 | 20-21,26-27,36-63 
  ...nSelector.tsx |    37.5 |      100 |       0 |    37.5 | 20-21,26-27,36-58 
  ...EntryStep.tsx |   12.76 |      100 |       0 |   12.76 | 34-78             
  ToolSelector.tsx |    4.16 |      100 |       0 |    4.16 | 31-253            
 ...bagents/manage |   21.51 |    59.52 |   27.27 |   21.51 |                   
  ...ctionStep.tsx |   10.25 |      100 |       0 |   10.25 | 21-103            
  ...eleteStep.tsx |   20.93 |      100 |       0 |   20.93 | 23-62             
  ...tEditStep.tsx |   25.53 |      100 |       0 |   25.53 | ...2,37-38,51-124 
  ...ctionStep.tsx |   35.42 |    59.52 |     100 |   35.42 | ...20-432,437-439 
  ...iewerStep.tsx |   13.72 |      100 |       0 |   13.72 | 18-73             
  ...gerDialog.tsx |    6.74 |      100 |       0 |    6.74 | 35-341            
 ...mponents/views |   70.21 |    67.32 |    64.7 |   70.21 |                   
  ContextUsage.tsx |   70.88 |    63.88 |      80 |   70.88 | ...20-426,463-557 
  DoctorReport.tsx |     9.8 |      100 |       0 |     9.8 | 25-54,57-131      
  ...sionsList.tsx |   87.69 |    73.68 |     100 |   87.69 | 65-72             
  McpStatus.tsx    |   89.53 |    60.52 |     100 |   89.53 | ...72,175-177,262 
  SkillsList.tsx   |   27.27 |      100 |       0 |   27.27 | 18-35             
  ToolsList.tsx    |     100 |      100 |     100 |     100 |                   
 src/ui/contexts   |   77.45 |    77.46 |   80.35 |   77.45 |                   
  ...ewContext.tsx |    64.7 |    85.71 |      50 |    64.7 | ...22-225,231-241 
  AppContext.tsx   |      80 |       50 |     100 |      80 | 19-20             
  ...ewContext.tsx |   93.03 |    64.28 |      50 |   93.03 | ...31-232,259-263 
  ...deContext.tsx |     100 |      100 |     100 |     100 |                   
  ...igContext.tsx |   81.81 |       50 |     100 |   81.81 | 15-16             
  ...ssContext.tsx |   82.31 |    82.84 |     100 |   82.31 | ...1153,1159-1161 
  ...owContext.tsx |   89.28 |       80 |   66.66 |   89.28 | 34,47-48,60-62    
  ...deContext.tsx |     100 |      100 |      50 |     100 |                   
  ...onContext.tsx |   43.28 |     62.5 |    62.5 |   43.28 | ...56-259,263-266 
  ...gsContext.tsx |   83.33 |       50 |     100 |   83.33 | 17-18             
  ...usContext.tsx |     100 |      100 |     100 |     100 |                   
  ...ngContext.tsx |   71.42 |       50 |     100 |   71.42 | 17-20             
  ...utContext.tsx |   85.71 |      100 |   66.66 |   85.71 | 13-14             
  ...nsContext.tsx |   88.23 |       50 |     100 |   88.23 | 118-119           
  ...teContext.tsx |   86.66 |       50 |     100 |   86.66 | 194-195           
  ...deContext.tsx |   76.08 |    72.72 |     100 |   76.08 | 47-48,52-59,77-78 
 src/ui/daemon     |   90.76 |    73.73 |   95.45 |   90.76 |                   
  ...TuiAdapter.ts |   90.76 |    73.73 |   95.45 |   90.76 | ...53,771-772,858 
 src/ui/editors    |   93.33 |    85.71 |   66.66 |   93.33 |                   
  ...ngsManager.ts |   93.33 |    85.71 |   66.66 |   93.33 | 49,63-64          
 src/ui/hooks      |   82.25 |    82.21 |   87.33 |   82.25 |                   
  ...dProcessor.ts |   83.12 |    82.56 |     100 |   83.12 | ...88-389,408-435 
  keyToAnsi.ts     |    3.92 |      100 |       0 |    3.92 | 19-77             
  ...dProcessor.ts |    94.8 |    70.58 |     100 |    94.8 | ...76-277,282-283 
  ...dProcessor.ts |   75.75 |    63.01 |   61.53 |   75.75 | ...84,908,927-931 
  ...amingState.ts |   12.22 |      100 |       0 |   12.22 | 54-157            
  ...agerDialog.ts |   88.23 |      100 |     100 |   88.23 | 20,24             
  ...ationFrame.ts |      32 |       60 |     100 |      32 | 42-44,51-90       
  ...odeCommand.ts |   58.82 |      100 |     100 |   58.82 | 28,33-48          
  ...enaCommand.ts |      85 |      100 |     100 |      85 | 23-24,29          
  ...aInProcess.ts |   19.81 |    66.66 |      25 |   19.81 | 57-175            
  ...Completion.ts |   92.81 |    89.09 |     100 |   92.81 | ...86-187,224-227 
  ...ifications.ts |   92.07 |    96.29 |     100 |   92.07 | 116-124           
  ...tIndicator.ts |   83.49 |    70.96 |     100 |   83.49 | ...60,168,170-178 
  ...waySummary.ts |   96.22 |    69.69 |     100 |   96.22 | 125-127,169       
  ...ndTaskView.ts |   94.21 |    76.08 |     100 |   94.21 | 122-126,213,219   
  ...ketedPaste.ts |    23.8 |      100 |       0 |    23.8 | 19-37             
  ...nchCommand.ts |   94.36 |    74.35 |     100 |   94.36 | ...60,168-169,209 
  ...ompletion.tsx |   95.96 |    83.87 |     100 |   95.96 | ...22-223,225-226 
  ...dMigration.ts |   90.62 |       75 |     100 |   90.62 | 38-40             
  useCompletion.ts |    92.4 |     87.5 |     100 |    92.4 | 68-69,93-94,98-99 
  ...nitMessage.ts |     100 |      100 |     100 |     100 |                   
  ...extualTips.ts |   77.27 |       50 |     100 |   77.27 | ...2,75-79,93-101 
  ...eteCommand.ts |   78.53 |    88.57 |     100 |   78.53 | ...96-104,112-113 
  ...ialogClose.ts |   13.33 |      100 |     100 |   13.33 | 82-173            
  useDiffData.ts   |   11.62 |      100 |       0 |   11.62 | 44-87             
  ...oublePress.ts |   53.12 |       75 |     100 |   53.12 | 33-35,41-54       
  ...orSettings.ts |     100 |      100 |     100 |     100 |                   
  ...Completion.ts |   99.12 |    97.67 |     100 |   99.12 | 182-183           
  ...ionUpdates.ts |   93.45 |     92.3 |     100 |   93.45 | ...83-287,300-306 
  ...agerDialog.ts |   88.88 |      100 |     100 |   88.88 | 21,25             
  ...backDialog.ts |    63.9 |    76.47 |   66.66 |    63.9 | ...66-168,190-191 
  useFocus.ts      |     100 |      100 |     100 |     100 |                   
  ...olderTrust.ts |     100 |      100 |     100 |     100 |                   
  ...ggestions.tsx |   89.15 |     62.5 |      50 |   89.15 | ...22-124,149-150 
  ...miniStream.ts |   78.06 |    75.47 |   91.66 |   78.06 | ...2573,2586-2594 
  ...BranchName.ts |    90.9 |     92.3 |     100 |    90.9 | 19-20,55-58       
  ...oryManager.ts |   93.15 |    93.75 |     100 |   93.15 | 44,107-110        
  ...ooksDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...stListener.ts |     100 |      100 |     100 |     100 |                   
  ...nAuthError.ts |   76.19 |       50 |     100 |   76.19 | 39-40,43-45       
  ...putHistory.ts |   92.59 |    85.71 |     100 |   92.59 | 63-64,72,94-96    
  ...storyStore.ts |     100 |    94.11 |     100 |     100 | 69                
  useKeypress.ts   |     100 |      100 |     100 |     100 |                   
  ...rdProtocol.ts |   36.36 |      100 |       0 |   36.36 | 24-31             
  ...unchEditor.ts |    9.67 |      100 |       0 |    9.67 | 11-32,39-90       
  ...gIndicator.ts |     100 |      100 |     100 |     100 |                   
  useLogger.ts     |   21.05 |      100 |       0 |   21.05 | 15-37             
  useMCPHealth.ts  |   63.15 |       75 |      50 |   63.15 | 42-52,64-67       
  useMcpDialog.ts  |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...moryDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...oryMonitor.ts |     100 |      100 |     100 |     100 |                   
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...delCommand.ts |     100 |       75 |     100 |     100 | 22                
  ...raseCycler.ts |   84.74 |    76.47 |     100 |   84.74 | ...49,52-53,69-71 
  ...rredEditor.ts |   58.33 |    22.22 |     100 |   58.33 | 23-27,29-33       
  ...derUpdates.ts |   86.49 |    77.96 |    90.9 |   86.49 | ...26,288-300,348 
  useQwenAuth.ts   |     100 |      100 |     100 |     100 |                   
  ...lScheduler.ts |    84.7 |    93.33 |     100 |    84.7 | ...71-276,372-382 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-7               
  ...umeCommand.ts |   97.08 |    83.33 |     100 |   97.08 | 103-104,133       
  ...ompletion.tsx |   90.59 |    83.33 |     100 |   90.59 | ...01,104,137-140 
  ...ectionList.ts |   96.98 |    95.69 |     100 |   96.98 | ...83-184,238-241 
  ...sionPicker.ts |   92.87 |    90.35 |     100 |   92.87 | ...99-501,503-505 
  ...earchInput.ts |     100 |      100 |     100 |     100 |                   
  ...ngsCommand.ts |   18.75 |      100 |       0 |   18.75 | 10-25             
  ...ellHistory.ts |   91.74 |    79.41 |     100 |   91.74 | ...74,122-123,133 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-73              
  ...Completion.ts |    82.7 |    85.41 |   94.73 |    82.7 | ...69-671,679-715 
  ...tateAndRef.ts |     100 |      100 |     100 |     100 |                   
  useStatusLine.ts |   96.09 |    90.37 |     100 |   96.09 | ...62-365,450-457 
  ...eateDialog.ts |   88.23 |      100 |     100 |   88.23 | 14,18             
  ...tification.ts |     100 |    85.71 |     100 |     100 | 47                
  ...alProgress.ts |   53.06 |       50 |   66.66 |   53.06 | ...53,61-68,79-85 
  ...rminalSize.ts |   76.19 |      100 |      50 |   76.19 | 21-25             
  ...emeCommand.ts |   67.01 |    29.41 |     100 |   67.01 | ...10-111,115-116 
  useTimer.ts      |   88.09 |    85.71 |     100 |   88.09 | 44-45,51-53       
  ...lMigration.ts |       0 |        0 |       0 |       0 |                   
  ...rustModify.ts |     100 |      100 |     100 |     100 |                   
  useTurnDiffs.ts  |   95.12 |    78.57 |     100 |   95.12 | 133-134,156-157   
  ...elcomeBack.ts |   87.36 |     90.9 |     100 |   87.36 | ...,94-96,114-115 
  ...reeSession.ts |   93.75 |       70 |     100 |   93.75 | 44-45,87          
  vim.ts           |   83.77 |    80.31 |     100 |   83.77 | ...55,759-767,776 
 src/ui/layouts    |   89.72 |     87.5 |     100 |   89.72 |                   
  ...AppLayout.tsx |   89.88 |     87.5 |     100 |   89.88 | 51-53,93-98       
  ...AppLayout.tsx |   89.47 |     87.5 |     100 |   89.47 | 58-63             
 src/ui/models     |   80.24 |    79.16 |   71.42 |   80.24 |                   
  ...ableModels.ts |   80.24 |    79.16 |   71.42 |   80.24 | ...,61-71,123-125 
 ...noninteractive |     100 |      100 |   14.28 |     100 |                   
  ...eractiveUi.ts |     100 |      100 |   14.28 |     100 |                   
 src/ui/state      |   94.91 |    81.81 |     100 |   94.91 |                   
  extensions.ts    |   94.91 |    81.81 |     100 |   94.91 | 68-69,88          
 src/ui/themes     |   98.53 |    70.58 |     100 |   98.53 |                   
  ansi-light.ts    |     100 |      100 |     100 |     100 |                   
  ansi.ts          |     100 |      100 |     100 |     100 |                   
  atom-one-dark.ts |     100 |      100 |     100 |     100 |                   
  ayu-light.ts     |     100 |      100 |     100 |     100 |                   
  ayu.ts           |     100 |      100 |     100 |     100 |                   
  color-utils.ts   |     100 |      100 |     100 |     100 |                   
  default-light.ts |     100 |      100 |     100 |     100 |                   
  default.ts       |     100 |      100 |     100 |     100 |                   
  ...inal-theme.ts |   88.59 |    85.96 |     100 |   88.59 | ...57-261,266-270 
  dracula.ts       |     100 |      100 |     100 |     100 |                   
  github-dark.ts   |     100 |      100 |     100 |     100 |                   
  github-light.ts  |     100 |      100 |     100 |     100 |                   
  googlecode.ts    |     100 |      100 |     100 |     100 |                   
  no-color.ts      |     100 |      100 |     100 |     100 |                   
  qwen-dark.ts     |     100 |      100 |     100 |     100 |                   
  qwen-light.ts    |     100 |      100 |     100 |     100 |                   
  ...tic-tokens.ts |     100 |      100 |     100 |     100 |                   
  ...-of-purple.ts |     100 |      100 |     100 |     100 |                   
  theme-manager.ts |   87.98 |    82.89 |     100 |   87.98 | ...48-357,362-363 
  theme.ts         |     100 |    38.02 |     100 |     100 | ...34-449,457-461 
  xcode.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/utils      |   83.98 |       83 |   92.61 |   83.98 |                   
  ...Colorizer.tsx |   79.53 |    83.78 |     100 |   79.53 | ...51-152,249-275 
  ...nRenderer.tsx |   68.83 |    70.14 |      50 |   68.83 | ...52-254,274-293 
  ...wnDisplay.tsx |   86.01 |    87.41 |     100 |   86.01 | ...87,704,729-754 
  ...idDiagram.tsx |   87.79 |    95.34 |     100 |   87.79 | 156-179           
  ...eRenderer.tsx |   92.08 |    80.45 |      95 |   92.08 | ...76-679,723-728 
  ...dWorkUtils.ts |     100 |      100 |     100 |     100 |                   
  ...boardUtils.ts |   59.61 |    58.82 |     100 |   59.61 | ...,86-88,107-149 
  commandUtils.ts  |    95.9 |    88.42 |     100 |    95.9 | ...62,164-165,289 
  computeStats.ts  |     100 |      100 |     100 |     100 |                   
  customBanner.ts  |   90.68 |    91.22 |     100 |   90.68 | ...13,324-327,334 
  displayUtils.ts  |   88.37 |    72.22 |     100 |   88.37 | 23,25,29,31,33    
  formatters.ts    |   95.23 |    98.27 |     100 |   95.23 | 117-120           
  gradientUtils.ts |     100 |      100 |     100 |     100 |                   
  highlight.ts     |     100 |      100 |     100 |     100 |                   
  ...oryMapping.ts |     100 |    94.28 |     100 |     100 | 35,57             
  historyUtils.ts  |   94.11 |       94 |     100 |   94.11 | 94-97             
  isNarrowWidth.ts |     100 |      100 |     100 |     100 |                   
  ...olDetector.ts |    8.23 |      100 |       0 |    8.23 | ...31-132,135-136 
  latexRenderer.ts |   94.95 |     73.8 |     100 |   94.95 | ...76-178,184-187 
  layoutUtils.ts   |     100 |      100 |     100 |     100 |                   
  ...ightLoader.ts |     100 |    89.47 |     100 |     100 | 81,110            
  ...nUtilities.ts |   69.84 |    85.71 |     100 |   69.84 | 75-91,100-101     
  ...ToolGroups.ts |   98.66 |    96.77 |     100 |   98.66 | 48-49             
  ...geRenderer.ts |   86.23 |    69.06 |   95.12 |   86.23 | ...1284,1324-1330 
  ...alRenderer.ts |   86.69 |     71.9 |     100 |   86.69 | ...1476,1513-1519 
  ...lsBySource.ts |     100 |    95.23 |     100 |     100 | 84                
  osc8.ts          |   94.73 |    87.75 |     100 |   94.73 | ...49,434,438-439 
  ...mConstants.ts |     100 |      100 |     100 |     100 |                   
  restoreGoal.ts   |   98.98 |    97.05 |     100 |   98.98 | 98                
  ...storyUtils.ts |   61.89 |    69.87 |      90 |   61.89 | ...76,424,429-451 
  ...ickerUtils.ts |     100 |      100 |     100 |     100 |                   
  ...izedOutput.ts |   94.94 |      100 |   88.88 |   94.94 | 112-117           
  ...wOptimizer.ts |     100 |    96.77 |     100 |     100 | 69                
  terminalSetup.ts |    4.37 |      100 |       0 |    4.37 | 44-393            
  textUtils.ts     |   97.61 |    94.84 |   92.85 |   97.61 | ...50-251,386-387 
  todoSnapshot.ts  |   89.11 |    93.33 |     100 |   89.11 | ...,66-78,180-181 
  updateCheck.ts   |     100 |    80.95 |     100 |     100 | 30-42             
 ...i/utils/export |   56.77 |     40.8 |   79.41 |   56.77 |                   
  collect.ts       |   55.92 |    50.58 |   86.36 |   55.92 | ...25-640,642-647 
  index.ts         |     100 |      100 |     100 |     100 |                   
  normalize.ts     |   57.47 |    20.51 |      80 |   57.47 | ...09-310,324-359 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
  utils.ts         |      40 |      100 |       0 |      40 | 11-13             
 ...ort/formatters |    3.38 |      100 |       0 |    3.38 |                   
  html.ts          |    9.61 |      100 |       0 |    9.61 | ...28,34-76,82-84 
  json.ts          |      50 |      100 |       0 |      50 | 14-15             
  jsonl.ts         |     3.5 |      100 |       0 |     3.5 | 14-76             
  markdown.ts      |    0.94 |      100 |       0 |    0.94 | 13-295            
 src/utils         |   77.21 |    90.21 |   94.07 |   77.21 |                   
  acpModelUtils.ts |     100 |      100 |     100 |     100 |                   
  apiPreconnect.ts |   96.72 |    97.14 |     100 |   96.72 | 165-168           
  checks.ts        |   33.33 |      100 |       0 |   33.33 | 23-28             
  cleanup.ts       |   84.12 |    93.33 |      80 |   84.12 | 75,106-115        
  commands.ts      |     100 |      100 |     100 |     100 |                   
  commentJson.ts   |   87.17 |     90.9 |     100 |   87.17 | 64-73             
  ...Calculator.ts |     100 |      100 |     100 |     100 |                   
  deepMerge.ts     |     100 |       90 |     100 |     100 | 41-43,49          
  ...ScopeUtils.ts |   97.56 |    88.88 |     100 |   97.56 | 67                
  doctorChecks.ts  |   70.98 |       75 |     100 |   70.98 | ...95-301,325-341 
  ...putCapture.ts |   90.65 |    86.17 |     100 |   90.65 | ...72,370,372-373 
  ...arResolver.ts |   94.28 |       88 |     100 |   94.28 | 28-29,125-126     
  errors.ts        |   90.85 |    96.36 |    92.3 |   90.85 | 69-70,298-310     
  events.ts        |     100 |      100 |     100 |     100 |                   
  gitUtils.ts      |   91.91 |    84.61 |     100 |   91.91 | 78-81,124-127     
  ...AutoUpdate.ts |   90.76 |    93.33 |   88.88 |   90.76 | 103-114           
  ...tyWarnings.ts |     100 |      100 |     100 |     100 |                   
  ...lationInfo.ts |     100 |      100 |     100 |     100 |                   
  languageUtils.ts |   97.89 |    96.42 |     100 |   97.89 | 132-133           
  math.ts          |       0 |        0 |       0 |       0 | 1-15              
  ...iagnostics.ts |   94.57 |    83.01 |   88.88 |   94.57 | ...05,311,315-317 
  ...onfigUtils.ts |     100 |      100 |     100 |     100 |                   
  ...iveHelpers.ts |   96.79 |    93.28 |     100 |   96.79 | ...76-477,575,588 
  osc.ts           |    97.5 |      100 |   88.88 |    97.5 | 195-196           
  package.ts       |   88.88 |       80 |     100 |   88.88 | 33-34             
  processUtils.ts  |     100 |      100 |     100 |     100 |                   
  readStdin.ts     |   79.62 |       90 |      80 |   79.62 | 33-40,52-54       
  relaunch.ts      |   98.07 |    76.92 |     100 |   98.07 | 70                
  resolvePath.ts   |   66.66 |       25 |     100 |   66.66 | 12-13,16,18-19    
  runBudget.ts     |   99.35 |    96.77 |     100 |   99.35 | 119               
  sandbox.ts       |       0 |        0 |       0 |       0 | 1-1047            
  sessionPaths.ts  |   90.84 |    90.56 |     100 |   90.84 | ...81-182,185-186 
  settingsUtils.ts |   82.51 |    91.66 |   89.74 |   82.51 | ...76-694,701-709 
  spawnWrapper.ts  |     100 |      100 |     100 |     100 |                   
  ...upProfiler.ts |   98.46 |    94.52 |     100 |   98.46 | 130-131,305       
  ...upWarnings.ts |     100 |      100 |     100 |     100 |                   
  stdioHelpers.ts  |     100 |       60 |     100 |     100 | 23,32             
  systemInfo.ts    |   95.12 |    89.06 |     100 |   95.12 | ...43-244,249-253 
  ...InfoFields.ts |    87.5 |       65 |     100 |    87.5 | ...24-125,146-147 
  ...iffPreview.ts |   94.11 |    83.33 |     100 |   94.11 | 13                
  ...entEmitter.ts |     100 |      100 |     100 |     100 |                   
  ...upWarnings.ts |   91.17 |    82.35 |     100 |   91.17 | 67-68,73-74,77-78 
  version.ts       |     100 |       50 |     100 |     100 | 11                
  ...ingHandler.ts |     100 |      100 |     100 |     100 |                   
  windowTitle.ts   |     100 |      100 |     100 |     100 |                   
  ...WithBackup.ts |   63.15 |    81.25 |     100 |   63.15 | 93,118-157        
-------------------|---------|----------|---------|---------|-------------------
Core Package - Full Text Report
-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   80.32 |    83.02 |   82.57 |   80.32 |                   
 src               |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/__mocks__/fs  |       0 |        0 |       0 |       0 |                   
  promises.ts      |       0 |        0 |       0 |       0 | 1-48              
 src/agents        |   88.06 |    79.77 |   92.13 |   88.06 |                   
  ...transcript.ts |   92.25 |    85.71 |     100 |   92.25 | ...87,306-307,438 
  ...ent-resume.ts |    82.8 |    71.63 |   77.41 |    82.8 | ...1059-1063,1066 
  ...ound-tasks.ts |   95.76 |    87.57 |     100 |   95.76 | ...26-827,898-899 
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/agents/arena  |   76.54 |    66.87 |   78.72 |   76.54 |                   
  ...gentClient.ts |   79.47 |    88.88 |   81.81 |   79.47 | ...68-183,189-204 
  ArenaManager.ts  |   75.37 |    63.37 |   78.26 |   75.37 | ...1860,1866-1867 
  arena-events.ts  |   64.44 |      100 |      50 |   64.44 | ...71-175,178-183 
  diff-summary.ts  |    87.5 |    72.34 |     100 |    87.5 | ...32-133,137-138 
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...gents/backends |   76.29 |    86.15 |   73.04 |   76.29 |                   
  ITermBackend.ts  |   97.97 |    93.93 |     100 |   97.97 | ...78-180,255,307 
  ...essBackend.ts |   91.25 |    90.62 |   86.66 |   91.25 | ...94,249-269,328 
  TmuxBackend.ts   |    90.7 |    76.55 |   97.36 |    90.7 | ...87,697,743-747 
  detect.ts        |   31.25 |      100 |       0 |   31.25 | 34-88             
  index.ts         |     100 |      100 |     100 |     100 |                   
  iterm-it2.ts     |     100 |     92.1 |     100 |     100 | 37-38,106         
  tmux-commands.ts |    6.64 |      100 |    3.03 |    6.64 | ...93-363,386-503 
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...agents/runtime |   81.78 |    77.83 |   73.04 |   81.78 |                   
  agent-context.ts |     100 |      100 |     100 |     100 |                   
  agent-core.ts    |   76.81 |    72.89 |   63.63 |   76.81 | ...1614,1641-1688 
  agent-events.ts  |     100 |      100 |     100 |     100 |                   
  ...t-headless.ts |   84.48 |    78.04 |   63.63 |   84.48 | ...00-401,404-405 
  ...nteractive.ts |   80.07 |    80.76 |   74.07 |   80.07 | ...53,455,457,460 
  ...statistics.ts |   98.19 |    82.35 |     100 |   98.19 | 127,151,192,225   
  agent-types.ts   |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/agents/tasks  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/config        |    78.4 |    82.24 |   65.07 |    78.4 |                   
  config.ts        |   76.28 |    81.09 |   60.43 |   76.28 | ...3869,3880-3892 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  models.ts        |     100 |      100 |     100 |     100 |                   
  storage.ts       |   95.01 |     90.9 |   90.47 |   95.01 | ...71-372,375-376 
 ...nfirmation-bus |   98.29 |    97.14 |     100 |   98.29 |                   
  message-bus.ts   |   98.14 |    97.05 |     100 |   98.14 | 42-43             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/core          |   87.87 |    83.53 |   92.07 |   87.87 |                   
  baseLlmClient.ts |   87.24 |    76.47 |    87.5 |   87.24 | ...82,484-494,503 
  client.ts        |   87.42 |    80.57 |   86.36 |   87.42 | ...2072,2111-2114 
  ...tGenerator.ts |    72.1 |    61.11 |     100 |    72.1 | ...63,365,372-375 
  ...lScheduler.ts |   85.36 |    82.08 |   94.73 |   85.36 | ...3209,3270-3281 
  geminiChat.ts    |   91.04 |    87.25 |   97.22 |   91.04 | ...2703,2770-2771 
  geminiRequest.ts |     100 |      100 |     100 |     100 |                   
  ...htProtocol.ts |    9.09 |      100 |       0 |    9.09 | 34-42,45-49,52-87 
  logger.ts        |   87.33 |    87.02 |     100 |   87.33 | ...61-565,611-625 
  ...tyDefaults.ts |     100 |      100 |     100 |     100 |                   
  ...olExecutor.ts |   92.59 |       75 |      50 |   92.59 | 41-42             
  ...on-helpers.ts |   86.48 |    72.22 |     100 |   86.48 | ...97-198,212-221 
  ...issionFlow.ts |   98.59 |       95 |     100 |   98.59 | 93                
  prompts.ts       |   89.24 |    86.41 |   76.92 |   89.24 | ...-972,1175-1176 
  tokenLimits.ts   |     100 |    89.47 |     100 |     100 | 51-52             
  ...okTriggers.ts |   99.33 |    90.47 |     100 |   99.33 | 156,167           
  turn.ts          |   96.46 |    88.88 |     100 |   96.46 | ...21,434-435,483 
 ...ntentGenerator |   94.93 |    82.59 |   93.87 |   94.93 |                   
  ...tGenerator.ts |   96.49 |    84.28 |   92.59 |   96.49 | ...04,922-926,966 
  converter.ts     |   94.51 |    80.72 |     100 |   94.51 | ...06-607,617,823 
  index.ts         |       0 |        0 |       0 |       0 | 1-21              
  usage.ts         |     100 |      100 |     100 |     100 |                   
 ...ntentGenerator |   91.53 |    71.64 |   93.33 |   91.53 |                   
  ...tGenerator.ts |      90 |    70.96 |   92.85 |      90 | ...80-286,304-305 
  index.ts         |     100 |       80 |     100 |     100 | 50                
 ...ntentGenerator |   93.86 |    82.98 |    90.9 |   93.86 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tGenerator.ts |   93.72 |    81.27 |   90.32 |   93.72 | ...29,939-940,968 
  ...tDetection.ts |     100 |      100 |     100 |     100 |                   
 ...ntentGenerator |    81.7 |    84.27 |   90.78 |    81.7 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  converter.ts     |   76.88 |    82.25 |    87.5 |   76.88 | ...1589,1610-1616 
  errorHandler.ts  |     100 |      100 |     100 |     100 |                   
  index.ts         |   54.54 |    68.75 |      50 |   54.54 | ...79,87-91,95-99 
  ...tGenerator.ts |    66.4 |    70.58 |   88.88 |    66.4 | ...51-157,168-169 
  pipeline.ts      |   93.69 |     84.9 |     100 |   93.69 | ...61-462,470,538 
  ...ureContext.ts |     100 |      100 |     100 |     100 |                   
  ...ingOptions.ts |       0 |        0 |       0 |       0 | 1                 
  ...CallParser.ts |   90.66 |    88.57 |     100 |   90.66 | ...15-319,349-350 
  ...kingParser.ts |     100 |    96.87 |     100 |     100 | 42                
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...rator/provider |   96.67 |    88.42 |   96.07 |   96.67 |                   
  dashscope.ts     |   97.38 |    90.21 |   93.33 |   97.38 | ...93-294,370-371 
  deepseek.ts      |   94.91 |    89.36 |     100 |   94.91 | ...31-132,145-146 
  default.ts       |   95.79 |    89.65 |   88.88 |   95.79 | 122-123,193-195   
  index.ts         |     100 |      100 |     100 |     100 |                   
  mimo.ts          |   94.11 |    66.66 |     100 |   94.11 | 29,52-53          
  minimax.ts       |     100 |      100 |     100 |     100 |                   
  mistral.ts       |   96.07 |    73.33 |     100 |   96.07 | 32-33             
  modelscope.ts    |     100 |      100 |     100 |     100 |                   
  openrouter.ts    |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 |                   
  utils.ts         |     100 |      100 |     100 |     100 |                   
 src/extension     |   62.34 |    79.57 |   80.31 |   62.34 |                   
  ...-converter.ts |   66.21 |    52.45 |     100 |   66.21 | ...85-786,795-827 
  ...ionManager.ts |   47.05 |    82.19 |    65.9 |   47.05 | ...1400,1410-1429 
  ...onSettings.ts |   93.46 |    93.05 |     100 |   93.46 | ...17-221,228-232 
  ...-converter.ts |   54.88 |    94.44 |      60 |   54.88 | ...35-146,158-192 
  github.ts        |   46.41 |     87.3 |   63.63 |   46.41 | ...68-374,413-466 
  index.ts         |     100 |      100 |     100 |     100 |                   
  marketplace.ts   |   97.31 |    93.75 |     100 |   97.31 | ...65,185-186,275 
  npm.ts           |   59.01 |    71.69 |    87.5 |   59.01 | ...23-425,432-436 
  override.ts      |   94.11 |    88.88 |     100 |   94.11 | 63-64,81-82       
  redaction.ts     |     100 |      100 |     100 |     100 |                   
  settings.ts      |   66.26 |      100 |      50 |   66.26 | 81-108,143-149    
  storage.ts       |     100 |      100 |     100 |     100 |                   
  ...ableSchema.ts |     100 |      100 |     100 |     100 |                   
  variables.ts     |   88.75 |    83.33 |     100 |   88.75 | ...28-231,234-237 
 src/followup      |   55.57 |    84.14 |   81.25 |   55.57 |                   
  followupState.ts |      96 |    89.74 |     100 |      96 | 159-161,218-219   
  index.ts         |     100 |      100 |     100 |     100 |                   
  overlayFs.ts     |   95.06 |       84 |     100 |   95.06 | 78,108,122,133    
  speculation.ts   |   13.02 |      100 |   16.66 |   13.02 | 89-464,524-575    
  ...onToolGate.ts |     100 |    96.42 |     100 |     100 | 94                
  ...nGenerator.ts |    71.6 |    72.13 |   83.33 |    71.6 | ...88-246,316-318 
 src/generated     |       0 |        0 |       0 |       0 |                   
  git-commit.ts    |       0 |        0 |       0 |       0 | 1-10              
 src/goals         |   89.57 |    83.45 |   94.44 |   89.57 |                   
  ...eGoalStore.ts |    85.1 |    95.45 |   84.61 |    85.1 | ...63-166,174-182 
  goalHook.ts      |   97.26 |    91.48 |     100 |   97.26 | 100-105           
  goalJudge.ts     |   84.33 |    74.28 |     100 |   84.33 | ...57-358,366-368 
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/hooks         |   83.67 |    85.34 |   86.72 |   83.67 |                   
  ...okRegistry.ts |   86.48 |    77.08 |     100 |   86.48 | ...41-344,362-369 
  ...terpolator.ts |   96.66 |    93.33 |     100 |   96.66 | 66-67             
  ...HookRunner.ts |   96.68 |    87.23 |     100 |   96.68 | 110-112,231-233   
  ...Aggregator.ts |    96.4 |    90.78 |     100 |    96.4 | ...91,293-294,367 
  ...entHandler.ts |    94.6 |    86.07 |   93.33 |    94.6 | ...42,799-800,810 
  hookPlanner.ts   |   88.19 |       85 |    90.9 |   88.19 | ...68-170,188-199 
  hookRegistry.ts  |   90.17 |    83.33 |     100 |   90.17 | ...33,352,356,360 
  hookRunner.ts    |   58.56 |    71.26 |   66.66 |   58.56 | ...48-749,758-759 
  hookSystem.ts    |   84.57 |      100 |   65.85 |   84.57 | ...21-622,628-629 
  ...HookRunner.ts |   75.51 |     61.9 |      80 |   75.51 | ...05-406,424-425 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...HookRunner.ts |   96.37 |     90.9 |      90 |   96.37 | 342-350,424-425   
  ...SkillHooks.ts |   78.75 |       75 |   66.66 |   78.75 | 62-66,137-152     
  ...oksManager.ts |   96.66 |    91.66 |     100 |   96.66 | ...90,209-210,223 
  ssrfGuard.ts     |   77.22 |    85.36 |     100 |   77.22 | ...57,261-267,273 
  stopHookCap.ts   |     100 |      100 |     100 |     100 |                   
  trustedHooks.ts  |       0 |        0 |       0 |       0 | 1-124             
  types.ts         |   91.21 |    92.13 |   85.71 |   91.21 | ...40-441,501-505 
  urlValidator.ts  |     100 |      100 |     100 |     100 |                   
 src/ide           |   74.28 |    83.39 |   78.33 |   74.28 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  detect-ide.ts    |     100 |      100 |     100 |     100 |                   
  ide-client.ts    |    64.2 |    81.48 |   66.66 |    64.2 | ...9-970,999-1007 
  ide-installer.ts |   89.06 |    79.31 |     100 |   89.06 | ...36,143-147,160 
  ideContext.ts    |     100 |      100 |     100 |     100 |                   
  process-utils.ts |   84.84 |    71.79 |     100 |   84.84 | ...37,151,193-194 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/lsp           |   41.24 |    52.14 |   51.42 |   41.24 |                   
  ...nfigLoader.ts |   70.27 |    35.89 |   94.73 |   70.27 | ...20-422,426-432 
  ...ionFactory.ts |   42.69 |    79.16 |      50 |   42.69 | ...62-413,419-436 
  ...Normalizer.ts |   23.09 |    13.72 |   30.43 |   23.09 | ...04-905,909-924 
  ...verManager.ts |   25.31 |    62.06 |   41.66 |   25.31 | ...85-704,710-740 
  ...eLspClient.ts |   32.77 |       80 |   17.64 |   32.77 | ...84-288,294-295 
  ...LspService.ts |   48.49 |    67.16 |   65.71 |   48.49 | ...1352,1369-1379 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/mcp           |   78.69 |    75.34 |   75.92 |   78.69 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...h-provider.ts |   86.95 |      100 |   33.33 |   86.95 | ...,93,97,101-102 
  ...h-provider.ts |   73.82 |    53.92 |     100 |   73.82 | ...88-895,902-904 
  ...en-storage.ts |   98.62 |    97.72 |     100 |   98.62 | 87-88             
  oauth-utils.ts   |   70.58 |    85.29 |    90.9 |   70.58 | ...70-290,315-344 
  ...n-provider.ts |   89.83 |    95.83 |   45.45 |   89.83 | ...43,147,151-152 
 .../token-storage |   79.52 |    86.66 |   86.36 |   79.52 |                   
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   82.87 |    82.35 |   92.85 |   82.87 | ...63-173,181-182 
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   68.14 |    82.35 |   64.28 |   68.14 | ...81-295,298-314 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/memory        |   70.97 |    75.71 |    69.1 |   70.97 |                   
  const.ts         |     100 |      100 |     100 |     100 |                   
  dream.ts         |   65.65 |    73.33 |      50 |   65.65 | 50,107-148        
  ...entPlanner.ts |   57.84 |    72.72 |   33.33 |   57.84 | ...35,140-147,152 
  entries.ts       |   63.77 |    79.16 |      50 |   63.77 | ...72-180,183-189 
  extract.ts       |    95.2 |    79.16 |     100 |    95.2 | 81-86,125         
  ...entPlanner.ts |   63.08 |    65.71 |   41.17 |   63.08 | ...17,222-223,332 
  ...ionPlanner.ts |       0 |        0 |       0 |       0 | 1                 
  forget.ts        |    45.8 |    61.53 |   44.44 |    45.8 | ...04,211,214-346 
  indexer.ts       |   83.87 |    45.45 |     100 |   83.87 | ...50,56-57,69-70 
  manager.ts       |   75.31 |    81.04 |    75.6 |   75.31 | ...1278,1291-1293 
  memoryAge.ts     |   90.47 |    77.77 |     100 |   90.47 | 50-51             
  paths.ts         |   55.47 |    89.47 |   85.71 |   55.47 | ...,89-90,106-114 
  prompt.ts        |   93.36 |    71.42 |     100 |   93.36 | ...58,161,228-229 
  recall.ts        |   77.54 |    69.38 |   88.88 |   77.54 | ...53-258,282-293 
  ...ceSelector.ts |   91.86 |    77.27 |     100 |   91.86 | ...15,117-118,126 
  scan.ts          |   87.91 |    68.42 |     100 |   87.91 | ...47-48,58,82-87 
  ...entPlanner.ts |   58.02 |    66.66 |   56.25 |   58.02 | ...47-268,344-389 
  status.ts        |   10.52 |      100 |       0 |   10.52 | 41-98             
  store.ts         |   94.44 |    83.33 |     100 |   94.44 | 56-57,92-93       
  types.ts         |     100 |      100 |     100 |     100 |                   
  ...ontextFile.ts |   79.38 |    81.03 |   81.81 |   79.38 | ...58-272,286-291 
 src/mocks         |       0 |        0 |       0 |       0 |                   
  msw.ts           |       0 |        0 |       0 |       0 | 1-9               
 src/models        |   89.35 |    85.67 |    87.5 |   89.35 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...tor-config.ts |   90.24 |    91.42 |     100 |   90.24 | 142,148,151-160   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...nfigErrors.ts |   74.22 |       44 |   84.61 |   74.22 | ...,67-74,106-117 
  ...igResolver.ts |   98.66 |    92.85 |     100 |   98.66 | 162,324,330       
  modelRegistry.ts |     100 |    98.59 |     100 |     100 | 222               
  modelsConfig.ts  |   84.57 |    82.14 |   81.57 |   84.57 | ...1223,1252-1253 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/output        |     100 |      100 |     100 |     100 |                   
  ...-formatter.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/permissions   |   74.62 |    88.85 |   58.28 |   74.62 |                   
  autoMode.ts      |   61.59 |    93.54 |   83.33 |   61.59 | ...00-238,340-356 
  ...transcript.ts |      98 |       84 |     100 |      98 | 200-201           
  classifier.ts    |   92.89 |     87.5 |     100 |   92.89 | 146-153,333-337   
  ...erousRules.ts |     100 |    89.36 |     100 |     100 | 110,133,147,175   
  ...alTracking.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...on-manager.ts |    78.3 |    85.24 |   82.14 |    78.3 | ...-917,1023-1027 
  rule-parser.ts   |   96.06 |    93.22 |     100 |   96.06 | ...-875,1024-1026 
  ...-semantics.ts |   58.28 |    85.27 |    30.2 |   58.28 | ...1604-1614,1643 
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...sifier-prompts |   98.18 |       90 |     100 |   98.18 |                   
  system-prompt.ts |   98.18 |       90 |     100 |   98.18 | 150               
 src/prompts       |   83.63 |      100 |    87.5 |   83.63 |                   
  mcp-prompts.ts   |   18.18 |      100 |       0 |   18.18 | 11-19             
  ...t-registry.ts |     100 |      100 |     100 |     100 |                   
 src/providers     |   77.46 |    70.94 |   60.71 |   77.46 |                   
  all-providers.ts |      68 |      100 |       0 |      68 | 68-69,73-79,83-89 
  index.ts         |     100 |      100 |     100 |     100 |                   
  install.ts       |   98.87 |    87.27 |     100 |   98.87 | 268-269           
  ...der-config.ts |   66.11 |    55.93 |   63.15 |   66.11 | ...08-409,416-425 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...viders/presets |   97.26 |    86.36 |      50 |   97.26 |                   
  ...oding-plan.ts |   87.17 |      100 |       0 |   87.17 | 81-83,86-88,90-93 
  ...a-standard.ts |     100 |      100 |     100 |     100 |                   
  ...token-plan.ts |     100 |      100 |     100 |     100 |                   
  ...m-provider.ts |   97.01 |    81.25 |      75 |   97.01 | 120-121           
  deepseek.ts      |     100 |      100 |     100 |     100 |                   
  idealab.ts       |     100 |      100 |     100 |     100 |                   
  minimax.ts       |     100 |      100 |     100 |     100 |                   
  modelscope.ts    |     100 |      100 |     100 |     100 |                   
  openrouter.ts    |     100 |      100 |     100 |     100 |                   
  zai.ts           |     100 |      100 |     100 |     100 |                   
 src/qwen          |   83.87 |    77.23 |   95.83 |   83.87 |                   
  ...tGenerator.ts |   98.64 |    98.18 |     100 |   98.64 | 105-106           
  qwenOAuth2.ts    |   80.85 |    70.27 |   90.32 |   80.85 | ...1169-1185,1215 
  ...kenManager.ts |   83.76 |    76.22 |     100 |   83.76 | ...62-767,788-793 
 src/services      |   85.63 |    83.47 |   91.37 |   85.63 |                   
  ...ionTrailer.ts |     100 |      100 |     100 |     100 |                   
  ...llRegistry.ts |   98.44 |    91.83 |     100 |   98.44 | 268-269           
  ...ionService.ts |   98.44 |    97.45 |     100 |   98.44 | 536,538-542       
  ...ingService.ts |   83.88 |    83.33 |   83.33 |   83.88 | ...1266,1283-1284 
  ...ttribution.ts |   91.73 |    87.71 |      90 |   91.73 | ...80-685,826-827 
  ...utSlimming.ts |     100 |    96.77 |     100 |     100 | 139,188           
  cronScheduler.ts |   97.56 |    92.98 |     100 |   97.56 | 62-63,77,155      
  ...eryService.ts |   80.43 |    95.45 |      75 |   80.43 | ...19-134,140-141 
  ...oryService.ts |   86.18 |    76.76 |   91.17 |   86.18 | ...1150,1191-1194 
  fileReadCache.ts |     100 |      100 |     100 |     100 |                   
  ...temService.ts |   91.27 |    82.69 |    90.9 |   91.27 | ...94,196,294-301 
  ...ratedFiles.ts |      96 |    88.23 |     100 |      96 | 119-120,146-147   
  gitInit.ts       |     100 |      100 |     100 |     100 |                   
  gitService.ts    |   68.75 |     92.3 |   55.55 |   68.75 | ...12-122,125-129 
  ...reeService.ts |   73.83 |    69.31 |    97.5 |   73.83 | ...1460,1488-1489 
  ...ionService.ts |   98.13 |     97.8 |   95.45 |   98.13 | ...32-333,380-381 
  ...orRegistry.ts |   96.54 |    91.73 |     100 |   96.54 | ...70-471,622-623 
  sessionRecap.ts  |   12.65 |      100 |       0 |   12.65 | 44-150            
  ...ionService.ts |   90.47 |     79.2 |   96.87 |   90.47 | ...1324,1328-1329 
  sessionTitle.ts  |   93.87 |    71.15 |     100 |   93.87 | ...33-236,267-268 
  ...ionService.ts |   81.07 |    77.92 |   89.28 |   81.07 | ...1923,1929-1934 
  ...Estimation.ts |     100 |      100 |     100 |     100 |                   
  ...UseSummary.ts |   94.63 |    88.46 |     100 |   94.63 | ...62-164,214-215 
  ...reeCleanup.ts |   14.56 |      100 |   33.33 |   14.56 | 58-185            
  ...ionService.ts |   84.21 |    79.41 |     100 |   84.21 | ...22-223,239-240 
 ...icrocompaction |   98.05 |     91.8 |     100 |   98.05 |                   
  microcompact.ts  |   98.05 |     91.8 |     100 |   98.05 | ...19,289,293,391 
 src/skills        |   88.51 |    85.75 |   94.54 |   88.51 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...activation.ts |     100 |     93.1 |     100 |     100 | 93,112            
  skill-load.ts    |      94 |    86.56 |     100 |      94 | ...08,228,240-242 
  skill-manager.ts |   84.26 |    80.87 |   90.32 |   84.26 | ...1155,1162-1166 
  skill-paths.ts   |   89.15 |    86.36 |     100 |   89.15 | ...00-101,106-107 
  symlinkScope.ts  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/subagents     |   82.61 |    78.89 |   95.23 |   82.61 |                   
  ...tin-agents.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...nt-manager.ts |   77.15 |    71.36 |    93.1 |   77.15 | ...1178,1200-1201 
  types.ts         |     100 |      100 |     100 |     100 |                   
  validation.ts    |   92.46 |    95.18 |     100 |   92.46 | 51-56,69-74,78-83 
 src/telemetry     |   76.84 |     87.3 |   79.86 |   76.84 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...attributes.ts |   98.13 |       88 |     100 |   98.13 | 185-187           
  ...-exporters.ts |   46.37 |      100 |   44.44 |   46.37 | ...85,88-89,92-93 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-111             
  ...-processor.ts |   93.93 |    90.21 |   94.11 |   93.93 | ...75-280,299-300 
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-128             
  loggers.ts       |    51.9 |       64 |   57.77 |    51.9 | ...1214,1231-1251 
  metrics.ts       |   75.03 |    82.95 |   74.54 |   75.03 | ...8-988,991-1002 
  ...attributes.ts |     100 |      100 |     100 |     100 |                   
  sanitize.ts      |      80 |    83.33 |     100 |      80 | 35-36,41-42       
  sdk.ts           |   93.01 |    88.23 |   80.95 |   93.01 | ...58-559,579-583 
  ...on-context.ts |     100 |      100 |     100 |     100 |                   
  ...on-tracing.ts |   92.75 |    88.26 |     100 |   92.75 | ...27-930,934-937 
  ...etry-utils.ts |     100 |      100 |     100 |     100 |                   
  ...l-decision.ts |     100 |      100 |     100 |     100 |                   
  ...e-id-utils.ts |     100 |      100 |     100 |     100 |                   
  tracer.ts        |   98.61 |    89.36 |     100 |   98.61 | 53,108            
  types.ts         |   79.17 |    85.83 |   83.33 |   79.17 | ...1149,1152-1181 
  uiTelemetry.ts   |   92.97 |    96.96 |   81.25 |   92.97 | ...93-194,200-207 
 ...ry/qwen-logger |   68.24 |    79.56 |   64.91 |   68.24 |                   
  event-types.ts   |       0 |        0 |       0 |       0 |                   
  qwen-logger.ts   |   68.24 |    79.34 |   64.28 |   68.24 | ...1055,1093-1094 
 src/test-utils    |   93.16 |    95.91 |   76.47 |   93.16 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  ...st-helpers.ts |   94.11 |       90 |     100 |   94.11 | 69-70             
  index.ts         |     100 |      100 |     100 |     100 |                   
  mock-tool.ts     |   91.19 |    97.14 |   72.41 |   91.19 | ...38,202-203,216 
  ...aceContext.ts |     100 |      100 |     100 |     100 |                   
 src/tools         |   78.94 |    81.43 |   85.71 |   78.94 |                   
  ...erQuestion.ts |   88.93 |    76.74 |    90.9 |   88.93 | ...39-340,347-348 
  cron-create.ts   |   88.11 |    88.88 |    62.5 |   88.11 | ...,43-44,165-172 
  cron-delete.ts   |   96.82 |      100 |   83.33 |   96.82 | 26-27             
  cron-list.ts     |   96.66 |      100 |   83.33 |   96.66 | 25-26             
  diffOptions.ts   |     100 |      100 |     100 |     100 |                   
  edit.ts          |   81.02 |    84.07 |      75 |   81.02 | ...15-716,826-876 
  ...r-worktree.ts |   82.95 |    67.56 |    87.5 |   82.95 | ...82-185,276-277 
  exit-worktree.ts |   84.23 |    85.96 |   91.66 |   84.23 | ...92-293,298-312 
  exitPlanMode.ts  |   85.09 |    85.71 |     100 |   85.09 | ...60-163,177-189 
  glob.ts          |   90.63 |    88.33 |   84.61 |   90.63 | ...28,171,302,305 
  grep.ts          |   79.19 |    85.71 |   78.94 |   79.19 | ...20,560,569-576 
  ls.ts            |   96.74 |    90.27 |     100 |   96.74 | 176-181,212,216   
  lsp.ts           |   72.77 |    60.09 |   90.32 |   72.77 | ...1211,1213-1214 
  ...nt-manager.ts |   84.36 |    82.74 |   84.21 |   84.36 | ...2099-2103,2142 
  mcp-client.ts    |   33.18 |    77.65 |   66.66 |   33.18 | ...1490,1494-1497 
  mcp-tool.ts      |   90.98 |    88.88 |   96.42 |   90.98 | ...95-596,646-647 
  memory-config.ts |       0 |        0 |       0 |       0 | 1-47              
  ...iable-tool.ts |     100 |    84.61 |     100 |     100 | 102,109           
  monitor.ts       |   91.36 |    83.94 |   88.46 |   91.36 | ...61,574,770-775 
  notebook-edit.ts |   85.11 |    76.42 |   81.25 |   85.11 | ...54-870,916-917 
  ...nforcement.ts |   82.57 |       90 |     100 |   82.57 | 174-185,234-247   
  read-file.ts     |    95.4 |    90.32 |      90 |    95.4 | ...99,298-301,304 
  ripGrep.ts       |   94.59 |    85.71 |   93.33 |   94.59 | ...60,463,541-542 
  ...-transport.ts |    6.34 |        0 |       0 |    6.34 | 47-145            
  send-message.ts  |   84.68 |    91.66 |    62.5 |   84.68 | ...,82-90,167-170 
  shell.ts         |   73.05 |    79.66 |   91.42 |   73.05 | ...4216,4265-4271 
  skill-utils.ts   |     100 |      100 |     100 |     100 |                   
  skill.ts         |   88.35 |    91.42 |   86.66 |   88.35 | ...12,416,439-461 
  ...eticOutput.ts |   95.12 |      100 |      80 |   95.12 | 87-88             
  task-stop.ts     |   93.14 |    96.15 |   85.71 |   93.14 | 39-40,54-64       
  todoWrite.ts     |   89.17 |    82.05 |   92.85 |   89.17 | ...41-546,568-569 
  tool-error.ts    |     100 |      100 |     100 |     100 |                   
  tool-names.ts    |     100 |      100 |     100 |     100 |                   
  tool-registry.ts |   74.85 |    76.85 |   80.95 |   74.85 | ...30-831,839-840 
  tool-search.ts   |   95.19 |    86.48 |    92.3 |   95.19 | ...47-153,208-213 
  tools.ts         |   90.49 |    90.19 |   84.21 |   90.49 | ...78-479,495-501 
  web-fetch.ts     |   88.84 |       80 |   92.85 |   88.84 | ...12-313,315-316 
  write-file.ts    |   82.65 |    80.45 |   84.61 |   82.65 | ...65-668,696-731 
 src/tools/agent   |   75.07 |    80.49 |   73.61 |   75.07 |                   
  agent.ts         |   75.33 |    80.72 |   74.24 |   75.33 | ...2474,2483-2486 
  fork-subagent.ts |   69.62 |    71.42 |   66.66 |   69.62 | ...04-105,140-151 
 src/utils         |   89.13 |    87.66 |   93.61 |   89.13 |                   
  LruCache.ts      |       0 |        0 |       0 |       0 | 1-41              
  ...Controller.ts |     100 |      100 |     100 |     100 |                   
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...cFileWrite.ts |   77.96 |    80.48 |     100 |   77.96 | ...35,156,173-176 
  bareMode.ts      |   27.27 |      100 |       0 |   27.27 | 9-15,18-19        
  browser.ts       |    7.69 |      100 |       0 |    7.69 | 17-56             
  bundlePaths.ts   |     100 |      100 |     100 |     100 |                   
  ...igResolver.ts |     100 |      100 |     100 |     100 |                   
  ...engthError.ts |   89.11 |     87.5 |     100 |   89.11 | ...28-129,132-133 
  cronDisplay.ts   |   42.85 |    23.07 |     100 |   42.85 | 26-31,33-45,47-54 
  cronParser.ts    |   89.74 |    85.71 |     100 |   89.74 | ...,63-64,183-186 
  debugLogger.ts   |    95.9 |    93.93 |   94.73 |    95.9 | 106-107,214-218   
  editHelper.ts    |   93.63 |    83.52 |     100 |   93.63 | ...28-429,463-464 
  editor.ts        |    97.6 |     95.4 |     100 |    97.6 | ...25-326,328-329 
  ...arResolver.ts |   94.28 |    88.88 |     100 |   94.28 | 28-29,125-126     
  ...entContext.ts |     100 |    95.45 |     100 |     100 | 83                
  errorParsing.ts  |    97.7 |    97.05 |     100 |    97.7 | 72-73             
  ...rReporting.ts |   88.46 |       90 |     100 |   88.46 | 69-74             
  errors.ts        |   70.54 |    79.59 |      50 |   70.54 | ...15-231,235-241 
  fetch.ts         |   70.18 |    71.42 |   71.42 |   70.18 | ...42,148,161,186 
  fileUtils.ts     |    91.5 |    86.25 |   95.23 |    91.5 | ...1191,1195-1201 
  forkedAgent.ts   |   80.68 |    78.12 |   83.33 |   80.68 | ...39-545,550-556 
  formatters.ts    |   81.81 |       75 |     100 |   81.81 | 15-16             
  ...eUtilities.ts |   89.21 |    86.66 |     100 |   89.21 | 16-17,49-55,65-66 
  ...rStructure.ts |   94.36 |    94.28 |     100 |   94.36 | ...17-120,330-335 
  getPty.ts        |    12.5 |      100 |       0 |    12.5 | 21-34             
  gitDiff.ts       |   92.36 |    79.53 |     100 |   92.36 | ...55-856,928-929 
  ...noreParser.ts |    92.3 |    89.36 |     100 |    92.3 | ...15-116,186-187 
  gitUtils.ts      |   73.64 |    90.32 |   83.33 |   73.64 | ...,78-79,103-154 
  iconvHelper.ts   |     100 |      100 |     100 |     100 |                   
  ...rePatterns.ts |     100 |      100 |     100 |     100 |                   
  ...ionManager.ts |     100 |     90.9 |     100 |     100 | 26                
  ...lPromptIds.ts |     100 |      100 |     100 |     100 |                   
  jsonl-utils.ts   |    74.1 |    90.76 |   58.33 |    74.1 | ...23-326,336-342 
  ...-detection.ts |     100 |      100 |     100 |     100 |                   
  ...iagnostics.ts |    96.4 |     90.9 |     100 |    96.4 | ...66,293-294,376 
  ...yDiscovery.ts |   88.27 |    83.87 |     100 |   88.27 | ...76,279,407-410 
  ...tProcessor.ts |    93.2 |    89.18 |     100 |    93.2 | ...82-288,370-371 
  ...Inspectors.ts |   61.53 |      100 |      50 |   61.53 | 18-23             
  modelId.ts       |   98.95 |    98.21 |     100 |   98.95 | 148               
  ...kerChecker.ts |   90.78 |    91.66 |     100 |   90.78 | 73-79             
  notebook.ts      |   94.57 |    89.83 |   95.83 |   94.57 | ...21,333,385-387 
  openaiLogger.ts  |   90.85 |    87.87 |     100 |   90.85 | ...97-199,222-227 
  partUtils.ts     |     100 |    98.61 |     100 |     100 | 206               
  pathReader.ts    |     100 |      100 |     100 |     100 |                   
  paths.ts         |   93.21 |    91.86 |     100 |   93.21 | ...89-390,392-394 
  pdf.ts           |   93.68 |    87.05 |     100 |   93.68 | ...96-297,321-325 
  projectPath.ts   |     100 |      100 |     100 |     100 |                   
  projectRoot.ts   |   71.73 |    78.57 |     100 |   71.73 | 54-66             
  ...ectSummary.ts |   89.39 |    72.41 |     100 |   89.39 | ...37-142,193-196 
  ...tIdContext.ts |     100 |      100 |     100 |     100 |                   
  proxyUtils.ts    |     100 |      100 |     100 |     100 |                   
  ...rDetection.ts |   58.57 |       76 |     100 |   58.57 | ...4,88-89,95-100 
  ...noreParser.ts |   85.45 |    85.18 |     100 |   85.45 | ...59,65-66,72-73 
  rateLimit.ts     |   92.55 |    85.92 |     100 |   92.55 | ...70-272,309-310 
  readManyFiles.ts |   87.59 |       84 |     100 |   87.59 | ...09-211,227-238 
  retry.ts         |   89.81 |    88.05 |     100 |   89.81 | ...29,350,357-358 
  ripgrepUtils.ts  |   46.79 |    84.37 |   66.66 |   46.79 | ...45-246,258-335 
  ...sDiscovery.ts |   97.42 |    92.85 |     100 |   97.42 | ...04,182-183,202 
  ...iagnostics.ts |   83.08 |     67.5 |   92.59 |   83.08 | ...23,543-544,550 
  ...tchOptions.ts |   81.72 |    85.04 |   95.23 |   81.72 | ...18,543,572-581 
  runtimeStatus.ts |    97.5 |    88.57 |     100 |    97.5 | 167-168           
  safeJsonParse.ts |   74.07 |    83.33 |     100 |   74.07 | 40-46             
  ...nStringify.ts |     100 |      100 |     100 |     100 |                   
  ...aConverter.ts |   90.78 |    88.23 |     100 |   90.78 | ...41-42,93,95-96 
  ...aValidator.ts |   94.57 |    80.26 |     100 |   94.57 | ...04,213-216,270 
  ...r-launcher.ts |   76.92 |     91.3 |   66.66 |   76.92 | ...34,136,157-195 
  ...orageUtils.ts |   96.89 |    85.84 |     100 |   96.89 | ...51,367,447,466 
  shell-utils.ts   |   82.93 |    89.89 |     100 |   82.93 | ...1522,1529-1533 
  ...lAstParser.ts |   95.58 |    85.79 |     100 |   95.58 | ...1059-1061,1071 
  ...nlyChecker.ts |   95.75 |    92.39 |     100 |   95.75 | ...00-301,313-314 
  sideQuery.ts     |   98.71 |    97.14 |     100 |   98.71 | 110               
  ...pEventSink.ts |     100 |       80 |     100 |     100 | 61                
  ...tGenerator.ts |     100 |      100 |     100 |     100 |                   
  ...ameContext.ts |     100 |      100 |     100 |     100 |                   
  symlink.ts       |   77.77 |       50 |     100 |   77.77 | 44,54-59          
  ...emEncoding.ts |   96.36 |    91.17 |     100 |   96.36 | 59-60,124-125     
  terminalSafe.ts  |     100 |      100 |     100 |     100 |                   
  ...Serializer.ts |   98.72 |       90 |     100 |   98.72 | 42-43,134,201-203 
  testUtils.ts     |   53.33 |      100 |   33.33 |   53.33 | ...53,59-64,70-72 
  textUtils.ts     |      60 |      100 |   66.66 |      60 | 36-55             
  thoughtUtils.ts  |     100 |    92.85 |     100 |     100 | 71                
  ...-converter.ts |   94.59 |    85.71 |     100 |   94.59 | 35-36             
  tool-utils.ts    |    93.6 |     91.3 |     100 |    93.6 | ...58-159,162-163 
  truncation.ts    |     100 |       92 |     100 |     100 | 52,71             
  windowsPath.ts   |   89.47 |    79.31 |     100 |   89.47 | ...57-58,62,90-91 
  ...aceContext.ts |   93.71 |    89.28 |   93.33 |   93.71 | ...24-225,249-251 
  xml.ts           |     100 |      100 |     100 |     100 |                   
  yaml-parser.ts   |      92 |     84.9 |     100 |      92 | 49-53,65-69       
 ...ils/filesearch |   86.21 |    81.61 |   96.42 |   86.21 |                   
  crawlCache.ts    |     100 |      100 |     100 |     100 |                   
  crawler.ts       |   82.84 |    77.49 |   94.82 |   82.84 | ...1451,1485-1486 
  fileSearch.ts    |   93.58 |    87.32 |     100 |   93.58 | ...46-247,249-250 
  ignore.ts        |     100 |      100 |     100 |     100 |                   
  result-cache.ts  |     100 |     92.3 |     100 |     100 | 46                
 ...uest-tokenizer |   56.63 |    74.52 |   74.19 |   56.63 |                   
  ...eTokenizer.ts |   41.86 |    76.47 |   69.23 |   41.86 | ...70-443,453-507 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tTokenizer.ts |   68.39 |    69.49 |    90.9 |   68.39 | ...24-325,327-328 
  ...ageFormats.ts |      76 |      100 |   33.33 |      76 | 45-48,55-56       
  textTokenizer.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
-------------------|---------|----------|---------|---------|-------------------

For detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run.

@DragonnZhang DragonnZhang changed the title [codex] Add Codex reproduce workflow skills feat(skills): add agent reproduction workflows May 13, 2026
@DragonnZhang DragonnZhang marked this pull request as ready for review May 13, 2026 13:25

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical No tests exist for any of the 3 Python scripts or 2 Bash scripts introduced by this PR. The agent-reproduce-feature/SKILL.md Done Criteria require "at least one verification path" and "focused tests or a reproducible smoke command." The core pipeline (HTTP capture → trace normalization → trace comparison) is entirely untested — any regression in normalize_trace.py or compare_traces.py will silently produce misleading parity results.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_tmux_capture.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
@tanzhenxin tanzhenxin added the type/feature-request New feature or enhancement request label May 15, 2026
@pomelo-nwu

Copy link
Copy Markdown
Collaborator

Question: Has this workflow been exercised end-to-end against a real parity case (e.g., reproducing a specific /help or tool-call behavior from Codex or Claude Code and iterating until Qwen Code matches)?

The concern is a tool credibility gap: this skill exists to verify parity, but if its own normalize_trace.py or compare_traces.py has a subtle bug, it could silently produce a false "aligned" result — and we'd ship incorrect parity conclusions with no safety net.

A concrete suggestion: before merging, include at least one recorded parity run (even a small scenario like a single tool-call comparison) with its expected diff output committed as a fixture. This would serve as both:

  1. Proof that the pipeline works end-to-end (capture → normalize → compare → actionable diff)
  2. A regression test — future changes to the scripts can be validated against this known-good output

Without this, reviewers and future users have no way to distinguish "the agents are aligned" from "the comparison tool says they're aligned."

@DragonnZhang DragonnZhang enabled auto-merge (squash) May 18, 2026 05:57
@DragonnZhang

DragonnZhang commented May 18, 2026

Copy link
Copy Markdown
Collaborator Author

@pomelo-nwu Yes, #4120 was implemented using this skill. I only ran the skill once and made a small amount of manual fine-tuning afterwards; it was very useful.

Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py Outdated
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh Outdated
Comment thread .qwen/skills/agent-reproduce-feature/SKILL.md Outdated
Reviewer wenshao flagged that the documented commands use 'skills/...' while the skill files live under '.qwen/skills/...'. Running the documented examples from the repo root fails with No such file or directory.

Update all script invocations in the two agent-reproduce SKILL.md files and capture-workflow.md to use the correct '.qwen/skills/...' prefix. Documentation only; no code changes.
Reviewer flagged that '--set ssl_insecure=true' silently disables upstream TLS verification. Add a 3-line comment above the mitmdump invocation explaining the trade-off and that it is intended for local dev only.
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py
Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
@DragonnZhang DragonnZhang requested a review from wenshao May 22, 2026 05:18
Comment thread .qwen/skills/agent-reproduce-feature/scripts/capture_state.py Outdated
Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh Outdated
Comment thread .qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh Outdated
text = content_sample.decode("utf-8")
except UnicodeDecodeError:
if truncated:
text = content_sample.decode("utf-8", errors="ignore")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] When content is non-UTF-8 binary AND truncated, the code decodes with errors="ignore" and returns kind: "text" with garbled data. When the same content is non-truncated, it correctly returns kind: "base64". This inconsistency means truncated binary content gets a misleading kind label, and downstream normalize_trace.py tries to parse garbled text as JSON.

Suggested change
text = content_sample.decode("utf-8", errors="ignore")
except UnicodeDecodeError:
return {
"kind": "base64",
"base64": base64.b64encode(content_sample).decode("ascii"),
"truncated": truncated,
}

Remove the if truncated: branch — both truncated and non-truncated binary should use base64 encoding.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

trap cleanup EXIT

sleep "${REPRO_TMUX_SETTLE_SECONDS:-2}"
tmux capture-pane -t "${session}" -p -S - > "${out_dir}/tmux-pane.txt"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The cleanup trap at line 26 kills the tmux session on EXIT by default, but the script prints attach=tmux attach -t ${session} to stdout and tmux-session.txt. By the time the caller reads these instructions, the session is already dead. Users following the printed attach command will get session not found errors.

Suggested change
tmux capture-pane -t "${session}" -p -S - > "${out_dir}/tmux-pane.txt"
{
echo "session=${session}"
if [[ "${REPRO_TMUX_KEEP_SESSION:-0}" == "1" ]]; then
echo "attach=tmux attach -t ${session}"
echo "capture=tmux capture-pane -t ${session} -p -S - > ${out_dir}/tmux-pane.txt"
echo "kill=tmux kill-session -t ${session}"
else
echo "attach=(session killed on exit; re-run with REPRO_TMUX_KEEP_SESSION=1 to keep)"
fi
} > "${out_dir}/tmux-session.txt"

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

Comment thread .qwen/skills/agent-reproduce-align/scripts/compare_traces.py
"openai-organization",
"openai-project",
}
SENSITIVE_KEY_RE = re.compile(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] SENSITIVE_KEY_RE includes bare terms cookie and session as alternations. Because re.search matches anywhere in the key string, any JSON key containing these substrings — session_id, session_timeout, max_sessions, cookie_name, cookie_policy — is treated as sensitive and its entire value is replaced with "[REDACTED]".

This cascades through _redact_json to destroy tool parameter schemas: a parameter like "session_id": {"type": "string", "description": "..."} has its entire value redacted, making it impossible to compare tool definitions between traces — directly undermining the alignment workflow's purpose.

Suggested change
SENSITIVE_KEY_RE = re.compile(
SENSITIVE_KEY_RE = re.compile(
r"(?i)(api[-_]?key|authorization|password|secret|token|credential|"
r"access[-_]?token|refresh[-_]?token|client[-_]?secret|"
r"session[-_]?(?:id|token|key|secret)|cookie[-_]?(?:id|name|value|token))"
)

Replace bare cookie and session with compound patterns that require a sensitive suffix, so session_id still matches but max_sessions does not.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage gaps — The PR adds 4 Python scripts (~1,120 lines) and 3 shell scripts with zero test files. The most critical untested paths are: (1) _decode() branches in llm_dump.py (binary+truncated silently switches to lossy decode instead of base64); (2) all 9 TOKEN_PATTERNS redaction regexes in llm_dump.py; (3) normalize_schema() recursive handling of nested JSON Schema with $ref, anyOf, additionalProperties; (4) summarize_tool() for OpenAI vs Anthropic vs legacy tool formats. These are the core correctness paths that the entire comparison pipeline depends on.

— qwen3.7-max via Qwen Code /review

return value


def _redact_url(url: str) -> str:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] _redact_url() only redacts sensitive patterns in query parameters. URL path segments and userinfo (embedded credentials) pass through untouched.

  • Path tokens: A URL like /api/token/abc123def/endpoint is written verbatim. Some API designs embed tokens in path segments.
  • Userinfo: urlparse("https://user:sk-xxx@host/path") preserves user:sk-xxx in the netloc. _replace(query=...) does not touch parsed.username/parsed.password.
Suggested change
def _redact_url(url: str) -> str:
def _redact_url(url: str) -> str:
parsed = urlparse(url)
if parsed.username or parsed.password:
netloc = parsed.hostname or ""
if parsed.port:
netloc = f"{netloc}:{parsed.port}"
parsed = parsed._replace(netloc=netloc)
query = []
for key, value in parse_qsl(parsed.query, keep_blank_values=True):
query.append((key, "[REDACTED]" if SENSITIVE_KEY_RE.search(key) else value))
safe_path = _redact_text(parsed.path)
safe_fragment = _redact_text(parsed.fragment)
return urlunparse(parsed._replace(
query=urlencode(query, doseq=True),
path=safe_path,
fragment=safe_fragment,
))

— qwen3.7-max via Qwen Code /review

r"(?i)(api[-_]?key|authorization|cookie|password|secret|token|credential|"
r"access[-_]?token|refresh[-_]?token|client[-_]?secret|session)"
)
TOKEN_PATTERNS = (

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] TOKEN_PATTERNS requires a keyword prefix (Bearer, Basic, Token) for generic tokens. Bare JWTs like eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U are not matched. JWTs appear in LLM tool results (auth context, session tokens) and API responses.

Consider adding a JWT-shaped pattern:

Suggested change
TOKEN_PATTERNS = (
TOKEN_PATTERNS = (
(re.compile(r"(?i)\bbearer\s+[a-z0-9._~+/=-]+"), "Bearer [REDACTED]"),
(re.compile(r"(?i)\bbasic\s+[a-z0-9._~+/=-]+"), "Basic [REDACTED]"),
(re.compile(r"(?i)\btoken\s+[a-z0-9._~+/=-]+"), "Token [REDACTED]"),
(re.compile(r"\bsk-[A-Za-z0-9_-]{12,}\b"), "sk-[REDACTED]"),
(re.compile(r"\bAKIA[0-9A-Z]{16}\b"), "AKIA[REDACTED]"),
(re.compile(r"\bAIza[0-9A-Za-z_-]{20,}\b"), "AIza[REDACTED]"),
(re.compile(r"\b(?:ghp|gho|ghu|ghs)_[A-Za-z0-9_]{20,}\b"), "gh_[REDACTED]"),
(re.compile(r"\bgithub_pat_[A-Za-z0-9_]{20,}\b"), "github_pat_[REDACTED]"),
(re.compile(r"\beyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\b"), "eyJ[REDACTED]"),
(
re.compile(
r"-----BEGIN\s+[\w\s]+PRIVATE\s+KEY-----.*?-----END\s+[\w\s]+PRIVATE\s+KEY-----",
re.DOTALL,
),
"-----BEGIN PRIVATE KEY-----[REDACTED]-----END PRIVATE KEY-----",
),
)

— qwen3.7-max via Qwen Code /review

parsed = urlparse(req.get("url", ""))
url_path = parsed.path
if parsed.query:
url_path = f"{url_path}?{parsed.query}"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The normalized url_path now preserves the full query string (fixing the prior "query strings invisible" issue), but this introduces the opposite problem: query parameters commonly contain ephemeral values (deployment versions, request IDs, timestamps, cache-busters) that differ between runs.

compare_traces.py will report spurious request[N].url_path mismatches for otherwise identical calls. The alignment rules explicitly say "Do not chase provider-specific endpoints, model names, IDs, timestamps, token counts, or ephemeral headers."

Since meaningful parameters for LLM endpoints are in the POST body (already compared via body_keys and body_values), consider dropping the query string or making it opt-in:

Suggested change
url_path = f"{url_path}?{parsed.query}"
url_path = parsed.path

— qwen3.7-max via Qwen Code /review

Comment thread .qwen/skills/agent-reproduce-align/scripts/normalize_trace.py Outdated
echo "qwen_stdout=${out_dir}/qwen/command.stdout"
echo "qwen_stderr=${out_dir}/qwen/command.stderr"

if [[ "${reference_status}" -ne 0 || "${qwen_status}" -ne 0 || "${normalize_ref_status}" -ne 0 || "${normalize_qwen_status}" -ne 0 || "${compare_status}" -ne 0 ]]; then

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] compare_traces.py uses a clean three-way exit protocol: 0 = no diffs, 1 = diffs found, 2 = load error. But this guard collapses all non-zero sub-statuses — including compare_status=1 (expected diffs during iteration) — into exit 1.

The calling agent cannot distinguish "expected parity gap, keep iterating" from "infrastructure failure" without fragile stdout parsing. Consider distinct exit codes:

Suggested change
if [[ "${reference_status}" -ne 0 || "${qwen_status}" -ne 0 || "${normalize_ref_status}" -ne 0 || "${normalize_qwen_status}" -ne 0 || "${compare_status}" -ne 0 ]]; then
if [[ "${reference_status}" -ne 0 || "${qwen_status}" -ne 0 ]]; then
echo "FAILED: command execution (ref=${reference_status}, qwen=${qwen_status})" >&2
exit 2
elif [[ "${normalize_ref_status}" -ne 0 || "${normalize_qwen_status}" -ne 0 ]]; then
echo "FAILED: trace normalization" >&2
exit 3
elif [[ "${compare_status}" -eq 2 ]]; then
echo "FAILED: comparison error" >&2
exit 4
elif [[ "${compare_status}" -eq 1 ]]; then
exit 1
fi

— qwen3.7-max via Qwen Code /review

DragonnZhang and others added 4 commits May 26, 2026 14:04
`is_sensitive_path()` previously split rel_path on `[/._ -]+`, which both
let SSH private keys leak (`id_rsa` -> ["id", "rsa"], no match) and hid
benign tooling files from state diffs (`tokenizer.json` -> ["token", ...]
matched `token`).

Match whole `/`-separated segments and check the basename — and its stem
without trailing extension — directly against `SENSITIVE_PATH_PARTS`.
Hidden directories like `.ssh` / `.gnupg` keep matching their non-dot
equivalents so existing directory-level coverage is preserved.
A JSONL trace line that decodes to a non-object (`[]`, `"hello"`, `42`,
`null`) bypassed the `json.JSONDecodeError` handler and crashed the whole
`normalize()` run with an `AttributeError` on `raw.get("request")`.

Add an `isinstance(raw, dict)` guard alongside the existing decode-error
branch — emit the same `Warning: skipping ...` shape on stderr and
continue with the next line so a single odd entry no longer wipes out
the normalized output.
The sed redaction in run_with_mitm.sh covered sk-*, ghp_*, github_pat_*,
and generic api_key/token/secret/credential= patterns, but did not
include AKIA[0-9A-Z]{16} (AWS access keys) or AIza[0-9A-Za-z_-]{20,}
(Google API keys), even though llm_dump.py and capture_state.py both
already handle them. If those credentials were passed as CLI arguments,
they were written in plaintext to env.txt.

Add the two patterns so all three redaction surfaces stay in sync.
run_with_mitm.sh forced NO_PROXY="" and no_proxy="" so that ALL outbound
HTTP/HTTPS traffic (including localhost calls to MCP servers, dev servers,
and health checks) was funneled through the mitm proxy. When the wrapped
command interacted with a localhost service that did not speak HTTP through
the proxy, this surfaced as confusing connection errors that looked like
application bugs rather than wrapper-induced.

Set NO_PROXY="localhost,127.0.0.1" (and the lowercase variant) so localhost
calls bypass the proxy while remote traffic still gets captured. urllib's
proxy_bypass() honors the new value (verified locally: localhost and
127.0.0.1 bypassed, example.com still proxied).

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failing: Test (windows-latest, Node 22.x).

Additionally, no test files exist for any of the 4 Python scripts (~1,120 lines) or 3 shell scripts. The security-critical redaction logic (is_sensitive_path, TOKEN_PATTERNS, PEM regex, _redact_json) has zero test coverage. Several findings below (F1, F2, F3) would have been caught by simple unit tests with known inputs and expected outputs.


def entry_for_symlink(path: Path) -> dict[str, Any]:
try:
target = os.readlink(path)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] os.readlink(path) target is stored verbatim with zero redaction — no home path replacement, no is_sensitive_path check.

A symlink at a non-sensitive relative path (e.g. projects/link) pointing to /Users/wenshao/.ssh/id_rsa would leak the full absolute target path including the home directory and the existence/location of credential files into state-manifest.json.

Suggested change
target = os.readlink(path)
def entry_for_symlink(path: Path) -> dict[str, Any]:
try:
target = os.readlink(path)
except OSError:
target = None
if target is not None:
home = str(Path.home())
target = target.replace(home, "~")
if is_sensitive_path(target):
return {"kind": "symlink", "target": "<sensitive_path>"}
return {"kind": "symlink", "target": target}

— qwen3.7-max via Qwen Code /review

)
TOKEN_PATTERNS = (
(re.compile(r"(?i)\bbearer\s+[a-z0-9._~+/=-]+"), "Bearer [REDACTED]"),
(re.compile(r"(?i)\bbasic\s+[a-z0-9._~+/=-]+"), "Basic [REDACTED]"),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] (?i)\bbasic\s+[a-z0-9._~+/=-]+ matches common English phrases — "basic usage", "bearer minimum", "token effort" all match. Since _redact_json applies _redact_text to every JSON string value, user messages, tool descriptions, and assistant responses get corrupted: "Explain the basic usage""Explain the Basic [REDACTED]".

Note: capture_state.py:GENERIC_AUTH_RE at line 118 uses {8,} (8-char minimum) — the correct pattern. This should match.

Suggested change
(re.compile(r"(?i)\bbasic\s+[a-z0-9._~+/=-]+"), "Basic [REDACTED]"),
(re.compile(r"(?i)\bbearer\s+[A-Za-z0-9._~+/=-]{8,}"), "Bearer [REDACTED]"),
(re.compile(r"(?i)\bbasic\s+[A-Za-z0-9._~+/=-]{8,}"), "Basic [REDACTED]"),
(re.compile(r"(?i)\btoken\s+[A-Za-z0-9._~+/=-]{8,}"), "Token [REDACTED]"),

— qwen3.7-max via Qwen Code /review

(re.compile(r"\bgithub_pat_[A-Za-z0-9_]{20,}\b"), "github_pat_[REDACTED]"),
(
re.compile(
r"-----BEGIN\s+[\w\s]+PRIVATE\s+KEY-----.*?-----END\s+[\w\s]+PRIVATE\s+KEY-----",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] The PEM regex [\w\s]+ between BEGIN and PRIVATE\s+KEY greedily consumes PRIVATE KEY itself, so the PKCS#8 unencrypted format -----BEGIN PRIVATE KEY----- (no qualifier word) fails to match. PKCS#8 is the modern standard — default output of openssl pkcs8, used by Java keystores, Go crypto/x509, and GCP service account keys.

Format Matches?
BEGIN RSA PRIVATE KEY
BEGIN EC PRIVATE KEY
BEGIN ENCRYPTED PRIVATE KEY
BEGIN PRIVATE KEY

Make the qualifier optional:

Suggested change
r"-----BEGIN\s+[\w\s]+PRIVATE\s+KEY-----.*?-----END\s+[\w\s]+PRIVATE\s+KEY-----",
r"-----BEGIN\s+(?:[\w\s]+\s+)?PRIVATE\s+KEY-----.*?-----END\s+(?:[\w\s]+\s+)?PRIVATE\s+KEY-----",

— qwen3.7-max via Qwen Code /review

return digest.hexdigest()


def is_sensitive_path(rel_path: str) -> bool:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] is_sensitive_path misses .env.local, .env.production, .envrc, .credentials.json, .token.json, and .auth.json.

Root cause: rsplit(".", 1)[0] on .env.local yields .env (leading dot kept), which is not in SENSITIVE_PATH_PARTS. The part[1:] check yields env.local (extension not stripped). For .credentials.json, stem is .credentials, part[1:] is credentials.json — neither matches.

These are among the most common files for API keys and secrets (Next.js, Rails, Django, direnv). When not flagged, capture_text reads their full contents with only token-pattern redaction.

Suggested change
def is_sensitive_path(rel_path: str) -> bool:
def is_sensitive_path(rel_path: str) -> bool:
# ... existing checks ...
stem = basename.rsplit(".", 1)[0] if "." in basename else basename
if stem and stem in SENSITIVE_PATH_PARTS:
return True
# Handle hidden files: .credentials.json → stem is ".credentials"
if stem and stem.startswith(".") and stem[1:] in SENSITIVE_PATH_PARTS:
return True
# Check each dot-separated segment
for segment in basename.split("."):
if segment and segment in SENSITIVE_PATH_PARTS:
return True

— qwen3.7-max via Qwen Code /review

SSL_CERT_FILE="${ca_file}" \
REQUESTS_CA_BUNDLE="${ca_file}" \
REPRO_CAPTURE_OUT="${http_out}" \
"$@" >"${out_dir}/command.stdout" 2>"${out_dir}/command.stderr"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] Five output files (env.txt, mitm.log, command.stdout, command.stderr, command.exit) are created with default umask (typically 0o644, world-readable). Only http.jsonl gets chmod 0o600 (in llm_dump.py:141). This is inconsistent — capture_state.py correctly applies 0o600 to all its outputs.

mitm.log may contain proxy debug output with sensitive data. command.stdout/command.stderr capture unfiltered CLI output which may include API keys or tokens printed by the wrapped agent.

Suggested change
"$@" >"${out_dir}/command.stdout" 2>"${out_dir}/command.stderr"
# Before the main redirections, or after:
for f in "${out_dir}/env.txt" "${out_dir}/mitm.log" \
"${out_dir}/command.stdout" "${out_dir}/command.stderr" \
"${out_dir}/command.exit"; do
chmod 600 "${f}" 2>/dev/null || true
done

— qwen3.7-max via Qwen Code /review

"created_at": now_iso(),
"agent": args.agent,
"root": str(root),
"root_exists": root.exists(),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] When root does not exist, write_snapshot produces a manifest with "root_exists": false and "entries": {}, writes it to disk, and returns 0 with no stderr warning. A caller checking $? sees success and proceeds as if the snapshot were meaningful.

At 3 AM: capture_state.py snapshot --agent codex on a machine where ~/.codex doesn't exist → empty baseline → all files show as "added" in the diff with no explanation.

Suggested change
"root_exists": root.exists(),
if not root.exists():
print(f"WARNING: root does not exist: {root}", file=sys.stderr)

Consider also returning a non-zero exit code.

— qwen3.7-max via Qwen Code /review

-s "${script_dir}/llm_dump.py" \
>"${mitm_log}" 2>&1 &

mitm_pid="$!"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] The startup health check (line 57) verifies mitmdump is alive before the command starts, but there's no equivalent check after the command finishes. If mitmdump is OOM-killed or crashes mid-capture, records stop being written to http.jsonl with no diagnostic. The cleanup trap silently reaps the dead process.

The operator sees a shorter-than-expected http.jsonl with no indication that records were lost — indistinguishable from "the agent only made N requests".

Suggested change
mitm_pid="$!"
"$@" >"${out_dir}/command.stdout" 2>"${out_dir}/command.stderr"
status=$?
if ! kill -0 "${mitm_pid}" >/dev/null 2>&1; then
echo "WARNING: mitmdump exited during capture. Check mitm.log for details." >&2
echo "mitm_log=${mitm_log}" >&2
fi

— qwen3.7-max via Qwen Code /review

)


def response(flow: http.HTTPFlow) -> None:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The mitmproxy addon only implements response(). mitmproxy fires the error() hook (not response()) for connection-level failures: DNS errors, connection refused, TLS handshake failures, upstream timeouts. These failed flows are silently omitted from the JSONL capture.

If a reference agent makes 10 LLM API calls but 3 fail at the connection level, the trace records only 7 requests. The normalized trace reports request_count: 7. Compared against a Qwen Code trace where all 10 succeed, the comparison shows 3 "extra" requests rather than 3 failures — incorrect diagnosis.

Add an error handler:

def error(flow: http.HTTPFlow) -> None:
    record = {
        "ts": time.time(),
        "request": {
            "method": flow.request.method,
            "url": _redact_url(flow.request.pretty_url),
            "headers": _headers(flow.request.headers),
            "body": _decode(flow.request.content),
        },
        "response": None,
        "error": str(flow.error) if flow.error else None,
    }
    _write_record(record)

— qwen3.7-max via Qwen Code /review

return value


def walk_tools(value: Any) -> list[dict[str, Any]]:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] walk_tools() doesn't handle Gemini's functionDeclarations format. Gemini tools use {"tools": [{"functionDeclarations": [{"name": "...", "parameters": {...}}]}]}summarize_tool() receives {"functionDeclarations": [...]} and finds no function, parameters, or input_schema keys, producing an empty summary.

The capture layer is provider-agnostic (mitmproxy intercepts all HTTP), but normalization is blind to Gemini tool definitions. Trace comparison would report "no differences" for tool sets that are completely different.

Suggested change
def walk_tools(value: Any) -> list[dict[str, Any]]:
def walk_tools(value: Any) -> list[dict[str, Any]]:
tools: list[dict[str, Any]] = []
if isinstance(value, dict):
for key in ("tools", "functions"):
if key in value and isinstance(value[key], list):
for item in value[key]:
# Gemini: {"functionDeclarations": [...]}
if "functionDeclarations" in item and isinstance(item["functionDeclarations"], list):
for decl in item["functionDeclarations"]:
tools.append(summarize_tool({"type": "function", "function": decl}))
elif key == "functions":
tools.append(summarize_tool({"type": "function", "function": item}))
else:
tools.append(summarize_tool(item))
return tools

— qwen3.7-max via Qwen Code /review

return hashlib.sha256(value.encode("utf-8")).hexdigest()[:16]


def json_body(record: dict[str, Any]) -> Any:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] json_body() can return a list (e.g., OpenAI batch API requests: [{"method": "POST", ...}, ...]). In normalize(), body.keys() is guarded by isinstance(body, dict) (returns []), and summarize_body_values() also guards with if not isinstance(body, dict): return {}. Batch API requests silently produce empty body_keys and body_values — all body-level comparison is skipped.

Consider adding a body_kind field or content hash for non-dict bodies:

if isinstance(body, list):
    requests[-1]["body_kind"] = "list"
    requests[-1]["body_len"] = len(body)
    requests[-1]["body_hash"] = hashlib.sha256(json.dumps(body, sort_keys=True).encode()).hexdigest()[:16]

— qwen3.7-max via Qwen Code /review

dragon added 3 commits May 26, 2026 14:56
The `/I` flag for case-insensitive matching is a GNU sed extension.
On macOS with BSD sed (prior to Sequoia), the flag silently fails
to match — meaning `API_KEY=...`, `Secret=...`, `Credential=...`
in the captured `redacted_command` would not be redacted in
`env.txt` despite the pattern intending to catch them.

Replace the case-insensitive alternation with explicit per-letter
character classes (`[Aa][Pp][Ii][-_]?[Kk][Ee][Yy]`, etc.). Both
BSD and GNU sed accept this form, and the redaction now works
identically across platforms.

Verified locally on BSD sed (macOS):
- `API_KEY=plain` → `API_KEY=<redacted>`
- `Secret=abc` → `Secret=<redacted>`
- `TOKEN=xyz` → `TOKEN=<redacted>`
- `Credential=zzz` → `Credential=<redacted>`
- `MyApiKey=v` → `MyApiKey=<redacted>`

Addresses review feedback on PR #4118.
`summarize_messages` previously checked only `"system"` (Anthropic
Messages API) and `"instructions"` (OpenAI Responses API) as
top-level body keys. Qwen Code talks to Gemini-shaped endpoints
that carry the system prompt under `"systemInstruction"`
(camelCase) — there are 130+ uses of that key inside
packages/core/src alone.

Without this third key, the normalize tool dropped the system
prompt from the summarized message list when the captured trace
came from a Gemini-shaped request, so `compare_traces.py`
reported a spurious `message_roles` mismatch (one side had a
`role: "system"` entry, the other did not) and could not align
the system-prompt content between the two captures.

Add `"systemInstruction"` to the detection tuple and document the
provider conventions inline.

Verified locally with five synthetic bodies:
- Gemini `systemInstruction` (string)
- Gemini `systemInstruction` (`{role, parts}` shape)
- Anthropic `system` + `messages`
- OpenAI Responses `instructions` + `input`
- bare `messages` (no system prompt)

The first two now correctly emit a `role: "system"` summary entry;
the legacy three are byte-identical.

Addresses review feedback on PR #4118.
`compare_request()` in compare_traces.py looped over messages via
`zip(left_messages, right_messages)`, which silently truncates to the
shorter list. If one capture had an extra trailing message — typically
the feature-relevant prompt or a tool-result block — it was never
compared and never reported. The top-level `request_count` check in
`main()` catches differences in HTTP-request counts; the
`message_roles` diff happened to fire when lists differed, but it
only highlighted the role sequence, not which side was longer, by
how much, or what content was lost off the tail.

Add two diagnostics in `compare_request()` after the role check:

1. An explicit `message_count: L != R` line so the discrepancy is
   visible even when the role prefixes happen to match (e.g. both
   sides start `[system, user]` but one has an extra `assistant`).
2. A per-message `missing_in_right` / `extra_in_right` dump for the
   trailing slice, mirroring the existing request-level handling
   (`request[idx].missing_in_right` / `extra_in_right`) so the
   surfaced output is symmetric between the request and message
   layers.

The zip-loop itself stays (it's correct for the overlapping prefix),
and the empty-list / equal-length / identical / equal-length-with-
different-hash paths are byte-identical to before.

Verified locally with five synthetic request pairs:
- right missing trailing message → 3 diffs (roles, count, missing)
- right has extra trailing message → 3 diffs (roles, count, extra)
- equal length, different content_hash → only the hash diff
- identical → 0 diffs
- both empty → 0 diffs

Addresses review feedback on PR #4118.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 7 review. All three R6 findings addressed cleanly in the incremental diff (44 lines, 3 files):

  • compare_traces.py:51 message-count diagnostic now surfaces an explicit message_count: L != R line and reports the actual content of trailing messages that fell off zip() (both the missing-in-right and extra-in-right paths).
  • normalize_trace.py:138 adds systemInstruction to the Gemini/Qwen Code system-prompt detection tuple, with a concise provider-convention comment.
  • run_with_mitm.sh:95 replaces the GNU-only /I flag with portable per-letter character classes for api_key/token/secret/credential, and documents the BSD-sed rationale above the expression.

CI green 10/10. — qwen3.7-max via Qwen Code /review

@wenshao

wenshao commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Local Verification Report

PR: #4118feat(skills): add agent reproduction workflows
Verified by: Maintainer (local build + script validation + E2E testing)
Date: 2026-05-29


1. Static Validation

Check Files Result
Python compile (py_compile) llm_dump.py, compare_traces.py, normalize_trace.py, capture_state.py ALL PASS (4/4)
Shell syntax (bash -n) run_with_mitm.sh, run_tmux_capture.sh, run_pair_capture.sh ALL PASS (3/3)

2. E2E Functional Tests (tmux)

Script Test Result
normalize_trace.py Synthetic JSONL → normalized JSON with correct model, stream, messages (content hashes), tools (schemas), body_keys PASS
compare_traces.py Two traces with schema diff (extra timeout property) → correctly detects properties and schema differences PASS
capture_state.py snapshot Snapshots temp directory → produces state-manifest.json with file metadata, sha256, redacted text PASS
capture_state.py diff Before/after snapshots → detects added file (config.toml), modified file (settings.json), produces unified diff in markdown PASS
run_tmux_capture.sh Captures tmux session → produces command.txt, tmux-pane.txt, tmux-session.txt with session management info PASS
run_with_mitm.sh Usage guard → correct usage message + exit code 2 PASS
run_pair_capture.sh Usage guard → correct usage message + exit code 2 PASS
.gitignore .repro-runs/ directory → confirmed gitignored (working tree clean after creating test files) PASS

3. Security Review

Area Status Details
Credential redaction (llm_dump.py) OK Redacts Authorization, Cookie, API keys (OpenAI, GitHub, Google, AWS), Bearer tokens, PEM keys in both headers and JSON bodies. Sensitive header set covers 15 header names + regex pattern match.
Credential redaction (capture_state.py) OK Separate redaction for state files — covers Bearer, sk-, gh_, github_pat_, AKIA, AIza*, PEM keys, key-value pairs with sensitive names. Home path replaced with ~.
Credential redaction (run_with_mitm.sh) OK Command line logged with sed-based redaction of sk-, AKIA, AIza*, gh_, github_pat_, and generic API_KEY/TOKEN/SECRET patterns. BSD/GNU sed compatible (uses character classes instead of /I flag).
Sensitive path handling OK capture_state.py skips content capture for 36 sensitive path parts (auth, credentials, ssh, tokens, etc.). Only metadata (kind, size) exposed.
File permissions OK Output files chmod 0o600 (http.jsonl, state-manifest.json, state-diff.json, state-diff.md)
Proxy binding OK mitmdump binds to 127.0.0.1 only (not 0.0.0.0), with explicit NO_PROXY for localhost
SSRF prevention OK No URL interpolation from user input in API calls — scripts only write to local files
Body size limit OK MAX_BODY (500KB default) prevents unbounded memory consumption in llm_dump.py
Cleanup on exit OK All shell scripts use trap cleanup EXIT for proper process/session cleanup

4. Architecture Review

  • Two-skill workflow: agent-reproduce-feature (capture) → agent-reproduce-align (comparison) — clean separation of concerns
  • Reference agent selection: Supports codex and claude-code, with discovery via command -v before assuming paths
  • Trace normalization: Drops timestamps, auth headers, provider IDs — focuses on contracts (tool names, schemas, required fields, message roles)
  • State capture: Read-only filesystem walk with sensitive path filtering and content redaction — no writes outside the output directory
  • Output isolation: All generated artifacts go to .repro-runs/ (gitignored) or user-specified output directories
  • No runtime product changes: This PR only adds developer-only skill definitions and helper scripts under .qwen/skills/
  • .gitignore changes: Adds .repro-runs/, removes stale pr_body.md entry, fixes missing trailing newline

5. File Summary

Path Type Lines Purpose
.qwen/skills/agent-reproduce-feature/SKILL.md Skill def 133 Capture workflow skill definition
.qwen/skills/agent-reproduce-feature/references/capture-workflow.md Doc 160 Capture workflow reference guide
.qwen/skills/agent-reproduce-feature/scripts/capture_state.py Python 594 State snapshot & diff tool
.qwen/skills/agent-reproduce-feature/scripts/llm_dump.py Python 182 mitmproxy addon for HTTP trace capture
.qwen/skills/agent-reproduce-feature/scripts/run_with_mitm.sh Shell 130 Proxy-wrapped command execution
.qwen/skills/agent-reproduce-feature/scripts/run_tmux_capture.sh Shell 44 tmux session capture
.qwen/skills/agent-reproduce-align/SKILL.md Skill def 98 Alignment workflow skill definition
.qwen/skills/agent-reproduce-align/references/alignment-workflow.md Doc 84 Alignment workflow reference guide
.qwen/skills/agent-reproduce-align/scripts/compare_traces.py Python 156 Normalized trace comparison
.qwen/skills/agent-reproduce-align/scripts/normalize_trace.py Python 244 JSONL trace normalization
.qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh Shell 118 Paired reference+qwen capture
.gitignore Config +2/-1 Add .repro-runs/, fix trailing newline

6. Summary

All 4 Python scripts compile and function correctly with synthetic test data. All 3 shell scripts parse cleanly and show proper usage guards. E2E testing confirms normalize → compare pipeline works, state snapshot/diff produces correct results, tmux capture creates expected output files, and .repro-runs/ is properly gitignored. Security posture is solid — multi-layer credential redaction, sensitive path filtering, restricted file permissions, and localhost-only proxy binding.

Recommendation: Ready to merge

@tanzhenxin tanzhenxin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the careful iteration here — and thanks to @wenshao for the deep review pass and local verification.

I went through the remaining open threads. Approving — these are opt-in, additive skills with no runtime/product code, and their captures stay local and gitignored, so nothing here is blocking.

A few items are worth a follow-up when convenient (none need to gate this PR):

  • Redaction over-matching — the one I'd suggest doing first. In llm_dump.py the bearer/basic/token patterns have no minimum length, so ordinary text like "basic usage" becomes "Basic [REDACTED]"; and bare session/cookie in the sensitive-key regex redact keys such as session_id, which can blank out the tool-schema fields the align workflow is meant to compare. The {8,}-style patterns already in capture_state.py resolve both.
  • Redaction completeness (nice-to-have): PKCS#8 -----BEGIN PRIVATE KEY----- (no qualifier word) and bare JWTs aren't matched; and a few run_with_mitm.sh outputs are world-readable while http.jsonl is 0600.
  • Normalization edge cases (nice-to-have): the compared path keeps query strings (ephemeral IDs → spurious diffs), Gemini functionDeclarations tools aren't summarized, and batch (list-shaped) request bodies compare as empty.

Merging as-is and tracking these as follow-ups. Thanks again for building this!

@DragonnZhang DragonnZhang merged commit bec97f4 into main Jun 2, 2026
11 checks passed
TaimoorSiddiquiOfficial added a commit to TaimoorSiddiquiOfficial/HopCode that referenced this pull request Jun 2, 2026
* fix(cli): persist /memory toggle state across dialog reopen (#4650)

The Auto-memory / Auto-dream / Auto-skill rows initialized their state
from Config getters, which are frozen at startup and never reflect a
setValue() write. Each /memory reopen re-mounts the dialog and re-reads
that stale snapshot, so a just-flipped toggle appeared to revert. Read the
initial state from the live merged settings instead, matching the existing
write path (bareMode semantics preserved).

Also switch the test's `act` import to `react` — the previously used
@testing-library/react is declared in package.json but not installed, so
the suite could not run — and add a mount/unmount/remount regression test.

* Hide internal docs from docs site (#4357)

* fix(core): preserve uid in atomicWriteFile to avoid breaking shared-write files (#4431)

* fix(core): preserve uid/gid in atomicWriteFile to avoid breaking shared-write files

atomicWriteFile uses write-to-tmp + rename for crash atomicity. POSIX
rename creates a new inode owned by the calling process's euid/egid, so
the rename silently strips the original uid/gid. On shared-write setups
(e.g. a group-writable file owned by another user in a shared workspace
where the current user has group-write access), every Write/Edit/
NotebookEdit through qwen-code would reset ownership to the running
user and effectively revoke write access for the original collaborators.

The fix:

1. If the target exists and is owned by a different uid/gid than the
   process's effective uid/gid (and we are not root), fall back to
   in-place writeFile. This truncates the existing inode in place,
   preserving uid/gid. The trade-off is loss of crash atomicity for
   this specific case — an acceptable trade for not silently breaking
   shared-write file ownership.

2. If running as root, atomic rename is still used, and ownership is
   restored via chown(uid, gid) after the rename. Root can chown back;
   non-root cannot, hence the in-place fallback for non-root.

3. Windows is unaffected (no POSIX ownership semantics).

Tests:

- New: in-place fallback on uid mismatch — verify content updates, mode
  preserved, and inode unchanged (the inode is the signal that the
  fallback path ran rather than rename).
- New: same scenario triggered via gid mismatch.
- New: positive case — ownership matches → atomic rename → inode changes.

Regression: a v0.16.0 user reported "every write turns a world-writable
file into one other users can no longer write." Bisected to #4096 which
introduced atomicWriteFile + write-to-tmp + rename.

* fix(core): route root through in-place fallback + doc/test follow-ups

Review follow-ups on the atomic-write ownership fix:

1. Remove the root-special-case (rename + post-rename chown). chown
   silently fails inside user-namespaced or CAP_CHOWN-stripped Docker
   containers, which re-triggers the original bug for root-in-Docker
   users — exactly the scenario this fix was reported against. Routing
   root through the same in-place fallback as non-root eliminates this
   failure mode and drops an untestable branch (chown-back can't be
   exercised under non-root CI).

2. Document the three properties traded away by the in-place fallback:
   crash atomicity, concurrent-reader isolation, inotify watcher
   semantics (MODIFY vs MOVED_TO).

3. Document that the in-place fallback surfaces EACCES when the file's
   mode forbids the current user from writing — this is correct
   behavior (atomic rename used to silently replace files the user had
   no permission on, which was arguably a privilege issue).

4. Replace the brittle "see step 6 in the function doc" comment with a
   step-number-independent reference.

5. New test covering the EACCES path: chmod 0o444 + mocked geteuid
   triggers the fallback, fallback hits the read-only file, EACCES
   propagates cleanly, original content is preserved.

* fix(core): harden in-place fallback against symlink/unlink/inode races + doc/test follow-ups

Review follow-ups on #4431 ownership-preservation fix:

CRITICAL — in-place fallback security hardening (wenshao review):

The path-based `fs.writeFile(targetPath, ...)` fallback introduced
three races that the prior `rename(tmp, target)` form did not have:

1. Non-regular files (FIFO/socket/device): fs.writeFile calls
   open(O_WRONLY|O_CREAT|O_TRUNC). On a FIFO this blocks forever
   waiting for a reader. On a character/block device it writes to
   the actual device. The rename path replaced these with a
   regular file.

2. Symlink-swap TOCTOU: an attacker with parent-dir write can swap
   targetPath for a symlink between our stat and our writeFile.
   fs.writeFile follows symlinks at the destination; POSIX rename
   does not. In the very "shared-write workspace / Docker bind-mount"
   scenarios this PR targets, this lets a directory-writable
   attacker redirect agent writes elsewhere (e.g. /etc/passwd if
   the agent runs as root).

3. Unlink race: if targetPath is unlinked between stat and write,
   O_CREAT silently recreates it owned by the calling user — the
   exact ownership change the fallback was designed to prevent.
   Silent regression to the pre-fix bug under this race.

Fix: extract the fallback into writeInPlaceWithFdGuards():

  - open(target, O_WRONLY | O_TRUNC | O_NOFOLLOW) — no O_CREAT, so
    unlink-race surfaces ENOENT instead of silently recreating; and
    O_NOFOLLOW rejects symlink-swaps with ELOOP.
  - fstat(fd) verifies the bound inode's uid/gid still match
    existingStat — refuses the write if an inode-swap happened
    between stat and open.
  - Write through the fd (locked to the verified inode), chmod
    through the fd, close.

Caller now gates the fallback on existingStat.isFile() — non-regular
targets fall through to the atomic path which has well-defined
"replace special-file with regular-file" semantics.

DOC / TEST follow-ups:

- Add hardlink-propagation as a 4th trade-off in the in-place
  fallback JSDoc (review comment #4): rename creates a new inode so
  sibling hardlinks keep old content; in-place truncate+write keeps
  the inode so all hardlinks see new content.

- Update atomicWriteJSON JSDoc to note the write is now
  *conditionally* atomic (review comment #5): atomic when uid/gid
  matches the process, in-place when ownership differs. Previously
  the JSDoc still claimed unconditional atomicity.

- Update caller comments at runtimeStatus.ts and
  worktreeSessionService.ts that advertised crash-atomic writes via
  tmp+rename — those guarantees are now conditional (review
  comment #6).

- Add mode + tmp-leftover assertions to the gid-mismatch test to
  match the uid-mismatch test (review comment #2 — test
  consistency). Without these, a gid-fallback regression that
  silently dropped permissions or left a tmp file would not be
  caught.

- New test: FIFO + ownership mismatch must take the atomic path,
  not in-place (verifies the existingStat.isFile() guard works;
  hang on in-place would trip vitest timeout).

- New test: writing through a symlink with ownership mismatch
  exercises the resolve-then-stat-then-open flow and verifies the
  symlink itself is preserved.

Tests: 192/192 pass (atomicFileWrite + write-file + edit +
fileSystemService).

* fix(core): defer O_TRUNC and verify dev+ino in writeInPlaceWithFdGuards

PR #4431 review follow-up (wenshao critical):

The previous form opened with `O_WRONLY | O_TRUNC | O_NOFOLLOW`, which
truncated the bound file *before* the fd-bound fstat verification ran.
If an attacker swapped the path between the caller's stat and our
open, we would truncate the attacker's substituted inode (destroying
unrelated content) before detecting the swap.

Two fixes:

1. Open without O_TRUNC. Verify dev+ino+uid+gid+isFile match
   expectedStat through fh.stat(). Only then call fh.truncate(0)
   through the validated fd.

2. Expand the verification beyond uid+gid to include dev+ino+isFile.
   uid+gid alone misses a same-owner inode swap (attacker replaces
   the path with a different inode they own). dev+ino is the strong
   identity check; isFile catches a swap to FIFO/socket/device after
   the caller's existingStat.isFile() gate.

JSDoc updated to enumerate the four guards (NOFOLLOW, no CREAT, no
TRUNC at open, dev+ino+uid+gid+isFile via fstat) and explain why
truncation must wait until after verification.

192/192 tests pass.

* fix(core): close FIFO swap race with O_NONBLOCK + cover EOWNERSHIP_CHANGED path

PR #4431 review follow-up (deepseek-v4-pro via /review):

CRITICAL — FIFO swap TOCTOU:

The caller's `existingStat.isFile()` gate uses stat data captured
earlier. An attacker with parent-dir write can swap the regular file
for a FIFO between the caller's stat and our open inside
`writeInPlaceWithFdGuards`. The previous `O_WRONLY | O_NOFOLLOW` open
would then block indefinitely waiting for a FIFO reader; O_NOFOLLOW
only catches symlinks.

Fix: add O_NONBLOCK to the open flags. Defense in depth:

- On a reader-less FIFO, `open(O_WRONLY | O_NONBLOCK)` returns ENXIO
  immediately — no hang.
- If the FIFO has a reader (open succeeds), the subsequent fstat
  isFile() check still refuses the write via EOWNERSHIP_CHANGED.
- For regular files, O_NONBLOCK is a no-op.

CRITICAL test gap — EOWNERSHIP_CHANGED branch untested:

The primary TOCTOU defense (fdStat dev/ino/uid/gid/isFile vs
expectedStat) had no coverage. Exported `writeInPlaceWithFdGuards` so
it can be unit-tested directly:

- New test: simulate post-stat inode swap (unlink + recreate at same
  path), call helper with stale stat, assert EOWNERSHIP_CHANGED and
  that the attacker's content survives.
- New test: simulate post-stat regular→FIFO swap, assert open fails
  fast (ENXIO) or fstat catches it — either way no hang, no write.

DOC fix:

JSDoc said "we open read-write without truncating" but the code uses
O_WRONLY. Wording corrected to "write-only".

194/194 tests pass.

* fix(core): fix flaky inode-swap test + apply review follow-ups

PR #4431 review follow-up (glm-5.1 via /review) — 7 suggestions adopted,
1 partially adopted, 0 rejected:

CI FIX (Ubuntu test failure on tmpfs inode reuse):

The EOWNERSHIP_CHANGED inode-swap test used unlink+create to simulate
a post-stat swap. On Linux tmpfs the freshly-freed inode number is
often reused by the immediately-following create, so dev+ino remained
identical and the guard didn't trip (intermittent on Ubuntu CI; macOS
APFS happened to allocate different inodes). Switched to rename(decoy,
target) which moves an existing distinct inode into place, guaranteed
to differ from the original.

CODE:

- Wrap fh.writeFile failure after fh.truncate(0) with
  EINPLACE_WRITE_FAILED + cause, so callers see explicitly that the
  file was truncated and the write didn't complete (otherwise they
  see raw ENOSPC/EIO and may wrongly assume the original is intact
  given this lives in atomicFileWrite.ts).
- Skip fh.chmod when euid is neither root nor expectedStat.uid —
  chmod is guaranteed to fail with EPERM in that case (POSIX requires
  owner or root). Avoids a guaranteed-failing syscall on every call.
- Caller catches ENOENT from writeInPlaceWithFdGuards and falls
  through to atomic rename path. If the file was deleted between
  caller's stat and our open there is no ownership to preserve; the
  rename path correctly creates a new file at targetPath.

DOC:

- Replaced "defends against four races" with "hardened against
  post-stat races" (the bullet list has 5 items, the count was wrong).
- Reworded "non-regular targets must not reach this function" to
  describe defense-in-depth — O_NONBLOCK + !fdStat.isFile() reject
  post-stat regular→FIFO/socket/device swaps. The old wording made
  it look like O_NONBLOCK was redundant.
- Documented the dual chmod behavior (root vs non-root with foreign
  uid) inline.

TESTS:

- Added happy-path test for writeInPlaceWithFdGuards (write succeeds,
  inode preserved, mode preserved).
- Added ENOENT regression test (verifies the missing-O_CREAT
  property — if file unlinked between stat and open, no silent
  recreate with caller's uid).
- Renamed the misleading "O_NOFOLLOW guard" test (it actually tests
  resolve-through-symlink, not O_NOFOLLOW) to reflect what it does,
  and added a direct ELOOP test that drives writeInPlaceWithFdGuards
  with a path whose final component is a symlink — that's the real
  O_NOFOLLOW exercise.
- Fixed the FIFO test to pass a stat captured from the FIFO itself
  (not a stale regular-file stat) so only the FIFO-specific defense
  fires, not the inode/dev mismatch from a different file.

NOT ADOPTED:

- Skip-when-non-root chmod optimization adopted (small, useful), but
  the larger "structured chmod error model" deferred — best-effort
  matches the existing tryChmod pattern at file scope.

197/197 tests pass.

* fix(core): wrap truncate err + post-write nlink check + guard close + chmod sync

PR #4431 review follow-up (qwen-latest-series-invite-beta-v34 via /review)
— 7 of 10 suggestions adopted, 3 deferred:

CODE:

- **EINPLACE_TRUNCATE_FAILED wrap** (review #3291863048): symmetric to
  the existing EINPLACE_WRITE_FAILED — distinguishes "truncate failed,
  original intact" from "write failed post-truncate, original lost".

- **Post-write nlink === 0 check** (review #3291863059):
  EINODE_UNLINKED_DURING_WRITE detects the fstat-to-close window where
  a concurrent rename-over drops our bound inode's link count to zero
  and our write goes to an anonymous inode close will free. Silent
  data loss path now surfaces.

- **fh.close() guarded in finally** (review #3291863044): close failure
  on NFS/FUSE was masking the original try-body exception (including
  the meaningful EOWNERSHIP_CHANGED, EINPLACE_*, EINODE_*). flush:true
  already fsync'd, so close-after-flush is best-effort.

- **fdStat.uid in canChmod** (review #3291863055 part 1): use the
  fd-bound verified value instead of expectedStat.uid. Defense in depth
  — a future weakening of the fstat guard won't silently widen chmod
  privilege.

- **fh.sync() after chmod** (review #3291863053): chmod is metadata,
  not covered by writeFile({ flush: true }). A crash before lazy
  metadata flush would lose the mode restoration (matters for
  setuid/setgid). One extra syscall, best-effort.

- **@remarks freshness contract** (review #3291863051 partial): JSDoc
  now spells out that expectedStat MUST be a fresh stat captured
  immediately before the call. Stale stats nullify every guard.

- **Concurrent-writer limitation noted** (review #3291863061 partial):
  added a "Known limitation — no advisory locking" paragraph to JSDoc
  rather than adopting flock (Linux-specific, NFS issues, scope
  expansion). Callers needing multi-process coordination should layer
  their own lockfile.

- **@throws documentation** (review #3291863051 partial): four
  documented error codes (EOWNERSHIP_CHANGED, EINODE_UNLINKED_DURING_WRITE,
  EINPLACE_TRUNCATE_FAILED, EINPLACE_WRITE_FAILED).

TESTS:

- **EINPLACE_WRITE_FAILED via FileHandle.prototype.writeFile monkey-patch**
  (review #3291863040): triggers the data-loss path, asserts the wrapped
  code + message + cause, and verifies the file is empty (truncate ran).

- **canChmod=false actually skips chmod** (review #3291863055 part 2):
  prior uid-mismatch test had desiredMode === current mode, couldn't
  distinguish "skipped" from "no-op". New test uses desiredMode=0o755
  on a 0o644 file under canChmod=false → asserts mode stays 0o644.

NOT ADOPTED:

- ENOENT/ELOOP/ENXIO catch extension (review #3291863043): keeping the
  strict refusal for swap-to-special-file. Silent fallthrough-to-replace
  was pre-PR atomic-rename behavior, but in shared-write workspaces
  (this PR's target users) a special-file appearing at the target path
  is a signal worth surfacing, not papering over.

- Diagnostic logging (review #3291863049): the function has no logger
  dependency today; adding one is an architecture decision outside
  this PR's scope. The path taken is implied by the side effects
  (inode preserved vs new) but agreed: out-of-band telemetry would
  help ops. Defer to follow-up.

- flock advisory locking (review #3291863061 main): scope expansion;
  Linux-specific semantics, NFS edge cases. Documented as known
  limitation instead.

- Integration test for ENOENT fallthrough at atomicWriteFile level
  (review #3291863043 part 1): ESM module bindings prevent monkey-
  patching writeInPlaceWithFdGuards from outside. The unit test for
  the helper's ENOENT path covers the throwing behavior; the catch is
  3 lines and review-visible. Defer until a refactor opens an
  injection seam.

- Error code string constants export (review #3291863051 part 3): two
  codes don't merit a constant module. Magic strings are fine at this
  size.

199/199 tests pass.

* docs(core): sync writeRuntimeStatus JSDoc with conditional-atomic contract

PR #4431 review follow-up: function-level JSDoc still claimed
unconditional "Atomically write" and "never sees a partially written
file", inconsistent with the module-level docblock updated in earlier
commits. Updated to describe the conditional-atomic behavior (atomic
when uid/gid matches, in-place fallback when ownership differs) and
explicitly note the concurrent-reader visibility trade-off in the
fallback path. Links to atomicWriteJSON for the full contract.

Doc-only change. 199/199 tests pass.

* fix(core): add explicit fh.sync() — FileHandle.writeFile ignores flush option

PR #4431 review follow-up (qwen3.7-max via /review):

CRITICAL — FileHandle.writeFile silently ignores flush:

Node.js FileHandle.writeFile takes an early-return path that bypasses
the flush option entirely (the option is only honored on the
path-based fs.writeFile form). Our previous code passed
{ flush: true } to fh.writeFile and relied on the implicit fsync.
The only explicit fh.sync() was nested in the chmod block guarded by
canChmod — which is FALSE precisely when a non-root group member
writes to a group-writable file they don't own (the exact shared-write
scenario this PR targets). Net effect: in that branch, zero fsync.
Data sits in the kernel page cache; a crash before lazy flush leaves
the file empty (truncate succeeded) or partially written.

Fix:
- Drop flush from the fhWriteOptions object (silently ignored anyway).
- Add an explicit `fh.sync()` after writeFile succeeds, gated on
  options.flush. Runs BEFORE the chmod block so the canChmod=false
  branch also fsyncs.
- The chmod-block fh.sync() becomes metadata-only (covers the mode
  change), as the data is already on disk.

Updated comments to reflect the actual semantics rather than the
incorrect "writeFile({ flush: true }) fsyncs" assumption.

TESTS (partial adoption of review #3293252349):

- EINPLACE_TRUNCATE_FAILED: sibling test to EINPLACE_WRITE_FAILED.
  Monkey-patches FileHandle.prototype.truncate to throw EIO; asserts
  err.code + cause + "original content is intact" message, and
  verifies the file's original bytes are unchanged (truncate didn't
  run).
- Buffer in in-place fallback: locks in binary fidelity (byte-exact
  comparison) so a future encoding-passthrough regression for Buffer
  data would be caught.

NOT ADOPTED in this commit:

- EINODE_UNLINKED_DURING_WRITE test: requires post-write fh.stat()
  mocking with call-count discrimination (first call: real stat for
  verification; second call: nlink=0). The monkey-patch pattern works
  but is fragile; deferred to a follow-up that may also refactor the
  helper to accept an injectable stat fn for cleaner testability.

201/201 tests pass.

* fix: correct stale flush comment + add fh.sync() regression test

- Fix misleading close() comment that said "flush:true already
  fsync'd" — the explicit fh.sync() does the actual fsync, not the
  flush option (which is silently ignored on FileHandle.writeFile).
- Add regression test verifying fh.sync() is called when flush:true
  and skipped when flush is absent, preventing silent removal of the
  core durability fix.

Addresses wenshao review threads from 2026-05-23.

* test: add EINODE_UNLINKED_DURING_WRITE regression test

Monkey-patches FileHandle.stat to return nlink:0 on the post-write
check, verifying the nlink guard throws with the correct error code.
Addresses wenshao review from 2026-05-28.

* simplify: replace writeInPlaceWithFdGuards with plain fs.writeFile

Address yiliang114's review (CHANGES_REQUESTED):

1. [Critical] Remove ~120 lines of fd-level TOCTOU hardening
   (writeInPlaceWithFdGuards) — over-engineering for a local CLI.
   The in-place fallback now uses plain fs.writeFile + tryChmod,
   matching the EXDEV fallback pattern.

2. [Suggestion] Fix macOS GID false-positive: only compare uid in
   ownershipWouldChange(). macOS inherits parent dir GID for new
   files, so egid !== file.gid was a false positive that needlessly
   dropped crash atomicity.

3. [Suggestion] Trim 60+ lines of JSDoc to project style (AGENTS.md:
   "default to none, add only when WHY is non-obvious").

Net: -748 lines. 24 tests pass.

* fix: restore Stats type import (TS2304 build failure)

* docs: narrow scope from uid/gid to uid-only preservation

The gid check is intentionally skipped because macOS inherits the
parent directory's GID for new files, making egid !== file.gid a
false positive. Update comments and PR description to match the
actual implementation scope.

* test: add inode assertion to symlink ownership-mismatch test

Proves the in-place fallback actually ran instead of atomic rename.

* Improve hooks matcher display (#4545)

* feat(cli): improve hooks matcher display

* test(cli): cover hooks navigation levels

* fix(cli): use session channel when closing ACP sessions (#4522)

Detach closeSession/killSession from the session entry's owning channel instead of the current attach target, so the correct channel is decremented and killed during channel overlap (old channel dying while a fresh channel is current). Extracts findChannelInfoForEntry/detachSessionIdFromEntryChannel helpers with unit + integration coverage. Fixes #4325.

* fix(core,cli): replace full-history structuredClone with shallow/tail variants to prevent OOM on resume (#4644)

* fix(core,cli): replace full-history structuredClone with shallow/tail variants to prevent OOM on resume

Several UI and service call sites clone the entire chat history via
structuredClone(getHistory()) every turn. On a resumed session with
thousands of entries, each clone allocates 150-200 MB transiently.
When multiple async side-requests overlap (suggestion generation,
auto-title, checkpointing), multiple clones coexist on the heap,
pushing V8 past its limit within 10 turns (2 GB heap cap).

Changes:
- AppContainer.tsx: use getHistoryTail(40, true) instead of
  getHistory(true) + slice(-40)
- btwCommand.ts: same pattern, use getHistoryTail(40, true)
- sessionTitle.ts: use getHistoryShallow() (read-only filtering)
- sessionRecap.ts: use getHistoryShallow() (read-only filtering)
- useGeminiStream.ts: use getHistoryShallow() for checkpoint
  serialization (only needs to survive JSON.stringify)

Closes #4624

* fix(test): update mocks for getHistoryShallow/getHistoryTail in sessionTitle and btwCommand tests

* fix(cli): migrate remaining getHistory() clone sites to shallow/tail variants

- AppContainer.tsx rewind path: getHistory() → getHistoryShallow()
  (only used read-only by computeApiTruncationIndex)
- Session.ts ACP rewind: getHistory() → getHistoryShallow()
  (only walks entries to compute truncation index)
- Session.ts stop-hook: getHistory() + filter(.model).pop() →
  getLastModelMessageText() (O(1) backward scan, no clone)

* fix(core): use client-level getHistoryShallow with fallback

sessionTitle.ts and sessionRecap.ts were calling
chat.getHistoryShallow() directly, bypassing the client-level
wrapper that provides a getHistory() fallback when the chat
implementation doesn't support shallow reads. Use
geminiClient.getHistoryShallow() instead.

Update test mocks to match the new call site.

* fix(test): add getHistoryShallow and getLastModelMessageText to Session test mocks

Session.ts now calls chat.getHistoryShallow() in rewindToTurn and
chat.getLastModelMessageText() in the Stop hook. Update all mockChat
instances in Session.test.ts to provide these methods.

* feat(cli): add respectUserColors and hideContextIndicator options for statusline (#4670)

* feat(cli): add respectUserColors option to preserve ANSI colors in
     statusline command output

* test(cli): add respectUserColors tests for useStatusLine and Footer

* feat(cli): add hideContextIndicator option to hide built-in context usage in footer

* docs: update statusline configuration docs with respectUserColors and hideContextIndicator

* fix(core): tolerate unsupported Streamable HTTP GET SSE (#4521)

Fixes #4326

* fix(insight): Harden insight facet normalization and empty qualitative handling (#3557)

* Harden insight facet normalization and empty qualitative handling

* feat: enhance AtAGlance component to accept target sections for dynamic rendering

* feat(cli): notify when background shells finish (#4355)

* feat(core): add simplify bundled skill (#3570)

* feat(core): add simplify bundled skill

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): stabilize SettingsDialog restart prompt test

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(skills): use agent tool instead of task in simplify skill

The simplify skill referenced the 'task' tool for launching review passes,
but Qwen Code exposes 'agent' as the callable subagent tool ('task' is only
a legacy permission alias). Using 'task' would cause /simplify to stall when
trying to launch parallel review passes.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs: document simplify bundled skill

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* Update packages/core/src/skills/skill-manager.test.ts

Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>

* fix(core): repair simplify skill tests

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* Update packages/core/src/skills/bundled/simplify/SKILL.md

Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>

* fix(skills): address simplify review feedback (read-only passes, gitignore scope, safer dead-code removal)

- drop inert `argument-hint` frontmatter (argumentHint is never parsed or
  rendered anywhere; no other bundled skill uses it)
- mark Step 2 review passes read-only so edits stay isolated to Step 4
- narrow the no-diff fallback to `git ls-files --modified --others
  --exclude-standard` so ignored build output is excluded
- require a repo-wide caller check before removing code
- make the commands.md row state it edits code directly
- assert non-conflicting bundled skills survive cross-level dedup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>

* feat(skills): add agent reproduction workflows (#4118)

* chore(skills): add codex reproduce workflows

* feat(agent-reproduce): implement agent reproduction workflow and supporting scripts

* feat(skills): capture reference agent state diffs

* feat(cli): virtual viewport for long conversations on ink 7 (#4146)

* chore(deps): re-upgrade ink 6 → 7.0.3 (upstream Static remount fix landed)

PR #3860 first upgraded ink 6 → 7.0.2. PR #4083 reverted because of a
TUI regression: `<Static>` did not re-emit items when its `key` prop
was bumped, so `/clear` / Ctrl+O / refreshStatic left the history area
blank under ink 7.0.2.

ink 7.0.3 (released after #4083) contains the exact fixes:

  - be9f44cda Fix: <Static> remount via key change drops new items (#948)
  - 669c4386c Fix: Drop stale <Static> output from fullStaticOutput on identity change (#950)
  - 7c2267c01 Fix `useBoxMetrics` not accepting ref objects with an initial null value (#945)

Changes:
  - `ink` ^6.2.3 → ^7.0.3 (root hoist + cli direct)
  - `react` ^19.1.0 → ^19.2.4 (cli direct; ink 7.0.3 peerDeps requires >=19.2.0)
  - `react`/`react-dom` overrides ^19.2.4 added so the transitive graph
    stays deduped to a single instance (avoids `Invalid hook call` from
    multiple React copies, the classic ink-upgrade hazard)
  - `wrap-ansi` already on ^10.0.0 from #4083's partial-revert (no change)

Verified:
  - `npm ls ink` → single `ink@7.0.3` across all peer deps
  - `npm ls react` → single `react@19.2.4`
  - `npm run typecheck --workspace=@qwen-code/qwen-code` clean
  - `npm run typecheck --workspace=@qwen-code/qwen-code-core` clean
  - Composer.test.tsx 20/20, MainContent.test.tsx 6/6, TableRenderer.test.tsx
    59/59 + 1 skipped — all key UI components green on the new ink

The Static-remount regression is upstream-fixed in 7.0.3, so the
runtime path is restored without needing #3941's overflowY-self-managed
viewport. #3941 (virtual viewport) remains an opt-in performance
feature on top.

* fix(deps,cli): add @types/react overrides + move refreshStatic out of setCurrentModel updater

Two follow-ups from the multi-round audit of the ink 7.0.3 re-upgrade:

1. @types/react / @types/react-dom now pinned to ^19.2.0 in root
   overrides. packages/web-templates still declares @types/react ^18.2.0
   in its devDeps. Today the CLI build is unaffected (web-templates's
   18.x types are nested in its own node_modules and the React-using
   src/insight and src/export-html files are excluded from its tsconfig
   build), but a future reincludes-or-hoist accident would land
   conflicting global JSX namespaces in the CLI compile graph. Match
   the dep dedup we already enforce for `react` and `react-dom` so the
   type graph stays as deduped as the runtime graph.

2. AppContainer's onModelChange handler was calling refreshStatic() as
   a side-effect inside the setCurrentModel updater. React.StrictMode
   double-invokes state updaters in dev, so model swaps fired two
   clearTerminal writes + two <Static> key bumps. The double work was
   masked under ink 6 (key changes were no-ops on <Static>), but ink
   7.0.3 honors key changes — the doubled work is now potentially
   visible as a faster flash-flash on every model switch.

   Refactor: setCurrentModel becomes a pure setter; refreshStatic
   moves into a useEffect keyed on currentModel with a ref-comparison
   guard so the first render doesn't fire. Single clearTerminal write
   per real model change, even under StrictMode.

Verified: npm ls ink → single 7.0.3, npm ls react → single 19.2.4,
npm ls @types/react → 19.2.10 hoisted (npm flags web-templates's 18.x
constraint as overridden, which is the intended behavior). Typecheck
clean across cli + core workspaces.

* docs(design): virtual viewport on ink 7 — analysis + PR sequence

Captures the architectural analysis of how to thoroughly close the
flicker / refresh-storm class of issues (#2950, #3118, #3007, #3838 UI
side, #3899 follow-on) using a virtualized history viewport.

- Surveys claude-code (forked ink) and gemini-cli (@jrichman/ink +
  ScrollableList + VirtualizedList) reference implementations.
- Confirms ink 7 already exposes the primitives needed
  (`useBoxMetrics`, `measureElement`, `useWindowSize`,
  `useAnimation`) — no fork swap required.
- Picks porting gemini-cli's virtualized list components to ink 7 with
  `ResizeObserver` -> `useBoxMetrics` and a custom `StaticRender`.
- Splits the work into V.0..V.4 PRs with scope, dependencies, risk.
- Lists open questions + 11-item approval checklist that must clear
  before V.0 implementation begins.

This is a docs-only PR per the project's design-first workflow. No
runtime code changes.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): virtual viewport for long conversations on ink 7

Port gemini-cli's VirtualizedList + ScrollableList to stock ink 7,
adapting for ink 7's available primitives:

- `overflowY="hidden"` + `marginTop={-scrollTop}` instead of ink-fork's
  `overflowY="scroll"` (ink 7 has proper clip/unclip in render-node-to-output)
- `useBoxMetrics` inside each VirtualizedListItem (Option A) instead of a
  single ResizeObserver WeakMap; reports height changes via onHeightChange
  callback so the parent can update its heights record
- Custom `StaticRender` as `React.memo` with a reference-equality comparator,
  keyed on `itemKey-static-{width}` to freeze completed conversation items
- Character scrollbar column (`│` track / `█` thumb) since ink 7 has no
  native scrollbar prop
- No ScrollProvider / mouse drag (deferred to a follow-up PR)

Wire into MainContent.tsx behind `ui.useTerminalBuffer` setting (Settings
dialog → UI → Virtualized History; default false — opt-in).

Key bindings: Shift+↑/↓ (line), PgUp/PgDn (page), Ctrl+Home/End (top/bottom).

Re-render optimisations:
- renderItem wrapped in useCallback so renderedItems useMemo only recomputes
  when actual deps change (not on every streaming tick)
- Completed history items passed by original object reference so
  VirtualHistoryItem = memo(HistoryItemDisplay) can bail out on stable props
- estimatedItemHeight / keyExtractor / isStaticItem defined as module-level
  constants with no closure deps

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): add test coverage for virtual viewport scroll bindings and settings

- keyMatchers.test.ts: 6 new test cases for SCROLL_UP/DOWN, PAGE_UP/DOWN,
  SCROLL_HOME/END commands (41 tests total)
- settingsSchema.test.ts: assert ui.useTerminalBuffer is boolean, default false,
  showInDialog true, requiresRestart false

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): use ink 7 native overflow for VP pending items

In VP mode, pending items are rendered inside VirtualizedList's
overflowY="hidden" container, which uses ink 7's native clipping
as the viewport guard. Remove the availableTerminalHeight JS-
truncation bound from pending items in renderVirtualItem:

- JS truncation at terminal height would silently cut off content
  the user could scroll to read within the virtual viewport.
- ink 7 overflowY="hidden" on the VirtualizedList container is the
  correct clip guard — no JS line-counting workaround needed.
- Remove uiState.constrainHeight from renderVirtualItem deps (no
  longer referenced in the VP rendering path).

The legacy <Static> path is unchanged.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* perf(cli): binary-search offsets in virtualized list hot path

Replace linear findLastIndex / findIndex scans on the offsets array with
upperBound. Offsets are monotonic by construction, so the lookups inside
the render body and getAnchorForScrollTop drop from O(n) to O(log n).
Material for thousand-turn sessions where the lookup runs on every frame.

* fix(cli): wire ShowMoreLines + skip clearTerminal in VP mode

Two audit-found bugs in the VP path:

1. `<ShowMoreLines>` was outside the `<OverflowProvider>` that wraps
   `<ScrollableList>` in VP mode. `useOverflowState()` returns
   `undefined` outside the provider, so the component returned `null`
   and the "press ctrl-s to show more lines" affordance silently
   disappeared. Move `<ShowMoreLines>` inside the provider so the hook
   sees the live overflow state, matching the legacy path.

2. `refreshStatic()` and `repaintStaticViewport()` wrote
   `clearTerminal` / `cursorTo+eraseDown` to the host terminal
   unconditionally. In VP mode the React tree owns the visible region
   via ink 7's native `overflowY="hidden"` clipping — the physical
   write is a wasted flash on Ctrl+O / Alt+M / model change / resize.
   Guard both writes on `useTerminalBuffer === false`. The
   `historyRemountKey` bump still fires so the legacy `<Static>`
   fallback would still remount if someone toggled the setting mid-
   session.

Extends the targeted-repaint pattern introduced in #3967 to all
refreshStatic call sites, gated by the VP setting instead of by event
type.

* fix(cli): VP renderItem stability + source-copy offsets + heights GC

Three audit-found regressions tightened, in order of severity:

1. **Source-copy index offsets missing in VP** — legacy `<Static>` path
   threads per-item `sourceCopyIndexOffsets` so `/copy mermaid N` /
   `/copy latex N` hints stay stable across continuation messages. VP
   `renderVirtualItem` was not passing this prop, so the copy hints
   shown under each diagram drifted on every `gemini_content` chunk
   (the clipboard mechanism itself still worked from raw history; only
   the displayed number was wrong). Add two lookup tables —
   identity-keyed for static items, index-keyed for pending — without
   changing the VirtualizedList data signature, and thread offsets in
   both render branches.

2. **`renderVirtualItem` callback invalidated on every streaming tick**
   — its deps included `activePtyId` / `embeddedShellFocused` /
   `isEditorDialogOpen`, all of which flip mid-stream when a shell
   tool runs or a dialog opens. Each flip rebuilt the callback,
   invalidated `VirtualizedList.renderedItems`'s useMemo, and forced
   every static item to re-render through `<StaticRender>` — defeating
   the very memoization the design relies on. Move the three pending-
   only fields into a ref read inside the callback. Static-item closure
   now depends only on inputs that legitimately affect static output
   (terminalWidth, slashCommands, getCompactLabel, …). Pending items
   still re-render correctly because their item identity changes per
   tick, so the callback is called fresh each time and reads the
   latest ref.

3. **`pending` items now honour `constrainHeight`** in VP, matching the
   legacy path. Previously VP unconditionally passed `undefined` for
   `availableTerminalHeight` on pending, relying on the viewport
   `overflowY="hidden"` clip to limit visible size — but that hid the
   `<ShowMoreLines>` affordance from the user. Now that ShowMoreLines
   is correctly wired (previous commit), restore parity.

4. **Heights map memory leak** in `VirtualizedList` — `setHeights` only
   grew. Each `/clear` left orphan `h-N` keys; each pending → completed
   transition left orphan `p-N` keys. Add a `useLayoutEffect` that
   prunes entries whose keys are not in the current `data`. Runs in
   layout phase so the prune commits in the same paint as the data
   change — no stale-offsets frame.

* test+fix(cli): VP path coverage + stabilize absorbedCallIds empty Set

Completion-pass artifacts driven by the multi-agent audit:

- Settings description rewritten to enumerate the symptoms VP fixes so
  users with active flicker reports can find the toggle without reading
  the design doc.
- `absorbedCallIds` returns a module-level constant Set when compact mode
  is off, instead of a fresh `new Set()` per render. Fixes a hidden
  cascade: `activePtyId` flip mid-stream → useMemo runs → returns a new
  empty Set → `isSummaryAbsorbed` rebuilds → `renderVirtualItem`
  rebuilds → `VirtualizedList.renderedItems` recomputes → every static
  item re-renders. With the constant, the cascade dies at the source.
  Helps both VP and legacy paths.
- VP-path unit tests for MainContent (4 cases): ScrollableList mounts
  and Static does not when `useTerminalBuffer: true`; ShowMoreLines is
  reachable in VP mode (regression of the OverflowProvider mis-wrap);
  source-copy index offsets thread into renderItem for static items;
  renderItem callback identity is stable across `activePtyId` flips
  (proves the ref-based read keeps StaticRender memo effective).

* fix(cli): stabilize absorbedCallIds in compact mode + gate heights prune + tighten ShowMoreLines test

Round-2 audit follow-ups. Three real findings addressed; one flagged
false positive documented separately.

1. **absorbedCallIds Set identity now content-stable when compact mode is
   on.** The earlier EMPTY constant only short-circuited the compactMode=
   false path; when compact mode is enabled (some users default-on it),
   activePtyId / embeddedShellFocused flips during streaming still
   produced fresh Sets per render even when membership was unchanged,
   restarting the same cascade the pendingStateRef fix was meant to
   avoid. Compare-and-reuse via a ref: if the new Set has identical
   membership to the previous one, return the previous reference.

2. **`heights` map prune in `VirtualizedList` is gated.** Previously
   every streaming tick rebuilt an N-key Set and walked all heights,
   even on the steady-state path where nothing changes. Now only fires
   when the heights record has clearly outpaced live data
   (`size > max(8, 2 × data.length)`) — covers `/clear` and accumulated
   pending → completed transitions, skips the 30-Hz hot path entirely.

3. **VP ShowMoreLines test now actually verifies overflow connectivity.**
   Previous mock unconditionally rendered "SHOW_MORE", so the test only
   proved the JSX mounted — it would still pass if a future refactor
   moved `<OverflowProvider>` out of the VP tree again. The mock now
   reads `useOverflowState()` and emits "OVERFLOW_DISCONNECTED" when the
   context is missing. The VP test asserts both presence of "SHOW_MORE"
   and absence of the disconnected marker, so the regression is now
   caught.

Not addressed:
- Audit P0-1 claim that `renderMode` (Alt+M) / model-change updates
  don't reach VP static items: false positive. `renderMode` is a React
  Context (`RenderModeContext`), and Context propagation traverses the
  tree past `memo` boundaries — MarkdownDisplay's `useRenderMode()`
  consumer re-renders on context change regardless of whether
  `StaticRender` bails out. Verified by reading
  `packages/cli/src/ui/contexts/RenderModeContext.tsx` and
  `MarkdownDisplay.tsx:172`. No code change.
- Audit P1-2 pendingStateRef write-during-render race: speculative,
  relies on a multi-pass render path React 18+ does not currently use.
  Documented assumption in the existing inline comment.

* fix(cli): isolate renderItem errors + defensive height coerce + compact-mode mergedHistory stability

Round-3 audit follow-ups. Three real findings; the rest verified clean.

1. **`renderItem` errors no longer crash the CLI.** Previously a throw
   inside a per-item render propagated through `VirtualizedList`'s
   useMemo into React's commit phase, tearing down the whole Ink tree —
   one bad history record could nuke the session. Wrap each call in a
   try/catch and substitute a small red `[render error] …` text box on
   failure. The row stays in the viewport so the user can scroll past
   it.

2. **Defensive height coerce in offset accumulation.** A buggy
   `estimatedItemHeight` returning NaN / negative / Infinity would
   poison every downstream offset and break the `upperBound` /
   `findLastLE` binary search (which assumes monotonic offsets). Clamp
   to `Number.isFinite(raw) && raw > 0 ? raw : 0`. No-op for the
   in-tree estimators that return 3; insurance against future
   consumers.

3. **`mergedHistory` is content-stable when compact mode is on.** The
   Round-2 absorbedCallIds stability fix didn't reach this path:
   `mergeCompactToolGroups` always allocates a fresh array, and
   `mergedHistory`'s useMemo lists `activePtyId` / `embeddedShellFocused`
   as deps, so every streaming tick mid-shell-tool produced a new array
   even when items aligned. Cascade went `mergedHistory` → offsets map
   → `renderVirtualItem` → every static item re-rendered. Pair-wise
   compare new vs previous and return the previous reference when items
   align. Restores StaticRender memo effectiveness for compact-mode
   users.

Not addressed (audit findings deemed not worth fixing in this PR):
- `scrollToItem` silently no-ops when item is not in data — no current
  caller checks the return value, low impact.
- `allVirtualItems` array spread is O(n) per streaming tick — real but
  not a crash; revisit in a perf-focused follow-up.
- `itemRefs.current` is dead surface (never read) — cosmetic.
- StrictMode-only-in-DEBUG double-invoke paths verified safe.

* test+chore(cli): VP review round 4 — VirtualizedList/useBatchedScroll coverage + cleanups

Addresses wenshao's CHANGES_REQUESTED review on PR #3941.

- Add focused unit tests for `VirtualizedList` (9 cases) covering empty
  data, `renderStatic` full-render, `initialScrollIndex` with
  `SCROLL_TO_ITEM_END`, `targetScrollIndex` anchoring, imperative
  `scrollToEnd` / `scrollToIndex`, per-item `renderItem` error isolation,
  NaN/negative estimator coercion, and out-of-range `initialScrollIndex`
  clamping.
- Add `useBatchedScroll` unit tests (4 cases) covering initial reads,
  pending-value reads in the same tick, post-commit pending reset, and
  callback identity stability across rerenders.
- Remove dead `itemRefs` / `onSetRef` plumbing (declared, written, never
  read; `useCallback` with empty deps was also a stale-closure trap).
- Remove unused `isStatic?: boolean` from `VirtualizedListProps`
  (only `isStaticItem` is actually consumed).
- Tighten the render-phase setState block: each setter is now guarded
  by an equality check so React bails out of redundant updates, and a
  comment documents that this is the React-endorsed "adjusting state
  while rendering" pattern (the synchronous update avoids a one-frame
  flash at the previous position when `targetScrollIndex` changes).

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* chore(cli): remove dead `dataRef` from VirtualizedList (round-4 followup)

Declared and written in a `useLayoutEffect` on every `data` change but
never read anywhere in the component. Flagged in wenshao's round-4 review
of PR #3941.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): collapse model-change effect back into one batched handler

wenshao's PR #4119 review correctly flagged that splitting the
onModelChange flow into two effects (b25831b0e) reintroduced the
issue #3899 freeze regression on every model switch:

  1. setCurrentModel(model) commits first, with the OLD
     historyRemountKey.
  2. <Static key={`${historyRemountKey}-${currentModel}`}> sees its
     key change (because currentModel did) and remounts immediately.
  3. MainContent's render-phase progressive-replay reset only fires
     when historyRemountKey changes, so replayCount is still the
     full mergedHistory.length from any prior catch-up.
  4. The remounted Static dumps the entire history in one synchronous
     layout pass — exactly the freeze progressive replay was added
     to avoid (#3899). The second effect's refreshStatic() bump
     arrives a render too late.

Fix: do not split. Both side effects (refreshStatic, which writes
clearTerminal + bumps historyRemountKey, and setCurrentModel) live
in the event handler again, with a ref guard for same-model
notifications. The React.StrictMode concern that motivated b25831b0e
is addressed by keeping the side effect OUT of the setState updater
(it now runs once per event-handler invocation, not once per
double-invoked updater call). Both setState calls land in the same
React batch, so historyRemountKey and currentModel update together —
MainContent's render-phase reset sees the new key, replayCount drops
to the first chunk, and Static remounts with chunked replay intact.

Tests:
- AppContainer.test.tsx: 4 new tests covering the synchronous
  refreshStatic side-effect contract, same-model no-op, ref-guarded
  StrictMode double-invoke, and unsubscribe-on-unmount.
- MainContent.test.tsx: new regression guard — when currentModel
  changes but historyRemountKey is held constant, progressive replay
  must NOT reset (pins the MainContent invariant the two-effect
  refactor accidentally relied on).

Verified: vitest packages/cli AppContainer + MainContent green (82/82).
Typecheck clean.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix+docs(cli): VP review round 5 — typecheck, doc drift, scroll keys

PR #4146 review feedback (wenshao + Claude Opus 4.7 audit) addressed:

Code:
- MainContent.test: activePtyId typed as number (was 'pty-xyz' string,
  broke tsc with TS2322 — the test only relies on reference change so
  any number works).
- VirtualizedList: sanitize renderItem error path. Display becomes the
  generic `[render error]` marker; full err goes to debugLogger.debug
  so file paths / partial tool state don't leak to scrollback.
- MainContent: move pendingSourceCopyOffsetsByIndex into a ref so it
  no longer rebuilds renderVirtualItem identity every streaming tick.
  Without this, VirtualizedList.renderedItems useMemo invalidated
  per-tick → JSX rebuilt for every visible item → memo(HistoryItem
  Display) was still bailing but allocations were O(visible) per tick.
- AppContainer: drop the misleading "state-driven scroll reset" claim
  in the VP refreshStatic comment. VP is intentionally near-no-op:
  the React tree owns the visible region, mergedHistory mutation is
  what refreshes the screen, and the remount-key bump is preserved
  only to keep the legacy Static branch in sync if the user toggles
  the flag off mid-session.
- StaticRender: rewrite JSDoc to match reality. The custom React.memo
  is NOT output caching like @jrichman/ink's StaticRender export;
  the comparator rarely matches (parent allocates fresh JSX); the
  real skip happens at memo(HistoryItemDisplay) one level deeper.

Docs:
- docs/design/virtual-viewport: sync file map (drop non-existent
  ScrollProvider.tsx / useAnimatedScrollbar.ts), PR sequence (one PR
  #4146, V.3-V.5 deferred), open-question + checklist resolution for
  #3905 (superseded) and base branch rename.
- docs/users/reference/keyboard-shortcuts: document the 6 VP scroll
  keys (Shift+↑/↓, PgUp/PgDn, Ctrl+Home/End) under a "History
  scrollback (when ui.useTerminalBuffer is on)" section. Previously
  the only discovery path was the Settings dialog description.

Verified: tsc --noEmit -p packages/cli ✓, vitest 160/160 ✓ across
AppContainer / MainContent / VirtualizedList / useBatchedScroll /
keyMatchers / settingsSchema, eslint clean on touched files.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): SGR mouse wheel scroll in VP mode

Recovers the most-felt UX regression vs legacy `<Static>` mode: when
`ui.useTerminalBuffer` is on, legacy users lose mouse wheel as a way
to scroll history (the host terminal stopped seeing the conversation
in its scrollback buffer). This PR enables button-event tracking
(`?1002h`) + SGR coordinates (`?1006h`) while the ScrollableList has
focus, parses wheel events off stdin, and routes them to scrollBy.

Scope kept tight on purpose:
- Wheel only. Hit-testing for scrollbar drag / click-to-position
  needs screen-absolute element coords; stock ink 7's useBoxMetrics
  returns yoga's parent-relative layout. Deferred to V.4 with two
  exit paths (upstream getBoundingBox to ink 7, or local yoga walker).
- Mouse mode is enabled only while ScrollableList is mounted; non-VP
  users never see their terminal flipped into button-event tracking.
- Side effect: native click-and-drag text selection is captured by
  the program. Docs + settings dialog description now spell out the
  Shift / Option (macOS) bypass.

Implementation:
- `ui/utils/mouse.ts` — SGR + X11 parser, ported and trimmed from
  gemini-cli (Google LLC, Apache-2.0). Single-consumer.
- `ui/hooks/useMouseEvents.ts` — enable/parse/disable lifecycle
  hook. Listens on stdin via `useStdin().stdin`, runs handler
  through a ref so callers don't have to memoize.
- `ui/components/shared/ScrollableList.tsx` — subscribe to mouse
  events, route wheel → `scrollBy(±3)`. Also drops a dead outer
  `<Box flexGrow={1}>` wrapper that held an unread containerRef
  and collapsed to zero height in ink-testing-library (the test
  renderer has no flex parent, so flexGrow=1 → 0 height → no items
  ever rendered, which is how this dead code was exposed).

Tests:
- `ui/utils/mouse.test.ts` — 14 cases: SGR parsing (wheel, presses,
  modifiers, move), X11 parsing, fallback chain, incomplete-sequence
  guard (including the >50-byte garbage cap).
- `ui/components/shared/ScrollableList.test.tsx` — 3 cases: wheel
  events shift the rendered window; hasFocus=false makes the mouse
  pipeline inactive (no throw); non-wheel events leave the window
  unchanged. Renders are wrapped in `<KeypressProvider>` (required
  by useKeypress in production but easy to forget in standalone
  tests).

Docs:
- `docs/users/reference/keyboard-shortcuts.md` — adds "Mouse wheel"
  row + the Shift/Option-to-select note.
- `packages/cli/src/config/settingsSchema.ts` — the in-app dialog
  description now mentions mouse wheel and the text-select bypass.
- `docs/design/virtual-viewport/README.md` — §1 status, §5 file map,
  §7 PR sequence all reflect mouse wheel landing in #4146 and the
  V.4–V.7 follow-up split (scrollbar drag / in-app search / alt-
  buffer / host-scrollback dual-write research).

Verified: tsc --noEmit -p packages/cli ✓, vitest 182/182 ✓ across
AppContainer / MainContent / VirtualizedList / ScrollableList /
useBatchedScroll / mouse / keyMatchers / settingsSchema.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): auto-hide animation for VP scrollbar thumb

Pairs with the SGR mouse-wheel work from the previous commit:
when the user actually scrolls, the thumb pops bright; after a
1.5s idle it fades into the dim track so the bar stops competing
with the conversation. The track column itself stays in layout
regardless, so the viewport never reflows mid-flash (which would
trigger per-item re-measure and a visible jitter).

Implementation kept minimal for stock ink 7:
- gemini-cli's `useAnimatedScrollbar` interpolates RGB colors via
  a theme + per-frame setInterval. The terminal can't render
  smooth fades anyway, so this hook collapses the state to a
  binary `isVisible` flag with a single setTimeout. ~75 LoC.
- `VirtualizedList` calls `flashScrollbar()` from a useLayoutEffect
  keyed on `clampedScrollTop`. The very first commit is skipped
  via a ref so initial mount doesn't paint a flash.
- The render switches the thumb glyph (`█` vs `│`) and `dimColor`
  based on `isVisible && inThumb`. Width stays 1 either way.

Tests (6 new):
- initial mount stays hidden (no spurious mount flash)
- flash → visible, hides after idle timeout, successive flashes
  reset the timer (no premature hide), idleHideMs<=0 disables
  auto-hide for tests that want to assert on the visible state,
  unmount cleans up the pending timer.

Doc updates:
- `docs/design/virtual-viewport/README.md` §1 status, §5 file map,
  §7 PR sequence — V.4 row now scopes only the drag/click-jump
  work (still coord-blocked); animated scrollbar moved out of
  deferred and into shipped.
- PR #4146 body — architecture table mentions the auto-hide, new
  files list adds `useAnimatedScrollbar.ts`, test count refreshed
  to 188/188.

Verified: tsc --noEmit -p packages/cli ✓, vitest 188/188 ✓.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): VP review round 6 — ESC bug, CI lint, scope-controlled cleanup

Triage of /review feedback from 2026-05-18 + 2026-05-19. Took the
ones that are real and small; declined the ones that are
false-positive / out-of-scope so this PR stops expanding.

Must-fix:
- CI Lint failure: vscode-ide-companion/schemas/settings.schema.json
  was stale after the keyboard-shortcuts description bump. Regenerated
  via `npm run generate:settings-schema`.
- useMouseEvents.ts had `const ESC = '';` (literal empty string after
  the raw 0x1B byte got stripped somewhere in the source pipeline).
  `buffer.indexOf('', 1) === 1` would have degraded garbage skipping
  to a one-byte scan, and the `else { buffer = ''; break }` branch
  could never run. Fixed by switching to the `'\x1b'` text escape and
  doing the same in `mouse.ts` (which had the raw byte, also fragile).
  Comment explains why.

Small wins (one-liners taken from the review batch):
- ScrollableList: rest-spread separates `hasFocus` from the props
  forwarded to VirtualizedList. Latent collision risk; no behaviour
  change today.
- VirtualizedList: `debugLogger.debug` when isReady=false so blank-
  viewport edge cases (tiny terminal / mid-resize race) become
  diagnosable from the debug log instead of looking like a hang.

Real perf (VP-only):
- MainContent: gated the progressive-Static-replay machinery behind
  `!useVirtualScroll`. The render-phase reset still consumes the
  remount-key bump so flag-off toggles mid-session catch up cleanly,
  but `setReplayCount` and the setImmediate chunking effect are now
  skipped for VP users. Saves ~M/CHUNK_SIZE wasted re-renders per
  Ctrl+O / model change on a 1000-turn session.

Belt-and-braces:
- useMouseEvents: added a `process.on('exit')` handler that writes
  the SGR mouse disable seq again. The React cleanup already covers
  normal unmount, but Ctrl+C / SIGTERM / parent kill bypass it and
  the terminal would otherwise stay in button-event-tracking mode
  after qwen exits.

Explicitly declined / deferred (with reasoning logged on the PR):
- requestAnimationFrame wheel throttle: rAF doesn't exist in Node;
  React 19 already batches state updates within a tick, and the
  renderedItems memo bounds the actual work to visible items. Will
  revisit if profiling shows it.
- Stable pending-item IDs (`p-N` keys shifting on completion): the
  observable jitter is at most one frame of estimated-vs-actual
  height delta. Moderate scope (creation-time ID allocation); fits
  better in a focused follow-up than in this PR.

Verified: tsc --noEmit -p packages/cli ✓, vitest 188/188 ✓ across
the full VP suite.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): scrollBy bottom uses live end anchor in virtualized list

When keyboard scroll reaches the bottom, scrollBy set isStickingToBottom
but anchored via getAnchorForScrollTop(maxScroll), a fixed {index,offset}
pixel anchor. scrollTo/scrollToEnd instead use {index: last, offset:
SCROLL_TO_ITEM_END}, which recomputes the bottom from live item heights
each render. The fixed anchor did not track the last item growing during
streaming, so scroll-to-bottom via keyboard lagged behind new tokens.
Align scrollBy's bottom branch with the sibling methods.

Reported by wenshao in PR review.

* fix(cli): parse mouse events via ink useInput, not a stdin data listener

useMouseEvents attached its own stdin.on('data', ...) listener. Adding a
'data' listener switches stdin into flowing mode, which drains the buffer
before ink's readable + stdin.read() reader (ink App) can consume it, so
all keyboard input routed through useInput was silently starved while
mouse mode was active.

Parse mouse sequences from ink's existing input pipeline via useInput
instead, so there is only one stdin reader. ink captures a full SGR
sequence (ESC [ < .. M/m) as a single CSI event and delivers it with the
leading ESC stripped, so we re-prepend it before parsing. Non-mouse input
does not match and is ignored; ink still routes input to the app's other
useInput handlers, so keyboard navigation keeps working.

Only SGR mode (1006h, which we enable) is parsed via this path; the legacy
X11 encoding is not recoverable through ink's CSI parser, which is the
encoding modern terminals stop emitting once 1006h is set.

Reported by wenshao in PR review.

* fix(cli): parse only SGR in mouse hook to avoid X11 paste misfire

The useInput-based mouse hook called parseMouseEvent, which also tries the
X11 fallback (parseX11MouseEvent). An X11 prefix (ESC [ M + 3 bytes) can
reach the handler via pasted text — ink emits paste content as input when
no paste listener is registered — and would misfire a spurious mouse event.
Call parseSGRMouseEvent directly so only the SGR encoding we enable (1006h)
is parsed, matching the hook's documented contract.

Reported by wenshao in PR review.

* test(cli): assert SGR mouse parser rejects X11 sequences

Locks in the security property behind the parseMouseEvent ->
parseSGRMouseEvent switch in useMouseEvents: an X11 sequence arriving as
pasted text must not misfire a mouse event. Asserts a well-formed X11
sequence is a valid X11 event yet returns null from parseSGRMouseEvent, so
a future revert to parseMouseEvent fails this test.

Reported by wenshao in PR review.

* test(cli): add VP scroll coverage + eslint-disable for useBatchedScroll

Cover keyboard scroll commands (Shift+Up/Down, PageUp/Down, Ctrl+Home/End),
scrollBy/scrollTo imperative API (positive/negative/overflow/clamp), and
auto-scroll-during-streaming state machine (stick-to-bottom, disengage on
user scroll, re-engage on scrollToEnd). Add missing eslint-disable-next-line
for intentionally dep-free useLayoutEffect in useBatchedScroll.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* chore(cli): remove trailing whitespace in useBatchedScroll

The eslint-disable-next-line comment was removed by eslint --fix as an
unused directive (exhaustive-deps does not flag a useLayoutEffect with
no dependency array). Clean up the residual blank line.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): background housekeeping for stale file-history dirs (#4414)

PR #4064 introduced ~/.qwen/file-history/{sessionId}/ for /rewind but had
no cross-session cleanup — directories accumulated indefinitely. This adds
a generic background housekeeping framework with file-history cleanup as
its first user.

- 30-day mtime sweep, configurable via general.cleanupPeriodDays
- 10-min startup delay (1-min catch-up if last run >7d ago)
- 24h recurring cadence, idle-gated (defers if user typed in last 1 min)
- O_EXCL lockfile + marker mtime throttle (multi-process safe)
- Current session whitelisted via lazy config.getSessionId() — defends
  against long-idle active sessions and /clear minting a new session
- Negative cleanupPeriodDays values clamp to 1h minimum (defends against
  schema-bypass: a future cutoff would otherwise sweep everything)
- Zero new prod dependencies; ~70 lines of self-written O_EXCL throttle
  primitive in lieu of proper-lockfile (which pulls graceful-fs and
  monkey-patches every fs method on first require)
- All setTimeout(...).unref() — never blocks process exit

Closes #4173.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(core): loosen auto-mode classifier timeouts, disable stage-2 thinking (#4680)

* fix(core): loosen auto-mode classifier timeouts, disable stage-2 thinking

The AUTO-mode classifier fails closed on timeout — a timed-out judge call
blocks the action as "unavailable". The tight 3s/10s stage budgets turned
transient slowness (slow network, large transcript, model queueing) into
spurious blocks of otherwise-valid actions. Raise them to 10s/30s so a
slow-but-healthy call is not treated as a hard block.

Also disable thinking in stage 2 (previously the only stage with
includeThoughts: true). This is a latency-sensitive permission gate the
user is actively waiting on; allocating a reasoning budget made the review
path slower and more expensive, which directly worsened the fail-closed
timeout. The model still records its reasoning in the structured
`thinking` output field — it just no longer gets an allocated budget.

Closes #4676

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs(core): trim verbose comments in auto-mode classifier

Condense the three comments touched by this change (module docstring
stage-2 note, timeout-budget rationale, stage-2 thinkingConfig) while
keeping the essential "why". No logic changes.

Co-authored-by: Qwen-Coder <noreply@qwenlm.ai>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <noreply@qwenlm.ai>

* fix(core): coerce hostile-provider usage token counts (#4350 part 1) (#4439)

* fix(core): coerce hostile-provider usage token counts (#4350 part 1)

Hostile providers (broken upstream, OpenAI-compat proxy returning
null/NaN, misconfigured override) can emit non-finite or negative
values for `usageMetadata.{prompt,candidates,cached,total}TokenCount`.
Captured unguarded in `processStreamResponse`, these poison the
compaction gate arithmetic:

- `lastPromptTokenCount + NaN >= hard` is always false → hard-rescue
  is silently disabled, eventually OOMing the V8 heap.
- `Infinity >= hard` is always true → hard-rescue fires every send.

Route the four API capture sites through a `coerceUsageCount` helper
that maps unknown / non-finite / negative to 0. `Number.isFinite(-1)`
is true, so an explicit `>= 0` is needed in addition to `isFinite`.

Part 1 of the hostile-provider hardening from #4350. The companion
`computeThresholds` guard depends on the un-merged three-tier ladder
in #4345 and is deferred until that lands.

Covered by parametrized tests in `geminiChat.test.ts` over NaN,
±Infinity, negative, null, undefined, and string inputs, plus a
fallback test asserting …
TaimoorSiddiquiOfficial added a commit to TaimoorSiddiquiOfficial/HopCode that referenced this pull request Jun 2, 2026
* fix(cli): persist /memory toggle state across dialog reopen (#4650)

The Auto-memory / Auto-dream / Auto-skill rows initialized their state
from Config getters, which are frozen at startup and never reflect a
setValue() write. Each /memory reopen re-mounts the dialog and re-reads
that stale snapshot, so a just-flipped toggle appeared to revert. Read the
initial state from the live merged settings instead, matching the existing
write path (bareMode semantics preserved).

Also switch the test's `act` import to `react` — the previously used
@testing-library/react is declared in package.json but not installed, so
the suite could not run — and add a mount/unmount/remount regression test.

* Hide internal docs from docs site (#4357)

* fix(core): preserve uid in atomicWriteFile to avoid breaking shared-write files (#4431)

* fix(core): preserve uid/gid in atomicWriteFile to avoid breaking shared-write files

atomicWriteFile uses write-to-tmp + rename for crash atomicity. POSIX
rename creates a new inode owned by the calling process's euid/egid, so
the rename silently strips the original uid/gid. On shared-write setups
(e.g. a group-writable file owned by another user in a shared workspace
where the current user has group-write access), every Write/Edit/
NotebookEdit through qwen-code would reset ownership to the running
user and effectively revoke write access for the original collaborators.

The fix:

1. If the target exists and is owned by a different uid/gid than the
   process's effective uid/gid (and we are not root), fall back to
   in-place writeFile. This truncates the existing inode in place,
   preserving uid/gid. The trade-off is loss of crash atomicity for
   this specific case — an acceptable trade for not silently breaking
   shared-write file ownership.

2. If running as root, atomic rename is still used, and ownership is
   restored via chown(uid, gid) after the rename. Root can chown back;
   non-root cannot, hence the in-place fallback for non-root.

3. Windows is unaffected (no POSIX ownership semantics).

Tests:

- New: in-place fallback on uid mismatch — verify content updates, mode
  preserved, and inode unchanged (the inode is the signal that the
  fallback path ran rather than rename).
- New: same scenario triggered via gid mismatch.
- New: positive case — ownership matches → atomic rename → inode changes.

Regression: a v0.16.0 user reported "every write turns a world-writable
file into one other users can no longer write." Bisected to #4096 which
introduced atomicWriteFile + write-to-tmp + rename.

* fix(core): route root through in-place fallback + doc/test follow-ups

Review follow-ups on the atomic-write ownership fix:

1. Remove the root-special-case (rename + post-rename chown). chown
   silently fails inside user-namespaced or CAP_CHOWN-stripped Docker
   containers, which re-triggers the original bug for root-in-Docker
   users — exactly the scenario this fix was reported against. Routing
   root through the same in-place fallback as non-root eliminates this
   failure mode and drops an untestable branch (chown-back can't be
   exercised under non-root CI).

2. Document the three properties traded away by the in-place fallback:
   crash atomicity, concurrent-reader isolation, inotify watcher
   semantics (MODIFY vs MOVED_TO).

3. Document that the in-place fallback surfaces EACCES when the file's
   mode forbids the current user from writing — this is correct
   behavior (atomic rename used to silently replace files the user had
   no permission on, which was arguably a privilege issue).

4. Replace the brittle "see step 6 in the function doc" comment with a
   step-number-independent reference.

5. New test covering the EACCES path: chmod 0o444 + mocked geteuid
   triggers the fallback, fallback hits the read-only file, EACCES
   propagates cleanly, original content is preserved.

* fix(core): harden in-place fallback against symlink/unlink/inode races + doc/test follow-ups

Review follow-ups on #4431 ownership-preservation fix:

CRITICAL — in-place fallback security hardening (wenshao review):

The path-based `fs.writeFile(targetPath, ...)` fallback introduced
three races that the prior `rename(tmp, target)` form did not have:

1. Non-regular files (FIFO/socket/device): fs.writeFile calls
   open(O_WRONLY|O_CREAT|O_TRUNC). On a FIFO this blocks forever
   waiting for a reader. On a character/block device it writes to
   the actual device. The rename path replaced these with a
   regular file.

2. Symlink-swap TOCTOU: an attacker with parent-dir write can swap
   targetPath for a symlink between our stat and our writeFile.
   fs.writeFile follows symlinks at the destination; POSIX rename
   does not. In the very "shared-write workspace / Docker bind-mount"
   scenarios this PR targets, this lets a directory-writable
   attacker redirect agent writes elsewhere (e.g. /etc/passwd if
   the agent runs as root).

3. Unlink race: if targetPath is unlinked between stat and write,
   O_CREAT silently recreates it owned by the calling user — the
   exact ownership change the fallback was designed to prevent.
   Silent regression to the pre-fix bug under this race.

Fix: extract the fallback into writeInPlaceWithFdGuards():

  - open(target, O_WRONLY | O_TRUNC | O_NOFOLLOW) — no O_CREAT, so
    unlink-race surfaces ENOENT instead of silently recreating; and
    O_NOFOLLOW rejects symlink-swaps with ELOOP.
  - fstat(fd) verifies the bound inode's uid/gid still match
    existingStat — refuses the write if an inode-swap happened
    between stat and open.
  - Write through the fd (locked to the verified inode), chmod
    through the fd, close.

Caller now gates the fallback on existingStat.isFile() — non-regular
targets fall through to the atomic path which has well-defined
"replace special-file with regular-file" semantics.

DOC / TEST follow-ups:

- Add hardlink-propagation as a 4th trade-off in the in-place
  fallback JSDoc (review comment #4): rename creates a new inode so
  sibling hardlinks keep old content; in-place truncate+write keeps
  the inode so all hardlinks see new content.

- Update atomicWriteJSON JSDoc to note the write is now
  *conditionally* atomic (review comment #5): atomic when uid/gid
  matches the process, in-place when ownership differs. Previously
  the JSDoc still claimed unconditional atomicity.

- Update caller comments at runtimeStatus.ts and
  worktreeSessionService.ts that advertised crash-atomic writes via
  tmp+rename — those guarantees are now conditional (review
  comment #6).

- Add mode + tmp-leftover assertions to the gid-mismatch test to
  match the uid-mismatch test (review comment #2 — test
  consistency). Without these, a gid-fallback regression that
  silently dropped permissions or left a tmp file would not be
  caught.

- New test: FIFO + ownership mismatch must take the atomic path,
  not in-place (verifies the existingStat.isFile() guard works;
  hang on in-place would trip vitest timeout).

- New test: writing through a symlink with ownership mismatch
  exercises the resolve-then-stat-then-open flow and verifies the
  symlink itself is preserved.

Tests: 192/192 pass (atomicFileWrite + write-file + edit +
fileSystemService).

* fix(core): defer O_TRUNC and verify dev+ino in writeInPlaceWithFdGuards

PR #4431 review follow-up (wenshao critical):

The previous form opened with `O_WRONLY | O_TRUNC | O_NOFOLLOW`, which
truncated the bound file *before* the fd-bound fstat verification ran.
If an attacker swapped the path between the caller's stat and our
open, we would truncate the attacker's substituted inode (destroying
unrelated content) before detecting the swap.

Two fixes:

1. Open without O_TRUNC. Verify dev+ino+uid+gid+isFile match
   expectedStat through fh.stat(). Only then call fh.truncate(0)
   through the validated fd.

2. Expand the verification beyond uid+gid to include dev+ino+isFile.
   uid+gid alone misses a same-owner inode swap (attacker replaces
   the path with a different inode they own). dev+ino is the strong
   identity check; isFile catches a swap to FIFO/socket/device after
   the caller's existingStat.isFile() gate.

JSDoc updated to enumerate the four guards (NOFOLLOW, no CREAT, no
TRUNC at open, dev+ino+uid+gid+isFile via fstat) and explain why
truncation must wait until after verification.

192/192 tests pass.

* fix(core): close FIFO swap race with O_NONBLOCK + cover EOWNERSHIP_CHANGED path

PR #4431 review follow-up (deepseek-v4-pro via /review):

CRITICAL — FIFO swap TOCTOU:

The caller's `existingStat.isFile()` gate uses stat data captured
earlier. An attacker with parent-dir write can swap the regular file
for a FIFO between the caller's stat and our open inside
`writeInPlaceWithFdGuards`. The previous `O_WRONLY | O_NOFOLLOW` open
would then block indefinitely waiting for a FIFO reader; O_NOFOLLOW
only catches symlinks.

Fix: add O_NONBLOCK to the open flags. Defense in depth:

- On a reader-less FIFO, `open(O_WRONLY | O_NONBLOCK)` returns ENXIO
  immediately — no hang.
- If the FIFO has a reader (open succeeds), the subsequent fstat
  isFile() check still refuses the write via EOWNERSHIP_CHANGED.
- For regular files, O_NONBLOCK is a no-op.

CRITICAL test gap — EOWNERSHIP_CHANGED branch untested:

The primary TOCTOU defense (fdStat dev/ino/uid/gid/isFile vs
expectedStat) had no coverage. Exported `writeInPlaceWithFdGuards` so
it can be unit-tested directly:

- New test: simulate post-stat inode swap (unlink + recreate at same
  path), call helper with stale stat, assert EOWNERSHIP_CHANGED and
  that the attacker's content survives.
- New test: simulate post-stat regular→FIFO swap, assert open fails
  fast (ENXIO) or fstat catches it — either way no hang, no write.

DOC fix:

JSDoc said "we open read-write without truncating" but the code uses
O_WRONLY. Wording corrected to "write-only".

194/194 tests pass.

* fix(core): fix flaky inode-swap test + apply review follow-ups

PR #4431 review follow-up (glm-5.1 via /review) — 7 suggestions adopted,
1 partially adopted, 0 rejected:

CI FIX (Ubuntu test failure on tmpfs inode reuse):

The EOWNERSHIP_CHANGED inode-swap test used unlink+create to simulate
a post-stat swap. On Linux tmpfs the freshly-freed inode number is
often reused by the immediately-following create, so dev+ino remained
identical and the guard didn't trip (intermittent on Ubuntu CI; macOS
APFS happened to allocate different inodes). Switched to rename(decoy,
target) which moves an existing distinct inode into place, guaranteed
to differ from the original.

CODE:

- Wrap fh.writeFile failure after fh.truncate(0) with
  EINPLACE_WRITE_FAILED + cause, so callers see explicitly that the
  file was truncated and the write didn't complete (otherwise they
  see raw ENOSPC/EIO and may wrongly assume the original is intact
  given this lives in atomicFileWrite.ts).
- Skip fh.chmod when euid is neither root nor expectedStat.uid —
  chmod is guaranteed to fail with EPERM in that case (POSIX requires
  owner or root). Avoids a guaranteed-failing syscall on every call.
- Caller catches ENOENT from writeInPlaceWithFdGuards and falls
  through to atomic rename path. If the file was deleted between
  caller's stat and our open there is no ownership to preserve; the
  rename path correctly creates a new file at targetPath.

DOC:

- Replaced "defends against four races" with "hardened against
  post-stat races" (the bullet list has 5 items, the count was wrong).
- Reworded "non-regular targets must not reach this function" to
  describe defense-in-depth — O_NONBLOCK + !fdStat.isFile() reject
  post-stat regular→FIFO/socket/device swaps. The old wording made
  it look like O_NONBLOCK was redundant.
- Documented the dual chmod behavior (root vs non-root with foreign
  uid) inline.

TESTS:

- Added happy-path test for writeInPlaceWithFdGuards (write succeeds,
  inode preserved, mode preserved).
- Added ENOENT regression test (verifies the missing-O_CREAT
  property — if file unlinked between stat and open, no silent
  recreate with caller's uid).
- Renamed the misleading "O_NOFOLLOW guard" test (it actually tests
  resolve-through-symlink, not O_NOFOLLOW) to reflect what it does,
  and added a direct ELOOP test that drives writeInPlaceWithFdGuards
  with a path whose final component is a symlink — that's the real
  O_NOFOLLOW exercise.
- Fixed the FIFO test to pass a stat captured from the FIFO itself
  (not a stale regular-file stat) so only the FIFO-specific defense
  fires, not the inode/dev mismatch from a different file.

NOT ADOPTED:

- Skip-when-non-root chmod optimization adopted (small, useful), but
  the larger "structured chmod error model" deferred — best-effort
  matches the existing tryChmod pattern at file scope.

197/197 tests pass.

* fix(core): wrap truncate err + post-write nlink check + guard close + chmod sync

PR #4431 review follow-up (qwen-latest-series-invite-beta-v34 via /review)
— 7 of 10 suggestions adopted, 3 deferred:

CODE:

- **EINPLACE_TRUNCATE_FAILED wrap** (review #3291863048): symmetric to
  the existing EINPLACE_WRITE_FAILED — distinguishes "truncate failed,
  original intact" from "write failed post-truncate, original lost".

- **Post-write nlink === 0 check** (review #3291863059):
  EINODE_UNLINKED_DURING_WRITE detects the fstat-to-close window where
  a concurrent rename-over drops our bound inode's link count to zero
  and our write goes to an anonymous inode close will free. Silent
  data loss path now surfaces.

- **fh.close() guarded in finally** (review #3291863044): close failure
  on NFS/FUSE was masking the original try-body exception (including
  the meaningful EOWNERSHIP_CHANGED, EINPLACE_*, EINODE_*). flush:true
  already fsync'd, so close-after-flush is best-effort.

- **fdStat.uid in canChmod** (review #3291863055 part 1): use the
  fd-bound verified value instead of expectedStat.uid. Defense in depth
  — a future weakening of the fstat guard won't silently widen chmod
  privilege.

- **fh.sync() after chmod** (review #3291863053): chmod is metadata,
  not covered by writeFile({ flush: true }). A crash before lazy
  metadata flush would lose the mode restoration (matters for
  setuid/setgid). One extra syscall, best-effort.

- **@remarks freshness contract** (review #3291863051 partial): JSDoc
  now spells out that expectedStat MUST be a fresh stat captured
  immediately before the call. Stale stats nullify every guard.

- **Concurrent-writer limitation noted** (review #3291863061 partial):
  added a "Known limitation — no advisory locking" paragraph to JSDoc
  rather than adopting flock (Linux-specific, NFS issues, scope
  expansion). Callers needing multi-process coordination should layer
  their own lockfile.

- **@throws documentation** (review #3291863051 partial): four
  documented error codes (EOWNERSHIP_CHANGED, EINODE_UNLINKED_DURING_WRITE,
  EINPLACE_TRUNCATE_FAILED, EINPLACE_WRITE_FAILED).

TESTS:

- **EINPLACE_WRITE_FAILED via FileHandle.prototype.writeFile monkey-patch**
  (review #3291863040): triggers the data-loss path, asserts the wrapped
  code + message + cause, and verifies the file is empty (truncate ran).

- **canChmod=false actually skips chmod** (review #3291863055 part 2):
  prior uid-mismatch test had desiredMode === current mode, couldn't
  distinguish "skipped" from "no-op". New test uses desiredMode=0o755
  on a 0o644 file under canChmod=false → asserts mode stays 0o644.

NOT ADOPTED:

- ENOENT/ELOOP/ENXIO catch extension (review #3291863043): keeping the
  strict refusal for swap-to-special-file. Silent fallthrough-to-replace
  was pre-PR atomic-rename behavior, but in shared-write workspaces
  (this PR's target users) a special-file appearing at the target path
  is a signal worth surfacing, not papering over.

- Diagnostic logging (review #3291863049): the function has no logger
  dependency today; adding one is an architecture decision outside
  this PR's scope. The path taken is implied by the side effects
  (inode preserved vs new) but agreed: out-of-band telemetry would
  help ops. Defer to follow-up.

- flock advisory locking (review #3291863061 main): scope expansion;
  Linux-specific semantics, NFS edge cases. Documented as known
  limitation instead.

- Integration test for ENOENT fallthrough at atomicWriteFile level
  (review #3291863043 part 1): ESM module bindings prevent monkey-
  patching writeInPlaceWithFdGuards from outside. The unit test for
  the helper's ENOENT path covers the throwing behavior; the catch is
  3 lines and review-visible. Defer until a refactor opens an
  injection seam.

- Error code string constants export (review #3291863051 part 3): two
  codes don't merit a constant module. Magic strings are fine at this
  size.

199/199 tests pass.

* docs(core): sync writeRuntimeStatus JSDoc with conditional-atomic contract

PR #4431 review follow-up: function-level JSDoc still claimed
unconditional "Atomically write" and "never sees a partially written
file", inconsistent with the module-level docblock updated in earlier
commits. Updated to describe the conditional-atomic behavior (atomic
when uid/gid matches, in-place fallback when ownership differs) and
explicitly note the concurrent-reader visibility trade-off in the
fallback path. Links to atomicWriteJSON for the full contract.

Doc-only change. 199/199 tests pass.

* fix(core): add explicit fh.sync() — FileHandle.writeFile ignores flush option

PR #4431 review follow-up (qwen3.7-max via /review):

CRITICAL — FileHandle.writeFile silently ignores flush:

Node.js FileHandle.writeFile takes an early-return path that bypasses
the flush option entirely (the option is only honored on the
path-based fs.writeFile form). Our previous code passed
{ flush: true } to fh.writeFile and relied on the implicit fsync.
The only explicit fh.sync() was nested in the chmod block guarded by
canChmod — which is FALSE precisely when a non-root group member
writes to a group-writable file they don't own (the exact shared-write
scenario this PR targets). Net effect: in that branch, zero fsync.
Data sits in the kernel page cache; a crash before lazy flush leaves
the file empty (truncate succeeded) or partially written.

Fix:
- Drop flush from the fhWriteOptions object (silently ignored anyway).
- Add an explicit `fh.sync()` after writeFile succeeds, gated on
  options.flush. Runs BEFORE the chmod block so the canChmod=false
  branch also fsyncs.
- The chmod-block fh.sync() becomes metadata-only (covers the mode
  change), as the data is already on disk.

Updated comments to reflect the actual semantics rather than the
incorrect "writeFile({ flush: true }) fsyncs" assumption.

TESTS (partial adoption of review #3293252349):

- EINPLACE_TRUNCATE_FAILED: sibling test to EINPLACE_WRITE_FAILED.
  Monkey-patches FileHandle.prototype.truncate to throw EIO; asserts
  err.code + cause + "original content is intact" message, and
  verifies the file's original bytes are unchanged (truncate didn't
  run).
- Buffer in in-place fallback: locks in binary fidelity (byte-exact
  comparison) so a future encoding-passthrough regression for Buffer
  data would be caught.

NOT ADOPTED in this commit:

- EINODE_UNLINKED_DURING_WRITE test: requires post-write fh.stat()
  mocking with call-count discrimination (first call: real stat for
  verification; second call: nlink=0). The monkey-patch pattern works
  but is fragile; deferred to a follow-up that may also refactor the
  helper to accept an injectable stat fn for cleaner testability.

201/201 tests pass.

* fix: correct stale flush comment + add fh.sync() regression test

- Fix misleading close() comment that said "flush:true already
  fsync'd" — the explicit fh.sync() does the actual fsync, not the
  flush option (which is silently ignored on FileHandle.writeFile).
- Add regression test verifying fh.sync() is called when flush:true
  and skipped when flush is absent, preventing silent removal of the
  core durability fix.

Addresses wenshao review threads from 2026-05-23.

* test: add EINODE_UNLINKED_DURING_WRITE regression test

Monkey-patches FileHandle.stat to return nlink:0 on the post-write
check, verifying the nlink guard throws with the correct error code.
Addresses wenshao review from 2026-05-28.

* simplify: replace writeInPlaceWithFdGuards with plain fs.writeFile

Address yiliang114's review (CHANGES_REQUESTED):

1. [Critical] Remove ~120 lines of fd-level TOCTOU hardening
   (writeInPlaceWithFdGuards) — over-engineering for a local CLI.
   The in-place fallback now uses plain fs.writeFile + tryChmod,
   matching the EXDEV fallback pattern.

2. [Suggestion] Fix macOS GID false-positive: only compare uid in
   ownershipWouldChange(). macOS inherits parent dir GID for new
   files, so egid !== file.gid was a false positive that needlessly
   dropped crash atomicity.

3. [Suggestion] Trim 60+ lines of JSDoc to project style (AGENTS.md:
   "default to none, add only when WHY is non-obvious").

Net: -748 lines. 24 tests pass.

* fix: restore Stats type import (TS2304 build failure)

* docs: narrow scope from uid/gid to uid-only preservation

The gid check is intentionally skipped because macOS inherits the
parent directory's GID for new files, making egid !== file.gid a
false positive. Update comments and PR description to match the
actual implementation scope.

* test: add inode assertion to symlink ownership-mismatch test

Proves the in-place fallback actually ran instead of atomic rename.

* Improve hooks matcher display (#4545)

* feat(cli): improve hooks matcher display

* test(cli): cover hooks navigation levels

* fix(cli): use session channel when closing ACP sessions (#4522)

Detach closeSession/killSession from the session entry's owning channel instead of the current attach target, so the correct channel is decremented and killed during channel overlap (old channel dying while a fresh channel is current). Extracts findChannelInfoForEntry/detachSessionIdFromEntryChannel helpers with unit + integration coverage. Fixes #4325.

* fix(core,cli): replace full-history structuredClone with shallow/tail variants to prevent OOM on resume (#4644)

* fix(core,cli): replace full-history structuredClone with shallow/tail variants to prevent OOM on resume

Several UI and service call sites clone the entire chat history via
structuredClone(getHistory()) every turn. On a resumed session with
thousands of entries, each clone allocates 150-200 MB transiently.
When multiple async side-requests overlap (suggestion generation,
auto-title, checkpointing), multiple clones coexist on the heap,
pushing V8 past its limit within 10 turns (2 GB heap cap).

Changes:
- AppContainer.tsx: use getHistoryTail(40, true) instead of
  getHistory(true) + slice(-40)
- btwCommand.ts: same pattern, use getHistoryTail(40, true)
- sessionTitle.ts: use getHistoryShallow() (read-only filtering)
- sessionRecap.ts: use getHistoryShallow() (read-only filtering)
- useGeminiStream.ts: use getHistoryShallow() for checkpoint
  serialization (only needs to survive JSON.stringify)

Closes #4624

* fix(test): update mocks for getHistoryShallow/getHistoryTail in sessionTitle and btwCommand tests

* fix(cli): migrate remaining getHistory() clone sites to shallow/tail variants

- AppContainer.tsx rewind path: getHistory() → getHistoryShallow()
  (only used read-only by computeApiTruncationIndex)
- Session.ts ACP rewind: getHistory() → getHistoryShallow()
  (only walks entries to compute truncation index)
- Session.ts stop-hook: getHistory() + filter(.model).pop() →
  getLastModelMessageText() (O(1) backward scan, no clone)

* fix(core): use client-level getHistoryShallow with fallback

sessionTitle.ts and sessionRecap.ts were calling
chat.getHistoryShallow() directly, bypassing the client-level
wrapper that provides a getHistory() fallback when the chat
implementation doesn't support shallow reads. Use
geminiClient.getHistoryShallow() instead.

Update test mocks to match the new call site.

* fix(test): add getHistoryShallow and getLastModelMessageText to Session test mocks

Session.ts now calls chat.getHistoryShallow() in rewindToTurn and
chat.getLastModelMessageText() in the Stop hook. Update all mockChat
instances in Session.test.ts to provide these methods.

* feat(cli): add respectUserColors and hideContextIndicator options for statusline (#4670)

* feat(cli): add respectUserColors option to preserve ANSI colors in
     statusline command output

* test(cli): add respectUserColors tests for useStatusLine and Footer

* feat(cli): add hideContextIndicator option to hide built-in context usage in footer

* docs: update statusline configuration docs with respectUserColors and hideContextIndicator

* fix(core): tolerate unsupported Streamable HTTP GET SSE (#4521)

Fixes #4326

* fix(insight): Harden insight facet normalization and empty qualitative handling (#3557)

* Harden insight facet normalization and empty qualitative handling

* feat: enhance AtAGlance component to accept target sections for dynamic rendering

* feat(cli): notify when background shells finish (#4355)

* feat(core): add simplify bundled skill (#3570)

* feat(core): add simplify bundled skill

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): stabilize SettingsDialog restart prompt test

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(skills): use agent tool instead of task in simplify skill

The simplify skill referenced the 'task' tool for launching review passes,
but Qwen Code exposes 'agent' as the callable subagent tool ('task' is only
a legacy permission alias). Using 'task' would cause /simplify to stall when
trying to launch parallel review passes.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs: document simplify bundled skill

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* Update packages/core/src/skills/skill-manager.test.ts

Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>

* fix(core): repair simplify skill tests

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* Update packages/core/src/skills/bundled/simplify/SKILL.md

Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>

* fix(skills): address simplify review feedback (read-only passes, gitignore scope, safer dead-code removal)

- drop inert `argument-hint` frontmatter (argumentHint is never parsed or
  rendered anywhere; no other bundled skill uses it)
- mark Step 2 review passes read-only so edits stay isolated to Step 4
- narrow the no-diff fallback to `git ls-files --modified --others
  --exclude-standard` so ignored build output is excluded
- require a repo-wide caller check before removing code
- make the commands.md row state it edits code directly
- assert non-conflicting bundled skills survive cross-level dedup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>

* feat(skills): add agent reproduction workflows (#4118)

* chore(skills): add codex reproduce workflows

* feat(agent-reproduce): implement agent reproduction workflow and supporting scripts

* feat(skills): capture reference agent state diffs

* feat(cli): virtual viewport for long conversations on ink 7 (#4146)

* chore(deps): re-upgrade ink 6 → 7.0.3 (upstream Static remount fix landed)

PR #3860 first upgraded ink 6 → 7.0.2. PR #4083 reverted because of a
TUI regression: `<Static>` did not re-emit items when its `key` prop
was bumped, so `/clear` / Ctrl+O / refreshStatic left the history area
blank under ink 7.0.2.

ink 7.0.3 (released after #4083) contains the exact fixes:

  - be9f44cda Fix: <Static> remount via key change drops new items (#948)
  - 669c4386c Fix: Drop stale <Static> output from fullStaticOutput on identity change (#950)
  - 7c2267c01 Fix `useBoxMetrics` not accepting ref objects with an initial null value (#945)

Changes:
  - `ink` ^6.2.3 → ^7.0.3 (root hoist + cli direct)
  - `react` ^19.1.0 → ^19.2.4 (cli direct; ink 7.0.3 peerDeps requires >=19.2.0)
  - `react`/`react-dom` overrides ^19.2.4 added so the transitive graph
    stays deduped to a single instance (avoids `Invalid hook call` from
    multiple React copies, the classic ink-upgrade hazard)
  - `wrap-ansi` already on ^10.0.0 from #4083's partial-revert (no change)

Verified:
  - `npm ls ink` → single `ink@7.0.3` across all peer deps
  - `npm ls react` → single `react@19.2.4`
  - `npm run typecheck --workspace=@qwen-code/qwen-code` clean
  - `npm run typecheck --workspace=@qwen-code/qwen-code-core` clean
  - Composer.test.tsx 20/20, MainContent.test.tsx 6/6, TableRenderer.test.tsx
    59/59 + 1 skipped — all key UI components green on the new ink

The Static-remount regression is upstream-fixed in 7.0.3, so the
runtime path is restored without needing #3941's overflowY-self-managed
viewport. #3941 (virtual viewport) remains an opt-in performance
feature on top.

* fix(deps,cli): add @types/react overrides + move refreshStatic out of setCurrentModel updater

Two follow-ups from the multi-round audit of the ink 7.0.3 re-upgrade:

1. @types/react / @types/react-dom now pinned to ^19.2.0 in root
   overrides. packages/web-templates still declares @types/react ^18.2.0
   in its devDeps. Today the CLI build is unaffected (web-templates's
   18.x types are nested in its own node_modules and the React-using
   src/insight and src/export-html files are excluded from its tsconfig
   build), but a future reincludes-or-hoist accident would land
   conflicting global JSX namespaces in the CLI compile graph. Match
   the dep dedup we already enforce for `react` and `react-dom` so the
   type graph stays as deduped as the runtime graph.

2. AppContainer's onModelChange handler was calling refreshStatic() as
   a side-effect inside the setCurrentModel updater. React.StrictMode
   double-invokes state updaters in dev, so model swaps fired two
   clearTerminal writes + two <Static> key bumps. The double work was
   masked under ink 6 (key changes were no-ops on <Static>), but ink
   7.0.3 honors key changes — the doubled work is now potentially
   visible as a faster flash-flash on every model switch.

   Refactor: setCurrentModel becomes a pure setter; refreshStatic
   moves into a useEffect keyed on currentModel with a ref-comparison
   guard so the first render doesn't fire. Single clearTerminal write
   per real model change, even under StrictMode.

Verified: npm ls ink → single 7.0.3, npm ls react → single 19.2.4,
npm ls @types/react → 19.2.10 hoisted (npm flags web-templates's 18.x
constraint as overridden, which is the intended behavior). Typecheck
clean across cli + core workspaces.

* docs(design): virtual viewport on ink 7 — analysis + PR sequence

Captures the architectural analysis of how to thoroughly close the
flicker / refresh-storm class of issues (#2950, #3118, #3007, #3838 UI
side, #3899 follow-on) using a virtualized history viewport.

- Surveys claude-code (forked ink) and gemini-cli (@jrichman/ink +
  ScrollableList + VirtualizedList) reference implementations.
- Confirms ink 7 already exposes the primitives needed
  (`useBoxMetrics`, `measureElement`, `useWindowSize`,
  `useAnimation`) — no fork swap required.
- Picks porting gemini-cli's virtualized list components to ink 7 with
  `ResizeObserver` -> `useBoxMetrics` and a custom `StaticRender`.
- Splits the work into V.0..V.4 PRs with scope, dependencies, risk.
- Lists open questions + 11-item approval checklist that must clear
  before V.0 implementation begins.

This is a docs-only PR per the project's design-first workflow. No
runtime code changes.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): virtual viewport for long conversations on ink 7

Port gemini-cli's VirtualizedList + ScrollableList to stock ink 7,
adapting for ink 7's available primitives:

- `overflowY="hidden"` + `marginTop={-scrollTop}` instead of ink-fork's
  `overflowY="scroll"` (ink 7 has proper clip/unclip in render-node-to-output)
- `useBoxMetrics` inside each VirtualizedListItem (Option A) instead of a
  single ResizeObserver WeakMap; reports height changes via onHeightChange
  callback so the parent can update its heights record
- Custom `StaticRender` as `React.memo` with a reference-equality comparator,
  keyed on `itemKey-static-{width}` to freeze completed conversation items
- Character scrollbar column (`│` track / `█` thumb) since ink 7 has no
  native scrollbar prop
- No ScrollProvider / mouse drag (deferred to a follow-up PR)

Wire into MainContent.tsx behind `ui.useTerminalBuffer` setting (Settings
dialog → UI → Virtualized History; default false — opt-in).

Key bindings: Shift+↑/↓ (line), PgUp/PgDn (page), Ctrl+Home/End (top/bottom).

Re-render optimisations:
- renderItem wrapped in useCallback so renderedItems useMemo only recomputes
  when actual deps change (not on every streaming tick)
- Completed history items passed by original object reference so
  VirtualHistoryItem = memo(HistoryItemDisplay) can bail out on stable props
- estimatedItemHeight / keyExtractor / isStaticItem defined as module-level
  constants with no closure deps

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): add test coverage for virtual viewport scroll bindings and settings

- keyMatchers.test.ts: 6 new test cases for SCROLL_UP/DOWN, PAGE_UP/DOWN,
  SCROLL_HOME/END commands (41 tests total)
- settingsSchema.test.ts: assert ui.useTerminalBuffer is boolean, default false,
  showInDialog true, requiresRestart false

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): use ink 7 native overflow for VP pending items

In VP mode, pending items are rendered inside VirtualizedList's
overflowY="hidden" container, which uses ink 7's native clipping
as the viewport guard. Remove the availableTerminalHeight JS-
truncation bound from pending items in renderVirtualItem:

- JS truncation at terminal height would silently cut off content
  the user could scroll to read within the virtual viewport.
- ink 7 overflowY="hidden" on the VirtualizedList container is the
  correct clip guard — no JS line-counting workaround needed.
- Remove uiState.constrainHeight from renderVirtualItem deps (no
  longer referenced in the VP rendering path).

The legacy <Static> path is unchanged.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* perf(cli): binary-search offsets in virtualized list hot path

Replace linear findLastIndex / findIndex scans on the offsets array with
upperBound. Offsets are monotonic by construction, so the lookups inside
the render body and getAnchorForScrollTop drop from O(n) to O(log n).
Material for thousand-turn sessions where the lookup runs on every frame.

* fix(cli): wire ShowMoreLines + skip clearTerminal in VP mode

Two audit-found bugs in the VP path:

1. `<ShowMoreLines>` was outside the `<OverflowProvider>` that wraps
   `<ScrollableList>` in VP mode. `useOverflowState()` returns
   `undefined` outside the provider, so the component returned `null`
   and the "press ctrl-s to show more lines" affordance silently
   disappeared. Move `<ShowMoreLines>` inside the provider so the hook
   sees the live overflow state, matching the legacy path.

2. `refreshStatic()` and `repaintStaticViewport()` wrote
   `clearTerminal` / `cursorTo+eraseDown` to the host terminal
   unconditionally. In VP mode the React tree owns the visible region
   via ink 7's native `overflowY="hidden"` clipping — the physical
   write is a wasted flash on Ctrl+O / Alt+M / model change / resize.
   Guard both writes on `useTerminalBuffer === false`. The
   `historyRemountKey` bump still fires so the legacy `<Static>`
   fallback would still remount if someone toggled the setting mid-
   session.

Extends the targeted-repaint pattern introduced in #3967 to all
refreshStatic call sites, gated by the VP setting instead of by event
type.

* fix(cli): VP renderItem stability + source-copy offsets + heights GC

Three audit-found regressions tightened, in order of severity:

1. **Source-copy index offsets missing in VP** — legacy `<Static>` path
   threads per-item `sourceCopyIndexOffsets` so `/copy mermaid N` /
   `/copy latex N` hints stay stable across continuation messages. VP
   `renderVirtualItem` was not passing this prop, so the copy hints
   shown under each diagram drifted on every `gemini_content` chunk
   (the clipboard mechanism itself still worked from raw history; only
   the displayed number was wrong). Add two lookup tables —
   identity-keyed for static items, index-keyed for pending — without
   changing the VirtualizedList data signature, and thread offsets in
   both render branches.

2. **`renderVirtualItem` callback invalidated on every streaming tick**
   — its deps included `activePtyId` / `embeddedShellFocused` /
   `isEditorDialogOpen`, all of which flip mid-stream when a shell
   tool runs or a dialog opens. Each flip rebuilt the callback,
   invalidated `VirtualizedList.renderedItems`'s useMemo, and forced
   every static item to re-render through `<StaticRender>` — defeating
   the very memoization the design relies on. Move the three pending-
   only fields into a ref read inside the callback. Static-item closure
   now depends only on inputs that legitimately affect static output
   (terminalWidth, slashCommands, getCompactLabel, …). Pending items
   still re-render correctly because their item identity changes per
   tick, so the callback is called fresh each time and reads the
   latest ref.

3. **`pending` items now honour `constrainHeight`** in VP, matching the
   legacy path. Previously VP unconditionally passed `undefined` for
   `availableTerminalHeight` on pending, relying on the viewport
   `overflowY="hidden"` clip to limit visible size — but that hid the
   `<ShowMoreLines>` affordance from the user. Now that ShowMoreLines
   is correctly wired (previous commit), restore parity.

4. **Heights map memory leak** in `VirtualizedList` — `setHeights` only
   grew. Each `/clear` left orphan `h-N` keys; each pending → completed
   transition left orphan `p-N` keys. Add a `useLayoutEffect` that
   prunes entries whose keys are not in the current `data`. Runs in
   layout phase so the prune commits in the same paint as the data
   change — no stale-offsets frame.

* test+fix(cli): VP path coverage + stabilize absorbedCallIds empty Set

Completion-pass artifacts driven by the multi-agent audit:

- Settings description rewritten to enumerate the symptoms VP fixes so
  users with active flicker reports can find the toggle without reading
  the design doc.
- `absorbedCallIds` returns a module-level constant Set when compact mode
  is off, instead of a fresh `new Set()` per render. Fixes a hidden
  cascade: `activePtyId` flip mid-stream → useMemo runs → returns a new
  empty Set → `isSummaryAbsorbed` rebuilds → `renderVirtualItem`
  rebuilds → `VirtualizedList.renderedItems` recomputes → every static
  item re-renders. With the constant, the cascade dies at the source.
  Helps both VP and legacy paths.
- VP-path unit tests for MainContent (4 cases): ScrollableList mounts
  and Static does not when `useTerminalBuffer: true`; ShowMoreLines is
  reachable in VP mode (regression of the OverflowProvider mis-wrap);
  source-copy index offsets thread into renderItem for static items;
  renderItem callback identity is stable across `activePtyId` flips
  (proves the ref-based read keeps StaticRender memo effective).

* fix(cli): stabilize absorbedCallIds in compact mode + gate heights prune + tighten ShowMoreLines test

Round-2 audit follow-ups. Three real findings addressed; one flagged
false positive documented separately.

1. **absorbedCallIds Set identity now content-stable when compact mode is
   on.** The earlier EMPTY constant only short-circuited the compactMode=
   false path; when compact mode is enabled (some users default-on it),
   activePtyId / embeddedShellFocused flips during streaming still
   produced fresh Sets per render even when membership was unchanged,
   restarting the same cascade the pendingStateRef fix was meant to
   avoid. Compare-and-reuse via a ref: if the new Set has identical
   membership to the previous one, return the previous reference.

2. **`heights` map prune in `VirtualizedList` is gated.** Previously
   every streaming tick rebuilt an N-key Set and walked all heights,
   even on the steady-state path where nothing changes. Now only fires
   when the heights record has clearly outpaced live data
   (`size > max(8, 2 × data.length)`) — covers `/clear` and accumulated
   pending → completed transitions, skips the 30-Hz hot path entirely.

3. **VP ShowMoreLines test now actually verifies overflow connectivity.**
   Previous mock unconditionally rendered "SHOW_MORE", so the test only
   proved the JSX mounted — it would still pass if a future refactor
   moved `<OverflowProvider>` out of the VP tree again. The mock now
   reads `useOverflowState()` and emits "OVERFLOW_DISCONNECTED" when the
   context is missing. The VP test asserts both presence of "SHOW_MORE"
   and absence of the disconnected marker, so the regression is now
   caught.

Not addressed:
- Audit P0-1 claim that `renderMode` (Alt+M) / model-change updates
  don't reach VP static items: false positive. `renderMode` is a React
  Context (`RenderModeContext`), and Context propagation traverses the
  tree past `memo` boundaries — MarkdownDisplay's `useRenderMode()`
  consumer re-renders on context change regardless of whether
  `StaticRender` bails out. Verified by reading
  `packages/cli/src/ui/contexts/RenderModeContext.tsx` and
  `MarkdownDisplay.tsx:172`. No code change.
- Audit P1-2 pendingStateRef write-during-render race: speculative,
  relies on a multi-pass render path React 18+ does not currently use.
  Documented assumption in the existing inline comment.

* fix(cli): isolate renderItem errors + defensive height coerce + compact-mode mergedHistory stability

Round-3 audit follow-ups. Three real findings; the rest verified clean.

1. **`renderItem` errors no longer crash the CLI.** Previously a throw
   inside a per-item render propagated through `VirtualizedList`'s
   useMemo into React's commit phase, tearing down the whole Ink tree —
   one bad history record could nuke the session. Wrap each call in a
   try/catch and substitute a small red `[render error] …` text box on
   failure. The row stays in the viewport so the user can scroll past
   it.

2. **Defensive height coerce in offset accumulation.** A buggy
   `estimatedItemHeight` returning NaN / negative / Infinity would
   poison every downstream offset and break the `upperBound` /
   `findLastLE` binary search (which assumes monotonic offsets). Clamp
   to `Number.isFinite(raw) && raw > 0 ? raw : 0`. No-op for the
   in-tree estimators that return 3; insurance against future
   consumers.

3. **`mergedHistory` is content-stable when compact mode is on.** The
   Round-2 absorbedCallIds stability fix didn't reach this path:
   `mergeCompactToolGroups` always allocates a fresh array, and
   `mergedHistory`'s useMemo lists `activePtyId` / `embeddedShellFocused`
   as deps, so every streaming tick mid-shell-tool produced a new array
   even when items aligned. Cascade went `mergedHistory` → offsets map
   → `renderVirtualItem` → every static item re-rendered. Pair-wise
   compare new vs previous and return the previous reference when items
   align. Restores StaticRender memo effectiveness for compact-mode
   users.

Not addressed (audit findings deemed not worth fixing in this PR):
- `scrollToItem` silently no-ops when item is not in data — no current
  caller checks the return value, low impact.
- `allVirtualItems` array spread is O(n) per streaming tick — real but
  not a crash; revisit in a perf-focused follow-up.
- `itemRefs.current` is dead surface (never read) — cosmetic.
- StrictMode-only-in-DEBUG double-invoke paths verified safe.

* test+chore(cli): VP review round 4 — VirtualizedList/useBatchedScroll coverage + cleanups

Addresses wenshao's CHANGES_REQUESTED review on PR #3941.

- Add focused unit tests for `VirtualizedList` (9 cases) covering empty
  data, `renderStatic` full-render, `initialScrollIndex` with
  `SCROLL_TO_ITEM_END`, `targetScrollIndex` anchoring, imperative
  `scrollToEnd` / `scrollToIndex`, per-item `renderItem` error isolation,
  NaN/negative estimator coercion, and out-of-range `initialScrollIndex`
  clamping.
- Add `useBatchedScroll` unit tests (4 cases) covering initial reads,
  pending-value reads in the same tick, post-commit pending reset, and
  callback identity stability across rerenders.
- Remove dead `itemRefs` / `onSetRef` plumbing (declared, written, never
  read; `useCallback` with empty deps was also a stale-closure trap).
- Remove unused `isStatic?: boolean` from `VirtualizedListProps`
  (only `isStaticItem` is actually consumed).
- Tighten the render-phase setState block: each setter is now guarded
  by an equality check so React bails out of redundant updates, and a
  comment documents that this is the React-endorsed "adjusting state
  while rendering" pattern (the synchronous update avoids a one-frame
  flash at the previous position when `targetScrollIndex` changes).

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* chore(cli): remove dead `dataRef` from VirtualizedList (round-4 followup)

Declared and written in a `useLayoutEffect` on every `data` change but
never read anywhere in the component. Flagged in wenshao's round-4 review
of PR #3941.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): collapse model-change effect back into one batched handler

wenshao's PR #4119 review correctly flagged that splitting the
onModelChange flow into two effects (b25831b0e) reintroduced the
issue #3899 freeze regression on every model switch:

  1. setCurrentModel(model) commits first, with the OLD
     historyRemountKey.
  2. <Static key={`${historyRemountKey}-${currentModel}`}> sees its
     key change (because currentModel did) and remounts immediately.
  3. MainContent's render-phase progressive-replay reset only fires
     when historyRemountKey changes, so replayCount is still the
     full mergedHistory.length from any prior catch-up.
  4. The remounted Static dumps the entire history in one synchronous
     layout pass — exactly the freeze progressive replay was added
     to avoid (#3899). The second effect's refreshStatic() bump
     arrives a render too late.

Fix: do not split. Both side effects (refreshStatic, which writes
clearTerminal + bumps historyRemountKey, and setCurrentModel) live
in the event handler again, with a ref guard for same-model
notifications. The React.StrictMode concern that motivated b25831b0e
is addressed by keeping the side effect OUT of the setState updater
(it now runs once per event-handler invocation, not once per
double-invoked updater call). Both setState calls land in the same
React batch, so historyRemountKey and currentModel update together —
MainContent's render-phase reset sees the new key, replayCount drops
to the first chunk, and Static remounts with chunked replay intact.

Tests:
- AppContainer.test.tsx: 4 new tests covering the synchronous
  refreshStatic side-effect contract, same-model no-op, ref-guarded
  StrictMode double-invoke, and unsubscribe-on-unmount.
- MainContent.test.tsx: new regression guard — when currentModel
  changes but historyRemountKey is held constant, progressive replay
  must NOT reset (pins the MainContent invariant the two-effect
  refactor accidentally relied on).

Verified: vitest packages/cli AppContainer + MainContent green (82/82).
Typecheck clean.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix+docs(cli): VP review round 5 — typecheck, doc drift, scroll keys

PR #4146 review feedback (wenshao + Claude Opus 4.7 audit) addressed:

Code:
- MainContent.test: activePtyId typed as number (was 'pty-xyz' string,
  broke tsc with TS2322 — the test only relies on reference change so
  any number works).
- VirtualizedList: sanitize renderItem error path. Display becomes the
  generic `[render error]` marker; full err goes to debugLogger.debug
  so file paths / partial tool state don't leak to scrollback.
- MainContent: move pendingSourceCopyOffsetsByIndex into a ref so it
  no longer rebuilds renderVirtualItem identity every streaming tick.
  Without this, VirtualizedList.renderedItems useMemo invalidated
  per-tick → JSX rebuilt for every visible item → memo(HistoryItem
  Display) was still bailing but allocations were O(visible) per tick.
- AppContainer: drop the misleading "state-driven scroll reset" claim
  in the VP refreshStatic comment. VP is intentionally near-no-op:
  the React tree owns the visible region, mergedHistory mutation is
  what refreshes the screen, and the remount-key bump is preserved
  only to keep the legacy Static branch in sync if the user toggles
  the flag off mid-session.
- StaticRender: rewrite JSDoc to match reality. The custom React.memo
  is NOT output caching like @jrichman/ink's StaticRender export;
  the comparator rarely matches (parent allocates fresh JSX); the
  real skip happens at memo(HistoryItemDisplay) one level deeper.

Docs:
- docs/design/virtual-viewport: sync file map (drop non-existent
  ScrollProvider.tsx / useAnimatedScrollbar.ts), PR sequence (one PR
  #4146, V.3-V.5 deferred), open-question + checklist resolution for
  #3905 (superseded) and base branch rename.
- docs/users/reference/keyboard-shortcuts: document the 6 VP scroll
  keys (Shift+↑/↓, PgUp/PgDn, Ctrl+Home/End) under a "History
  scrollback (when ui.useTerminalBuffer is on)" section. Previously
  the only discovery path was the Settings dialog description.

Verified: tsc --noEmit -p packages/cli ✓, vitest 160/160 ✓ across
AppContainer / MainContent / VirtualizedList / useBatchedScroll /
keyMatchers / settingsSchema, eslint clean on touched files.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): SGR mouse wheel scroll in VP mode

Recovers the most-felt UX regression vs legacy `<Static>` mode: when
`ui.useTerminalBuffer` is on, legacy users lose mouse wheel as a way
to scroll history (the host terminal stopped seeing the conversation
in its scrollback buffer). This PR enables button-event tracking
(`?1002h`) + SGR coordinates (`?1006h`) while the ScrollableList has
focus, parses wheel events off stdin, and routes them to scrollBy.

Scope kept tight on purpose:
- Wheel only. Hit-testing for scrollbar drag / click-to-position
  needs screen-absolute element coords; stock ink 7's useBoxMetrics
  returns yoga's parent-relative layout. Deferred to V.4 with two
  exit paths (upstream getBoundingBox to ink 7, or local yoga walker).
- Mouse mode is enabled only while ScrollableList is mounted; non-VP
  users never see their terminal flipped into button-event tracking.
- Side effect: native click-and-drag text selection is captured by
  the program. Docs + settings dialog description now spell out the
  Shift / Option (macOS) bypass.

Implementation:
- `ui/utils/mouse.ts` — SGR + X11 parser, ported and trimmed from
  gemini-cli (Google LLC, Apache-2.0). Single-consumer.
- `ui/hooks/useMouseEvents.ts` — enable/parse/disable lifecycle
  hook. Listens on stdin via `useStdin().stdin`, runs handler
  through a ref so callers don't have to memoize.
- `ui/components/shared/ScrollableList.tsx` — subscribe to mouse
  events, route wheel → `scrollBy(±3)`. Also drops a dead outer
  `<Box flexGrow={1}>` wrapper that held an unread containerRef
  and collapsed to zero height in ink-testing-library (the test
  renderer has no flex parent, so flexGrow=1 → 0 height → no items
  ever rendered, which is how this dead code was exposed).

Tests:
- `ui/utils/mouse.test.ts` — 14 cases: SGR parsing (wheel, presses,
  modifiers, move), X11 parsing, fallback chain, incomplete-sequence
  guard (including the >50-byte garbage cap).
- `ui/components/shared/ScrollableList.test.tsx` — 3 cases: wheel
  events shift the rendered window; hasFocus=false makes the mouse
  pipeline inactive (no throw); non-wheel events leave the window
  unchanged. Renders are wrapped in `<KeypressProvider>` (required
  by useKeypress in production but easy to forget in standalone
  tests).

Docs:
- `docs/users/reference/keyboard-shortcuts.md` — adds "Mouse wheel"
  row + the Shift/Option-to-select note.
- `packages/cli/src/config/settingsSchema.ts` — the in-app dialog
  description now mentions mouse wheel and the text-select bypass.
- `docs/design/virtual-viewport/README.md` — §1 status, §5 file map,
  §7 PR sequence all reflect mouse wheel landing in #4146 and the
  V.4–V.7 follow-up split (scrollbar drag / in-app search / alt-
  buffer / host-scrollback dual-write research).

Verified: tsc --noEmit -p packages/cli ✓, vitest 182/182 ✓ across
AppContainer / MainContent / VirtualizedList / ScrollableList /
useBatchedScroll / mouse / keyMatchers / settingsSchema.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): auto-hide animation for VP scrollbar thumb

Pairs with the SGR mouse-wheel work from the previous commit:
when the user actually scrolls, the thumb pops bright; after a
1.5s idle it fades into the dim track so the bar stops competing
with the conversation. The track column itself stays in layout
regardless, so the viewport never reflows mid-flash (which would
trigger per-item re-measure and a visible jitter).

Implementation kept minimal for stock ink 7:
- gemini-cli's `useAnimatedScrollbar` interpolates RGB colors via
  a theme + per-frame setInterval. The terminal can't render
  smooth fades anyway, so this hook collapses the state to a
  binary `isVisible` flag with a single setTimeout. ~75 LoC.
- `VirtualizedList` calls `flashScrollbar()` from a useLayoutEffect
  keyed on `clampedScrollTop`. The very first commit is skipped
  via a ref so initial mount doesn't paint a flash.
- The render switches the thumb glyph (`█` vs `│`) and `dimColor`
  based on `isVisible && inThumb`. Width stays 1 either way.

Tests (6 new):
- initial mount stays hidden (no spurious mount flash)
- flash → visible, hides after idle timeout, successive flashes
  reset the timer (no premature hide), idleHideMs<=0 disables
  auto-hide for tests that want to assert on the visible state,
  unmount cleans up the pending timer.

Doc updates:
- `docs/design/virtual-viewport/README.md` §1 status, §5 file map,
  §7 PR sequence — V.4 row now scopes only the drag/click-jump
  work (still coord-blocked); animated scrollbar moved out of
  deferred and into shipped.
- PR #4146 body — architecture table mentions the auto-hide, new
  files list adds `useAnimatedScrollbar.ts`, test count refreshed
  to 188/188.

Verified: tsc --noEmit -p packages/cli ✓, vitest 188/188 ✓.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): VP review round 6 — ESC bug, CI lint, scope-controlled cleanup

Triage of /review feedback from 2026-05-18 + 2026-05-19. Took the
ones that are real and small; declined the ones that are
false-positive / out-of-scope so this PR stops expanding.

Must-fix:
- CI Lint failure: vscode-ide-companion/schemas/settings.schema.json
  was stale after the keyboard-shortcuts description bump. Regenerated
  via `npm run generate:settings-schema`.
- useMouseEvents.ts had `const ESC = '';` (literal empty string after
  the raw 0x1B byte got stripped somewhere in the source pipeline).
  `buffer.indexOf('', 1) === 1` would have degraded garbage skipping
  to a one-byte scan, and the `else { buffer = ''; break }` branch
  could never run. Fixed by switching to the `'\x1b'` text escape and
  doing the same in `mouse.ts` (which had the raw byte, also fragile).
  Comment explains why.

Small wins (one-liners taken from the review batch):
- ScrollableList: rest-spread separates `hasFocus` from the props
  forwarded to VirtualizedList. Latent collision risk; no behaviour
  change today.
- VirtualizedList: `debugLogger.debug` when isReady=false so blank-
  viewport edge cases (tiny terminal / mid-resize race) become
  diagnosable from the debug log instead of looking like a hang.

Real perf (VP-only):
- MainContent: gated the progressive-Static-replay machinery behind
  `!useVirtualScroll`. The render-phase reset still consumes the
  remount-key bump so flag-off toggles mid-session catch up cleanly,
  but `setReplayCount` and the setImmediate chunking effect are now
  skipped for VP users. Saves ~M/CHUNK_SIZE wasted re-renders per
  Ctrl+O / model change on a 1000-turn session.

Belt-and-braces:
- useMouseEvents: added a `process.on('exit')` handler that writes
  the SGR mouse disable seq again. The React cleanup already covers
  normal unmount, but Ctrl+C / SIGTERM / parent kill bypass it and
  the terminal would otherwise stay in button-event-tracking mode
  after qwen exits.

Explicitly declined / deferred (with reasoning logged on the PR):
- requestAnimationFrame wheel throttle: rAF doesn't exist in Node;
  React 19 already batches state updates within a tick, and the
  renderedItems memo bounds the actual work to visible items. Will
  revisit if profiling shows it.
- Stable pending-item IDs (`p-N` keys shifting on completion): the
  observable jitter is at most one frame of estimated-vs-actual
  height delta. Moderate scope (creation-time ID allocation); fits
  better in a focused follow-up than in this PR.

Verified: tsc --noEmit -p packages/cli ✓, vitest 188/188 ✓ across
the full VP suite.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): scrollBy bottom uses live end anchor in virtualized list

When keyboard scroll reaches the bottom, scrollBy set isStickingToBottom
but anchored via getAnchorForScrollTop(maxScroll), a fixed {index,offset}
pixel anchor. scrollTo/scrollToEnd instead use {index: last, offset:
SCROLL_TO_ITEM_END}, which recomputes the bottom from live item heights
each render. The fixed anchor did not track the last item growing during
streaming, so scroll-to-bottom via keyboard lagged behind new tokens.
Align scrollBy's bottom branch with the sibling methods.

Reported by wenshao in PR review.

* fix(cli): parse mouse events via ink useInput, not a stdin data listener

useMouseEvents attached its own stdin.on('data', ...) listener. Adding a
'data' listener switches stdin into flowing mode, which drains the buffer
before ink's readable + stdin.read() reader (ink App) can consume it, so
all keyboard input routed through useInput was silently starved while
mouse mode was active.

Parse mouse sequences from ink's existing input pipeline via useInput
instead, so there is only one stdin reader. ink captures a full SGR
sequence (ESC [ < .. M/m) as a single CSI event and delivers it with the
leading ESC stripped, so we re-prepend it before parsing. Non-mouse input
does not match and is ignored; ink still routes input to the app's other
useInput handlers, so keyboard navigation keeps working.

Only SGR mode (1006h, which we enable) is parsed via this path; the legacy
X11 encoding is not recoverable through ink's CSI parser, which is the
encoding modern terminals stop emitting once 1006h is set.

Reported by wenshao in PR review.

* fix(cli): parse only SGR in mouse hook to avoid X11 paste misfire

The useInput-based mouse hook called parseMouseEvent, which also tries the
X11 fallback (parseX11MouseEvent). An X11 prefix (ESC [ M + 3 bytes) can
reach the handler via pasted text — ink emits paste content as input when
no paste listener is registered — and would misfire a spurious mouse event.
Call parseSGRMouseEvent directly so only the SGR encoding we enable (1006h)
is parsed, matching the hook's documented contract.

Reported by wenshao in PR review.

* test(cli): assert SGR mouse parser rejects X11 sequences

Locks in the security property behind the parseMouseEvent ->
parseSGRMouseEvent switch in useMouseEvents: an X11 sequence arriving as
pasted text must not misfire a mouse event. Asserts a well-formed X11
sequence is a valid X11 event yet returns null from parseSGRMouseEvent, so
a future revert to parseMouseEvent fails this test.

Reported by wenshao in PR review.

* test(cli): add VP scroll coverage + eslint-disable for useBatchedScroll

Cover keyboard scroll commands (Shift+Up/Down, PageUp/Down, Ctrl+Home/End),
scrollBy/scrollTo imperative API (positive/negative/overflow/clamp), and
auto-scroll-during-streaming state machine (stick-to-bottom, disengage on
user scroll, re-engage on scrollToEnd). Add missing eslint-disable-next-line
for intentionally dep-free useLayoutEffect in useBatchedScroll.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* chore(cli): remove trailing whitespace in useBatchedScroll

The eslint-disable-next-line comment was removed by eslint --fix as an
unused directive (exhaustive-deps does not flag a useLayoutEffect with
no dependency array). Clean up the residual blank line.

Generated with AI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): background housekeeping for stale file-history dirs (#4414)

PR #4064 introduced ~/.qwen/file-history/{sessionId}/ for /rewind but had
no cross-session cleanup — directories accumulated indefinitely. This adds
a generic background housekeeping framework with file-history cleanup as
its first user.

- 30-day mtime sweep, configurable via general.cleanupPeriodDays
- 10-min startup delay (1-min catch-up if last run >7d ago)
- 24h recurring cadence, idle-gated (defers if user typed in last 1 min)
- O_EXCL lockfile + marker mtime throttle (multi-process safe)
- Current session whitelisted via lazy config.getSessionId() — defends
  against long-idle active sessions and /clear minting a new session
- Negative cleanupPeriodDays values clamp to 1h minimum (defends against
  schema-bypass: a future cutoff would otherwise sweep everything)
- Zero new prod dependencies; ~70 lines of self-written O_EXCL throttle
  primitive in lieu of proper-lockfile (which pulls graceful-fs and
  monkey-patches every fs method on first require)
- All setTimeout(...).unref() — never blocks process exit

Closes #4173.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(core): loosen auto-mode classifier timeouts, disable stage-2 thinking (#4680)

* fix(core): loosen auto-mode classifier timeouts, disable stage-2 thinking

The AUTO-mode classifier fails closed on timeout — a timed-out judge call
blocks the action as "unavailable". The tight 3s/10s stage budgets turned
transient slowness (slow network, large transcript, model queueing) into
spurious blocks of otherwise-valid actions. Raise them to 10s/30s so a
slow-but-healthy call is not treated as a hard block.

Also disable thinking in stage 2 (previously the only stage with
includeThoughts: true). This is a latency-sensitive permission gate the
user is actively waiting on; allocating a reasoning budget made the review
path slower and more expensive, which directly worsened the fail-closed
timeout. The model still records its reasoning in the structured
`thinking` output field — it just no longer gets an allocated budget.

Closes #4676

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs(core): trim verbose comments in auto-mode classifier

Condense the three comments touched by this change (module docstring
stage-2 note, timeout-budget rationale, stage-2 thinkingConfig) while
keeping the essential "why". No logic changes.

Co-authored-by: Qwen-Coder <noreply@qwenlm.ai>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <noreply@qwenlm.ai>

* fix(core): coerce hostile-provider usage token counts (#4350 part 1) (#4439)

* fix(core): coerce hostile-provider usage token counts (#4350 part 1)

Hostile providers (broken upstream, OpenAI-compat proxy returning
null/NaN, misconfigured override) can emit non-finite or negative
values for `usageMetadata.{prompt,candidates,cached,total}TokenCount`.
Captured unguarded in `processStreamResponse`, these poison the
compaction gate arithmetic:

- `lastPromptTokenCount + NaN >= hard` is always false → hard-rescue
  is silently disabled, eventually OOMing the V8 heap.
- `Infinity >= hard` is always true → hard-rescue fires every send.

Route the four API capture sites through a `coerceUsageCount` helper
that maps unknown / non-finite / negative to 0. `Number.isFinite(-1)`
is true, so an explicit `>= 0` is needed in addition to `isFinite`.

Part 1 of the hostile-provider hardening from #4350. The companion
`computeThresholds` guard depends on the un-merged three-tier ladder
in #4345 and is deferred until that lands.

Covered by parametrized tests in `geminiChat.test.ts` over NaN,
±Infinity, negative, null, undefined, and string inputs, plus a
fallback test asserting a…
@DragonnZhang DragonnZhang deleted the dragon/feat-reproduce-skill branch June 4, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/feature-request New feature or enhancement request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants