feat(core)!: redesign auto-compaction thresholds with three-tier ladder by LaZzyMan · Pull Request #4168 · QwenLM/qwen-code

LaZzyMan · 2026-05-15T07:57:30Z

Summary

What changed: Replaces qwen-code's single 70% proportional auto-compaction threshold with a three-tier ladder (warn / auto / hard) that combines proportional fallback with absolute reservation. Also disables thinking + caps maxOutputTokens on the compression sideQuery, upgrades failure handling from a one-shot lock to a 3-strike circuit breaker, adds a local token estimator for the cheap-gate, plumbs a hard-tier rescue into sendMessageStream, rewires the /context command and tipRegistry tips around the new thresholds, and removes the chatCompression.contextPercentageThreshold setting.
Why it changed: The old 70% formula reserved 30% of the window unconditionally — on a 1M model that's 300K wasted. Aligning with claude-code's absolute-reservation design recovers ~267K on 1M and ~44K on 256K models while keeping proportional behaviour for small windows. Bundled in: failure recovery (1-shot lock made transient errors permanent), first-send / --continue coverage (lastPromptTokenCount = 0 previously bypassed all gates), and predictable buffer math across providers (thinking budget semantics vary).
Reviewer focus:
- packages/core/src/services/chatCompressionService.ts — new computeThresholds(), tier constants, cheap-gate
- packages/core/src/services/tokenEstimation.ts — local char/4 estimator
- packages/core/src/core/geminiChat.ts — hard-tier rescue + consecutiveFailures breaker
- packages/cli/src/ui/commands/contextCommand.ts — /context display

Validation

Commands run:
```bash
npm run typecheck # clean (4 workspaces)
npm run lint # clean (project files; pre-existing e2e-testing/scripts/*.js untouched)
cd packages/core && npx vitest run src/services src/core # 1930/1930 pass
cd packages/cli && npx vitest run # 5995/5995 pass + 9 skipped
```
Quickest reviewer verification path:
1. `computeThresholds(window)` is pure — read it and the 6 unit-test cases in `chatCompressionService.test.ts` (32K / 64K / 128K / 200K / 1M / 10K-extreme) to confirm the math.
2. The cheap-gate / hard-rescue wiring is covered by the new describe blocks in `chatCompressionService.test.ts` ("cheap-gate uses computeThresholds.auto", "computeThresholds") and `geminiChat.test.ts` ("compression failure circuit breaker", "sendMessageStream hard-tier rescue", "sendMessageStream first-turn estimation").
3. /context output covered by new `contextCommand.test.ts` tests (warn / auto / hard / safe tier classification + estimated-path fallback).

Evidence:

Threshold table across windows (matches design doc):

window	warn	auto	hard	dominant
32K	19.2K (pct)	22.4K (pct)	22.4K (退化)	proportional
128K	76.8K (pct)	95K (abs)	105K (abs)	mixed
200K	147K (abs)	167K (abs)	177K (abs)	absolute
1M	947K (abs)	967K (abs)	977K (abs)	absolute

Not validated:
- No live model run. All tests are unit/integration with mocked sideQuery; the new `maxOutputTokens=20K` cap and `includeThoughts=false` settings on the compression call were verified to be passed correctly but not yet exercised against a real provider. Worth running a few real auto-compactions on Dashscope / Anthropic / Gemini before merge to confirm summary quality holds.

Scope / Risk

Main risk: Tighter buffer on large windows. On a 1M model auto used to trigger at 700K and now triggers at 967K; that's intentional (recovers 267K of usable context) but compression has less headroom. The `maxOutputTokens=20K` cap mitigates output-side blowups; reactive overflow still latches as a safety net.
Breaking changes / migration notes:
- `chatCompression.contextPercentageThreshold` setting removed. Settings files containing it log a one-line stderr deprecation warning at startup and the value is ignored. The proportional floor (`DEFAULT_PCT = 0.7`) is now an internal constant.
- Public API surface expansion: `@qwen-code/qwen-code-core` now re-exports `computeThresholds` and `CompactionThresholds` (consumed by the CLI `/context` command and tip registry).
Not covered / not validated:
- `packages/cli/src/ui/components/views/ContextUsage.tsx` still consumes `autocompactBuffer` (redefined as `contextWindowSize - thresholds.auto` for compat). Refactoring it to render the three-tier ladder directly is a recommended follow-up.
- Telemetry-based calibration of the 20K `SUMMARY_RESERVE` against qwen workloads is not in scope; claude-code's p99.99 of 17K is the basis. Worth observing once shipped.
- The `COMPACT_MAX_OUTPUT_TOKENS = 20K` cap could clip very long compaction summaries — a `finish_reason === MAX_TOKENS` NOOP guard in `compress()` would be a sensible follow-up.

Testing Matrix

	🍏	🪟	🐧
npm run	✅	⚠️	⚠️
npx	⚠️	⚠️	⚠️
Docker	⚠️	⚠️	⚠️
Podman	⚠️	N/A	N/A
Seatbelt	⚠️	N/A	N/A

Testing matrix notes:

Implementation and verification ran on macOS (Apple Silicon). All other rows are unit-test coverage only — no platform-specific code was touched (this PR is pure TypeScript logic + tests), so cross-platform risk is low. Windows / Linux validation welcome before merge.

Design references

Design doc: `docs/design/auto-compaction-threshold-redesign.md`
Implementation plan: `docs/plans/2026-05-14-auto-compaction-threshold-redesign.md`

Both are committed in this PR so the rationale is visible alongside the code.

github-actions · 2026-05-15T07:59:54Z

📋 Review Summary

This PR redesigns qwen-code's auto-compaction threshold system from a single 70% proportional threshold to a three-tier ladder (warn/auto/hard) combining proportional fallback with absolute reservation. The implementation aligns with claude-code's design, recovers significant wasted context on large windows (~267K on 1M models), and bundles several related improvements: 3-strike failure circuit breaker, local token estimation for accurate threshold gating, and predictable output budget control via maxOutputTokens cap with thinking disabled.

🔍 General Feedback

Excellent design documentation: The design doc (docs/design/auto-compaction-threshold-redesign.md) provides clear rationale, mathematical formulas, empirical data tables, and implementation phases. This makes the review significantly easier.
Well-structured implementation: The code changes follow the design spec closely, with constants co-located, pure functions for threshold computation, and clear separation of concerns.
Comprehensive test coverage: New unit tests cover threshold computation across window sizes (32K/64K/128K/200K/1M/extreme), token estimation functions, cheap-gate behavior, and circuit breaker logic.
Thoughtful risk acknowledgment: The PR body explicitly identifies the main risk (tighter buffer on large windows), breaking changes, and what wasn't validated (live model runs).
Cross-file consistency: Constants and formulas are consistent across chatCompressionService.ts, geminiChat.ts, contextCommand.ts, and tipRegistry.ts.

🎯 Specific Feedback

🟡 High Priority Issues

File: packages/core/src/services/chatCompressionService.ts:374-380 - The runSideQuery call sets maxOutputTokens: COMPACT_MAX_OUTPUT_TOKENS and includeThoughts: false, but there's no guard for when finish_reason === 'MAX_TOKENS'. If the summary gets clipped at 20K, the code may persist a truncated summary. The design doc (risk Where is the config saved? #2) acknowledges this and suggests a follow-up, but given this is a fundamental change to compression behavior, consider adding at least a defensive check now:
```
// After runSideQuery, check if summary was truncated
if (summaryResult.usage?.candidatesTokenCount === COMPACT_MAX_OUTPUT_TOKENS) {
  // Log warning or treat as NOOP to avoid persisting truncated summary
  config.getDebugLogger().warn('Compression summary hit maxOutputTokens limit');
}
```
File: packages/core/src/services/tokenEstimation.ts:34-40 - The estimateContentTokens function uses DEFAULT_IMAGE_TOKEN_ESTIMATE as a fallback when called without precomputed counts. The comment notes this is "a test-friendly default" but production callers "MUST pass precomputedCharCounts". This is a footgun — consider making the parameter required or using a clearer default that won't silently produce wrong estimates if a future caller forgets to pass it.

🟢 Medium Priority Issues

File: packages/core/src/services/chatCompressionService.ts:67-75 - The TOOL_ROUND_RETAIN_COUNT constant is exported but only used internally by findCompressSplitPoint and splitPointRetainingTrailingPairs. Unless there's a planned external consumer, this should be private to avoid polluting the public API surface.
File: packages/cli/src/ui/commands/contextCommand.ts:177-183 - The code still references config.getChatCompression()?.contextPercentageThreshold for backward compatibility, but this PR removes that field. The deprecation warning logic in config.ts should handle this, but the context command should migrate to using computeThresholds() directly for consistency with the new design.
File: packages/core/src/core/geminiChat.ts - The PR description mentions consecutiveFailures counter and hard-tier rescue wiring in sendMessageStream, but the diff shows these changes are in a file that already exists in the repo. Need to verify the consecutiveFailures state is properly initialized and reset across session boundaries (e.g., when chat is restored from disk via --continue).

🔵 Low Priority Suggestions

File: packages/core/src/services/chatCompressionService.ts:89-105 - The computeThresholds function is excellent and well-documented. Consider adding a JSDoc @example showing the threshold table from the design doc (32K/128K/200K/1M rows) so future maintainers can quickly verify the formula behavior without consulting external docs.
File: packages/core/src/services/tokenEstimation.ts - The BYTES_PER_TOKEN = 4 constant is used throughout, but BYTES_PER_TOKEN_JSON = 2 (mentioned in design doc) isn't actually used anywhere in the implementation. Either remove it from the code or add a comment explaining where it would apply (currently only estimateContentChars in compactionInputSlimming.ts uses it for functionCall/functionResponse).
File: packages/cli/src/services/tips/tipRegistry.ts - The PR description mentions rewriting three context-* tips to follow the new thresholds, but this file isn't in the changed files list. Verify the tip registry changes are included before merge.
File: packages/core/src/index.ts - The PR mentions exporting computeThresholds and CompactionThresholds from @qwen-code/qwen-code-core for CLI consumption. Ensure these exports are added to the public index to avoid breaking the CLI's import path.

✅ Highlights

Mathematically sound threshold design: The computeThresholds function elegantly handles both small windows (proportional fallback) and large windows (absolute reservation) via max() formulas. The test cases verify the ladder always satisfies warn <= auto <= hard.
Circuit breaker upgrade: Moving from "1 failure = permanent lock" to "3 consecutive failures = temporary熔断" is a significant reliability improvement. The implementation correctly excludes force=true calls from the counter.
Token estimation closes critical gaps: The estimatePromptTokens function properly handles both the "lagging by one turn" and "first-turn zero" issues that previously caused threshold bypass scenarios.
Cross-provider consistency: Disabling thinking and capping maxOutputTokens addresses the inconsistent semantics across Anthropic/OpenAI/Gemini providers, making buffer predictions reliable.
Comprehensive validation: All 1930 core tests and 5995 CLI tests pass, plus typecheck and lint are clean. The test matrix in the PR body is transparent about platform coverage.

Adds a defensive guard in ChatCompressionService.compress() that detects when the side-query summary hit COMPACT_MAX_OUTPUT_TOKENS (20K). In that case the summary is likely truncated mid-content, so we drop it and return NOOP rather than persist a half-summary. The next send re-tries; reactive overflow still catches the catastrophic case where the API rejects the next request as too large. Documented in the design doc as risk #2; the bot reviewer on PR #4168 correctly pushed for it to land alongside the threshold redesign rather than as a follow-up since the new 20K cap is what makes truncation likely in the first place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LaZzyMan · 2026-05-15T08:18:19Z

Review response — commit `6ce81e73c`

Thanks for the thorough pass. Triage results per finding:

#	Outcome	Notes
🟡 H1 — `MAX_TOKENS` guard	✅ Fixed	Added a defensive check in `compress()` that NOOPs when `compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS` so a truncated summary isn't persisted. Includes a unit test asserting the path. Commit `6ce81e73c`.
🟡 H2 — `estimateContentTokens` footgun	❌ Declined (false-positive)	The "MUST pass `precomputedCharCounts`" warning the comment references is on `findCompressSplitPoint`, not `estimateContentTokens`. The latter's `imageTokenEstimate` parameter has a benign default (`DEFAULT_IMAGE_TOKEN_ESTIMATE = 1600`) that matches the splitter's default, keeping the two estimators in sync. Different functions, different contracts.
🟢 M1 — Hide `TOOL_ROUND_RETAIN_COUNT`	❌ Declined (out of PR scope)	`TOOL_ROUND_RETAIN_COUNT` was already exported in `chatCompressionService.ts` before this PR. Reducing export visibility is a separate cleanup that I don't want to bundle into a threshold-redesign change.
🟢 M2 — `contextCommand.ts:177-183` still references the deprecated field	❌ Declined (stale read)	This was actually rewritten by Task 11 of the redesign — `contextCommand.ts` now imports `computeThresholds` (line 28) and uses `computeThresholds(contextWindowSize)` (line 190). Grep `contextPercentageThreshold` in `packages/cli/src/ui/commands/contextCommand.ts` is empty.
🟢 M3 — `consecutiveFailures` across `--continue`	❌ Declined (works as intended)	`consecutiveFailures` is a `private` field on `GeminiChat` initialized to `0`. `--continue` constructs a fresh `GeminiChat` (history is restored separately), so the counter naturally resets — which is the correct semantics (a restarted session should get a fresh 3-strike budget rather than inheriting a latched breaker from a previous run).
🔵 L1 — JSDoc `@example` with threshold table	❌ Declined (filter 3)	The same table lives in `docs/design/auto-compaction-threshold-redesign.md` (committed in this PR). Duplicating it in JSDoc creates two sources of truth that can drift independently when the constants are tuned.
🔵 L2 — Missing `BYTES_PER_TOKEN_JSON = 2`	❌ Declined (not in code)	`BYTES_PER_TOKEN_JSON` doesn't exist in `tokenEstimation.ts`. The design doc only mentions it as a future possibility for json-dense content; the implementation deliberately uses a single `BYTES_PER_TOKEN = 4` ratio (matching claude-code's approach).
🔵 L3 / L4 — Verify files in PR	N/A	Both `packages/cli/src/services/tips/tipRegistry.ts` and `packages/core/src/index.ts` are in this PR (commit `28eb867a8`); please re-check the changed-files view.

Net: 1 fix accepted, 5 declined, 2 hallucinations dismissed. Force-pushed earlier (rebase onto main + the consecutiveFailures test fixup d270af030); this comment lands on top of the new merge-conflict-free branch tip 6ce81e73c.

🤖 Drafted with Claude Code using the review-response skill.

github-actions · 2026-05-15T08:39:55Z

Code Coverage Summary

Package	Lines	Statements	Functions	Branches
CLI	77.23%	77.23%	79.89%	79.84%
Core	79.59%	79.59%	82.19%	82.85%

CLI Package - Full Text Report

-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   77.23 |    79.84 |   79.89 |   77.23 |                   
 src               |    75.9 |    69.11 |   80.55 |    75.9 |                   
  gemini.tsx       |   68.53 |     66.4 |   76.47 |   68.53 | ...29,946-949,957 
  ...ractiveCli.ts |   80.23 |     68.3 |   78.57 |   80.23 | ...1054,1092,1195 
  ...liCommands.ts |   74.51 |    73.17 |     100 |   74.51 | ...41-265,290,391 
  ...ActiveAuth.ts |     100 |     87.5 |     100 |     100 | 66-80             
 ...cp-integration |   61.97 |    65.24 |   78.12 |   61.97 |                   
  acpAgent.ts      |   63.32 |    65.35 |   83.05 |   63.32 | ...2112,2126-2134 
  authMethods.ts   |   12.19 |      100 |       0 |   12.19 | 11-31,34-38,41-50 
  errorCodes.ts    |       0 |        0 |       0 |       0 | 1-22              
  ...DirContext.ts |     100 |      100 |     100 |     100 |                   
 ...ration/service |   68.65 |    83.33 |   66.66 |   68.65 |                   
  filesystem.ts    |   68.65 |    83.33 |   66.66 |   68.65 | ...32,77-94,97-98 
 ...ration/session |   75.88 |    72.05 |   86.25 |   75.88 |                   
  ...ryReplayer.ts |   67.34 |     75.6 |   81.81 |   67.34 | ...54-269,282-283 
  Session.ts       |   74.93 |    70.81 |   88.46 |   74.93 | ...2658,2664-2667 
  ...entTracker.ts |   90.85 |    84.84 |      90 |   90.85 | ...35,199,251-260 
  index.ts         |       0 |        0 |       0 |       0 | 1-40              
  ...ssionUtils.ts |   84.21 |    77.77 |     100 |   84.21 | ...37-153,209-211 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ssion/emitters |   96.01 |    90.75 |    92.3 |   96.01 |                   
  BaseEmitter.ts   |   76.92 |    66.66 |      80 |   76.92 | 23-24,39-40,55-56 
  ...ageEmitter.ts |     100 |    89.47 |     100 |     100 | 109,111           
  PlanEmitter.ts   |     100 |      100 |     100 |     100 |                   
  ...allEmitter.ts |   98.06 |     92.3 |     100 |   98.06 | 227-228,327,335   
  index.ts         |       0 |        0 |       0 |       0 | 1-10              
 ...ession/rewrite |   90.36 |    87.83 |   94.11 |   90.36 |                   
  LlmRewriter.ts   |      81 |       84 |     100 |      81 | ...,88-89,155-159 
  ...Middleware.ts |   95.83 |    85.71 |     100 |   95.83 | 119,127-129       
  TurnBuffer.ts    |     100 |      100 |     100 |     100 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 src/auth          |    97.7 |    94.81 |   95.45 |    97.7 |                   
  allProviders.ts  |     100 |      100 |     100 |     100 |                   
  ...iderConfig.ts |    97.6 |    95.04 |     100 |    97.6 | ...61,411,433-434 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 src/auth/install  |   98.57 |    88.88 |     100 |   98.57 |                   
  ...nstallPlan.ts |   98.57 |    88.88 |     100 |   98.57 | 80,93             
 ...viders/alibaba |   96.96 |    66.66 |   66.66 |   96.96 |                   
  ...baStandard.ts |     100 |      100 |     100 |     100 |                   
  codingPlan.ts    |   93.67 |    66.66 |   66.66 |   93.67 | 83,87-89,94       
  tokenPlan.ts     |     100 |      100 |     100 |     100 |                   
 ...oviders/custom |     100 |      100 |     100 |     100 |                   
  ...omProvider.ts |     100 |      100 |     100 |     100 |                   
 ...roviders/oauth |    91.5 |    77.03 |   97.05 |    91.5 |                   
  openrouter.ts    |   84.37 |    33.33 |     100 |   84.37 | 43-48             
  ...outerOAuth.ts |    91.9 |    79.06 |   96.87 |    91.9 | ...53-655,699-701 
 ...ers/thirdParty |     100 |      100 |     100 |     100 |                   
  deepseek.ts      |     100 |      100 |     100 |     100 |                   
  idealab.ts       |     100 |      100 |     100 |     100 |                   
  minimax.ts       |     100 |      100 |     100 |     100 |                   
  modelscope.ts    |     100 |      100 |     100 |     100 |                   
  zai.ts           |     100 |      100 |     100 |     100 |                   
 src/commands      |   47.93 |    85.71 |   43.47 |   47.93 |                   
  auth.ts          |     100 |    83.33 |     100 |     100 | 11,14             
  channel.ts       |   56.66 |      100 |       0 |   56.66 | 15-19,27-34       
  extensions.tsx   |   96.55 |      100 |      50 |   96.55 | 37                
  hooks.tsx        |   66.66 |      100 |       0 |   66.66 | 20-24             
  mcp.ts           |   94.73 |      100 |      50 |   94.73 | 28                
  review.ts        |   51.85 |      100 |       0 |   51.85 | 24-35,38          
  serve.ts         |    7.74 |      100 |       0 |    7.74 | ...51-147,149-230 
 ...mmands/channel |   39.25 |    79.45 |      50 |   39.25 |                   
  ...l-registry.ts |    8.57 |      100 |       0 |    8.57 | 6-21,24-42        
  config-utils.ts  |      92 |      100 |   66.66 |      92 | 21-26             
  configure.ts     |    14.7 |      100 |       0 |    14.7 | 18-21,23-84       
  pairing.ts       |   26.31 |      100 |       0 |   26.31 | ...30,40-50,52-65 
  pidfile.ts       |   96.34 |    86.95 |     100 |   96.34 | 49,59,91          
  start.ts         |   30.98 |       52 |   69.23 |   30.98 | ...72-475,484-486 
  status.ts        |   17.85 |      100 |       0 |   17.85 | 15-26,32-76       
  stop.ts          |      20 |      100 |       0 |      20 | 14-48             
 ...nds/extensions |    84.5 |    88.95 |   81.81 |    84.5 |                   
  consent.ts       |   71.65 |    89.28 |   42.85 |   71.65 | ...85-141,156-162 
  disable.ts       |     100 |      100 |     100 |     100 |                   
  enable.ts        |     100 |      100 |     100 |     100 |                   
  install.ts       |    75.6 |    66.66 |   66.66 |    75.6 | ...39-142,145-153 
  link.ts          |     100 |      100 |     100 |     100 |                   
  list.ts          |     100 |      100 |     100 |     100 |                   
  new.ts           |     100 |      100 |     100 |     100 |                   
  settings.ts      |   99.15 |      100 |   83.33 |   99.15 | 151               
  uninstall.ts     |    37.5 |      100 |   33.33 |    37.5 | 23-45,57-64,67-70 
  update.ts        |   96.32 |      100 |     100 |   96.32 | 101-105           
  utils.ts         |   60.24 |    28.57 |     100 |   60.24 | ...81,83-87,89-93 
 ...les/mcp-server |       0 |        0 |       0 |       0 |                   
  example.ts       |       0 |        0 |       0 |       0 | 1-60              
 src/commands/mcp  |   92.29 |    86.08 |   88.88 |   92.29 |                   
  add.ts           |     100 |    98.03 |     100 |     100 | 293               
  list.ts          |   91.22 |    80.76 |      80 |   91.22 | ...19-121,146-147 
  reconnect.ts     |   76.72 |    71.42 |   85.71 |   76.72 | 35-48,153-175     
  remove.ts        |     100 |       80 |     100 |     100 | 21-25             
 ...ommands/review |   11.57 |      100 |       0 |   11.57 |                   
  cleanup.ts       |   17.94 |      100 |       0 |   17.94 | ...01-106,108-109 
  deterministic.ts |   13.75 |      100 |       0 |   13.75 | ...22-738,740-741 
  fetch-pr.ts      |   11.36 |      100 |       0 |   11.36 | ...80-201,203-204 
  load-rules.ts    |   11.32 |      100 |       0 |   11.32 | ...41-153,155-156 
  pr-context.ts    |    6.22 |      100 |       0 |    6.22 | ...97-312,314-315 
  presubmit.ts     |    9.35 |      100 |       0 |    9.35 | ...62-287,289-290 
 ...nds/review/lib |      30 |      100 |       0 |      30 |                   
  gh.ts            |   22.58 |      100 |       0 |   22.58 | ...49,53-54,62-69 
  git.ts           |   22.72 |      100 |       0 |   22.72 | 15-18,29-39,43-44 
  paths.ts         |   52.94 |      100 |       0 |   52.94 | ...26,37-38,42-43 
 src/config        |   92.79 |    84.88 |   88.09 |   92.79 |                   
  auth.ts          |   86.98 |    80.32 |     100 |   86.98 | ...26-227,243-244 
  config.ts        |   87.96 |    84.36 |      80 |   87.96 | ...1856,1858-1866 
  keyBindings.ts   |   96.55 |       50 |     100 |   96.55 | 193-196           
  ...idersScope.ts |      92 |       90 |     100 |      92 | 11-12             
  sandboxConfig.ts |   61.64 |    71.87 |   66.66 |   61.64 | ...54-68,73,77-89 
  settings.ts      |   85.76 |    87.25 |   89.18 |   85.76 | ...1148,1153-1156 
  ...ingsSchema.ts |     100 |      100 |     100 |     100 |                   
  ...tedFolders.ts |   96.22 |       94 |     100 |   96.22 | ...88-190,205-206 
 ...nfig/migration |   94.89 |    78.94 |   83.33 |   94.89 |                   
  index.ts         |   94.87 |    88.88 |     100 |   94.87 | 91-92             
  scheduler.ts     |   96.55 |    77.77 |     100 |   96.55 | 19-20             
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...ation/versions |   94.74 |       96 |     100 |   94.74 |                   
  ...-v2-shared.ts |     100 |      100 |     100 |     100 |                   
  v1-to-v2.ts      |   81.75 |    90.19 |     100 |   81.75 | ...28-229,231-247 
  v2-to-v3.ts      |     100 |      100 |     100 |     100 |                   
  v3-to-v4.ts      |     100 |      100 |     100 |     100 |                   
 src/core          |     100 |      100 |     100 |     100 |                   
  auth.ts          |     100 |      100 |     100 |     100 |                   
  initializer.ts   |     100 |      100 |     100 |     100 |                   
  theme.ts         |     100 |      100 |     100 |     100 |                   
 src/dualOutput    |   63.09 |    64.51 |   55.55 |   63.09 |                   
  ...tputBridge.ts |   62.94 |    65.51 |   56.25 |   62.94 | ...22-323,331-334 
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/export        |       0 |        0 |       0 |       0 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-7               
 src/generated     |     100 |      100 |     100 |     100 |                   
  git-commit.ts    |     100 |      100 |     100 |     100 |                   
 src/i18n          |   81.47 |    75.94 |   65.71 |   81.47 |                   
  index.ts         |   63.68 |    69.56 |   53.84 |   63.68 | ...70-271,281-286 
  languages.ts     |   96.92 |    86.66 |     100 |   96.92 | 134-135,167,184   
  ...nslateKeys.ts |     100 |      100 |     100 |     100 |                   
  ...lationDict.ts |   93.33 |    66.66 |     100 |   93.33 | 15                
 src/i18n/locales  |     100 |      100 |     100 |     100 |                   
  ca.js            |     100 |      100 |     100 |     100 |                   
  de.js            |     100 |      100 |     100 |     100 |                   
  en.js            |     100 |      100 |     100 |     100 |                   
  fr.js            |     100 |      100 |     100 |     100 |                   
  ja.js            |     100 |      100 |     100 |     100 |                   
  pt.js            |     100 |      100 |     100 |     100 |                   
  ru.js            |     100 |      100 |     100 |     100 |                   
  zh-TW.js         |     100 |      100 |     100 |     100 |                   
  zh.js            |     100 |      100 |     100 |     100 |                   
 ...nonInteractive |   72.57 |    71.12 |   74.07 |   72.57 |                   
  session.ts       |   76.64 |     69.4 |   85.71 |   76.64 | ...23-824,833-843 
  types.ts         |    42.5 |      100 |   33.33 |    42.5 | ...80-581,584-585 
 ...active/control |   77.04 |    88.23 |      80 |   77.04 |                   
  ...rolContext.ts |    7.14 |        0 |       0 |    7.14 | 49-84             
  ...Dispatcher.ts |   91.66 |    91.83 |   88.88 |   91.66 | ...54-372,388,391 
  ...rolService.ts |       8 |        0 |       0 |       8 | 46-179            
 ...ol/controllers |    7.03 |       80 |   13.33 |    7.03 |                   
  ...Controller.ts |   19.32 |      100 |      60 |   19.32 | 81-118,127-210    
  ...Controller.ts |       0 |        0 |       0 |       0 | 1-56              
  ...Controller.ts |    3.94 |      100 |   11.11 |    3.94 | ...63-381,391-496 
  ...Controller.ts |   14.06 |      100 |       0 |   14.06 | ...82-117,130-133 
  ...Controller.ts |    5.21 |      100 |       0 |    5.21 | ...21-433,442-471 
 .../control/types |       0 |        0 |       0 |       0 |                   
  serviceAPIs.ts   |       0 |        0 |       0 |       0 | 1                 
 ...Interactive/io |   97.98 |     93.7 |   95.18 |   97.98 |                   
  ...putAdapter.ts |   97.89 |    92.82 |   98.07 |   97.89 | ...1303,1398-1399 
  ...putAdapter.ts |      96 |     90.9 |   85.71 |      96 | 51-52             
  ...nputReader.ts |     100 |    94.73 |     100 |     100 | 67                
  ...putAdapter.ts |   98.28 |      100 |      90 |   98.28 | 81-82,122-123     
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/patches       |       0 |        0 |       0 |       0 |                   
  is-in-ci.ts      |       0 |        0 |       0 |       0 | 1-17              
 src/remoteInput   |   86.98 |       75 |   85.71 |   86.98 |                   
  ...utContext.tsx |     100 |      100 |     100 |     100 |                   
  ...putWatcher.ts |   88.12 |    76.08 |   91.66 |   88.12 | ...21-222,233-236 
  index.ts         |       0 |        0 |       0 |       0 | 1-8               
 src/serve         |    79.3 |     78.8 |   92.85 |    79.3 |                   
  auth.ts          |   88.49 |    88.63 |     100 |   88.49 | ...49-150,153-155 
  capabilities.ts  |     100 |     90.9 |     100 |     100 | 264               
  ...usProvider.ts |   67.01 |    51.42 |     100 |   67.01 | ...40-245,278-286 
  debugMode.ts     |     100 |      100 |     100 |     100 |                   
  demo.ts          |     100 |      100 |     100 |     100 |                   
  envSnapshot.ts   |    92.3 |       84 |     100 |    92.3 | 108-111,170-177   
  eventBus.ts      |     100 |      100 |     100 |     100 |                   
  httpAcpBridge.ts |   79.62 |    78.84 |   96.38 |   79.62 | ...4246,4277-4318 
  ...oryChannel.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-106             
  loopbackBinds.ts |     100 |      100 |     100 |     100 |                   
  runQwenServe.ts  |   73.98 |    87.83 |   55.55 |   73.98 | ...94-710,735-737 
  server.ts        |   86.18 |    82.94 |   90.62 |   86.18 | ...2478,2543-2552 
  status.ts        |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
  ...paceAgents.ts |   64.87 |    70.45 |    90.9 |   64.87 | ...1306,1316-1326 
  ...paceMemory.ts |   87.13 |    78.46 |     100 |   87.13 | ...54-361,421-428 
 src/serve/auth    |   86.54 |    78.75 |   93.75 |   86.54 |                   
  deviceFlow.ts    |   96.33 |    79.51 |    97.5 |   96.33 | ...1526,1630,1700 
  ...owProvider.ts |   45.23 |    74.07 |      75 |   45.23 | ...90-359,375,379 
 src/serve/fs      |   84.85 |    79.75 |     100 |   84.85 |                   
  audit.ts         |     100 |    96.15 |     100 |     100 | 201               
  errors.ts        |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  paths.ts         |   77.82 |    77.08 |     100 |   77.82 | ...64,493-497,510 
  policy.ts        |   90.32 |    89.18 |     100 |   90.32 | 142-150           
  ...FileSystem.ts |   83.55 |    76.22 |     100 |   83.55 | ...1859,1886-1887 
 src/serve/routes  |   89.41 |       70 |     100 |   89.41 |                   
  ...ceFileRead.ts |   94.41 |    76.92 |     100 |   94.41 | ...28-329,390-392 
  ...eFileWrite.ts |    82.1 |    60.52 |     100 |    82.1 | ...42-244,247-249 
 src/services      |   91.67 |    91.21 |   97.56 |   91.67 |                   
  ...mandLoader.ts |     100 |    93.75 |     100 |     100 | 93                
  ...killLoader.ts |     100 |    96.15 |     100 |     100 | 47                
  ...andService.ts |    98.7 |      100 |     100 |    98.7 | 107               
  ...mandLoader.ts |   86.83 |    83.87 |     100 |   86.83 | ...30-335,340-345 
  ...omptLoader.ts |   75.84 |    80.64 |   83.33 |   75.84 | ...10-211,277-278 
  ...mandLoader.ts |     100 |      100 |     100 |     100 |                   
  ...nd-factory.ts |   91.42 |    91.66 |     100 |   91.42 | 128,137-144       
  ...ation-tool.ts |     100 |    95.45 |     100 |     100 | 125               
  ...ndMetadata.ts |   98.21 |    96.66 |     100 |   98.21 | 83,87             
  commandUtils.ts  |      96 |     90.9 |     100 |      96 | 48                
  ...and-parser.ts |   90.69 |    85.71 |     100 |   90.69 | 63-66             
  ...ionService.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...ght/generators |    85.9 |    85.61 |   90.47 |    85.9 |                   
  DataProcessor.ts |   85.63 |     85.6 |   92.85 |   85.63 | ...1122,1126-1133 
  ...tGenerator.ts |   98.21 |    85.71 |     100 |   98.21 | 46                
  ...teRenderer.ts |   45.45 |      100 |       0 |   45.45 | 13-51             
 .../insight/types |       0 |       50 |      50 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 |                   
  ...sightTypes.ts |       0 |        0 |       0 |       0 | 1                 
 ...mpt-processors |   97.27 |    94.04 |     100 |   97.27 |                   
  ...tProcessor.ts |     100 |      100 |     100 |     100 |                   
  ...eProcessor.ts |   94.52 |    84.21 |     100 |   94.52 | 46-47,93-94       
  ...tionParser.ts |     100 |      100 |     100 |     100 |                   
  ...lProcessor.ts |   97.41 |    95.65 |     100 |   97.41 | 95-98             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/services/tips |   97.35 |    85.29 |     100 |   97.35 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  tipHistory.ts    |   92.45 |       70 |     100 |   92.45 | ...22,144,151,160 
  tipRegistry.ts   |     100 |      100 |     100 |     100 |                   
  tipScheduler.ts  |     100 |    91.66 |     100 |     100 | 55                
 src/test-utils    |   93.75 |    83.33 |      80 |   93.75 |                   
  ...omMatchers.ts |   69.69 |       50 |      50 |   69.69 | 32-35,37-39,45-47 
  ...andContext.ts |     100 |      100 |     100 |     100 |                   
  render.tsx       |     100 |      100 |     100 |     100 |                   
 src/ui            |   65.28 |    73.08 |   60.34 |   65.28 |                   
  App.tsx          |     100 |      100 |     100 |     100 |                   
  AppContainer.tsx |   63.38 |    64.68 |      50 |   63.38 | ...3156,3160-3164 
  ...tionNudge.tsx |    9.58 |      100 |       0 |    9.58 | 24-94             
  ...ackDialog.tsx |   29.23 |      100 |       0 |   29.23 | 25-75             
  ...tionNudge.tsx |    7.69 |      100 |       0 |    7.69 | 25-103            
  colors.ts        |      60 |      100 |   35.29 |      60 | ...52,54-55,60-61 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  keyMatchers.ts   |   95.91 |    97.05 |     100 |   95.91 | 25-26             
  ...tic-colors.ts |     100 |      100 |     100 |     100 |                   
  ...inePresets.ts |   98.17 |    88.88 |     100 |   98.17 | ...12,239,387-389 
  textConstants.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/auth       |   55.06 |    51.13 |   35.48 |   55.06 |                   
  AuthDialog.tsx   |   64.26 |    44.44 |   16.66 |   64.26 | ...59,366-388,392 
  ...nProgress.tsx |       0 |        0 |       0 |       0 | 1-64              
  ...etupSteps.tsx |    39.5 |       32 |   38.46 |    39.5 | ...69,472,478,481 
  useAuth.ts       |   76.63 |    68.29 |     100 |   76.63 | ...48,493-499,560 
  ...rSetupFlow.ts |   44.61 |    33.33 |      50 |   44.61 | ...57-378,395-438 
 src/ui/commands   |   75.19 |    81.23 |   83.08 |   75.19 |                   
  aboutCommand.ts  |     100 |      100 |     100 |     100 |                   
  agentsCommand.ts |   83.78 |      100 |      60 |   83.78 | 30-32,42-44       
  ...odeCommand.ts |   89.04 |    81.25 |     100 |   89.04 | 91-92,94-99       
  arenaCommand.ts  |   62.81 |    58.73 |   65.21 |   62.81 | ...91-596,681-689 
  authCommand.ts   |     100 |      100 |     100 |     100 |                   
  branchCommand.ts |     100 |      100 |     100 |     100 |                   
  btwCommand.ts    |   95.59 |    71.42 |     100 |   95.59 | 72,154-159        
  bugCommand.ts    |   81.13 |    71.42 |     100 |   81.13 | 60-69             
  clearCommand.ts  |      92 |    76.47 |     100 |      92 | 43-44,72-73,91-92 
  ...essCommand.ts |    64.7 |       50 |      75 |    64.7 | ...48-149,163-166 
  ...extCommand.ts |   65.09 |    53.84 |   84.61 |   65.09 | ...66-601,612-613 
  copyCommand.ts   |   98.28 |    94.89 |     100 |   98.28 | ...80,280,321,327 
  deleteCommand.ts |     100 |      100 |     100 |     100 |                   
  diffCommand.ts   |     100 |     87.5 |     100 |     100 | ...61,224-225,238 
  ...ryCommand.tsx |   68.09 |    77.77 |   77.77 |   68.09 | ...56-261,315-323 
  docsCommand.ts   |     100 |    88.88 |     100 |     100 | 25                
  doctorCommand.ts |   95.06 |    88.28 |     100 |   95.06 | ...92-293,320-321 
  dreamCommand.ts  |      75 |    66.66 |   66.66 |      75 | 22-27,44-47       
  editorCommand.ts |     100 |      100 |     100 |     100 |                   
  exportCommand.ts |   98.25 |    91.02 |     100 |   98.25 | ...81,198-199,364 
  ...onsCommand.ts |   48.66 |     90.9 |   63.63 |   48.66 | ...05-109,159-211 
  forgetCommand.ts |   26.82 |      100 |      50 |   26.82 | 18-51             
  goalCommand.ts   |   91.25 |    83.33 |      90 |   91.25 | ...83-186,198-201 
  helpCommand.ts   |     100 |      100 |     100 |     100 |                   
  hooksCommand.ts  |    20.4 |       40 |      40 |    20.4 | ...48-180,204-205 
  ideCommand.ts    |   60.75 |    64.28 |   41.17 |   60.75 | ...05-306,310-324 
  initCommand.ts   |   84.33 |    72.72 |     100 |   84.33 | 68,82-87,89-94    
  ...ghtCommand.ts |   74.56 |    68.42 |     100 |   74.56 | ...31-245,250-273 
  ...ageCommand.ts |   92.17 |    82.69 |     100 |   92.17 | ...43,164,173-183 
  lspCommand.ts    |     100 |    86.95 |     100 |     100 | 31,101-102        
  ...elsCommand.ts |     100 |      100 |     100 |     100 |                   
  mcpCommand.ts    |     100 |      100 |     100 |     100 |                   
  memoryCommand.ts |     100 |      100 |     100 |     100 |                   
  modelCommand.ts  |   75.09 |    78.18 |      75 |   75.09 | ...20-225,262-267 
  ...onsCommand.ts |     100 |      100 |     100 |     100 |                   
  planCommand.ts   |   78.82 |    76.92 |     100 |   78.82 | 30-35,51-56,68-73 
  quitCommand.ts   |     100 |      100 |     100 |     100 |                   
  recapCommand.ts  |   21.81 |      100 |      50 |   21.81 | 24-73             
  ...berCommand.ts |   32.43 |      100 |      50 |   32.43 | 23-57             
  renameCommand.ts |   85.71 |    86.04 |     100 |   85.71 | ...02-209,216-221 
  ...oreCommand.ts |    92.3 |    87.87 |     100 |    92.3 | ...,83-88,129-130 
  resumeCommand.ts |     100 |      100 |     100 |     100 |                   
  rewindCommand.ts |      80 |      100 |      50 |      80 | 19-21             
  ...ngsCommand.ts |     100 |      100 |     100 |     100 |                   
  ...hubCommand.ts |   81.43 |    65.21 |      80 |   81.43 | ...70-173,176-179 
  skillsCommand.ts |   15.04 |      100 |      25 |   15.04 | ...90-106,109-136 
  statsCommand.ts  |   88.19 |    84.21 |     100 |   88.19 | ...,58-61,143-146 
  ...ineCommand.ts |     100 |      100 |     100 |     100 |                   
  ...aryCommand.ts |    6.46 |      100 |      50 |    6.46 | 31-329            
  tasksCommand.ts  |   77.22 |    72.13 |     100 |   77.22 | ...46-150,172-177 
  ...tupCommand.ts |     100 |      100 |     100 |     100 |                   
  themeCommand.ts  |     100 |      100 |     100 |     100 |                   
  toolsCommand.ts  |     100 |      100 |     100 |     100 |                   
  trustCommand.ts  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
  vimCommand.ts    |   54.54 |      100 |      50 |   54.54 | 19-29             
 src/ui/components |   62.86 |    74.64 |   65.25 |   62.86 |                   
  AboutBox.tsx     |     100 |      100 |     100 |     100 |                   
  AnsiOutput.tsx   |   65.57 |      100 |      50 |   65.57 | 69-90             
  ApiKeyInput.tsx  |       0 |        0 |       0 |       0 | 1-97              
  AppHeader.tsx    |   89.39 |       75 |     100 |   89.39 | 35,37-42,44       
  ...odeDialog.tsx |     9.7 |      100 |       0 |     9.7 | 35-47,50-182      
  AsciiArt.ts      |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |   13.04 |      100 |       0 |   13.04 | 18-61             
  ...TextInput.tsx |   77.01 |       76 |     100 |   77.01 | ...20,234-236,263 
  Composer.tsx     |    81.6 |     64.7 |     100 |    81.6 | ...90,108,160,173 
  ...entPrompt.tsx |     100 |      100 |     100 |     100 |                   
  ...ryDisplay.tsx |   75.89 |    62.06 |     100 |   75.89 | ...,88,93-108,113 
  ...geDisplay.tsx |   68.42 |    57.14 |     100 |   68.42 | 16-17,31-32,42-50 
  ...ification.tsx |   28.57 |      100 |       0 |   28.57 | 16-36             
  ...gProfiler.tsx |       0 |        0 |       0 |       0 | 1-36              
  ...ogManager.tsx |   11.99 |      100 |       0 |   11.99 | 66-517            
  DiffDialog.tsx   |    2.47 |      100 |       0 |    2.47 | 68-732            
  ...ngsDialog.tsx |    8.44 |      100 |       0 |    8.44 | 37-195            
  ExitWarning.tsx  |     100 |      100 |     100 |     100 |                   
  ...hProgress.tsx |    87.8 |    33.33 |     100 |    87.8 | 28-31,56          
  ...ustDialog.tsx |     100 |      100 |     100 |     100 |                   
  Footer.tsx       |   76.59 |    48.64 |     100 |   76.59 | ...35-136,175-180 
  ...ngSpinner.tsx |   68.42 |       80 |      50 |   68.42 | 35-52,73,80-81    
  GoalPill.tsx     |   76.19 |    81.81 |     100 |   76.19 | 24-30,46-50       
  Header.tsx       |   98.62 |    94.28 |     100 |   98.62 | 162,164           
  Help.tsx         |   98.32 |    89.88 |     100 |   98.32 | ...24,381,447-448 
  ...emDisplay.tsx |    61.7 |       36 |     100 |    61.7 | ...42,345,348-354 
  ...ngeDialog.tsx |     100 |      100 |     100 |     100 |                   
  InputPrompt.tsx  |   83.01 |    79.78 |   83.33 |   83.01 | ...1399,1531,1581 
  ...Shortcuts.tsx |   20.87 |      100 |       0 |   20.87 | ...6,49-51,67-125 
  ...Indicator.tsx |     100 |    91.42 |     100 |     100 | 65,74             
  ...firmation.tsx |   91.42 |      100 |      50 |   91.42 | 26-31             
  MainContent.tsx  |   81.75 |       75 |     100 |   81.75 | ...70-274,282-286 
  ...elsDialog.tsx |   71.05 |    69.11 |   72.72 |   71.05 | ...77,590,601-603 
  MemoryDialog.tsx |    55.1 |    54.54 |   57.14 |    55.1 | ...56,368,381-383 
  ...geDisplay.tsx |       0 |        0 |       0 |       0 | 1-41              
  ModelDialog.tsx  |   80.12 |    63.55 |     100 |   80.12 | ...39-555,612-616 
  ...tsDisplay.tsx |     100 |    97.22 |     100 |     100 | 270               
  ...fications.tsx |   18.18 |      100 |       0 |   18.18 | 15-58             
  ...onsDialog.tsx |    2.13 |      100 |       0 |    2.13 | 62-133,148-1004   
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...icePrompt.tsx |   92.64 |    85.71 |     100 |   92.64 | 102-106,134-139   
  PrepareLabel.tsx |   91.66 |    77.27 |     100 |   91.66 | 73-75,77-79,110   
  ...atePrompt.tsx |    8.57 |      100 |       0 |    8.57 | 24-55,58-134      
  ...geDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...ngDisplay.tsx |   21.42 |      100 |       0 |   21.42 | 13-39             
  ...hProgress.tsx |   85.25 |    88.46 |     100 |   85.25 | 121-147           
  ...dSelector.tsx |   41.26 |    61.53 |   71.42 |   41.26 | ...74-472,476-520 
  ...ionPicker.tsx |   83.66 |    72.13 |     100 |   83.66 | ...96,402,444-466 
  ...onPreview.tsx |   92.42 |    84.37 |     100 |   92.42 | ...,70-71,143-145 
  ...ryDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...putPrompt.tsx |   72.56 |       80 |      40 |   72.56 | ...06-109,114-117 
  ...ngsDialog.tsx |   66.27 |    71.16 |      75 |   66.27 | ...12-820,826-827 
  ...ionDialog.tsx |    87.8 |      100 |   33.33 |    87.8 | 36-39,44-51       
  ...putPrompt.tsx |    15.9 |      100 |       0 |    15.9 | 20-63             
  ...Indicator.tsx |   57.14 |      100 |       0 |   57.14 | 12-15             
  ...MoreLines.tsx |      28 |      100 |       0 |      28 | 18-40             
  ...ionPicker.tsx |   17.59 |      100 |       0 |   17.59 | 55-172            
  StatsDisplay.tsx |     100 |      100 |     100 |     100 |                   
  ...ineDialog.tsx |   93.69 |    83.92 |     100 |   93.69 | ...11,273,293-295 
  ...yTodoList.tsx |   94.17 |       80 |     100 |   94.17 | 56-57,131-134     
  ...nsDisplay.tsx |   87.25 |       64 |     100 |   87.25 | ...45-147,154-156 
  ThemeDialog.tsx  |   89.95 |    46.15 |      75 |   89.95 | ...71-173,243-245 
  Tips.tsx         |   93.54 |       75 |     100 |   93.54 | 39-40             
  TodoDisplay.tsx  |     100 |      100 |     100 |     100 |                   
  ...tsDisplay.tsx |     100 |     87.5 |     100 |     100 | 31-32             
  TrustDialog.tsx  |     100 |    81.81 |     100 |     100 | 71-86             
  ...ification.tsx |   36.36 |      100 |       0 |   36.36 | 15-22             
  ...ackDialog.tsx |    7.84 |      100 |       0 |    7.84 | 24-134            
  ...xitDialog.tsx |   80.36 |    43.47 |      60 |   80.36 | ...24-238,248-251 
 ...nts/agent-view |   38.33 |    70.83 |   36.36 |   38.33 |                   
  ...atContent.tsx |    8.79 |      100 |       0 |    8.79 | 53-265,271-273    
  ...tChatView.tsx |   21.05 |      100 |       0 |   21.05 | 21-39             
  ...tComposer.tsx |    9.95 |      100 |       0 |    9.95 | 57-308            
  AgentFooter.tsx  |   17.07 |      100 |       0 |   17.07 | 28-66             
  AgentHeader.tsx  |   15.38 |      100 |       0 |   15.38 | 27-64             
  AgentTabBar.tsx  |    87.8 |    27.27 |     100 |    87.8 | ...,85,98-106,124 
  ...oryAdapter.ts |     100 |    91.83 |     100 |     100 | 103,109-110,138   
  index.ts         |       0 |        0 |       0 |       0 | 1-12              
 ...mponents/arena |   45.72 |    70.53 |   60.86 |   45.72 |                   
  ArenaCards.tsx   |   73.06 |    71.79 |   85.71 |   73.06 | ...83-185,321-326 
  ...ectDialog.tsx |   83.48 |    69.86 |   88.88 |   83.48 | ...88-392,409-410 
  ...artDialog.tsx |   10.15 |      100 |       0 |   10.15 | 27-161            
  ...tusDialog.tsx |    5.63 |      100 |       0 |    5.63 | 33-75,80-288      
  ...topDialog.tsx |    6.17 |      100 |       0 |    6.17 | 33-213            
 ...ackground-view |   75.63 |    84.49 |   85.29 |   75.63 |                   
  ...sksDialog.tsx |   70.92 |    80.48 |   76.19 |   70.92 | ...1118,1194-1196 
  ...TasksPill.tsx |   63.75 |    86.95 |     100 |   63.75 | 44,86-106,114-122 
  ...gentPanel.tsx |   99.53 |    93.18 |     100 |   99.53 | 123               
 ...nts/extensions |   45.28 |    33.33 |      60 |   45.28 |                   
  ...gerDialog.tsx |   44.31 |    34.14 |      75 |   44.31 | ...71-480,483-488 
  index.ts         |       0 |        0 |       0 |       0 | 1-9               
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...tensions/steps |   54.88 |    94.23 |   66.66 |   54.88 |                   
  ...ctionStep.tsx |   95.12 |    92.85 |   85.71 |   95.12 | 84-86,89          
  ...etailStep.tsx |    6.18 |      100 |       0 |    6.18 | 17-128            
  ...nListStep.tsx |   88.43 |    94.73 |      80 |   88.43 | 52-53,59-72,106   
  ...electStep.tsx |   13.46 |      100 |       0 |   13.46 | 20-70             
  ...nfirmStep.tsx |   19.56 |      100 |       0 |   19.56 | 23-65             
  index.ts         |     100 |      100 |     100 |     100 |                   
 ...mponents/hooks |   68.67 |    69.07 |   69.56 |   68.67 |                   
  ...etailStep.tsx |   74.68 |    66.66 |   66.66 |   74.68 | ...71-184,188-201 
  ...etailStep.tsx |    87.4 |    73.68 |     100 |    87.4 | 41-42,99-113,119  
  ...abledStep.tsx |     100 |      100 |     100 |     100 |                   
  ...sListStep.tsx |     100 |      100 |     100 |     100 |                   
  ...entDialog.tsx |   34.51 |    47.05 |   42.85 |   34.51 | ...78,482-495,499 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-13              
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...components/mcp |   20.98 |    86.36 |   83.33 |   20.98 |                   
  ...ealthPill.tsx |   68.42 |    85.71 |     100 |   68.42 | 40-46             
  ...entDialog.tsx |    3.64 |      100 |       0 |    3.64 | 41-717            
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-30              
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   95.83 |    88.88 |     100 |   95.83 | 16,20,109-110     
 ...ents/mcp/steps |   26.74 |    54.54 |   42.85 |   26.74 |                   
  ...icateStep.tsx |    5.88 |      100 |       0 |    5.88 | 40-55,58-296      
  ...electStep.tsx |   10.95 |      100 |       0 |   10.95 | 16-88             
  ...etailStep.tsx |    5.26 |      100 |       0 |    5.26 | 31-247            
  ...rListStep.tsx |   75.18 |    59.37 |     100 |   75.18 | ...53-158,169-173 
  ...etailStep.tsx |   10.41 |      100 |       0 |   10.41 | ...1,67-79,82-139 
  ToolListStep.tsx |   69.02 |       50 |     100 |   69.02 | ...22,125,134-143 
 ...nents/messages |   82.44 |    79.55 |    72.6 |   82.44 |                   
  ...ionDialog.tsx |   80.84 |     77.6 |    62.5 |   80.84 | ...98,516,534-536 
  BtwMessage.tsx   |     100 |      100 |     100 |     100 |                   
  ...upDisplay.tsx |   97.67 |    83.72 |     100 |   97.67 | 119,142,150       
  ...onMessage.tsx |   91.93 |    82.35 |     100 |   91.93 | 57-59,61,63       
  ...nMessages.tsx |   79.06 |      100 |      70 |   79.06 | ...51-264,268-280 
  DiffRenderer.tsx |   93.19 |    86.17 |     100 |   93.19 | ...09,237-238,304 
  ...tsDisplay.tsx |   97.82 |    77.27 |     100 |   97.82 | 87,89             
  ...usMessage.tsx |   76.31 |     42.1 |   66.66 |   76.31 | ...99,101,124,155 
  ...ssMessage.tsx |    12.5 |      100 |       0 |    12.5 | 18-59             
  ...edMessage.tsx |   16.66 |      100 |       0 |   16.66 | 22-38             
  ...sMessages.tsx |   55.67 |       40 |   28.57 |   55.67 | ...20-125,133-145 
  ...ryMessage.tsx |   14.28 |      100 |       0 |   14.28 | 23-62             
  ...onMessage.tsx |   81.02 |    69.23 |   33.33 |   81.02 | ...24-426,433-435 
  ...upMessage.tsx |      84 |    93.61 |     100 |      84 | ...56-383,405-420 
  ToolMessage.tsx  |   88.84 |    75.71 |    92.3 |   88.84 | ...44-749,776-778 
 ...ponents/shared |   85.36 |    78.48 |   95.77 |   85.36 |                   
  ...ctionList.tsx |   99.03 |    95.65 |     100 |   99.03 | 85                
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  EnumSelector.tsx |     100 |    96.42 |     100 |     100 | 58                
  MaxSizedBox.tsx  |   83.01 |    86.25 |   88.88 |   83.01 | ...12-513,618-619 
  MultiSelect.tsx  |   84.31 |    74.19 |     100 |   84.31 | ...37,193-195,205 
  ...tonSelect.tsx |     100 |      100 |     100 |     100 |                   
  ...eSelector.tsx |     100 |       60 |     100 |     100 | 40-45             
  TextInput.tsx    |   77.01 |    48.78 |      80 |   77.01 | ...08-212,224-230 
  ...apsedTime.tsx |     100 |      100 |     100 |     100 |                   
  ...Indicator.tsx |     100 |      100 |     100 |     100 |                   
  text-buffer.ts   |   83.68 |    78.55 |   97.61 |   83.68 | ...2270-2272,2368 
  ...er-actions.ts |   86.71 |    67.79 |     100 |   86.71 | ...07-608,809-811 
 ...ents/subagents |   30.87 |        0 |       0 |   30.87 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  index.ts         |       0 |        0 |       0 |       0 | 1-11              
  reducers.tsx     |    12.1 |      100 |       0 |    12.1 | 33-190            
  types.ts         |     100 |      100 |     100 |     100 |                   
  utils.ts         |   10.95 |      100 |       0 |   10.95 | ...1,56-57,60-102 
 ...bagents/create |    9.13 |      100 |       0 |    9.13 |                   
  ...ionWizard.tsx |    7.28 |      100 |       0 |    7.28 | 34-299            
  ...rSelector.tsx |   14.75 |      100 |       0 |   14.75 | 26-85             
  ...onSummary.tsx |    4.26 |      100 |       0 |    4.26 | 27-331            
  ...tionInput.tsx |    8.63 |      100 |       0 |    8.63 | 23-177            
  ...dSelector.tsx |   33.33 |      100 |       0 |   33.33 | 20-21,26-27,36-63 
  ...nSelector.tsx |    37.5 |      100 |       0 |    37.5 | 20-21,26-27,36-58 
  ...EntryStep.tsx |   12.76 |      100 |       0 |   12.76 | 34-78             
  ToolSelector.tsx |    4.16 |      100 |       0 |    4.16 | 31-253            
 ...bagents/manage |   21.51 |    59.52 |   27.27 |   21.51 |                   
  ...ctionStep.tsx |   10.25 |      100 |       0 |   10.25 | 21-103            
  ...eleteStep.tsx |   20.93 |      100 |       0 |   20.93 | 23-62             
  ...tEditStep.tsx |   25.53 |      100 |       0 |   25.53 | ...2,37-38,51-124 
  ...ctionStep.tsx |   35.42 |    59.52 |     100 |   35.42 | ...20-432,437-439 
  ...iewerStep.tsx |   13.72 |      100 |       0 |   13.72 | 18-73             
  ...gerDialog.tsx |    6.74 |      100 |       0 |    6.74 | 35-341            
 ...mponents/views |   70.11 |       68 |    64.7 |   70.11 |                   
  ContextUsage.tsx |   70.67 |    65.71 |      80 |   70.67 | ...16-422,459-553 
  DoctorReport.tsx |     9.8 |      100 |       0 |     9.8 | 25-54,57-131      
  ...sionsList.tsx |   87.69 |    73.68 |     100 |   87.69 | 65-72             
  McpStatus.tsx    |   89.53 |    60.52 |     100 |   89.53 | ...72,175-177,262 
  SkillsList.tsx   |   27.27 |      100 |       0 |   27.27 | 18-35             
  ToolsList.tsx    |     100 |      100 |     100 |     100 |                   
 src/ui/contexts   |   77.34 |    78.06 |   80.35 |   77.34 |                   
  ...ewContext.tsx |    64.7 |    85.71 |      50 |    64.7 | ...22-225,231-241 
  AppContext.tsx   |      80 |       50 |     100 |      80 | 19-20             
  ...ewContext.tsx |   95.18 |    67.56 |      50 |   95.18 | ...94-195,222-226 
  ...deContext.tsx |     100 |      100 |     100 |     100 |                   
  ...igContext.tsx |   81.81 |       50 |     100 |   81.81 | 15-16             
  ...ssContext.tsx |   82.31 |    82.84 |     100 |   82.31 | ...1153,1159-1161 
  ...owContext.tsx |   89.28 |       80 |   66.66 |   89.28 | 34,47-48,60-62    
  ...deContext.tsx |     100 |      100 |      50 |     100 |                   
  ...onContext.tsx |   43.28 |     62.5 |    62.5 |   43.28 | ...56-259,263-266 
  ...gsContext.tsx |   83.33 |       50 |     100 |   83.33 | 17-18             
  ...usContext.tsx |     100 |      100 |     100 |     100 |                   
  ...ngContext.tsx |   71.42 |       50 |     100 |   71.42 | 17-20             
  ...utContext.tsx |   85.71 |      100 |   66.66 |   85.71 | 13-14             
  ...nsContext.tsx |   88.23 |       50 |     100 |   88.23 | 120-121           
  ...teContext.tsx |   86.66 |       50 |     100 |   86.66 | 195-196           
  ...deContext.tsx |   76.08 |    72.72 |     100 |   76.08 | 47-48,52-59,77-78 
 src/ui/daemon     |   90.76 |    73.73 |   95.45 |   90.76 |                   
  ...TuiAdapter.ts |   90.76 |    73.73 |   95.45 |   90.76 | ...53,771-772,858 
 src/ui/editors    |   93.33 |    85.71 |   66.66 |   93.33 |                   
  ...ngsManager.ts |   93.33 |    85.71 |   66.66 |   93.33 | 49,63-64          
 src/ui/hooks      |   82.14 |    82.17 |   86.69 |   82.14 |                   
  ...dProcessor.ts |   83.12 |    82.56 |     100 |   83.12 | ...88-389,408-435 
  keyToAnsi.ts     |    3.92 |      100 |       0 |    3.92 | 19-77             
  ...dProcessor.ts |    94.8 |    70.58 |     100 |    94.8 | ...76-277,282-283 
  ...dProcessor.ts |   75.59 |    62.58 |   61.53 |   75.59 | ...88,912,931-935 
  ...amingState.ts |   12.22 |      100 |       0 |   12.22 | 54-157            
  ...agerDialog.ts |   88.23 |      100 |     100 |   88.23 | 20,24             
  ...ationFrame.ts |      32 |       60 |     100 |      32 | 42-44,51-90       
  ...odeCommand.ts |   58.82 |      100 |     100 |   58.82 | 28,33-48          
  ...enaCommand.ts |      85 |      100 |     100 |      85 | 23-24,29          
  ...aInProcess.ts |   19.81 |    66.66 |      25 |   19.81 | 57-175            
  ...Completion.ts |   92.77 |    89.09 |     100 |   92.77 | ...86-187,220-223 
  ...ifications.ts |   92.07 |    96.29 |     100 |   92.07 | 116-124           
  ...tIndicator.ts |   83.49 |    70.96 |     100 |   83.49 | ...60,168,170-178 
  ...waySummary.ts |   96.22 |    69.69 |     100 |   96.22 | 125-127,169       
  ...ndTaskView.ts |   94.21 |    76.08 |     100 |   94.21 | 122-126,213,219   
  ...ketedPaste.ts |    23.8 |      100 |       0 |    23.8 | 19-37             
  ...nchCommand.ts |   94.36 |    74.35 |     100 |   94.36 | ...60,168-169,209 
  ...ompletion.tsx |   95.95 |    82.75 |     100 |   95.95 | ...22-223,225-226 
  ...dMigration.ts |   90.62 |       75 |     100 |   90.62 | 38-40             
  useCompletion.ts |    92.4 |     87.5 |     100 |    92.4 | 68-69,93-94,98-99 
  ...nitMessage.ts |     100 |      100 |     100 |     100 |                   
  ...extualTips.ts |   77.27 |       50 |     100 |   77.27 | ...2,75-79,93-101 
  ...eteCommand.ts |   78.53 |    88.57 |     100 |   78.53 | ...96-104,112-113 
  ...ialogClose.ts |   13.33 |      100 |     100 |   13.33 | 91-182            
  useDiffData.ts   |   11.62 |      100 |       0 |   11.62 | 44-87             
  ...oublePress.ts |   53.12 |       75 |     100 |   53.12 | 33-35,41-54       
  ...orSettings.ts |     100 |      100 |     100 |     100 |                   
  ...Completion.ts |   99.12 |     97.7 |     100 |   99.12 | 182-183           
  ...ionUpdates.ts |   93.45 |     92.3 |     100 |   93.45 | ...83-287,300-306 
  ...agerDialog.ts |   88.88 |      100 |     100 |   88.88 | 21,25             
  ...backDialog.ts |   54.47 |       50 |   33.33 |   54.47 | ...69-171,193-194 
  useFocus.ts      |     100 |      100 |     100 |     100 |                   
  ...olderTrust.ts |     100 |      100 |     100 |     100 |                   
  ...ggestions.tsx |   89.15 |     62.5 |      50 |   89.15 | ...22-124,149-150 
  ...miniStream.ts |    77.7 |    74.93 |   91.66 |    77.7 | ...2497,2510-2518 
  ...BranchName.ts |    90.9 |     92.3 |     100 |    90.9 | 19-20,55-58       
  ...oryManager.ts |   93.15 |    93.75 |     100 |   93.15 | 44,107-110        
  ...ooksDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...stListener.ts |     100 |      100 |     100 |     100 |                   
  ...nAuthError.ts |   76.19 |       50 |     100 |   76.19 | 39-40,43-45       
  ...putHistory.ts |   92.59 |    85.71 |     100 |   92.59 | 63-64,72,94-96    
  ...storyStore.ts |     100 |    94.11 |     100 |     100 | 69                
  useKeypress.ts   |     100 |      100 |     100 |     100 |                   
  ...rdProtocol.ts |   36.36 |      100 |       0 |   36.36 | 24-31             
  ...unchEditor.ts |    9.67 |      100 |       0 |    9.67 | 11-32,39-90       
  ...gIndicator.ts |     100 |      100 |     100 |     100 |                   
  useLogger.ts     |   21.05 |      100 |       0 |   21.05 | 15-37             
  useMCPHealth.ts  |   63.15 |       75 |      50 |   63.15 | 42-52,64-67       
  ...elsCommand.ts |     100 |      100 |     100 |     100 |                   
  useMcpDialog.ts  |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...moryDialog.ts |    87.5 |      100 |     100 |    87.5 | 19,23             
  ...oryMonitor.ts |     100 |      100 |     100 |     100 |                   
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...delCommand.ts |     100 |       75 |     100 |     100 | 22                
  ...raseCycler.ts |   84.74 |    76.47 |     100 |   84.74 | ...49,52-53,69-71 
  ...derUpdates.ts |   86.38 |    77.19 |     100 |   86.38 | ...22,281-293,341 
  useQwenAuth.ts   |     100 |      100 |     100 |     100 |                   
  ...lScheduler.ts |    84.7 |    93.33 |     100 |    84.7 | ...71-276,372-382 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-7               
  ...umeCommand.ts |   97.08 |    83.33 |     100 |   97.08 | 103-104,133       
  ...ompletion.tsx |   90.59 |    83.33 |     100 |   90.59 | ...01,104,137-140 
  ...ectionList.ts |   96.98 |    95.69 |     100 |   96.98 | ...83-184,238-241 
  ...sionPicker.ts |   92.87 |    90.35 |     100 |   92.87 | ...99-501,503-505 
  ...earchInput.ts |     100 |      100 |     100 |     100 |                   
  ...ngsCommand.ts |   18.75 |      100 |       0 |   18.75 | 10-25             
  ...ellHistory.ts |   91.74 |    79.41 |     100 |   91.74 | ...74,122-123,133 
  ...oryCommand.ts |       0 |        0 |       0 |       0 | 1-73              
  ...Completion.ts |   82.67 |    85.41 |   94.73 |   82.67 | ...68-670,678-714 
  ...tateAndRef.ts |     100 |      100 |     100 |     100 |                   
  useStatusLine.ts |   96.09 |    90.37 |     100 |   96.09 | ...62-365,450-457 
  ...eateDialog.ts |   88.23 |      100 |     100 |   88.23 | 14,18             
  ...tification.ts |     100 |    85.71 |     100 |     100 | 47                
  ...alProgress.ts |   53.06 |       50 |   66.66 |   53.06 | ...53,61-68,79-85 
  ...rminalSize.ts |   76.19 |      100 |      50 |   76.19 | 21-25             
  ...emeCommand.ts |   67.01 |    29.41 |     100 |   67.01 | ...10-111,115-116 
  useTimer.ts      |   88.09 |    85.71 |     100 |   88.09 | 44-45,51-53       
  ...lMigration.ts |       0 |        0 |       0 |       0 |                   
  ...rustModify.ts |     100 |      100 |     100 |     100 |                   
  useTurnDiffs.ts  |   95.12 |    78.57 |     100 |   95.12 | 133-134,156-157   
  ...elcomeBack.ts |   87.36 |     90.9 |     100 |   87.36 | ...,94-96,114-115 
  ...reeSession.ts |   93.75 |       75 |     100 |   93.75 | 44-45,87          
  vim.ts           |   83.77 |    80.31 |     100 |   83.77 | ...55,759-767,776 
 src/ui/layouts    |   89.72 |     87.5 |     100 |   89.72 |                   
  ...AppLayout.tsx |   89.88 |     87.5 |     100 |   89.88 | 51-53,93-98       
  ...AppLayout.tsx |   89.47 |     87.5 |     100 |   89.47 | 58-63             
 ...i/manageModels |   93.61 |       48 |     100 |   93.61 |                   
  manageModels.ts  |   93.61 |       48 |     100 |   93.61 | ...63-166,179,209 
 src/ui/models     |   80.24 |    79.16 |   71.42 |   80.24 |                   
  ...ableModels.ts |   80.24 |    79.16 |   71.42 |   80.24 | ...,61-71,123-125 
 ...noninteractive |     100 |      100 |   14.28 |     100 |                   
  ...eractiveUi.ts |     100 |      100 |   14.28 |     100 |                   
 src/ui/state      |   94.91 |    81.81 |     100 |   94.91 |                   
  extensions.ts    |   94.91 |    81.81 |     100 |   94.91 | 68-69,88          
 src/ui/themes     |   98.53 |    70.58 |     100 |   98.53 |                   
  ansi-light.ts    |     100 |      100 |     100 |     100 |                   
  ansi.ts          |     100 |      100 |     100 |     100 |                   
  atom-one-dark.ts |     100 |      100 |     100 |     100 |                   
  ayu-light.ts     |     100 |      100 |     100 |     100 |                   
  ayu.ts           |     100 |      100 |     100 |     100 |                   
  color-utils.ts   |     100 |      100 |     100 |     100 |                   
  default-light.ts |     100 |      100 |     100 |     100 |                   
  default.ts       |     100 |      100 |     100 |     100 |                   
  ...inal-theme.ts |   88.59 |    85.96 |     100 |   88.59 | ...57-261,266-270 
  dracula.ts       |     100 |      100 |     100 |     100 |                   
  github-dark.ts   |     100 |      100 |     100 |     100 |                   
  github-light.ts  |     100 |      100 |     100 |     100 |                   
  googlecode.ts    |     100 |      100 |     100 |     100 |                   
  no-color.ts      |     100 |      100 |     100 |     100 |                   
  qwen-dark.ts     |     100 |      100 |     100 |     100 |                   
  qwen-light.ts    |     100 |      100 |     100 |     100 |                   
  ...tic-tokens.ts |     100 |      100 |     100 |     100 |                   
  ...-of-purple.ts |     100 |      100 |     100 |     100 |                   
  theme-manager.ts |   87.98 |    82.89 |     100 |   87.98 | ...48-357,362-363 
  theme.ts         |     100 |    38.02 |     100 |     100 | ...34-449,457-461 
  xcode.ts         |     100 |      100 |     100 |     100 |                   
 src/ui/utils      |   83.98 |    82.97 |   92.61 |   83.98 |                   
  ...Colorizer.tsx |   79.53 |    83.78 |     100 |   79.53 | ...51-152,249-275 
  ...nRenderer.tsx |   68.83 |    70.14 |      50 |   68.83 | ...52-254,274-293 
  ...wnDisplay.tsx |   86.01 |    87.41 |     100 |   86.01 | ...87,704,729-754 
  ...idDiagram.tsx |   87.79 |    95.34 |     100 |   87.79 | 156-179           
  ...eRenderer.tsx |   92.08 |    80.45 |      95 |   92.08 | ...76-679,723-728 
  ...dWorkUtils.ts |     100 |      100 |     100 |     100 |                   
  ...boardUtils.ts |   59.61 |    58.82 |     100 |   59.61 | ...,86-88,107-149 
  commandUtils.ts  |    95.9 |    88.42 |     100 |    95.9 | ...62,164-165,289 
  computeStats.ts  |     100 |      100 |     100 |     100 |                   
  customBanner.ts  |   90.68 |    91.22 |     100 |   90.68 | ...13,324-327,334 
  displayUtils.ts  |   88.37 |    72.22 |     100 |   88.37 | 23,25,29,31,33    
  formatters.ts    |   95.23 |    98.27 |     100 |   95.23 | 117-120           
  gradientUtils.ts |     100 |      100 |     100 |     100 |                   
  highlight.ts     |     100 |      100 |     100 |     100 |                   
  ...oryMapping.ts |     100 |    94.28 |     100 |     100 | 35,57             
  historyUtils.ts  |   94.11 |       94 |     100 |   94.11 | 94-97             
  isNarrowWidth.ts |     100 |      100 |     100 |     100 |                   
  ...olDetector.ts |    8.23 |      100 |       0 |    8.23 | ...31-132,135-136 
  latexRenderer.ts |   94.95 |     73.8 |     100 |   94.95 | ...76-178,184-187 
  layoutUtils.ts   |     100 |      100 |     100 |     100 |                   
  ...ightLoader.ts |     100 |    89.47 |     100 |     100 | 81,110            
  ...nUtilities.ts |   69.84 |    85.71 |     100 |   69.84 | 75-91,100-101     
  ...ToolGroups.ts |   98.66 |    96.77 |     100 |   98.66 | 48-49             
  ...geRenderer.ts |   86.23 |    69.06 |   95.12 |   86.23 | ...1284,1324-1330 
  ...alRenderer.ts |   86.69 |     71.9 |     100 |   86.69 | ...1476,1513-1519 
  ...lsBySource.ts |     100 |    95.23 |     100 |     100 | 84                
  osc8.ts          |   94.71 |    87.41 |     100 |   94.71 | ...43,428,432-433 
  ...mConstants.ts |     100 |      100 |     100 |     100 |                   
  restoreGoal.ts   |   98.98 |    97.05 |     100 |   98.98 | 98                
  ...storyUtils.ts |   61.89 |    69.87 |      90 |   61.89 | ...76,424,429-451 
  ...ickerUtils.ts |     100 |      100 |     100 |     100 |                   
  ...izedOutput.ts |   94.94 |      100 |   88.88 |   94.94 | 112-117           
  ...wOptimizer.ts |     100 |    96.77 |     100 |     100 | 69                
  terminalSetup.ts |    4.37 |      100 |       0 |    4.37 | 44-393            
  textUtils.ts     |   97.61 |    94.84 |   92.85 |   97.61 | ...50-251,386-387 
  todoSnapshot.ts  |   89.11 |    93.33 |     100 |   89.11 | ...,66-78,180-181 
  updateCheck.ts   |     100 |    80.95 |     100 |     100 | 30-42             
 ...i/utils/export |   56.77 |     40.8 |   79.41 |   56.77 |                   
  collect.ts       |   55.92 |    50.58 |   86.36 |   55.92 | ...25-640,642-647 
  index.ts         |     100 |      100 |     100 |     100 |                   
  normalize.ts     |   57.47 |    20.51 |      80 |   57.47 | ...09-310,324-359 
  types.ts         |       0 |        0 |       0 |       0 | 1                 
  utils.ts         |      40 |      100 |       0 |      40 | 11-13             
 ...ort/formatters |    3.38 |      100 |       0 |    3.38 |                   
  html.ts          |    9.61 |      100 |       0 |    9.61 | ...28,34-76,82-84 
  json.ts          |      50 |      100 |       0 |      50 | 14-15             
  jsonl.ts         |     3.5 |      100 |       0 |     3.5 | 14-76             
  markdown.ts      |    0.94 |      100 |       0 |    0.94 | 13-295            
 src/utils         |   76.06 |    89.51 |   93.82 |   76.06 |                   
  acpModelUtils.ts |     100 |      100 |     100 |     100 |                   
  apiPreconnect.ts |   96.72 |    97.14 |     100 |   96.72 | 165-168           
  checks.ts        |   33.33 |      100 |       0 |   33.33 | 23-28             
  cleanup.ts       |   84.12 |    93.33 |      80 |   84.12 | 75,106-115        
  commands.ts      |     100 |      100 |     100 |     100 |                   
  commentJson.ts   |   87.17 |     90.9 |     100 |   87.17 | 64-73             
  ...Calculator.ts |     100 |      100 |     100 |     100 |                   
  deepMerge.ts     |     100 |       90 |     100 |     100 | 41-43,49          
  ...ScopeUtils.ts |   97.56 |    88.88 |     100 |   97.56 | 67                
  doctorChecks.ts  |   71.06 |       75 |     100 |   71.06 | ...95-301,325-341 
  ...putCapture.ts |   90.65 |    86.17 |     100 |   90.65 | ...72,370,372-373 
  ...arResolver.ts |   94.28 |       88 |     100 |   94.28 | 28-29,125-126     
  errors.ts        |   98.67 |    96.36 |     100 |   98.67 | 67-68             
  events.ts        |     100 |      100 |     100 |     100 |                   
  gitUtils.ts      |   91.91 |    84.61 |     100 |   91.91 | 78-81,124-127     
  ...AutoUpdate.ts |   90.76 |    93.33 |   88.88 |   90.76 | 103-114           
  ...lationInfo.ts |     100 |      100 |     100 |     100 |                   
  languageUtils.ts |   97.89 |    96.42 |     100 |   97.89 | 132-133           
  math.ts          |       0 |        0 |       0 |       0 | 1-15              
  ...iagnostics.ts |   94.57 |    83.01 |   88.88 |   94.57 | ...05,311,315-317 
  ...onfigUtils.ts |     100 |      100 |     100 |     100 |                   
  ...iveHelpers.ts |   96.79 |    93.28 |     100 |   96.79 | ...76-477,575,588 
  osc.ts           |    97.5 |      100 |   88.88 |    97.5 | 195-196           
  package.ts       |   88.88 |       80 |     100 |   88.88 | 33-34             
  processUtils.ts  |     100 |      100 |     100 |     100 |                   
  readStdin.ts     |   79.62 |       90 |      80 |   79.62 | 33-40,52-54       
  relaunch.ts      |   98.07 |    76.92 |     100 |   98.07 | 70                
  resolvePath.ts   |   66.66 |       25 |     100 |   66.66 | 12-13,16,18-19    
  sandbox.ts       |       0 |        0 |       0 |       0 | 1-1047            
  settingsUtils.ts |   82.89 |    90.67 |   89.47 |   82.89 | ...52-663,670-678 
  spawnWrapper.ts  |     100 |      100 |     100 |     100 |                   
  ...upProfiler.ts |   98.46 |    94.52 |     100 |   98.46 | 130-131,305       
  ...upWarnings.ts |     100 |      100 |     100 |     100 |                   
  stdioHelpers.ts  |     100 |       60 |     100 |     100 | 23,32             
  systemInfo.ts    |   95.12 |    89.06 |     100 |   95.12 | ...43-244,249-253 
  ...InfoFields.ts |   87.61 |       65 |     100 |   87.61 | ...22-123,144-145 
  ...iffPreview.ts |   94.11 |    83.33 |     100 |   94.11 | 13                
  ...entEmitter.ts |     100 |      100 |     100 |     100 |                   
  ...upWarnings.ts |   91.17 |    82.35 |     100 |   91.17 | 67-68,73-74,77-78 
  version.ts       |     100 |       50 |     100 |     100 | 11                
  windowTitle.ts   |     100 |      100 |     100 |     100 |                   
  ...WithBackup.ts |   63.15 |    81.25 |     100 |   63.15 | 93,118-157        
-------------------|---------|----------|---------|---------|-------------------

Core Package - Full Text Report

-------------------|---------|----------|---------|---------|-------------------
File               | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-------------------|---------|----------|---------|---------|-------------------
All files          |   79.59 |    82.85 |   82.19 |   79.59 |                   
 src               |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/__mocks__/fs  |       0 |        0 |       0 |       0 |                   
  promises.ts      |       0 |        0 |       0 |       0 | 1-48              
 src/agents        |   87.58 |    78.93 |   91.76 |   87.58 |                   
  ...transcript.ts |   92.25 |    85.71 |     100 |   92.25 | ...87,306-307,438 
  ...ent-resume.ts |   82.53 |    71.28 |   77.41 |   82.53 | ...1045-1049,1052 
  ...ound-tasks.ts |    95.4 |    86.48 |     100 |    95.4 | ...55-756,827-828 
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/agents/arena  |   76.54 |    66.87 |   78.72 |   76.54 |                   
  ...gentClient.ts |   79.47 |    88.88 |   81.81 |   79.47 | ...68-183,189-204 
  ArenaManager.ts  |   75.37 |    63.37 |   78.26 |   75.37 | ...1860,1866-1867 
  arena-events.ts  |   64.44 |      100 |      50 |   64.44 | ...71-175,178-183 
  diff-summary.ts  |    87.5 |    72.34 |     100 |    87.5 | ...32-133,137-138 
  index.ts         |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...gents/backends |   76.29 |    86.15 |   73.04 |   76.29 |                   
  ITermBackend.ts  |   97.97 |    93.93 |     100 |   97.97 | ...78-180,255,307 
  ...essBackend.ts |   91.25 |    90.62 |   86.66 |   91.25 | ...94,249-269,328 
  TmuxBackend.ts   |    90.7 |    76.55 |   97.36 |    90.7 | ...87,697,743-747 
  detect.ts        |   31.25 |      100 |       0 |   31.25 | 34-88             
  index.ts         |     100 |      100 |     100 |     100 |                   
  iterm-it2.ts     |     100 |     92.1 |     100 |     100 | 37-38,106         
  tmux-commands.ts |    6.64 |      100 |    3.03 |    6.64 | ...93-363,386-503 
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...agents/runtime |   81.14 |     76.7 |   71.42 |   81.14 |                   
  agent-context.ts |     100 |      100 |     100 |     100 |                   
  agent-core.ts    |   76.49 |    72.35 |   60.86 |   76.49 | ...1608,1635-1682 
  agent-events.ts  |     100 |      100 |     100 |     100 |                   
  ...t-headless.ts |   81.19 |    71.73 |   60.86 |   81.19 | ...98-399,402-403 
  ...nteractive.ts |   79.71 |    79.62 |      75 |   79.71 | ...54,456,458,461 
  ...statistics.ts |   98.19 |    82.35 |     100 |   98.19 | 127,151,192,225   
  agent-types.ts   |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/agents/tasks  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/config        |   78.38 |    82.01 |   65.78 |   78.38 |                   
  config.ts        |   76.22 |     80.8 |   61.16 |   76.22 | ...3748,3759-3771 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  models.ts        |     100 |      100 |     100 |     100 |                   
  storage.ts       |   95.01 |     90.9 |   90.47 |   95.01 | ...71-372,375-376 
 ...nfirmation-bus |   98.29 |    97.14 |     100 |   98.29 |                   
  message-bus.ts   |   98.14 |    97.05 |     100 |   98.14 | 42-43             
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/core          |   86.36 |    83.04 |   90.09 |   86.36 |                   
  baseLlmClient.ts |   87.24 |    76.47 |    87.5 |   87.24 | ...82,484-494,503 
  client.ts        |   87.56 |     81.3 |   86.11 |   87.56 | ...1925,1964-1967 
  ...tGenerator.ts |    72.1 |    61.11 |     100 |    72.1 | ...63,365,372-375 
  ...lScheduler.ts |   80.33 |     80.9 |   93.47 |   80.33 | ...2559,2611-2615 
  geminiChat.ts    |   90.65 |    86.85 |   91.66 |   90.65 | ...1779,1846-1847 
  geminiRequest.ts |     100 |      100 |     100 |     100 |                   
  ...htProtocol.ts |    9.09 |      100 |       0 |    9.09 | 34-42,45-49,52-87 
  logger.ts        |   87.33 |    87.02 |     100 |   87.33 | ...61-565,611-625 
  ...tyDefaults.ts |     100 |      100 |     100 |     100 |                   
  ...olExecutor.ts |   92.59 |       75 |      50 |   92.59 | 41-42             
  ...on-helpers.ts |   85.71 |    70.58 |     100 |   85.71 | ...90-191,205-214 
  ...issionFlow.ts |   98.59 |    94.73 |     100 |   98.59 | 93                
  prompts.ts       |   89.36 |    86.41 |   76.92 |   89.36 | ...-977,1180-1181 
  tokenLimits.ts   |     100 |    89.47 |     100 |     100 | 51-52             
  ...okTriggers.ts |   99.31 |    90.41 |     100 |   99.31 | 124,135           
  turn.ts          |   96.46 |    88.88 |     100 |   96.46 | ...19,432-433,481 
 ...ntentGenerator |   94.92 |    82.59 |   93.87 |   94.92 |                   
  ...tGenerator.ts |   96.48 |    84.28 |   92.59 |   96.48 | ...01,919-923,963 
  converter.ts     |   94.51 |    80.72 |     100 |   94.51 | ...06-607,617,823 
  index.ts         |       0 |        0 |       0 |       0 | 1-21              
  usage.ts         |     100 |      100 |     100 |     100 |                   
 ...ntentGenerator |   91.53 |    71.64 |   93.33 |   91.53 |                   
  ...tGenerator.ts |      90 |    70.96 |   92.85 |      90 | ...80-286,304-305 
  index.ts         |     100 |       80 |     100 |     100 | 50                
 ...ntentGenerator |   93.34 |    80.28 |   90.32 |   93.34 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tGenerator.ts |   93.32 |    80.28 |   90.32 |   93.32 | ...01,911-912,940 
 ...ntentGenerator |   81.66 |    84.08 |    90.9 |   81.66 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  converter.ts     |   76.88 |    82.25 |    87.5 |   76.88 | ...1589,1610-1616 
  errorHandler.ts  |     100 |      100 |     100 |     100 |                   
  index.ts         |   52.38 |    44.44 |      50 |   52.38 | ...77,81-85,89-93 
  ...tGenerator.ts |    66.4 |    70.58 |   88.88 |    66.4 | ...51-157,168-169 
  pipeline.ts      |   93.67 |     84.9 |     100 |   93.67 | ...80-481,489,554 
  ...ureContext.ts |     100 |      100 |     100 |     100 |                   
  ...ingOptions.ts |       0 |        0 |       0 |       0 | 1                 
  ...CallParser.ts |   90.66 |    88.57 |     100 |   90.66 | ...15-319,349-350 
  ...kingParser.ts |     100 |    96.87 |     100 |     100 | 42                
  types.ts         |       0 |        0 |       0 |       0 | 1                 
 ...rator/provider |   96.83 |    89.55 |   95.65 |   96.83 |                   
  dashscope.ts     |   97.29 |    89.77 |   93.33 |   97.29 | ...81-282,358-359 
  deepseek.ts      |   95.55 |    90.56 |     100 |   95.55 | ...31-132,145-146 
  default.ts       |   95.79 |    89.65 |   88.88 |   95.79 | 122-123,193-195   
  index.ts         |     100 |      100 |     100 |     100 |                   
  minimax.ts       |     100 |      100 |     100 |     100 |                   
  mistral.ts       |   96.07 |    73.33 |     100 |   96.07 | 32-33             
  modelscope.ts    |     100 |      100 |     100 |     100 |                   
  openrouter.ts    |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 |                   
 src/extension     |   60.56 |    79.46 |    78.4 |   60.56 |                   
  ...-converter.ts |   62.35 |    47.82 |      90 |   62.35 | ...90-791,800-832 
  ...ionManager.ts |   47.04 |    82.06 |    65.9 |   47.04 | ...1398,1408-1427 
  ...onSettings.ts |   93.46 |    93.05 |     100 |   93.46 | ...17-221,228-232 
  ...-converter.ts |   54.88 |    94.44 |      60 |   54.88 | ...35-146,158-192 
  github.ts        |   44.94 |    88.52 |      60 |   44.94 | ...53-359,398-451 
  index.ts         |     100 |      100 |     100 |     100 |                   
  marketplace.ts   |   97.29 |    93.75 |     100 |   97.29 | ...64,184-185,274 
  npm.ts           |   48.66 |    76.08 |      75 |   48.66 | ...18-420,427-431 
  override.ts      |   94.11 |    88.88 |     100 |   94.11 | 63-64,81-82       
  settings.ts      |   66.26 |      100 |      50 |   66.26 | 81-108,143-149    
  storage.ts       |     100 |      100 |     100 |     100 |                   
  ...ableSchema.ts |     100 |      100 |     100 |     100 |                   
  variables.ts     |   88.75 |    83.33 |     100 |   88.75 | ...28-231,234-237 
 src/followup      |   55.57 |    84.14 |   81.25 |   55.57 |                   
  followupState.ts |      96 |    89.74 |     100 |      96 | 159-161,218-219   
  index.ts         |     100 |      100 |     100 |     100 |                   
  overlayFs.ts     |   95.06 |       84 |     100 |   95.06 | 78,108,122,133    
  speculation.ts   |   13.02 |      100 |   16.66 |   13.02 | 89-464,524-575    
  ...onToolGate.ts |     100 |    96.42 |     100 |     100 | 94                
  ...nGenerator.ts |    71.6 |    72.13 |   83.33 |    71.6 | ...88-246,316-318 
 src/generated     |       0 |        0 |       0 |       0 |                   
  git-commit.ts    |       0 |        0 |       0 |       0 | 1-10              
 src/goals         |   89.57 |    83.45 |   94.44 |   89.57 |                   
  ...eGoalStore.ts |    85.1 |    95.45 |   84.61 |    85.1 | ...63-166,174-182 
  goalHook.ts      |   97.26 |    91.48 |     100 |   97.26 | 100-105           
  goalJudge.ts     |   84.33 |    74.28 |     100 |   84.33 | ...57-358,366-368 
  index.ts         |     100 |      100 |     100 |     100 |                   
 src/hooks         |   83.48 |    84.87 |   86.83 |   83.48 |                   
  ...okRegistry.ts |   86.48 |    77.08 |     100 |   86.48 | ...41-344,362-369 
  ...bortSignal.ts |     100 |      100 |     100 |     100 |                   
  ...terpolator.ts |   96.66 |    93.33 |     100 |   96.66 | 66-67             
  ...HookRunner.ts |   96.68 |    87.23 |     100 |   96.68 | 110-112,231-233   
  ...Aggregator.ts |    96.4 |    90.78 |     100 |    96.4 | ...91,293-294,367 
  ...entHandler.ts |   94.56 |    83.78 |   93.33 |   94.56 | ...38,795-796,806 
  hookPlanner.ts   |   84.13 |    76.59 |      90 |   84.13 | ...38,144,162-173 
  hookRegistry.ts  |   90.17 |    83.33 |     100 |   90.17 | ...33,352,356,360 
  hookRunner.ts    |   58.56 |    71.26 |   66.66 |   58.56 | ...48-749,758-759 
  hookSystem.ts    |   84.57 |      100 |   65.85 |   84.57 | ...21-622,628-629 
  ...HookRunner.ts |   75.51 |     61.9 |      80 |   75.51 | ...05-406,424-425 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...HookRunner.ts |   93.63 |    89.47 |      90 |   93.63 | ...45-353,427-428 
  ...SkillHooks.ts |   78.75 |       75 |   66.66 |   78.75 | 62-66,137-152     
  ...oksManager.ts |   96.66 |    91.66 |     100 |   96.66 | ...90,209-210,223 
  ssrfGuard.ts     |   77.22 |    85.36 |     100 |   77.22 | ...57,261-267,273 
  stopHookCap.ts   |     100 |      100 |     100 |     100 |                   
  trustedHooks.ts  |       0 |        0 |       0 |       0 | 1-124             
  types.ts         |   91.21 |    92.04 |   85.71 |   91.21 | ...40-441,501-505 
  urlValidator.ts  |     100 |      100 |     100 |     100 |                   
 src/ide           |   74.28 |    83.39 |   78.33 |   74.28 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  detect-ide.ts    |     100 |      100 |     100 |     100 |                   
  ide-client.ts    |    64.2 |    81.48 |   66.66 |    64.2 | ...9-970,999-1007 
  ide-installer.ts |   89.06 |    79.31 |     100 |   89.06 | ...36,143-147,160 
  ideContext.ts    |     100 |      100 |     100 |     100 |                   
  process-utils.ts |   84.84 |    71.79 |     100 |   84.84 | ...37,151,193-194 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/lsp           |   41.24 |    52.14 |   51.42 |   41.24 |                   
  ...nfigLoader.ts |   70.27 |    35.89 |   94.73 |   70.27 | ...20-422,426-432 
  ...ionFactory.ts |   42.69 |    79.16 |      50 |   42.69 | ...62-413,419-436 
  ...Normalizer.ts |   23.09 |    13.72 |   30.43 |   23.09 | ...04-905,909-924 
  ...verManager.ts |   25.31 |    62.06 |   41.66 |   25.31 | ...85-704,710-740 
  ...eLspClient.ts |   32.77 |       80 |   17.64 |   32.77 | ...84-288,294-295 
  ...LspService.ts |   48.49 |    67.16 |   65.71 |   48.49 | ...1352,1369-1379 
  constants.ts     |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/mcp           |   78.69 |    75.34 |   75.92 |   78.69 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...h-provider.ts |   86.95 |      100 |   33.33 |   86.95 | ...,93,97,101-102 
  ...h-provider.ts |   73.82 |    53.92 |     100 |   73.82 | ...88-895,902-904 
  ...en-storage.ts |   98.62 |    97.72 |     100 |   98.62 | 87-88             
  oauth-utils.ts   |   70.58 |    85.29 |    90.9 |   70.58 | ...70-290,315-344 
  ...n-provider.ts |   89.83 |    95.83 |   45.45 |   89.83 | ...43,147,151-152 
 .../token-storage |   79.52 |    86.66 |   86.36 |   79.52 |                   
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   82.87 |    82.35 |   92.85 |   82.87 | ...63-173,181-182 
  ...en-storage.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...en-storage.ts |   68.14 |    82.35 |   64.28 |   68.14 | ...81-295,298-314 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/memory        |      68 |    76.57 |   66.66 |      68 |                   
  const.ts         |     100 |      100 |     100 |     100 |                   
  dream.ts         |   65.65 |    73.33 |      50 |   65.65 | 50,107-148        
  ...entPlanner.ts |   57.84 |    72.72 |   33.33 |   57.84 | ...35,140-147,152 
  entries.ts       |   63.77 |    79.16 |      50 |   63.77 | ...72-180,183-189 
  extract.ts       |    95.2 |    79.16 |     100 |    95.2 | 81-86,125         
  ...entPlanner.ts |   63.08 |    65.71 |   41.17 |   63.08 | ...17,222-223,332 
  ...ionPlanner.ts |       0 |        0 |       0 |       0 | 1                 
  forget.ts        |    45.8 |    61.53 |   44.44 |    45.8 | ...04,211,214-346 
  indexer.ts       |   83.87 |    45.45 |     100 |   83.87 | ...50,56-57,69-70 
  manager.ts       |   75.31 |    81.04 |    75.6 |   75.31 | ...1278,1291-1293 
  memoryAge.ts     |   90.47 |    77.77 |     100 |   90.47 | 50-51             
  paths.ts         |   55.47 |    89.47 |   85.71 |   55.47 | ...,89-90,106-114 
  prompt.ts        |   93.36 |    71.42 |     100 |   93.36 | ...58,161,228-229 
  recall.ts        |   77.54 |    69.38 |   88.88 |   77.54 | ...53-258,282-293 
  ...ceSelector.ts |   91.86 |    77.27 |     100 |   91.86 | ...15,117-118,126 
  scan.ts          |   87.91 |    68.42 |     100 |   87.91 | ...47-48,58,82-87 
  ...entPlanner.ts |    11.5 |      100 |       0 |    11.5 | ...57-192,210-298 
  status.ts        |   10.52 |      100 |       0 |   10.52 | 41-98             
  store.ts         |   94.44 |    83.33 |     100 |   94.44 | 56-57,92-93       
  types.ts         |     100 |      100 |     100 |     100 |                   
  ...ontextFile.ts |   79.38 |    81.03 |   81.81 |   79.38 | ...58-272,286-291 
 src/mocks         |       0 |        0 |       0 |       0 |                   
  msw.ts           |       0 |        0 |       0 |       0 | 1-9               
 src/models        |   89.35 |    85.67 |    87.5 |   89.35 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...tor-config.ts |   90.24 |    91.42 |     100 |   90.24 | 142,148,151-160   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...nfigErrors.ts |   74.22 |       44 |   84.61 |   74.22 | ...,67-74,106-117 
  ...igResolver.ts |   98.66 |    92.85 |     100 |   98.66 | 162,324,330       
  modelRegistry.ts |     100 |    98.59 |     100 |     100 | 222               
  modelsConfig.ts  |   84.57 |    82.14 |   81.57 |   84.57 | ...1223,1252-1253 
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/output        |     100 |      100 |     100 |     100 |                   
  ...-formatter.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/permissions   |   74.28 |    88.55 |   57.55 |   74.28 |                   
  autoMode.ts      |   61.59 |    93.54 |   83.33 |   61.59 | ...00-238,340-356 
  ...transcript.ts |      98 |       84 |     100 |      98 | 200-201           
  classifier.ts    |   92.89 |     87.5 |     100 |   92.89 | 146-153,333-337   
  ...erousRules.ts |     100 |    83.87 |     100 |     100 | 101,113,137-143   
  ...alTracking.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...on-manager.ts |   78.26 |    85.24 |   82.14 |   78.26 | ...-916,1022-1026 
  rule-parser.ts   |   95.99 |    93.22 |     100 |   95.99 | ...-864,1013-1015 
  ...-semantics.ts |   58.28 |    85.27 |    30.2 |   58.28 | ...1604-1614,1643 
  types.ts         |     100 |      100 |     100 |     100 |                   
 ...sifier-prompts |   98.18 |       90 |     100 |   98.18 |                   
  system-prompt.ts |   98.18 |       90 |     100 |   98.18 | 150               
 src/prompts       |   83.63 |      100 |    87.5 |   83.63 |                   
  mcp-prompts.ts   |   18.18 |      100 |       0 |   18.18 | 11-19             
  ...t-registry.ts |     100 |      100 |     100 |     100 |                   
 src/qwen          |   83.87 |    77.23 |   95.83 |   83.87 |                   
  ...tGenerator.ts |   98.64 |    98.18 |     100 |   98.64 | 105-106           
  qwenOAuth2.ts    |   80.85 |    70.27 |   90.32 |   80.85 | ...1169-1185,1215 
  ...kenManager.ts |   83.76 |    76.22 |     100 |   83.76 | ...62-767,788-793 
 src/services      |   85.68 |    83.51 |   91.11 |   85.68 |                   
  ...ionTrailer.ts |     100 |      100 |     100 |     100 |                   
  ...llRegistry.ts |   98.44 |    91.83 |     100 |   98.44 | 268-269           
  ...ionService.ts |   98.03 |    96.42 |   85.71 |   98.03 | ...98,700-704,837 
  ...ingService.ts |   83.88 |    83.33 |   83.33 |   83.88 | ...1266,1283-1284 
  ...ttribution.ts |   91.73 |    87.71 |      90 |   91.73 | ...80-685,826-827 
  ...utSlimming.ts |     100 |    96.77 |     100 |     100 | 141,190           
  cronScheduler.ts |   97.56 |    92.98 |     100 |   97.56 | 62-63,77,155      
  ...eryService.ts |   80.43 |    95.45 |      75 |   80.43 | ...19-134,140-141 
  ...oryService.ts |   86.18 |    76.76 |   91.17 |   86.18 | ...1150,1191-1194 
  fileReadCache.ts |     100 |      100 |     100 |     100 |                   
  ...temService.ts |   91.27 |    82.69 |    90.9 |   91.27 | ...94,196,294-301 
  ...ratedFiles.ts |      96 |    88.23 |     100 |      96 | 119-120,146-147   
  gitInit.ts       |     100 |      100 |     100 |     100 |                   
  gitService.ts    |   68.75 |     92.3 |   55.55 |   68.75 | ...12-122,125-129 
  ...reeService.ts |   73.83 |    69.31 |    97.5 |   73.83 | ...1460,1488-1489 
  ...ionService.ts |   98.13 |     97.8 |   95.45 |   98.13 | ...32-333,380-381 
  ...orRegistry.ts |   96.54 |    91.73 |     100 |   96.54 | ...70-471,622-623 
  sessionRecap.ts  |   12.65 |      100 |       0 |   12.65 | 44-150            
  ...ionService.ts |   90.23 |     78.8 |   96.77 |   90.23 | ...1294,1298-1299 
  sessionTitle.ts  |   93.87 |    71.15 |     100 |   93.87 | ...33-236,267-268 
  ...ionService.ts |   81.07 |    77.92 |   89.28 |   81.07 | ...1923,1929-1934 
  ...Estimation.ts |     100 |      100 |     100 |     100 |                   
  ...UseSummary.ts |   94.63 |    88.46 |     100 |   94.63 | ...62-164,214-215 
  ...reeCleanup.ts |   14.56 |      100 |   33.33 |   14.56 | 58-185            
  ...ionService.ts |   84.21 |    79.41 |     100 |   84.21 | ...22-223,239-240 
 ...icrocompaction |   98.05 |     91.8 |     100 |   98.05 |                   
  microcompact.ts  |   98.05 |     91.8 |     100 |   98.05 | ...19,289,293,391 
 src/skills        |    87.5 |    83.86 |   94.23 |    87.5 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...activation.ts |     100 |     93.1 |     100 |     100 | 93,112            
  skill-load.ts    |   92.94 |    81.63 |     100 |   92.94 | ...06,226,238-240 
  skill-manager.ts |   83.31 |    79.66 |   90.32 |   83.31 | ...1120,1127-1131 
  skill-paths.ts   |   86.74 |    77.77 |     100 |   86.74 | ...00-101,106-107 
  symlinkScope.ts  |     100 |      100 |     100 |     100 |                   
  types.ts         |     100 |      100 |     100 |     100 |                   
 src/subagents     |   82.61 |    78.89 |   95.23 |   82.61 |                   
  ...tin-agents.ts |     100 |      100 |     100 |     100 |                   
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...nt-manager.ts |   77.15 |    71.36 |    93.1 |   77.15 | ...1178,1200-1201 
  types.ts         |     100 |      100 |     100 |     100 |                   
  validation.ts    |   92.46 |    95.18 |     100 |   92.46 | 51-56,69-74,78-83 
 src/telemetry     |   74.72 |    86.01 |   78.85 |   74.72 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  constants.ts     |     100 |      100 |     100 |     100 |                   
  ...attributes.ts |   98.13 |       88 |     100 |   98.13 | 185-187           
  ...-exporters.ts |   46.37 |      100 |   44.44 |   46.37 | ...85,88-89,92-93 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-111             
  ...-processor.ts |   93.93 |    90.21 |   94.11 |   93.93 | ...75-280,299-300 
  ...t.circular.ts |       0 |        0 |       0 |       0 | 1-128             
  loggers.ts       |    51.9 |       64 |   57.77 |    51.9 | ...1214,1231-1251 
  metrics.ts       |    74.9 |    82.95 |   74.54 |    74.9 | ...58-978,981-992 
  sanitize.ts      |      80 |    83.33 |     100 |      80 | 35-36,41-42       
  sdk.ts           |   90.45 |    83.56 |   76.92 |   90.45 | ...17-318,338-342 
  ...on-context.ts |     100 |      100 |     100 |     100 |                   
  ...on-tracing.ts |   92.24 |    88.77 |     100 |   92.24 | ...21-424,522-525 
  ...etry-utils.ts |     100 |      100 |     100 |     100 |                   
  ...l-decision.ts |     100 |      100 |     100 |     100 |                   
  ...e-id-utils.ts |     100 |      100 |     100 |     100 |                   
  tracer.ts        |   98.61 |    89.36 |     100 |   98.61 | 53,108            
  types.ts         |   79.17 |    85.83 |   83.33 |   79.17 | ...1149,1152-1181 
  uiTelemetry.ts   |   92.97 |    96.96 |   81.25 |   92.97 | ...93-194,200-207 
 ...ry/qwen-logger |   68.24 |    79.56 |   64.91 |   68.24 |                   
  event-types.ts   |       0 |        0 |       0 |       0 |                   
  qwen-logger.ts   |   68.24 |    79.34 |   64.28 |   68.24 | ...1055,1093-1094 
 src/test-utils    |   93.16 |    95.91 |   76.47 |   93.16 |                   
  config.ts        |     100 |      100 |     100 |     100 |                   
  ...st-helpers.ts |   94.11 |       90 |     100 |   94.11 | 69-70             
  index.ts         |     100 |      100 |     100 |     100 |                   
  mock-tool.ts     |   91.19 |    97.14 |   72.41 |   91.19 | ...38,202-203,216 
  ...aceContext.ts |     100 |      100 |     100 |     100 |                   
 src/tools         |   78.53 |     81.6 |   85.98 |   78.53 |                   
  ...erQuestion.ts |   88.93 |    76.74 |    90.9 |   88.93 | ...39-340,347-348 
  cron-create.ts   |   88.11 |    88.88 |    62.5 |   88.11 | ...,43-44,165-172 
  cron-delete.ts   |   96.82 |      100 |   83.33 |   96.82 | 26-27             
  cron-list.ts     |   96.66 |      100 |   83.33 |   96.66 | 25-26             
  diffOptions.ts   |     100 |      100 |     100 |     100 |                   
  edit.ts          |   81.02 |    84.07 |      75 |   81.02 | ...15-716,826-876 
  ...r-worktree.ts |   82.95 |    67.56 |    87.5 |   82.95 | ...82-185,276-277 
  exit-worktree.ts |   84.23 |    85.96 |   91.66 |   84.23 | ...92-293,298-312 
  exitPlanMode.ts  |   85.09 |    85.71 |     100 |   85.09 | ...60-163,177-189 
  glob.ts          |   90.63 |    88.33 |   84.61 |   90.63 | ...28,171,302,305 
  grep.ts          |   79.19 |    85.71 |   78.94 |   79.19 | ...20,560,569-576 
  ls.ts            |   96.74 |    90.27 |     100 |   96.74 | 176-181,212,216   
  lsp.ts           |   72.77 |    60.09 |   90.32 |   72.77 | ...1211,1213-1214 
  ...nt-manager.ts |   84.36 |    82.74 |   84.21 |   84.36 | ...2099-2103,2142 
  mcp-client.ts    |   33.18 |    77.65 |   66.66 |   33.18 | ...1490,1494-1497 
  mcp-tool.ts      |   90.98 |    88.88 |   96.42 |   90.98 | ...95-596,646-647 
  memory-config.ts |       0 |        0 |       0 |       0 | 1-47              
  ...iable-tool.ts |     100 |    84.61 |     100 |     100 | 102,109           
  monitor.ts       |   91.36 |    83.94 |   88.46 |   91.36 | ...61,574,770-775 
  ...nforcement.ts |   82.44 |       90 |     100 |   82.44 | 174-185,234-247   
  read-file.ts     |   95.09 |    88.75 |      90 |   95.09 | ...99,293-296,299 
  ripGrep.ts       |   94.59 |    85.71 |   93.33 |   94.59 | ...60,463,541-542 
  ...-transport.ts |    6.34 |        0 |       0 |    6.34 | 47-145            
  send-message.ts  |   84.68 |    91.66 |    62.5 |   84.68 | ...,82-90,167-170 
  shell.ts         |   73.05 |    79.66 |   91.42 |   73.05 | ...4216,4265-4271 
  skill-utils.ts   |     100 |      100 |     100 |     100 |                   
  skill.ts         |   88.35 |    91.42 |   86.66 |   88.35 | ...12,416,439-461 
  ...eticOutput.ts |   95.12 |      100 |      80 |   95.12 | 87-88             
  task-stop.ts     |   93.14 |    96.15 |   85.71 |   93.14 | 39-40,54-64       
  todoWrite.ts     |   89.17 |    82.05 |   92.85 |   89.17 | ...41-546,568-569 
  tool-error.ts    |     100 |      100 |     100 |     100 |                   
  tool-names.ts    |     100 |      100 |     100 |     100 |                   
  tool-registry.ts |   74.85 |    76.85 |   80.95 |   74.85 | ...30-831,839-840 
  tool-search.ts   |   95.19 |    86.48 |    92.3 |   95.19 | ...47-153,208-213 
  tools.ts         |   90.49 |    90.19 |   84.21 |   90.49 | ...78-479,495-501 
  web-fetch.ts     |   88.84 |       80 |   92.85 |   88.84 | ...12-313,315-316 
  write-file.ts    |   82.65 |    80.45 |   84.61 |   82.65 | ...65-668,696-731 
 src/tools/agent   |   74.64 |    81.34 |   73.61 |   74.64 |                   
  agent.ts         |    74.9 |     81.6 |   74.24 |    74.9 | ...2390,2399-2402 
  fork-subagent.ts |   69.62 |    71.42 |   66.66 |   69.62 | ...04-105,140-151 
 src/utils         |   88.99 |    87.67 |    93.6 |   88.99 |                   
  LruCache.ts      |       0 |        0 |       0 |       0 | 1-41              
  ...ssageQueue.ts |     100 |      100 |     100 |     100 |                   
  ...cFileWrite.ts |   77.96 |    80.48 |     100 |   77.96 | ...35,156,173-176 
  bareMode.ts      |   27.27 |      100 |       0 |   27.27 | 9-15,18-19        
  browser.ts       |    7.69 |      100 |       0 |    7.69 | 17-56             
  bundlePaths.ts   |     100 |      100 |     100 |     100 |                   
  ...igResolver.ts |     100 |      100 |     100 |     100 |                   
  ...engthError.ts |   89.11 |    87.23 |     100 |   89.11 | ...28-129,132-133 
  cronDisplay.ts   |   42.85 |    23.07 |     100 |   42.85 | 26-31,33-45,47-54 
  cronParser.ts    |   89.74 |    85.71 |     100 |   89.74 | ...,63-64,183-186 
  debugLogger.ts   |    95.9 |    93.84 |   94.73 |    95.9 | 106-107,214-218   
  editHelper.ts    |   93.63 |    83.52 |     100 |   93.63 | ...28-429,463-464 
  editor.ts        |   97.61 |    95.71 |     100 |   97.61 | ...70-271,273-274 
  ...arResolver.ts |   94.28 |    88.88 |     100 |   94.28 | 28-29,125-126     
  ...entContext.ts |     100 |    95.45 |     100 |     100 | 83                
  errorParsing.ts  |    97.7 |    97.05 |     100 |    97.7 | 72-73             
  ...rReporting.ts |   88.46 |       90 |     100 |   88.46 | 69-74             
  errors.ts        |   70.92 |    79.59 |   53.33 |   70.92 | ...03-219,223-229 
  fetch.ts         |   70.18 |    71.42 |   71.42 |   70.18 | ...42,148,161,186 
  fileUtils.ts     |   91.46 |    86.19 |   95.23 |   91.46 | ...1188,1192-1198 
  forkedAgent.ts   |   80.68 |    78.12 |   83.33 |   80.68 | ...39-545,550-556 
  formatters.ts    |   81.81 |       75 |     100 |   81.81 | 15-16             
  ...eUtilities.ts |   89.21 |    86.66 |     100 |   89.21 | 16-17,49-55,65-66 
  ...rStructure.ts |   94.36 |    94.28 |     100 |   94.36 | ...17-120,330-335 
  getPty.ts        |    12.5 |      100 |       0 |    12.5 | 21-34             
  gitDiff.ts       |   92.36 |    79.53 |     100 |   92.36 | ...55-856,928-929 
  ...noreParser.ts |    92.3 |    89.36 |     100 |    92.3 | ...15-116,186-187 
  gitUtils.ts      |   56.66 |    85.71 |      75 |   56.66 | ...2,72-73,97-148 
  iconvHelper.ts   |     100 |      100 |     100 |     100 |                   
  ...rePatterns.ts |     100 |      100 |     100 |     100 |                   
  ...ionManager.ts |     100 |     90.9 |     100 |     100 | 26                
  ...lPromptIds.ts |     100 |      100 |     100 |     100 |                   
  jsonl-utils.ts   |    74.1 |    90.76 |   58.33 |    74.1 | ...23-326,336-342 
  ...-detection.ts |     100 |      100 |     100 |     100 |                   
  ...iagnostics.ts |   96.87 |    91.83 |     100 |   96.87 | 214-219,272       
  ...yDiscovery.ts |    83.9 |    79.36 |     100 |    83.9 | ...16,319,411-414 
  ...tProcessor.ts |   93.63 |       90 |     100 |   93.63 | ...96-302,384-385 
  ...Inspectors.ts |   61.53 |      100 |      50 |   61.53 | 18-23             
  modelId.ts       |   98.95 |    98.18 |     100 |   98.95 | 148               
  ...kerChecker.ts |   88.75 |    85.71 |     100 |   88.75 | 69-70,87-93       
  notebook.ts      |   94.35 |    84.78 |     100 |   94.35 | ...10,122,174-176 
  openaiLogger.ts  |   88.05 |    84.09 |     100 |   88.05 | ...44-146,169-174 
  partUtils.ts     |     100 |    98.61 |     100 |     100 | 206               
  pathReader.ts    |     100 |      100 |     100 |     100 |                   
  paths.ts         |   93.21 |    91.86 |     100 |   93.21 | ...89-390,392-394 
  pdf.ts           |   93.68 |    87.05 |     100 |   93.68 | ...96-297,321-325 
  projectPath.ts   |     100 |      100 |     100 |     100 |                   
  ...ectSummary.ts |   89.39 |    72.41 |     100 |   89.39 | ...37-142,193-196 
  ...tIdContext.ts |     100 |      100 |     100 |     100 |                   
  proxyUtils.ts    |     100 |      100 |     100 |     100 |                   
  ...rDetection.ts |   58.57 |       76 |     100 |   58.57 | ...4,88-89,95-100 
  ...noreParser.ts |   85.45 |    85.18 |     100 |   85.45 | ...59,65-66,72-73 
  rateLimit.ts     |   92.55 |    85.92 |     100 |   92.55 | ...70-272,309-310 
  readManyFiles.ts |   87.96 |    86.95 |     100 |   87.96 | ...05-207,223-234 
  retry.ts         |   89.81 |    88.05 |     100 |   89.81 | ...29,350,357-358 
  ripgrepUtils.ts  |   46.79 |    84.37 |   66.66 |   46.79 | ...45-246,258-335 
  ...sDiscovery.ts |   97.42 |    92.85 |     100 |   97.42 | ...04,182-183,202 
  ...tchOptions.ts |   81.72 |    85.04 |   95.23 |   81.72 | ...11,536,565-574 
  runtimeStatus.ts |    97.5 |    88.57 |     100 |    97.5 | 167-168           
  safeJsonParse.ts |   74.07 |    83.33 |     100 |   74.07 | 40-46             
  ...nStringify.ts |     100 |      100 |     100 |     100 |                   
  ...aConverter.ts |   90.78 |    88.23 |     100 |   90.78 | ...41-42,93,95-96 
  ...aValidator.ts |   94.57 |    80.26 |     100 |   94.57 | ...04,213-216,270 
  ...r-launcher.ts |   76.92 |     91.3 |   66.66 |   76.92 | ...34,136,157-195 
  ...orageUtils.ts |   96.89 |    85.84 |     100 |   96.89 | ...51,367,447,466 
  shell-utils.ts   |   82.93 |    89.89 |     100 |   82.93 | ...1522,1529-1533 
  ...lAstParser.ts |   95.58 |    85.79 |     100 |   95.58 | ...1059-1061,1071 
  ...nlyChecker.ts |   95.75 |    92.39 |     100 |   95.75 | ...00-301,313-314 
  sideQuery.ts     |   98.71 |    97.14 |     100 |   98.71 | 110               
  ...pEventSink.ts |     100 |       80 |     100 |     100 | 61                
  ...tGenerator.ts |     100 |      100 |     100 |     100 |                   
  ...ameContext.ts |     100 |      100 |     100 |     100 |                   
  symlink.ts       |   77.77 |       50 |     100 |   77.77 | 44,54-59          
  ...emEncoding.ts |   96.36 |    91.17 |     100 |   96.36 | 59-60,124-125     
  terminalSafe.ts  |     100 |      100 |     100 |     100 |                   
  ...Serializer.ts |   98.72 |       90 |     100 |   98.72 | 42-43,134,201-203 
  testUtils.ts     |   53.33 |      100 |   33.33 |   53.33 | ...53,59-64,70-72 
  textUtils.ts     |      60 |      100 |   66.66 |      60 | 36-55             
  thoughtUtils.ts  |     100 |    92.85 |     100 |     100 | 71                
  ...-converter.ts |   94.59 |    85.71 |     100 |   94.59 | 35-36             
  tool-utils.ts    |    93.6 |     91.3 |     100 |    93.6 | ...58-159,162-163 
  truncation.ts    |     100 |       92 |     100 |     100 | 52,71             
  windowsPath.ts   |   89.47 |    79.31 |     100 |   89.47 | ...57-58,62,90-91 
  ...aceContext.ts |   93.71 |    89.28 |   93.33 |   93.71 | ...24-225,249-251 
  xml.ts           |     100 |      100 |     100 |     100 |                   
  yaml-parser.ts   |      92 |    84.61 |     100 |      92 | 49-53,65-69       
 ...ils/filesearch |   86.21 |    81.61 |   96.42 |   86.21 |                   
  crawlCache.ts    |     100 |      100 |     100 |     100 |                   
  crawler.ts       |   82.84 |    77.49 |   94.82 |   82.84 | ...1451,1485-1486 
  fileSearch.ts    |   93.58 |    87.32 |     100 |   93.58 | ...46-247,249-250 
  ignore.ts        |     100 |      100 |     100 |     100 |                   
  result-cache.ts  |     100 |     92.3 |     100 |     100 | 46                
 ...uest-tokenizer |   56.63 |    74.52 |   74.19 |   56.63 |                   
  ...eTokenizer.ts |   41.86 |    76.47 |   69.23 |   41.86 | ...70-443,453-507 
  index.ts         |     100 |      100 |     100 |     100 |                   
  ...tTokenizer.ts |   68.39 |    69.49 |    90.9 |   68.39 | ...24-325,327-328 
  ...ageFormats.ts |      76 |      100 |   33.33 |      76 | 45-48,55-56       
  textTokenizer.ts |     100 |      100 |     100 |     100 |                   
  types.ts         |       0 |        0 |       0 |       0 | 1                 
-------------------|---------|----------|---------|---------|-------------------

For detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run.

The Task 11 redesign updated the non-interactive text formatter (formatContextUsageText) but left ContextUsage.tsx — the interactive React component that real /context users see — unchanged. As a result the TUI still showed the old single "Autocompact buffer" line and none of the new warn/auto/hard ladder. Adds a "Compaction thresholds" section after the per-category breakdown: - Effective window - Warn / Auto / Hard threshold rows with a ▶ marker on the row the current usage has crossed - Current tier label coloured by severity (safe→green, warn/auto→ yellow, hard→red) The existing progress bar legend (Used / Free / Autocompact buffer) is preserved because it's tied to the three-segment progress bar visualisation; the new section adds the absolute numbers + tier badge on top of that. Caught by the tmux e2e test (PR #4168 ci-monitor follow-up). Pre-fix the assertion 'Compaction thresholds' missed completely from the TUI; post-fix the new section renders correctly for fresh and live sessions on 1M / 200K / 128K windows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LaZzyMan · 2026-05-15T10:00:05Z

E2E 测试报告

跟进 review/CI 反馈，针对核心功能补做了多轮 E2E 验证。最终覆盖矩阵：

✅ 真实模型 E2E（最重要）

配置： qwen3.6-plus (1M 窗口) · tmux 交互模式 · 3 轮对话

阶段	Used tokens	说明
Auto threshold	967,000	`computeThresholds(1M).auto`
Turn 1 后（读 ~3.5MB TS 源码 inline）	908,128 (90.8%)	接近阈值但未跨过
Turn 2 发起（加载更多内容）	估算超 967K → 触发压缩	cheap-gate 工作正确
TUI 压缩 banner	`compressed from: 908128 to 20742 tokens`	真实 reduction
Turn 2 后 `/context`	20,742 (2.1%) · tier=`safe`	history 真的变小
Turn 3 任意 prompt	模型正常回答	压缩后对话不破

实测 97.7% 上下文 reduction，所有 6 项断言通过：

Used 跨过 Auto threshold（precondition met）
压缩事件触发（UI banner 出现）
Y < X 且 Y << X（不是假压缩）
压缩后 Used 远低于 auto
tier 回到 safe
后续 turn 正常响应

✅ TUI `/context` 三层阈值显示

tmux e2e 一开始发现 ContextUsage.tsx React 组件没跟 Task 11 一起更新——TUI 仍显示旧的单行 Autocompact buffer。已在 commit 378635550 修复。修复后 TUI 渲染：

```
Compaction thresholds
Effective window 980.0k tokens
Warn threshold 947.0k tokens
Auto threshold 967.0k tokens
Hard threshold 977.0k tokens
Current tier safe
```

带 ▶ marker 标记当前跨越的阈值，tier 按 severity 着色（safe→绿 / warn,auto→黄 / hard→红）。

✅ Headless smoke

node dist/cli.js \"...\" --approval-mode yolo --output-format json 正常运行，输出 type: system/assistant/result 完整 JSON 流。

⚠️ 仅 unit test 覆盖、未跑真实模型 E2E

以下场景因构造真实触发条件成本高（需要让真实 sideQuery 反复失败 / 刻意填到 hard / 等）暂未做 E2E，但 unit test 覆盖到位：

Failure circuit breaker — consecutiveFailures 计数到 MAX_CONSECUTIVE_FAILURES=3 后 cheap-gate NOOP。geminiChat.test.ts 三个 case 直接验证（容忍 2 次、force 不计数、成功重置）
Hard-tier rescue — effectiveTokens >= hard 时 force=true 压缩 + 重置熔断器。geminiChat.test.ts 三个 case（force 触发、counter 重置、below-hard 不触发）
MAX_TOKENS guard — sideQuery output 达到 20K cap 时 NOOP 防止持久化截断 summary。chatCompressionService.test.ts 直接断言 status 为 NOOP + warn log
v4 nested deprecation warning — 用户在 model.chatCompression.contextPercentageThreshold 处使用废弃字段时启动 stderr warning。config.test.ts 3 个 case 覆盖（present warns / absent no-warn / other field no-warn）

如有 reviewer 觉得需要补真实场景 E2E，可以继续；但 unit test 已严格覆盖且 cost/benefit 边际收益递减。

🤖 Generated with Claude Code using the `e2e-testing` skill.

wenshao · 2026-05-16T23:12:46Z

+        );
+      return {
+        newHistory: null,
+        info: {


[Suggestion] Truncated summary returns NOOP — circuit breaker never trips

When compression output hits the 20K COMPACT_MAX_OUTPUT_TOKENS cap, compress() returns CompressionStatus.NOOP. Since isCompressionFailureStatus() does not match NOOP, consecutiveFailures is never incremented. If the model consistently produces max-length summaries, every subsequent send wastes an API call on a compaction attempt that will be dropped.

Consider treating the MAX_TOKENS truncation as a recoverable failure (increment the counter without locking) so the breaker can trip after repeated occurrences:

Suggested change

info: {

config

.getDebugLogger()

.warn(

`[chat-compression] summary output reached the ` +

`COMPACT_MAX_OUTPUT_TOKENS cap (${COMPACT_MAX_OUTPUT_TOKENS}); ` +

`dropping potentially-truncated result.`,

);

return {

newHistory: null,

info: {

originalTokenCount,

newTokenCount: originalTokenCount,

compressionStatus: CompressionStatus.FAILURE,

},

};

— glm-5.1 via Qwen Code /review

✅ Fixed in 181393c — MAX_TOKENS truncation guard now returns COMPRESSION_FAILED_EMPTY_SUMMARY (a failure status), so the consecutive-failure breaker ticks on repeated truncations instead of wasting an API call per send. Unit test updated to assert the new status.

wenshao · 2026-05-16T23:12:46Z

+                    // suppress further auto-compaction since the chat clearly
+                    // can't shrink — trip the breaker to its NOOP threshold so
+                    // subsequent unforced sends short-circuit at the cheap-gate.
+                    self.consecutiveFailures = MAX_CONSECUTIVE_FAILURES;


[Suggestion] Reactive failure permanently disables auto-compaction on a single transient error

self.consecutiveFailures = MAX_CONSECUTIVE_FAILURES directly sets the counter to the maximum rather than incrementing. A single transient network error during reactive compression permanently disables auto-compaction until the next hard-threshold crossing resets it. While the comment explains the rationale (forced compression already failed), consider incrementing instead so that N distinct failures are required:

Suggested change

self.consecutiveFailures = MAX_CONSECUTIVE_FAILURES;

self.consecutiveFailures += 1;

Or, if the current behavior is intentional, add a short comment noting that hard-tier rescue is the designated recovery path.

— glm-5.1 via Qwen Code /review

✅ Fixed in 181393c — reactive failure now does self.consecutiveFailures += 1 instead of = MAX. Comment notes hard-tier rescue as the designated recovery path. Test marks failed reactive compression attempts... updated to assert counter == 1 (not MAX).

wenshao · 2026-05-16T23:12:46Z

+        DEFAULT_TOKEN_LIMIT;
+      const { hard } = computeThresholds(contextLimit);
+      const effectiveTokens = estimatePromptTokens(
+        this.getHistory(true),


[Suggestion] Redundant deep clones + token estimation on every send

The hard-tier rescue calls estimatePromptTokens(this.getHistory(true), ...) (deep clone + full history walk). Then tryCompress → cheap-gate calls estimatePromptTokens(chat.getHistory(true), ...) again (second deep clone + walk). If the cheap-gate passes, compress() calls chat.getHistory(true) a third time.

Every sendMessageStream pays for 2–3 full-history clones and 2 estimation traversals even when no compaction is needed. Consider computing the effective token count once here and passing it into tryCompress as a pre-computed value:

Suggested change

this.getHistory(true),

const effectiveTokens = estimatePromptTokens(

this.getHistory(true),

userContent,

this.lastPromptTokenCount,

resolveSlimmingConfig(chatCompressionSettings).imageTokenEstimate,

);

This also fixes a minor inconsistency: this call uses the default imageTokenEstimate (1600) while the cheap-gate inside tryCompress uses the user's configured value.

— glm-5.1 via Qwen Code /review

✅ Fixed in 181393c — sendMessageStream now computes effectiveTokens once and threads it through TryCompressOptions.precomputedEffectiveTokens; service.compress skips its own estimation pass when supplied. Also uses resolveSlimmingConfig(chatCompressionSettings).imageTokenEstimate so the rescue and cheap-gate paths see the same value. Steady-state path (count>0) skips the costly getHistory(true) clone since estimatePromptTokens only needs the user message in that branch — drops the per-send clone count from 2–3 to 1.

wenshao · 2026-05-16T23:12:46Z

+      const pendingUserMessage = opts.pendingUserMessage;
+      const effectiveTokens = pendingUserMessage
+        ? estimatePromptTokens(
+            chat.getHistory(true),


[Suggestion] computeThresholds() called twice per send + redundant getHistory(true)

computeThresholds(contextLimit) is called here in the cheap-gate, and again in geminiChat.ts hard-tier rescue for the same contextLimit. Similarly, estimatePromptTokens(chat.getHistory(true), ...) clones the full history even when originalTokenCount > 0 (only estimateContentTokens([userMessage]) is needed in that case — the history is unused).

Consider:

Moving computeThresholds into a lazy field on the service (or passing the result from the caller)

Guarding the getHistory(true) call behind the originalTokenCount === 0 branch to avoid the wasted clone

Suggested change

chat.getHistory(true),

const thresholds = computeThresholds(contextLimit);

const effectiveTokens =

originalTokenCount > 0

? estimateContentTokens([pendingUserMessage ?? []])

: estimatePromptTokens(

chat.getHistory(true),

pendingUserMessage,

0,

slimmingConfig.imageTokenEstimate,

);

— glm-5.1 via Qwen Code /review

✅ Fixed in 181393c alongside the geminiChat-side change — precomputedEffectiveTokens lets sendMessageStream pass the value computed there directly, so the service's cheap-gate no longer recomputes computeThresholds + getHistory(true) + estimatePromptTokens. The steady-state branch (lastPromptTokenCount > 0) also skips history cloning entirely now.

wenshao · 2026-05-16T23:23:04Z

    id: 'context-critical',
    content:
-      'Context is almost full! Run /compress now or start /new to continue.',
+      'Context near hard limit — auto-compact will force on next send. Consider /clear if you want to start fresh.',


[Suggestion] Dead code: getContextUsagePercent has zero callers

The three context-* tips were all rewired to use ctx.thresholds directly (as visible here — the getContextUsagePercent(ctx) >= 95 call on the old L39 is replaced by ctx.thresholds.hard). However, getContextUsagePercent itself (defined at L41) is no longer called anywhere in the codebase but remains exported from ./index.ts. Consider removing it.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Fixed in 181393c — getContextUsagePercent and its index.ts re-export removed. Confirmed no callers in source (grep clean across packages/cli/src and packages/core/src).

wenshao · 2026-05-16T23:23:04Z

+ * `breakdown.thresholds` + `breakdown.currentTier`, which the context command
+ * derives from `computeThresholds()` in core.
+ */
+const CompactionThresholds: React.FC<{


[Suggestion] New CompactionThresholds/ThresholdRow components have no tests

The new ~60 lines of JSX (ThresholdRow at L149, CompactionThresholds at L202, tierColor switch at ~L158) that render the three-tier ladder with color-coded tier labels and isCurrent arrow indicators are untested. No test file exists for ContextUsage.tsx, and this PR adds none. While the underlying data calculations are covered in contextCommand.test.ts, the rendering behavior (tier color mapping, arrow positioning, conditional visibility) is not verified in CI.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Partially fixed in 181393c — added packages/cli/src/ui/components/views/ContextUsage.test.tsx with 4 ink-testing-library cases covering the new Compaction-thresholds section: header + 4 threshold rows render, ▶ marker placement per current tier (safe/warn/hard), and the colored tier label. Skipped a true snapshot because the precise frame layout drifts with terminal width — string-match assertions on labels/marker presence are more durable.

wenshao

Two additional observations that don't map to specific diff lines:

Missing test: contextPercentageThreshold setting is silently ignored. The config.ts deprecation warns on stderr, but no test verifies the behavioral change — that setting contextPercentageThreshold: 0 (which previously disabled auto-compaction) now has no effect. Consider adding a test pinning this.
tierTokens = 0 when isEstimated shows misleading "safe" tier. In contextCommand.ts, when no API data exists yet (first render, --continue), tierTokens = 0 makes currentTier always 'safe' even for large inherited history. Consider using estimated overhead as the tier input when isEstimated.

— mimo-v2.5-pro via Qwen Code /review

wenshao · 2026-05-17T05:19:18Z

+ * Average bytes-per-token for char-based token estimation.
+ * Matches claude-code's roughTokenCountEstimation default (tokens.ts).
+ */
+export const BYTES_PER_TOKEN = 4;


[Suggestion] BYTES_PER_TOKEN is a misleading name — the value divides character counts (from estimateContentChars → string.length), not byte counts. For CJK text (3 bytes/char UTF-8), the name actively misleads. The adjacent module compactionInputSlimming.ts correctly names the identical ratio TOKEN_TO_CHAR_RATIO = 4.

Suggested change

export const BYTES_PER_TOKEN = 4;

/**

* Average characters-per-token for char-based token estimation.

* Matches the inverse of TOKEN_TO_CHAR_RATIO in compactionInputSlimming.ts.

*/

export const CHARS_PER_TOKEN = 4;

Then update the two usages (Math.ceil(totalChars / CHARS_PER_TOKEN) at line 39 and the return at line 66).

— mimo-v2.5-pro via Qwen Code /review

✅ Fixed in 181393c — renamed BYTES_PER_TOKEN → CHARS_PER_TOKEN (the inputs are character counts from string.length, not byte counts; old name misleads on CJK). Doc updated to reference TOKEN_TO_CHAR_RATIO in compactionInputSlimming.ts as the inverse.

wenshao · 2026-05-17T05:19:18Z

+    if (
+      !isSummaryEmpty &&
+      typeof compressionOutputTokenCount === 'number' &&
+      compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS


[Suggestion] Off-by-one in truncation guard: >= COMPACT_MAX_OUTPUT_TOKENS rejects summaries that are exactly 20,000 tokens. Since maxOutputTokens is set to the same value, a model that produces a valid summary at exactly the cap limit gets dropped. The > operator is more appropriate — it catches outputs that exceeded the cap (impossible with the API budget, but defensive), while allowing outputs that landed exactly at the limit.

Suggested change

compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS

compressionOutputTokenCount > COMPACT_MAX_OUTPUT_TOKENS

Note: the dropped NOOP does NOT trip the circuit breaker (NOOP is not a failure status in isCompressionFailureStatus), so the worst case is a retry loop rather than permanent disablement. Still worth fixing to avoid unnecessary retry cycles.

— mimo-v2.5-pro via Qwen Code /review

❌ Declined — the >= operator is intentionally conservative: with maxOutputTokens=20K, p99.99 of real summaries is ~17K (per claude-code data), so landing exactly at the cap is far more likely truncation than a clean stop. > would make the guard effectively dead (the API can't return more than the cap). With R1.1 now in place (this batch), persistent truncation trips the breaker after MAX_CONSECUTIVE_FAILURES, so the worst-case wasted-retry cost is bounded. The proper long-term fix is plumbing finish_reason through runSideQuery, which is out of scope here.

wenshao · 2026-05-17T05:19:18Z

+      estimateContentTokens([userMessage], imageTokenEstimate)
+    );
+  }
+  return estimateContentTokens([...history, userMessage], imageTokenEstimate);


[Suggestion] When lastPromptTokenCount === 0 (first send after --continue or inherited history), the fallback estimates only history + userMessage. It misses system prompt (~8-15K tokens), tool definitions (~5K), skill content, and cached content. This underestimates by ~15-20K tokens, which could cause the hard-tier rescue to not fire when it should.

The docstring correctly warns "using it to SKIP compaction is not [safe]" and the reactive overflow is the safety net, but closing the gap would improve first-send behavior. Consider adding a configurable or estimated overhead baseline:

Suggested change

return estimateContentTokens([...history, userMessage], imageTokenEstimate);

// Fallback: estimate from history + user message. Note this underestimates

// by ~15-20K tokens (system prompt, tool definitions, skills) — the reactive

// overflow handler is the safety net if the hard-tier rescue misses.

return estimateContentTokens([...history, userMessage], imageTokenEstimate);

— mimo-v2.5-pro via Qwen Code /review

✅ Partially fixed in 181393c — added an explanatory comment on the fallback branch documenting the ~15-20K under-estimate (system prompt + tool definitions + skills + cache headers) and that reactive overflow is the safety net. Skipped adding a magic overhead constant because the actual overhead is per-config (depends on tools loaded, skills active) and a fixed value would be a different kind of guess.

wenshao · 2026-05-17T05:19:18Z

    });

-    it('clears hasFailedCompressionAttempt after a forced successful compression', async () => {
+    it('forwards the pending user message to the compression cheap-gate', async () => {


[Suggestion] This test mocks compress entirely, so the real estimatePromptTokens(history, pendingUserMessage, 0) full-history fallback path is never exercised end-to-end. The "first send after --continue / sub-agent inherited history" scenario — where lastPromptTokenCount === 0 and only the full-history estimate can cross the auto threshold — has no integration test.

Consider adding a test that uses the real ChatCompressionService (not a mock) with a GeminiChat seeded with large inherited history and lastPromptTokenCount = 0, then asserts that sendMessageStream triggers compaction.

— mimo-v2.5-pro via Qwen Code /review

✅ Fixed in 181393c — added triggers compaction end-to-end through the real ChatCompressionService when lastPromptTokenCount === 0 and inherited history is large to geminiChat.test.ts. Uses the real service (no vi.spyOn on compress.prototype), seeds 400K chars of history, sets lastPromptTokenCount=0, mocks runSideQuery at the baseLlmClient layer, and asserts the stream emits a COMPRESSED event.

wenshao · 2026-05-17T05:19:18Z

+    expect(estimateContentTokens([c], 1600)).toBe(1600);
+  });
+
+  it('estimates functionCall (json-dense) contributes some positive count', () => {


[Suggestion] estimateContentTokens has a test for functionCall but no test for functionResponse, which has a distinct branch in estimateContentChars (nested parts walk, 64-char floor for wrapper metadata). Tool-heavy conversations are the exact scenario where context grows fastest.

Suggested change

it('estimates functionCall (json-dense) contributes some positive count', () => {

it('estimates functionResponse (json-dense) contributes some positive count', () => {

const c: Content = {

role: 'user',

parts: [{ functionResponse: { name: 'tool', response: { result: 'data'.repeat(100) } } }],

};

const result = estimateContentTokens([c]);

expect(result).toBeGreaterThan(0);

});

it('estimates functionCall (json-dense) contributes some positive count', () => {

— mimo-v2.5-pro via Qwen Code /review

✅ Fixed in 181393c — added estimates functionResponse (nested parts) contributes some positive count to tokenEstimation.test.ts. Tool-heavy conversations were the gap.

Adds a defensive guard in ChatCompressionService.compress() that detects when the side-query summary hit COMPACT_MAX_OUTPUT_TOKENS (20K). In that case the summary is likely truncated mid-content, so we drop it and return NOOP rather than persist a half-summary. The next send re-tries; reactive overflow still catches the catastrophic case where the API rejects the next request as too large. Documented in the design doc as risk #2; the bot reviewer on PR #4168 correctly pushed for it to land alongside the threshold redesign rather than as a follow-up since the new 20K cap is what makes truncation likely in the first place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Task 11 redesign updated the non-interactive text formatter (formatContextUsageText) but left ContextUsage.tsx — the interactive React component that real /context users see — unchanged. As a result the TUI still showed the old single "Autocompact buffer" line and none of the new warn/auto/hard ladder. Adds a "Compaction thresholds" section after the per-category breakdown: - Effective window - Warn / Auto / Hard threshold rows with a ▶ marker on the row the current usage has crossed - Current tier label coloured by severity (safe→green, warn/auto→ yellow, hard→red) The existing progress bar legend (Used / Free / Autocompact buffer) is preserved because it's tied to the three-segment progress bar visualisation; the new section adds the absolute numbers + tier badge on top of that. Caught by the tmux e2e test (PR #4168 ci-monitor follow-up). Pre-fix the assertion 'Compaction thresholds' missed completely from the TUI; post-fix the new section renders correctly for fresh and live sessions on 1M / 200K / 128K windows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Behavior fixes: - MAX_TOKENS truncation guard now returns COMPRESSION_FAILED_EMPTY_SUMMARY instead of NOOP so the consecutive-failure breaker actually trips after repeated max-length summaries (R1.1). - Reactive overflow failure increments consecutiveFailures by 1 instead of latching to MAX in one shot, so a transient network blip doesn't permanently disable auto-compaction. The hard-tier rescue resets the counter, which remains the designated recovery path (R1.2). - /context current-tier classification uses rawOverhead (system + tools + memory + skills) as the tier input when API data is not yet available, rather than 0 — large inherited contexts no longer silently show 'safe' (R2.2). Performance: - sendMessageStream computes effectiveTokens ONCE and passes it through TryCompressOptions.precomputedEffectiveTokens, so the cheap-gate inside service.compress doesn't redo the estimation. Also fixes the imageTokenEstimate inconsistency between the rescue and cheap-gate paths (R1.3 + R1.4). - Steady-state path (lastPromptTokenCount > 0) skips the costly getHistory(true) clone — estimatePromptTokens only needs the user message in that branch. Code hygiene: - BYTES_PER_TOKEN → CHARS_PER_TOKEN (inputs are char counts, not byte counts; CJK text would mislead under the old name) (R3.1). - Drop dead getContextUsagePercent helper + index re-export — no callers in source after the threshold rewire (R1.5). - Add a comment on estimatePromptTokens' first-send fallback documenting the ~15-20K under-estimate (system prompt + tools + skills) and that reactive overflow is the safety net (R3.3). Tests: - New CLI ContextUsage.test.tsx exercises the React renderer for the three-tier section: section presence, ▶ marker placement per tier, current-tier label coloring (R1.6). - New chatCompressionService.test.ts case pins that a stale contextPercentageThreshold: 0 value in user settings no longer short-circuits compaction (R2.1). - New tokenEstimation.test.ts case covers functionResponse (distinct nested-parts branch from functionCall) (R3.5). - New geminiChat.test.ts integration test exercises the real ChatCompressionService — not a mock — for the first-send-after- inherited-history scenario where lastPromptTokenCount=0 and only the full-history estimate can cross the auto threshold (R3.4). Declined: R3.2 (change `>=` to `>` on the MAX_TOKENS guard). The current operator catches the at-cap case as suspicious, which is intentional — landing exactly at the output cap is far more likely truncation than clean stop given p99.99 ≈ 17K. With R1.1 in place, persistent truncations trip the breaker after MAX_CONSECUTIVE_FAILURES so the worst case is bounded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LaZzyMan · 2026-05-18T02:57:31Z

Review batch 4 — commit `181393c49`

Rebased onto current main (3 conflict files resolved) and addressed the new wenshao-channel review.

Outcomes per finding

#	Outcome	One-line
R1.1 truncation→NOOP doesn't trip breaker	✅ Fixed	guard returns `COMPRESSION_FAILED_EMPTY_SUMMARY` so counter ticks
R1.2 reactive failure latches breaker	✅ Fixed	`consecutiveFailures += 1` instead of `= MAX`
R1.3 redundant deep clones + double estimation	✅ Fixed	`precomputedEffectiveTokens` threaded through opts; steady-state path skips `getHistory(true)`
R1.4 `computeThresholds` × 2 + redundant `getHistory`	✅ Fixed	same change as R1.3 + resolved `imageTokenEstimate` consistency
R1.5 `getContextUsagePercent` dead code	✅ Fixed	removed, no callers in source
R1.6 new TUI components untested	✅ Partially	added `ContextUsage.test.tsx` with 4 `ink-testing-library` cases (header, tier markers, color)
R2.1 missing test for deprecated field ignored	✅ Fixed	added unit test asserting `contextPercentageThreshold: 0` no longer disables compaction
R2.2 isEstimated → misleading 'safe'	✅ Fixed	tier classification uses `rawOverhead` (not 0) when API data absent
R3.1 `BYTES_PER_TOKEN` misleading on CJK	✅ Fixed	renamed to `CHARS_PER_TOKEN`; doc updated
R3.2 `>=` should be `>`	❌ Declined	guard intentionally conservative — at-cap is more often truncation than clean stop; R1.1 bounds worst case
R3.3 first-send fallback underestimates	✅ Partially	added comment documenting the ~15-20K under-estimate + reactive safety net; declined the magic-overhead constant
R3.4 first-send-after-continue not integration-tested	✅ Fixed	added real-service integration test in `geminiChat.test.ts`
R3.5 `functionResponse` not tested	✅ Fixed	added `estimateContentTokens` test for nested-parts branch

Verification

Affected unit/integration tests: 172/172 pass (chatCompressionService.test.ts 68 · geminiChat.test.ts 96 · tokenEstimation.test.ts 8)
New ContextUsage.test.tsx: 4/4 pass
npm run typecheck clean across all workspaces
npm run lint clean (the 11 errors on .claude/skills/e2e-testing/scripts/*.js are pre-existing and unrelated to this PR)
Rebased onto current main (1c529e4f0); 3 conflict files (geminiChat.ts, chatCompressionService.ts/test.ts) resolved to merge main's bypassTokenThreshold heap-pressure path with this PR's consecutiveFailures breaker — both mechanisms now coexist (heap-pressure bypass overrides the breaker carve-out)

11 inline threads have per-thread gh api replies pointing at this commit. Ready for next pass.

🤖 Triaged via the review-response skill.

wenshao

总评

Reviewed all 22 changed files focusing on chatCompressionService.ts / geminiChat.ts / tokenEstimation.ts / contextCommand.ts / tipRegistry.ts / ContextUsage.tsx 及对应测试与文档。

质量高,可以合并。 computeThresholds 是纯函数 + 覆盖 32K / 64K / 128K / 200K / 1M / 10K-极端六个边界窗口的单元用例,数学清晰且可证;max(proportional, absolute) 的组合让小窗口自动降级到比例分支、大窗口完全由绝对分支主导;hard = max(rawHard, auto) 的 collapse-to-auto 兜住了小窗口下 hard < auto 的逻辑错。estimatePromptTokens 注释非常诚实地写明了首发 fallback 的 ~15-20K 偏低与 reactive overflow 作为 safety net 的角色,设计-代码-注释一致。steady-state 上 lastPromptTokenCount > 0 时传 [] 给 estimatePromptTokens 跳过 getHistory(true) clone 是合理的 hot-path 优化。

最赞的是 R3.4 的端到端集成测试 —— 真 sendMessageStream → tryCompress → 真 ChatCompressionService → 真 cheap-gate → splitter → mocked baseLlmClient → persistence 整条链走通,直接覆盖了历史上最容易出 bug 的 lastPromptTokenCount === 0 分支。

发现(以 inline 评论标注)

建议:contextCommand.ts:313 的 tier 分类用 rawOverhead,不包含 messagesTokens;一个 --continue 恢复了 100K 历史消息的 session 仍会在 /context 显示 "safe",但下一条 send 立刻在 cheap-gate 触发压缩 —— UI 与 runtime 判断不一致。建议要么从 chat 取 history 复用 estimatePromptTokens,要么收紧注释 scope。
建议:chatCompressionService.ts:553 的 >= COMPACT_MAX_OUTPUT_TOKENS 截断判据是 heuristic,正好 20K 的合法 summary 会被误判为截断;同时该路径复用 COMPRESSION_FAILED_EMPTY_SUMMARY 会让 telemetry 分不清 prompt 质量问题 vs 容量问题。建议:挂 TODO(finish_reason) + 加 COMPRESSION_FAILED_OUTPUT_TRUNCATED 子状态。
建议:geminiChat.ts:752 hard-rescue 在 tryCompress(force=true) 前 reset counter,而 force=true 路径在 service 失败分支又 skip 自增 —— 结果 hard-rescue 失败永远不累加 counter,只能靠 reactive overflow 兜底。语义合理但字段命名误导,在 consecutiveFailures 字段 JSDoc 加一行说明。
应修:CompressOptions.hasFailedCompressionAttempt: boolean → consecutiveFailures: number 是 SDK breaking change,PR description / release notes 里目前没列出。

风险审计声明

我反向审计了:

hard-rescue 失败的无限循环可能 —— ✅ 不会,失败后 API 仍发出,reactive overflow 接管;
COMPACT_MAX_OUTPUT_TOKENS = 20K 永久压不下的可能 —— ⚠️ 三次 false-positive 截断会熔断,见发现 #2;
跨 provider 行为一致性 —— ✅ thinking 关 + maxOutputTokens 钉死把不确定性最小化(Anthropic thinking budget / OpenAI reasoning tokens / Gemini 模型差异都被规避);
consecutiveFailures 在 force / heap-pressure / reactive 各路径的累计边界 —— ✅ 看了 R1.2 / R1.4 注释,逻辑闭环,但需要发现 #3 的注释补充;
/context UI 与 cheap-gate 的同源性 —— ⚠️ 发现 #1;
兼容性 —— ⚠️ 发现 #4。

docs/plans/2026-05-14-auto-compaction-threshold-redesign.md 1752 行在 PR 里也注意到了 —— 风格层面,plan doc 在 tree 里的体量偏大,但因为不进运行时,不阻断合并。

wenshao · 2026-05-18T03:56:32Z

+  // should not silently show "safe" just because the API hasn't been hit.
+  // The estimate is a lower bound (excludes message body until first turn)
+  // so the tier may under-classify, but never over-classifies. (R2.2)
+  const tierTokens = isEstimated ? rawOverhead : apiTotalTokens;


/context 估算分支可能仍误报 "safe":这条注释说的目标是 "restored session with 800K of inherited history should not silently show safe",但 rawOverhead 只覆盖 system prompt / builtin tools / mcp tools / memory / skills,不包含 messagesTokens。

实际后果:一个 --continue 恢复了 100K 历史消息(不是 memory/skills overhead)的 session,首发前跑 /context 仍然显示 safe,但下一条 send 立刻在 cheap-gate 触发压缩 —— UI 与 runtime 的判断不一致。

cheap-gate 在 chatCompressionService.ts 用 estimatePromptTokens(history, pendingUserMessage, lastPromptTokenCount=0) 才能正确捕获,这里用的 rawOverhead 是完全不同的数据源。

建议二选一:

(a) 把 chat history 透传给 collectContextData,在估算分支调用 estimatePromptTokens(history, undefined, 0, imageTokenEstimate),与 cheap-gate 同源;

(b) 收紧注释,明说 "这条 fix 只覆盖 overhead-heavy 场景(memory / skills / MCP),message-heavy 场景(--continue 恢复)仍要等首发 cheap-gate 才会被纠正"。

R3.4 的端到端测试也佐证了这点:那个用例用 chat.setHistory([400K 字符的 user 消息, ...]) + lastPromptTokenCount === 0,cheap-gate 能正确触发压缩 —— 但同一时刻 /context 会显示 safe。

✅ Partially fixed in 681b705 — went with reviewer's option (b): tightened the inline comment to scope the fix to overhead-heavy sessions, and added a TODO to plumb chat history into collectContextData for same-source-of-truth as the cheap-gate. Option (a) requires adding a parameter to collectContextData (Config doesn't expose the active chat today), which is a bigger plumbing change deferred.

wenshao · 2026-05-18T03:56:32Z

+          // perspective a truncated summary is unusable just like an empty
+          // one. `isCompressionFailureStatus()` returns true for this enum,
+          // so non-force callers will tick the consecutive-failure counter.
+          compressionStatus: CompressionStatus.COMPRESSION_FAILED_EMPTY_SUMMARY,


输出截断 guard 有两个可以加强的点:

(1) 判据用 finish_reason 而不是 >= 20K(L553):compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS 是一个 heuristic —— 恰好等于 20K 的合法 summary 也会被判成 truncated → 计 failure → 三次后熔断。PR description 里也承认 "A finish_reason === MAX_TOKENS NOOP guard would be a sensible follow-up"。至少在这里挂一个 TODO(finish_reason) 注释锚住跟进项:

// TODO: switch to finish_reason === MAX_TOKENS when sideQuery surfaces it // — the current >= cap heuristic false-positives on legitimate 20K summaries.

(2) 复用 COMPRESSION_FAILED_EMPTY_SUMMARY 会模糊 telemetry(L572):空 summary(prompt 质量问题,可能要调 prompt / 切分点)和 truncated(容量问题,要调 cap 或 splitter)是两个完全不同的运营信号。合并到同一 enum 值后,从日志 / telemetry 看不出到底是哪一类。

建议加一个 COMPRESSION_FAILED_OUTPUT_TRUNCATED,让 isCompressionFailureStatus() 同样返回 true(对持久化层等价),但日志和 telemetry 能区分这两种失败模式。

注释 L562 已经写了 "Reuse the empty-summary status: from the persistence layer's perspective a truncated summary is unusable just like an empty one" —— 是合理的近似,但代价是 observability 损失。

✅ Fixed in 681b705 (both parts): (1) added TODO(finish_reason) comment at the truncation guard documenting that >= cap is a heuristic awaiting runSideQuery to surface finish_reason; (2) added CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED enum value distinct from EMPTY_SUMMARY so telemetry can separate prompt-quality failures from capacity failures. isCompressionFailureStatus() returns true for both, so persistence/breaker behaviour is unchanged. Updated the truncation test to expect the new status.

wenshao · 2026-05-18T03:56:32Z

+      );
+      const shouldForceFromHard = effectiveTokens >= hard;
+      if (shouldForceFromHard) {
+        this.consecutiveFailures = 0;


Hard-rescue 与熔断计数器的实际语义需要写明:这里在调 tryCompress(force=true) 之前先把 counter 清 0;而 force=true 路径在 failure 分支又跳过了 this.consecutiveFailures += 1(chatCompressionService.ts 那一侧 if (!force) 守卫)。

两者组合的后果:hard-rescue 反复失败时,consecutiveFailures 永远不会累加,熔断对这条路径完全无效 —— 只能靠 reactive overflow 兜底,而 reactive overflow 自己也只会 +1 一次(那段代码也是 force=true,作为补偿才显式 self.consecutiveFailures += 1)。

这是合理的 fail-open 设计(hard 已经预测会爆,继续退让最有意义),但 consecutiveFailures 字段的字面含义和实际语义不一致:它实际是 "非 force、非 hard-rescue 的连续失败次数",而不是字面意义上的所有失败。

建议在 GeminiChat.consecutiveFailures 的 JSDoc(geminiChat.ts:459 附近)或这里的 reset 处加 1-2 行注释明说:

// Hard-rescue is a fail-open exception to the breaker: we reset the counter // because the runtime decided overflow is imminent regardless of recent // failure history. Combined with `if (!force)` in the service's failure // branch, hard-rescue failures never accumulate — reactive overflow is the // real safety net for this path (it explicitly bumps the counter by 1). this.consecutiveFailures = 0;

这样未来调试 "为什么 hard 一直触发但 counter 是 0" 时不至于困惑。

✅ Fixed in 681b705 — expanded the consecutiveFailures field JSDoc on GeminiChat to spell out the real semantics: it tracks "non-force, non-hard-rescue consecutive failures". Listed each path's interaction with the counter (auto +1, manual /compress skipped, hard-rescue resets BEFORE force=true), and called out reactive overflow as the actual safety net (it explicitly bumps the counter by +1). Future debug-time confusion about "why is hard-rescue firing but counter is 0" now has a one-line answer in the field doc.

wenshao · 2026-05-18T03:56:32Z

+   * force=true call resets it.
   */
-  hasFailedCompressionAttempt: boolean;
+  consecutiveFailures: number;


SDK breaking change 需要补到 release notes:hasFailedCompressionAttempt: boolean → consecutiveFailures: number 在 CompressOptions 上是字段名 + 类型双变。CompressOptions 通过 ChatCompressionService 间接暴露在 @qwen-code/qwen-code-core 的 public surface,下游 SDK 直接调 service.compress({ ..., hasFailedCompressionAttempt: true }) 的代码会拿到 TS 编译错误,且语义也变了(布尔指示 vs 计数累加,默认值 0 不等价于 false 在某些边界上)。

PR description 的 "Breaking changes / migration notes" 段落里列了 contextPercentageThreshold 移除,但没提这条。建议补一行(并在 release notes / CHANGELOG 同步):

CompressOptions.hasFailedCompressionAttempt: boolean 重命名为 consecutiveFailures: number。SDK 消费者需要从 "传 true 表示已失败" 改为 "传当前累计失败次数(通常由 GeminiChat 维护)"。语义变化:true 旧含义是 "永久禁用 auto",新的 >= MAX_CONSECUTIVE_FAILURES 等价。

这条改动我审计了一下使用面:仓库内只有 GeminiChat.tryCompress 一个调用方在传这个字段,所以内部 migration 风险很低;但 core 包的 d.ts 是会发出去的,对外仍是 breaking。

✅ Fixed in 681b705 — added a dedicated SDK-Breaking-change subsection to the design doc (docs/design/auto-compaction-threshold-redesign.md) covering the CompressOptions.hasFailedCompressionAttempt: boolean → consecutiveFailures: number rename with a side-by-side semantics table and migration guide (true → MAX_CONSECUTIVE_FAILURES, false → 0). The PR description's release-notes block is sourced from the design doc, so this propagates.

- R5.1: tighten /context tier comment + TODO. The rawOverhead-based fix doesn't cover `--continue` restores with many history messages (since rawOverhead excludes messagesTokens). UI may still show 'safe' for one render until the first send. Documented inline and added a TODO to plumb chat history into collectContextData for same-source-of-truth as the cheap-gate. - R5.2a: add TODO(finish_reason) at the truncation guard. The `>= cap` heuristic false-positives on legitimate at-cap summaries; the proper signal is finish_reason which runSideQuery doesn't surface today. - R5.2b: split telemetry — new CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED enum value. Distinct from EMPTY_SUMMARY so logs/telemetry can tell prompt-quality failures (tune prompt / splitter) from capacity failures (raise cap / shrink splitter input). isCompressionFailureStatus() treats both as failures so the breaker behavior is unchanged. - R5.3: expand consecutiveFailures JSDoc to clarify it tracks "non-force, non-hard-rescue consecutive failures" — hard-rescue resets the counter and force=true skips increments, so the counter is the "regular path" health signal only; reactive overflow is the real safety net for the force-only paths. - R5.4: document the CompressOptions field rename (hasFailedCompressionAttempt: boolean → consecutiveFailures: number) as an SDK breaking change in the design doc with migration guide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LaZzyMan · 2026-05-18T04:08:26Z

Review batch 5 — commit `681b70501`

谢谢 high-quality review。"质量高,可以合并" 这条评价对 batch 4 是莫大的鼓励。这次 batch 5 全部 4 项 finding 均已处理（无 declined）。

#	Outcome	One-line
R5.1 `/context` tier 漏 messagesTokens	✅ Partial	选 path (b): 收紧注释 + TODO；path (a) 需要改 Config 接口暴露 chat instance，是更大改动
R5.2a `>=` heuristic 应基于 finish_reason	✅ Fixed	加 `TODO(finish_reason)` 锚定 follow-up
R5.2b 截断与空 summary 模糊 telemetry	✅ Fixed	新增 `CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED` 枚举值
R5.3 hard-rescue counter 语义不一致	✅ Fixed	`consecutiveFailures` JSDoc 扩写，列明所有路径的累加/重置语义
R5.4 SDK breaking change 未列	✅ Fixed	design doc Breaking-change 章节补全 `hasFailedCompressionAttempt` → `consecutiveFailures` 含迁移指引

Verification

typecheck 跨 4 workspace clean
core 测试 172/172 pass (chatCompressionService.test.ts · geminiChat.test.ts · tokenEstimation.test.ts)
cli 测试 15/15 pass (contextCommand.test.ts · ContextUsage.test.tsx · tipRegistry.test.ts)
新增 COMPRESSION_FAILED_OUTPUT_TRUNCATED enum 值附加在 NOOP 之后（不 shift 现有 enum 数值，二进制兼容）

Reviewer 风险审计回应

针对 reviewer 总评中的 ⚠️ 项：

Where is the config saved? #2 三次 false-positive 熔断风险：已通过 R5.2b 的 enum 拆分让运维能从 telemetry 区分"提示词质量" vs "容量"；finish_reason 路径上线后 R5.2a 的 TODO 可一次清理掉 false-positive
pre-release: fix ci #1 /context UI vs cheap-gate 同源性：注释和 TODO 已说明只覆盖 overhead-heavy 场景；message-heavy 在首发 cheap-gate 即纠正（≤1 render lag）
Are you interested in AI Terminal? #4 SDK 兼容性：design doc 已含 migration guide，仓库内只 GeminiChat.tryCompress 一个内部调用方，外部影响面小

plan doc 体量

reviewer 提到 docs/plans/2026-05-14-auto-compaction-threshold-redesign.md 1752 行偏大、不进运行时。同意它是一次性产物，仅作为本 PR 的实施记录。如果你认为应该移到 .github/closed-PRs/ 这类归档位置，可以单独提一个 follow-up。

5 个 inline 评论都已在原 thread 单条回复并指向 681b70501。

🤖 Triaged via the review-response skill.

wenshao

Test coverage gaps (no specific diff line to anchor):

[Suggestion] compress() precomputedEffectiveTokens path (path 1 — skip estimation) has no unit test at the service level. If future refactoring breaks the priority logic, the regression is invisible at this layer. (chatCompressionService.test.ts)
[Suggestion] Hard-tier rescue tests verify force + pendingUserMessage are passed but never assert precomputedEffectiveTokens is forwarded. The estimation-reuse optimization (a core perf win of this PR) is unprotected against regression. (geminiChat.test.ts)
[Suggestion] COMPRESSION_FAILED_OUTPUT_TRUNCATED is included in isCompressionFailureStatus() but never exercised as a failure variant in any geminiChat-level circuit breaker test. Only INFLATED and EMPTY_SUMMARY are tested. (geminiChat.test.ts)
[Suggestion] The originalTokenCount === 0 + pendingUserMessage path (which falls through to the first-send estimation fallback) has no dedicated test case. The existing "estimated tokens exceed threshold" test uses originalTokenCount = 160_000 (non-zero), so the zero-path branch is uncovered at the service level. (chatCompressionService.test.ts)

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-18T06:23:44Z

+      consecutiveFailures >= MAX_CONSECUTIVE_FAILURES &&
+      !force &&
+      !bypassTokenThreshold
    ) {


[Critical] Circuit breaker NOOP is completely silent. When consecutiveFailures >= MAX_CONSECUTIVE_FAILURES causes compress() to return CompressionStatus.NOOP, there is zero log output. The only observable signal that auto-compaction has stopped is the absence of compression — a nightmare to debug at 3 AM when the only visible symptom is "API context overflow."

Suggested change

) {

this.config

.getDebugLogger()

.warn(

`Auto-compaction breaker tripped: consecutiveFailures=${opts.consecutiveFailures} >= MAX=${MAX_CONSECUTIVE_FAILURES}. Use /compress to reset.`,

);

return { compressionStatus: CompressionStatus.NOOP };

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Fixed in 1030557 — added a warn-level log on the breaker NOOP path quoting consecutiveFailures, MAX, and recovery instruction (/compress). Was: silent. Now: shows up in logs so absence-of-compaction is no longer invisible.

wenshao · 2026-05-18T06:23:44Z

+    // trying after MAX_CONSECUTIVE_FAILURES strikes rather than burn an API
+    // call on every send. Reactive overflow still catches the catastrophic
+    // case. See docs/design/auto-compaction-threshold-redesign.md risk #2.
+    //


[Suggestion] The truncation guard compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS (>=) can false-positive on a legitimate summary that happens to land exactly at 20K tokens. Each false-positive counts as a compression failure, incrementing the circuit breaker — 3 false-positives permanently disable auto-compaction. The code's own TODO acknowledges the correct approach is finish_reason === 'length' / MAX_TOKENS. Until runSideQuery surfaces finish_reason, consider using > instead of >= to shrink the false-positive window (a model producing exactly 20K is far less likely than one exceeding 20K).

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Fixed in 1030557 — >= changed to >. With R5.2b's new COMPRESSION_FAILED_OUTPUT_TRUNCATED status now counting toward the breaker, the false-positive cost (3 strikes → permanent disable) was too high. The proper finish_reason path is still TODO'd; > will essentially never fire today (API enforces hard cap) but is the right semantics once finish_reason lands.

wenshao · 2026-05-18T06:23:44Z

+        this.lastPromptTokenCount,
+        imageTokenEstimate,
+      );
+      const shouldForceFromHard = effectiveTokens >= hard;


[Suggestion] Hard-tier rescue resets consecutiveFailures = 0 before calling tryCompress(force=true), and force=true causes tryCompress's failure branch to skip the consecutiveFailures += 1 increment. This means repeated hard-rescue compression failures never trip the breaker — each failing send burns one doomed compression API call with no limit. The opposite extreme of the proactive path's breaker; consider either saving/restoring the pre-call counter value on failure, or adding a separate hard-rescue cooldown counter (e.g., skip hard rescue for N turns after M consecutive hard-rescue failures).

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Fixed in 1030557 — added a dedicated hardRescueFailureCount field on GeminiChat, bounded by MAX_CONSECUTIVE_FAILURES. After that many consecutive hard-rescue failures, the rescue stops firing and reactive overflow takes over. Resets on any compression success. Documented in JSDoc + observable via the new warn-log when the rescue fails.

wenshao · 2026-05-18T06:23:44Z


+export type ContextTier = 'safe' | 'warn' | 'auto' | 'hard';
+
+export interface ContextThresholds {


[Suggestion] ContextThresholds (CLI package) has the exact same shape as CompactionThresholds (core package) — four identically-typed fields (effectiveWindow, warn, auto, hard). Maintaining a duplicate type definition across packages creates a silent drift risk: if core's shape changes, the CLI type becomes stale with no compiler error (because the runtime value still flows through untyped boundaries). Consider re-exporting CompactionThresholds from core's public API and referencing it here, or using Pick<CompactionThresholds, ...> to make the relationship explicit.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

✅ Fixed in 1030557 — ContextThresholds reduced to a type alias of core's CompactionThresholds (export type ContextThresholds = CompactionThresholds;). The four fields now have a single source of truth; the previously-silent drift risk is gone.

wenshao · 2026-05-18T06:23:45Z

-                    self.hasFailedCompressionAttempt = true;
+                    // Reactive compression is force=true so tryCompress's
+                    // failure branch did not increment the counter. Count it
+                    // explicitly as one strike — a single transient error


[Suggestion] The reactive overflow path (self.consecutiveFailures += 1 at L984) shares the same consecutiveFailures counter as the proactive cheap-gate path. This means reactive compression failures (transient model issues producing poor summaries) disable proactive auto-compaction. These two paths have different failure semantics — reactive is a last-resort recovery, proactive is a performance optimization — and sharing a breaker undermines fault isolation. Consider a separate reactiveConsecutiveFailures counter, or at minimum document the coupling in the field's JSDoc.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

❌ Declined — coupling is documented in R5.3 JSDoc and the related R1.2 fix already softened reactive's impact: reactive failure increments by +1, not =MAX, so it takes MAX_CONSECUTIVE_FAILURES reactive failures to disable proactive. That's the correct outcome for a chat where reactive consistently fails — splitting the counter would just delay the bound without changing the steady-state behavior. R6.7 (this batch) additionally fixed reactive's missing increment in the catch block so the bound is now uniformly enforced across both reactive failure modes.

wenshao

Additional observation (not tied to a diff line): docs/users/configuration/settings.md still documents model.chatCompression.contextPercentageThreshold as a live setting (including "Use 0 to disable compression entirely"), but this field has been removed from ChatCompressionSettings and the value is silently ignored. The settings docs should be updated to reflect the removal.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

wenshao · 2026-05-18T15:10:58Z

+        imageTokenEstimate,
+      );
+      const shouldForceFromHard = effectiveTokens >= hard;
+      if (shouldForceFromHard) {


[Critical] Hard-tier rescue fires with zero logging

When shouldForceFromHard is true, the code silently resets consecutiveFailures = 0 and passes force=true to tryCompress. There is no debugLogger.info or debugLogger.warn at this trigger point.

The JSDoc at L469 says "If you're debugging 'why is hard-rescue firing but the counter is 0', that's by design" — but there's nothing in the log to confirm that IS what's happening. An oncall engineer seeing repeated forced compaction with counter stuck at 0 cannot distinguish "hard-tier rescue by design" from "something is resetting the counter in a loop" without reading source code.

Suggested change

if (shouldForceFromHard) {

if (shouldForceFromHard) {

debugLogger.info(

`[compaction] hard-tier rescue: effectiveTokens=${effectiveTokens} >= hard=${hard}, forcing compaction (consecutiveFailures ${this.consecutiveFailures} → 0)`,

);

this.consecutiveFailures = 0;

}

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — added a debugLogger.info call at the rescue trigger point capturing effectiveTokens, the hard threshold, and the counter transition (consecutiveFailures N → 0). Now the field JSDoc's claim that "hard-rescue resets the counter by design" has a matching observable signal.

wenshao · 2026-05-18T15:10:58Z

+                    // the intended recovery path. (review #4168 R1.2)
+                    self.consecutiveFailures += 1;
                  }
                } catch (compressionError) {


[Critical] Reactive overflow catch block doesn't increment consecutiveFailures

The status-based failure path at L984 correctly increments self.consecutiveFailures += 1, but thrown exceptions (network errors, model 500s, timeouts) in this catch block bypass the increment entirely. If reactive compression consistently throws rather than returning a failure status, the circuit breaker never trips — the system burns a failed reactive compression API call on every overflow indefinitely.

Suggested change

} catch (compressionError) {

} catch (compressionError) {

if (

params.config?.abortSignal?.aborted ||

isAbortError(compressionError)

) {

throw compressionError;

}

debugLogger.warn('Reactive compression failed.', compressionError);

self.consecutiveFailures += 1;

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — self.consecutiveFailures += 1 added inside the catch block (was: status-based path only). Thrown exceptions (network errors, model 5xx, timeouts) now also count toward the breaker, matching the status-based pattern.

wenshao · 2026-05-18T15:10:58Z

    id: 'context-critical',
    content:
-      'Context is almost full! Run /compress now or start /new to continue.',
+      'Context near hard limit — auto-compact will force on next send. Consider /clear if you want to start fresh.',


[Suggestion] Tip text says "will force on next send" but hard-tier rescue already ran on current send

The isRelevant check fires when lastPromptTokenCount >= thresholds.hard — meaning the hard-tier rescue in sendMessageStream has already forced compaction on the send that just produced this response. Users are told compaction "will force on next send" when it already forced on this send.

Suggested change

'Context near hard limit — auto-compact will force on next send. Consider /clear if you want to start fresh.',

'Context near hard limit — auto-compact was forced on this turn. Consider /clear if context remains tight.',

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — tip text now reads "Context near hard limit — auto-compact was forced on this turn. Consider /clear if context remains tight." Tense matches the actual timing (rescue already ran by the time the tip renders).

wenshao · 2026-05-18T15:10:59Z

+    ) {
+      // eslint-disable-next-line no-console
+      console.warn(
+        '[qwen-code] chatCompression.contextPercentageThreshold has been removed ' +


[Suggestion] contextPercentageThreshold: 0 disable mechanism removed with no replacement

Users who set contextPercentageThreshold: 0 to disable auto-compaction (e.g., for debugging compression issues, or for sessions that must retain full history) now have no migration path. The one-time console.warn at startup is easy to miss, and doesn't inform users that compaction can no longer be disabled at all.

Consider either: (1) adding a chatCompression.enabled boolean as a replacement escape hatch, (2) adding an env var like QWEN_AUTO_COMPACT=false, or (3) at minimum, updating the warning to explicitly state that auto-compaction cannot currently be disabled.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Partially fixed in 1030557 — went with reviewer's option (3): the deprecation warning now explicitly states "auto-compaction cannot currently be disabled" and points users to /clear / open an issue if they need a replacement. Did not add a new enabled flag or env var — those would re-introduce the disable mechanism this PR removed. The choice trades one round of forced-on auto-compact for the cleaner threshold semantics; the warning makes that trade-off visible to affected users.

wenshao · 2026-05-18T15:10:59Z

+  label: string;
+  tokens: number;
+  isCurrent?: boolean;
+  hint?: string;


[Suggestion] Dead code: ThresholdRow declares hint?: string prop but no caller passes it

The hint prop is declared, destructured, and rendered, but none of the ThresholdRow invocations in CompactionThresholds ever pass a hint value. Remove the prop until a caller actually needs it.

Suggested change

hint?: string;

const ThresholdRow: React.FC<{

label: string;

tokens: number;

isCurrent?: boolean;

}> = ({ label, tokens, isCurrent }) => {

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — hint prop removed from ThresholdRow. The only caller that ever passed it was dropped in an earlier round (the "Effective window" hint was moved out to avoid wrapping). Now matches the actual usage.

wenshao · 2026-05-18T15:10:59Z

+  // single render — that resolves the moment any send happens.
+  //
+  // TODO: plumb the chat history into collectContextData and use
+  // estimatePromptTokens(history, undefined, 0, imageTokenEstimate) here


[Suggestion] TODO references a type-incorrect function call

The TODO proposes estimatePromptTokens(history, undefined, 0, imageTokenEstimate), but userMessage: Content is a non-optional parameter in estimatePromptTokens. A future developer following this TODO will hit a compile error.

Update the TODO to show a type-correct call, or note that userMessage needs to be made optional first.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — rewrote the TODO with a 3-step implementation sketch that calls out the prerequisite signature change (estimatePromptTokens.userMessage needs to be made optional first) and the plumbing path (add a chat?: GeminiChat parameter to collectContextData). A future developer following the TODO won't hit the compile error you flagged.

wenshao · 2026-05-18T15:10:59Z

 */
-const DEFAULT_COMPRESSION_THRESHOLD = 0.7;
+function currentTier(
+  tokens: number,


[Suggestion] hard === auto makes the 'auto' tier unreachable for small windows

computeThresholds collapses hard to auto for windows below ~82.5K tokens. Since this function checks >= hard first, when hard === auto the first branch always wins — the function returns 'hard' and the 'auto' tier is never displayed. Users with smaller-window models see the display jump from 'warn' directly to 'hard'.

Consider checking auto before hard when they're equal, or displaying 'auto/hard' when the thresholds are identical.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — currentTier() now only returns 'hard' when thresholds.hard > thresholds.auto. For small windows where computeThresholds collapses them, the function returns 'auto' instead, so the tier label is reachable for all windows.

wenshao · 2026-05-18T15:10:59Z

-        breakdown.autocompactBuffer,
-        contextWindowSize,
-      ),
+      `  Effective window:   ${formatNum(breakdown.thresholds.effectiveWindow)}  (window − 20K reserve)`,


[Suggestion] formatContextUsageText hardcodes English labels and a stale-prone magic constant

The text formatter embeds threshold labels in English ("Warn threshold", "Auto threshold", etc.) and includes a literal "(window − 20K reserve)" string. The interactive ContextUsage.tsx component uses t() for i18n. Additionally, "20K" is a hard-coded reference to SUMMARY_RESERVE — if that constant changes, this string silently goes stale.

Consider using t() for labels and deriving the reserve text from the constant.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — formatContextUsageText labels now go through t() (matching the interactive ContextUsage.tsx). "20K reserve" is now derived from SUMMARY_RESERVE (exported from core) via Math.round(SUMMARY_RESERVE / 1000) + "K", so it stays in sync if the constant ever changes.

wenshao · 2026-05-18T15:10:59Z

+      const effectiveTokens =
+        opts.precomputedEffectiveTokens !== undefined
+          ? opts.precomputedEffectiveTokens
+          : pendingUserMessage


[Suggestion] Cheap-gate path #2 (pendingUserMessage without precomputedEffectiveTokens) is dead code

The cheap-gate has three estimation paths, but no current caller ever passes pendingUserMessage without also passing precomputedEffectiveTokens: sendMessageStream passes both, tryCompressChat (manual /compress) passes neither, and heap-pressure bypass passes neither. This path is the only one that calls chat.getHistory(true) inside the cheap-gate — an expensive clone that is never actually reached.

Removing this branch (or adding an assertion) would simplify the gate and eliminate the latent clone risk if a future caller accidentally hits it.

— qwen-latest-series-invite-beta-v28 via Qwen Code /review

✅ Fixed in 1030557 — removed the dead pendingUserMessage-only branch. The cheap-gate now uses opts.precomputedEffectiveTokens ?? originalTokenCount. Production callers (sendMessageStream) always pass precomputed; direct service callers fall back to originalTokenCount without cloning history. Eliminates the latent double-clone risk.

Adds a defensive guard in ChatCompressionService.compress() that detects when the side-query summary hit COMPACT_MAX_OUTPUT_TOKENS (20K). In that case the summary is likely truncated mid-content, so we drop it and return NOOP rather than persist a half-summary. The next send re-tries; reactive overflow still catches the catastrophic case where the API rejects the next request as too large. Documented in the design doc as risk #2; the bot reviewer on PR #4168 correctly pushed for it to land alongside the threshold redesign rather than as a follow-up since the new 20K cap is what makes truncation likely in the first place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Task 11 redesign updated the non-interactive text formatter (formatContextUsageText) but left ContextUsage.tsx — the interactive React component that real /context users see — unchanged. As a result the TUI still showed the old single "Autocompact buffer" line and none of the new warn/auto/hard ladder. Adds a "Compaction thresholds" section after the per-category breakdown: - Effective window - Warn / Auto / Hard threshold rows with a ▶ marker on the row the current usage has crossed - Current tier label coloured by severity (safe→green, warn/auto→ yellow, hard→red) The existing progress bar legend (Used / Free / Autocompact buffer) is preserved because it's tied to the three-segment progress bar visualisation; the new section adds the absolute numbers + tier badge on top of that. Caught by the tmux e2e test (PR #4168 ci-monitor follow-up). Pre-fix the assertion 'Compaction thresholds' missed completely from the TUI; post-fix the new section renders correctly for fresh and live sessions on 1M / 200K / 128K windows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Behavior fixes: - MAX_TOKENS truncation guard now returns COMPRESSION_FAILED_EMPTY_SUMMARY instead of NOOP so the consecutive-failure breaker actually trips after repeated max-length summaries (R1.1). - Reactive overflow failure increments consecutiveFailures by 1 instead of latching to MAX in one shot, so a transient network blip doesn't permanently disable auto-compaction. The hard-tier rescue resets the counter, which remains the designated recovery path (R1.2). - /context current-tier classification uses rawOverhead (system + tools + memory + skills) as the tier input when API data is not yet available, rather than 0 — large inherited contexts no longer silently show 'safe' (R2.2). Performance: - sendMessageStream computes effectiveTokens ONCE and passes it through TryCompressOptions.precomputedEffectiveTokens, so the cheap-gate inside service.compress doesn't redo the estimation. Also fixes the imageTokenEstimate inconsistency between the rescue and cheap-gate paths (R1.3 + R1.4). - Steady-state path (lastPromptTokenCount > 0) skips the costly getHistory(true) clone — estimatePromptTokens only needs the user message in that branch. Code hygiene: - BYTES_PER_TOKEN → CHARS_PER_TOKEN (inputs are char counts, not byte counts; CJK text would mislead under the old name) (R3.1). - Drop dead getContextUsagePercent helper + index re-export — no callers in source after the threshold rewire (R1.5). - Add a comment on estimatePromptTokens' first-send fallback documenting the ~15-20K under-estimate (system prompt + tools + skills) and that reactive overflow is the safety net (R3.3). Tests: - New CLI ContextUsage.test.tsx exercises the React renderer for the three-tier section: section presence, ▶ marker placement per tier, current-tier label coloring (R1.6). - New chatCompressionService.test.ts case pins that a stale contextPercentageThreshold: 0 value in user settings no longer short-circuits compaction (R2.1). - New tokenEstimation.test.ts case covers functionResponse (distinct nested-parts branch from functionCall) (R3.5). - New geminiChat.test.ts integration test exercises the real ChatCompressionService — not a mock — for the first-send-after- inherited-history scenario where lastPromptTokenCount=0 and only the full-history estimate can cross the auto threshold (R3.4). Declined: R3.2 (change `>=` to `>` on the MAX_TOKENS guard). The current operator catches the at-cap case as suspicious, which is intentional — landing exactly at the output cap is far more likely truncation than clean stop given p99.99 ≈ 17K. With R1.1 in place, persistent truncations trip the breaker after MAX_CONSECUTIVE_FAILURES so the worst case is bounded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- R5.1: tighten /context tier comment + TODO. The rawOverhead-based fix doesn't cover `--continue` restores with many history messages (since rawOverhead excludes messagesTokens). UI may still show 'safe' for one render until the first send. Documented inline and added a TODO to plumb chat history into collectContextData for same-source-of-truth as the cheap-gate. - R5.2a: add TODO(finish_reason) at the truncation guard. The `>= cap` heuristic false-positives on legitimate at-cap summaries; the proper signal is finish_reason which runSideQuery doesn't surface today. - R5.2b: split telemetry — new CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED enum value. Distinct from EMPTY_SUMMARY so logs/telemetry can tell prompt-quality failures (tune prompt / splitter) from capacity failures (raise cap / shrink splitter input). isCompressionFailureStatus() treats both as failures so the breaker behavior is unchanged. - R5.3: expand consecutiveFailures JSDoc to clarify it tracks "non-force, non-hard-rescue consecutive failures" — hard-rescue resets the counter and force=true skips increments, so the counter is the "regular path" health signal only; reactive overflow is the real safety net for the force-only paths. - R5.4: document the CompressOptions field rename (hasFailedCompressionAttempt: boolean → consecutiveFailures: number) as an SDK breaking change in the design doc with migration guide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Task 11 redesign updated the non-interactive text formatter (formatContextUsageText) but left ContextUsage.tsx — the interactive React component that real /context users see — unchanged. As a result the TUI still showed the old single "Autocompact buffer" line and none of the new warn/auto/hard ladder. Adds a "Compaction thresholds" section after the per-category breakdown: - Effective window - Warn / Auto / Hard threshold rows with a ▶ marker on the row the current usage has crossed - Current tier label coloured by severity (safe→green, warn/auto→ yellow, hard→red) The existing progress bar legend (Used / Free / Autocompact buffer) is preserved because it's tied to the three-segment progress bar visualisation; the new section adds the absolute numbers + tier badge on top of that. Caught by the tmux e2e test (PR #4168 ci-monitor follow-up). Pre-fix the assertion 'Compaction thresholds' missed completely from the TUI; post-fix the new section renders correctly for fresh and live sessions on 1M / 200K / 128K windows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Behavior fixes: - MAX_TOKENS truncation guard now returns COMPRESSION_FAILED_EMPTY_SUMMARY instead of NOOP so the consecutive-failure breaker actually trips after repeated max-length summaries (R1.1). - Reactive overflow failure increments consecutiveFailures by 1 instead of latching to MAX in one shot, so a transient network blip doesn't permanently disable auto-compaction. The hard-tier rescue resets the counter, which remains the designated recovery path (R1.2). - /context current-tier classification uses rawOverhead (system + tools + memory + skills) as the tier input when API data is not yet available, rather than 0 — large inherited contexts no longer silently show 'safe' (R2.2). Performance: - sendMessageStream computes effectiveTokens ONCE and passes it through TryCompressOptions.precomputedEffectiveTokens, so the cheap-gate inside service.compress doesn't redo the estimation. Also fixes the imageTokenEstimate inconsistency between the rescue and cheap-gate paths (R1.3 + R1.4). - Steady-state path (lastPromptTokenCount > 0) skips the costly getHistory(true) clone — estimatePromptTokens only needs the user message in that branch. Code hygiene: - BYTES_PER_TOKEN → CHARS_PER_TOKEN (inputs are char counts, not byte counts; CJK text would mislead under the old name) (R3.1). - Drop dead getContextUsagePercent helper + index re-export — no callers in source after the threshold rewire (R1.5). - Add a comment on estimatePromptTokens' first-send fallback documenting the ~15-20K under-estimate (system prompt + tools + skills) and that reactive overflow is the safety net (R3.3). Tests: - New CLI ContextUsage.test.tsx exercises the React renderer for the three-tier section: section presence, ▶ marker placement per tier, current-tier label coloring (R1.6). - New chatCompressionService.test.ts case pins that a stale contextPercentageThreshold: 0 value in user settings no longer short-circuits compaction (R2.1). - New tokenEstimation.test.ts case covers functionResponse (distinct nested-parts branch from functionCall) (R3.5). - New geminiChat.test.ts integration test exercises the real ChatCompressionService — not a mock — for the first-send-after- inherited-history scenario where lastPromptTokenCount=0 and only the full-history estimate can cross the auto threshold (R3.4). Declined: R3.2 (change `>=` to `>` on the MAX_TOKENS guard). The current operator catches the at-cap case as suspicious, which is intentional — landing exactly at the output cap is far more likely truncation than clean stop given p99.99 ≈ 17K. With R1.1 in place, persistent truncations trip the breaker after MAX_CONSECUTIVE_FAILURES so the worst case is bounded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- R5.1: tighten /context tier comment + TODO. The rawOverhead-based fix doesn't cover `--continue` restores with many history messages (since rawOverhead excludes messagesTokens). UI may still show 'safe' for one render until the first send. Documented inline and added a TODO to plumb chat history into collectContextData for same-source-of-truth as the cheap-gate. - R5.2a: add TODO(finish_reason) at the truncation guard. The `>= cap` heuristic false-positives on legitimate at-cap summaries; the proper signal is finish_reason which runSideQuery doesn't surface today. - R5.2b: split telemetry — new CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED enum value. Distinct from EMPTY_SUMMARY so logs/telemetry can tell prompt-quality failures (tune prompt / splitter) from capacity failures (raise cap / shrink splitter input). isCompressionFailureStatus() treats both as failures so the breaker behavior is unchanged. - R5.3: expand consecutiveFailures JSDoc to clarify it tracks "non-force, non-hard-rescue consecutive failures" — hard-rescue resets the counter and force=true skips increments, so the counter is the "regular path" health signal only; reactive overflow is the real safety net for the force-only paths. - R5.4: document the CompressOptions field rename (hasFailedCompressionAttempt: boolean → consecutiveFailures: number) as an SDK breaking change in the design doc with migration guide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Observability (R6.1 + R6.6): - chatCompressionService.compress() now warn-logs when the breaker trips the NOOP path; previously the only signal was the absence of compaction - sendMessageStream info-logs hard-tier rescue trigger + warn-logs on rescue failure so debugging matches the consecutiveFailures JSDoc Counter accounting (R6.3 + R6.7): - New `hardRescueFailureCount` field on GeminiChat bounds hard-rescue retries to MAX_CONSECUTIVE_FAILURES — without it a chat whose history can't shrink would burn an API call per send forever (force=true skipped the regular increment AND the rescue's pre-call reset wiped state). After MAX failures, hard rescue stops firing and reactive overflow takes over as the next defense layer. Reset on any compression success. - Reactive overflow catch block now increments consecutiveFailures so thrown exceptions (network, 5xx, timeouts) also count toward the breaker — previously only status-based reactive failures incremented. UI corrections (R6.8 + R6.9 + R6.12): - context-critical tip: tense corrected from "will force on next send" to "was forced on this turn" — the rescue already ran by the time the tip renders - Deprecation warning explicitly states auto-compaction can no longer be disabled (no replacement for `contextPercentageThreshold: 0`) - currentTier() returns 'auto' (not 'hard') when hard collapses to auto on small windows — previously the 'auto' tier was unreachable for those sessions Code hygiene (R6.2 / R6.4 / R6.10 / R6.11 / R6.13 / R6.14): - Truncation guard `>=` → `>`: legitimate at-cap summaries no longer treated as truncation (was particularly costly because R5.2b made these count toward the breaker) - ContextThresholds reduced to a type alias of core's CompactionThresholds to eliminate silent-drift risk - Removed dead `hint` prop on ThresholdRow (no caller after R5 refactor) - TODO at contextCommand.ts now shows a type-correct call sketch - formatContextUsageText uses t() for labels; "20K" derived from SUMMARY_RESERVE constant (exported from core) - cheap-gate dead branch removed: production callers always pass precomputedEffectiveTokens; direct service callers fall back to originalTokenCount instead of double-cloning history Tests (R6.15): - New: COMPRESSION_FAILED_OUTPUT_TRUNCATED counts toward the breaker - New: precomputedEffectiveTokens path skips estimation work - New: cheap-gate falls back to originalTokenCount when no precomputed - Hard-rescue test now asserts precomputedEffectiveTokens is forwarded Docs (R6.16): - docs/users/configuration/settings.md table entry for `model.chatCompression.contextPercentageThreshold` updated to mark the field REMOVED with link to PR rationale Declined: R6.5 (separate reactive/proactive counter). The R5.3 JSDoc already documents the coupling intentionally; R1.2 reduced reactive's weight to +1 (not =MAX), so it takes MAX_CONSECUTIVE_FAILURES reactive failures to disable proactive — which is the correct outcome for a chat where reactive consistently fails. A separate counter would add state without changing observable behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

R7.1 critical (scratchpad data-retention): with includeThoughts=false, the compression model emits its private <scratchpad> reasoning as plain text alongside <state_snapshot>, and the entire concatenation was being persisted as the chat's compressed memory — leaking sensitive tool output (API keys, paths, file fragments) into every subsequent turn. Extract just the <state_snapshot> envelope from the response; surface a no-match as COMPRESSION_FAILED_EMPTY_SUMMARY so the breaker reacts to prompt-format drift. R7.2 / R7.3 critical (hard-rescue counter accounting): pessimistic increment pattern. The previous post-call accounting silently leaked two failure shapes: - throw (provider 5xx / abort): post-handler unreachable, counter stuck → infinite re-fire on every send. - NOOP (history too small to split): neither failure-status nor COMPRESSED branch matched → same infinite re-fire. Increment hardRescueFailureCount BEFORE tryCompress(force=true); rely on the existing success-branch reset in tryCompress to refund the strike on COMPRESSED. Throws, NOOPs, and failure statuses all keep the strike uniformly. R7.4 critical (constant coupling): lifted TOKEN_TO_CHAR_RATIO to the single declaration in compactionInputSlimming.ts; tokenEstimation.ts's CHARS_PER_TOKEN is now a re-export. Silent-drift risk between splitter sizing and gate sizing is gone. R7.5: removed dead `pendingUserMessage` field from CompressOptions / TryCompressOptions — unused since R6.14 collapsed its consumer. R7.6: breaker-NOOP path returns the caller's `originalTokenCount` rather than 0 so telemetry sees real session token counts on the trip event, not a misleading zero. R7.7: log at warn level when hard-rescue is skipped due to budget exhaustion (hardRescueFailureCount >= MAX). Closes the "why isn't rescue firing" oncall blind spot. R7.8: reverted R6.2's `>` back to `>=` on the truncation guard. With the API hard-capping output at COMPACT_MAX_OUTPUT_TOKENS, `>` could never fire — making the guard dead code that silently persisted truncated summaries. `>=` catches exact-at-cap (almost always truncated); the breaker bounds 3 strikes. Declined the reviewer's alternative `>= cap * 0.95` heuristic — broadens false positives into the p99-realistic range (~19K) without addressing the root cause (finish_reason plumbing, still TODO'd). R7.9: throttle the breaker warn log via a `breakerWarningEmitted` flag on GeminiChat. Fires once when the breaker first trips, resets when consecutiveFailures returns to 0. Service stays stateless. R7.10: neutral tip wording — "Run /compress or /clear to free space" is correct whether hard-rescue ran, failed, or was budget-suppressed. Previous past-tense ("was forced on this turn") was wrong in the budget-exhausted case. R7.11: 4 new test cases pinning the hardRescueFailureCount + reactive overflow counter contracts (budget exhaustion via failures, via NOOPs, via thrown exceptions; reactive throw increments consecutiveFailures). Tests: packages/core 205 passing in changed files (chatCompression + geminiChat + tokenEstimation + compactionInputSlimming); packages/cli 33 passing (tips + ContextUsage + contextCommand). Pre- existing serve/* breakage and timeout-flaky utils/filesearch tests unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…er + throttle) The R7.1 <state_snapshot> extraction shipped in round 7 turned out to be incomplete on three fronts that the round-8 review caught: R8.1 critical (format-violation diagnostic): when the model produced non-empty raw output but no <state_snapshot> tags, the path silently classified as COMPRESSION_FAILED_EMPTY_SUMMARY — indistinguishable from a model that genuinely returned nothing, and three such sends trip the breaker with no actionable signal. Added a warn-level log on the !isRawEmpty && isSummaryEmpty branch that includes length and the first 200 chars of the raw output, so an oncall can distinguish "prompt drift / model misbehaviour" from "provider error". R8.6 (regex bypass): the non-greedy `<state_snapshot>[\s\S]*?</...>` match captured from the FIRST occurrence of the opening tag. Because the compression prompt instructs the model to "generate the <state_snapshot>", the scratchpad is plausibly going to mention the tag literally — and the match would then start at the scratchpad mention and capture the scratchpad's reasoning through to the real closing tag, defeating the data-retention fix. Anchored on the LAST opening tag via `[\s\S]*<state_snapshot>([\s\S]*?)</state_snapshot>` plus `${`<state_snapshot>${...}</state_snapshot>`}` reconstruction. R8.7 (token math): the persisted history contains only the snapshot envelope, but newTokenCount used the raw API `candidatesTokenCount` which counts scratchpad+snapshot. Scaling by `summary.length / rawSummaryText.length` while keeping the API count as the base preserves tokenizer fidelity for the snapshot portion. Test scenario of ~3x scratchpad vs snapshot drops the bookkeeping from 1024 → ~260, which is materially closer to what the next cheap-gate actually sees. R8.4 (throttle asymmetry): the R7.7 budget-exhausted warn fired on every send when a session stayed above the hard threshold — asymmetric with R7.9's `breakerWarningEmitted`. Added matching `budgetExhaustedWarningEmitted` flag, cleared in the same COMPRESSED success branch as the other resets. R8.2 / R8.3 / R8.5 (test coverage gaps): added 6 tests pinning contracts the previous rounds left unverified: - exact-cap (20_000) truncation guard (R7.8 regression guard) - scratchpad-strip end-to-end persistence assertion (R7.1) - format-violation EMPTY_SUMMARY + warn (R8.1/R8.3b combined) - breaker-tripped NOOP returns originalTokenCount (R7.6 telemetry) - hardRescueFailureCount recovery after COMPRESSED success (R8.5) - regex-anchor on literal scratchpad mention (R8.6) - newTokenCount accounts for only persisted snapshot (R8.7) Phase 5 ordering: R8.1, R8.6, R8.7 were written test-first (RED → fix → GREEN); R8.4 mirrors R7.9 structurally. Phase 6 self-review checklist run and documented in the PR reply. All 2126 core tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

R9.1 (telemetry assertion): pre-existing breaker-NOOP test only checked status — added explicit token-count assertions so a regression to `0/0` would surface instead of silently corrupting trip-event telemetry. R9.2 critical (NOOP refund): the R7.2/R7.3 pessimistic increment was overcautious for the NOOP case. A forced rescue NOOPs when the compressible slice is too small to split this turn — not because the compression mechanism is broken. Refund the strike on NOOP so a session whose first few turns happen to be too small doesn't permanently disable hard-rescue. Throws and failure statuses still cost a strike. Flipped the R7.11 NOOP test to assert the new contract (budget does NOT exhaust on NOOPs). R9.3 critical (cross-file silent coupling): the `<state_snapshot>` tag name was hard-coded in both `prompts.ts` (literal XML in the template) and `chatCompressionService.ts` (extraction regex). A rename in one without the other was a silent failure mode (every compaction → EMPTY_SUMMARY → breaker trips after 3 sends → auto- compaction permanently off, looking like "model can't follow format"). Lifted `COMPRESSION_SNAPSHOT_TAG = 'state_snapshot'` as a shared constant; prompt template uses it via template literal, regex constructs from it via `new RegExp`. R9.4 (stale breaker flag): hard-rescue resets `consecutiveFailures = 0` in the pre-call path but pre-R9.4 left `breakerWarningEmitted` true. After a session sequence "breaker trips → warn emitted → hard-rescue resets counter → counter re-trips", the second trip emitted no warn. Clear the flag alongside the counter in the rescue pre-call path. R9.5 (tip small-window collapse): the `context-critical` tip fired at `>= thresholds.hard`, but on small windows (32K) `computeThresholds` collapses hard to equal auto — the tip would claim "near hard limit" when there is no distinct hard limit. Mirror the `currentTier` guard (`hard > auto`) so the `context-high` band `[auto, hard)` handles small windows cleanly. R9.6 declined as filter-1 false-positive: the cited inflation was fixed in R8.7 (current code scales `compressionOutputTokenCount` by the snapshot/raw char ratio). Reviewer was reading a stale snapshot. R9.7 (preserve valid snapshots): the truncation guard fired whenever `compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS` regardless of extraction success. When the model emits a complete `<state_snapshot>...</state_snapshot>` envelope and the cap was consumed by scratchpad, dropping the snapshot throws away a valid result. Gated the guard on `!snapshotMatch` so it now only fires when the envelope is incomplete (no closing tag) — strong evidence of mid-snapshot truncation. Existing R7.8/R8.2 truncation tests updated to use no-closing-tag mocks (the actual shape of mid- snapshot truncation); added new test for the "complete envelope + cap hit → preserved" contract. Phase 5 ordering: R9.2 / R9.4 / R9.7 were RED-first (the R7.11 NOOP test flip is the explicit RED for R9.2; R9.4 has a fresh internals-peek test; R9.7 has a fresh test that fails against the pre-R9.7 code which would return TRUNCATED instead of COMPRESSED). R9.3 is a constant-lift with no behavior change. R9.5 has a new small-window-collapse test. Tests: 2128 core + 24 CLI all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

R11.1 critical (NaN propagation): R10.1's `?? 0` only catches null/undefined; NaN passes through and poisons every subsequent `lastPromptTokenCount + NaN + ...` arithmetic — `NaN >= hard` is always false, silently disabling hard-tier rescue for the session. Guard with `Number.isFinite` so NaN / Infinity / non-numbers coerce to 0. RED-first via hostile-NaN-payload test. R11.2 (self-inflicted regression from R9.5): adding `hard > auto` to context-critical left context-high's `[auto, hard)` band empty when hard === auto (small windows 32K/64K). Users at the auto threshold lost ALL contextual tips. Accept `>= auto` in context-high when hard === auto so there's always exactly one tip in the high-utilization range. RED-first via collapsed-window test. R11.3 critical (per-strike observability): pre-R11.3, proactive auto-compaction failures produced ZERO logs until the breaker tripped on strike 3. An oncall investigating "auto-compaction stopped" couldn't distinguish EMPTY_SUMMARY / OUTPUT_TRUNCATED / INFLATED / TOKEN_COUNT_ERROR without source-diving. Added info-level per-strike log citing status and strike-of-MAX. Declined the second half of the suggestion (promote breaker/budget warns to console.warn for user visibility) — that's UI noise; users without DEBUG=QWEN_CODE_CHAT enabled see reactive overflow recovery working, which is the intended UX. R11.4 critical (disable escape hatch restored): the removal of `contextPercentageThreshold: 0` was scope-collateral, not intent. Users with compliance / debugging / audit-trail needs require a way to opt out of auto-compaction entirely. Added `chatCompression.disabled: boolean` field. Service-level cheap-gate gates `!force && !bypassTokenThreshold` (proactive only); hard- rescue gated at SOURCE in sendMessageStream since force=true would bypass the service gate. Manual /compress (user-initiated force=true via tryCompressChat) and reactive overflow (API-layer safety net) remain active — matching the old contextPercentageThreshold=0 semantics that only gated the proactive path. R11.5 declined-design: the counter asymmetry between `consecutiveFailures` (proactive cheap-gate health) and `hardRescueFailureCount` (rescue-budget pessimistic) is intentional and documented in the JSDoc — they track different mechanisms with legitimately different reset semantics. The "regular breaker reports healthy while every compression fails" scenario the reviewer describes IS the design: a flaky hard-rescue eventually exhausts its own budget, then the proactive cheap-gate accumulates strikes, then the cheap-gate breaker latches. Reactive overflow catches the actual API failure throughout. The save/restore pattern suggested would complicate the state machine without changing the recovery shape. R11.6 (sensitive content in warn log): R8.1's `slice(0, 200)` of raw model output captured exactly the window where scratchpad's sensitive content (quoted API keys, paths from tool output) is most likely to appear. Length-only message preserves the operationally actionable distinction ("model returned content but no tags" vs "model returned nothing") without the leak risk. Actual content is recoverable from provider-side logging. R11.7 (regex hoist): the snapshot extraction regex depends only on the immutable `COMPRESSION_SNAPSHOT_TAG` constant. Hoisted to module-scope `SNAPSHOT_REGEX` — removes per-call `new RegExp()` overhead and signals to readers that the pattern is a fixed contract, not parameterised. R11.8 (i18n hygiene): `breakdown.currentTier` value was interpolated raw at 2 sites (contextCommand text formatter + ContextUsage Ink component). Wrapped in `t()` so non-English locales don't see mixed-language output. Sibling sweep via grep confirmed exactly 2 unwrapped render sites; the other `currentTier` references are code comparisons against tier-name string literals (not user-facing strings). 2361 core + 35 CLI tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

R12.1 critical (sibling sweep of R11.1): R11.1 added Number.isFinite to `lastCandidatesTokenCount`, but `lastPromptTokenCount` (assigned 3 lines above) and `cachedContentTokenCount` had no guard. Also, Number.isFinite(-1) is true — a negative value would still poison arithmetic. Factored `coerceUsageCount(value)` enforcing (finite ∧ >= 0) and routed all 4 API-value capture sites through it. RED-first via Infinity/NaN/-1/-1e9 injection test. R12.2 critical (computeThresholds NaN propagation): a provider returning `"context_window": null` surfaces as `contextWindowSize: NaN`. Pre-fix, NaN propagated to all 4 thresholds, every downstream `tokens >= NaN` comparison evaluated false, and the entire three-tier gate silently disabled. Guard with `!Number.isFinite || <= 0` → return Infinity thresholds (gate falls through to NOOP) + 0 effectiveWindow. RED-first against NaN/0/-1/-Inf inputs. R12.3 critical (R8.7 self-inflicted undercount): pure scaling collapses on extreme scratchpad/snapshot ratios. Example: 200K scratchpad + 5K snapshot with 15K API tokens scaled to ~375 tokens. Floor by `estimateContentTokens` on the persisted summary — `Math.max( scaledApi, charBased)` keeps API tokenizer fidelity when scratchpad is reasonable, clamps when it isn't. RED-first via 200K/5K extreme test. R12.4 critical (disabled NOOP observability): the R11.4 disable-knob NOOP returned silently, leaving oncall unable to distinguish "user disabled" from "system broken". Added once-per-process warn (module- level flag because `ChatCompressionService` is per-call). Symmetric with R7.9 `breakerWarningEmitted` / R8.4 `budgetExhaustedWarningEmitted`. R12.5 critical (test gap for R11.4 source gate): R11.4's hard-rescue source-level disable check had no regression guard. Added test mocking `getChatCompression: { disabled: true }` + lastPromptTokenCount above hard threshold; asserts no force=true call to tryCompress. Test passes against current code — pins the contract against future refactor removing the source gate. R12.6 (deprecation text contradiction): the R11.4 commit added `disabled: true` but left the deprecation warning saying "auto-compaction cannot currently be disabled". Updated to mention the new field. R12.7 declined-design: `imageTokenEstimate: 0` silently clamping to 100 violates user intent on a user-configurable knob. The reviewer's concern (user accidentally disabling image weight) is real but the fix is wrong shape — silent override of explicit values is filter-5 defensive bloat. Users explicitly setting 0 are signaling intent; config-validation warnings at load are a future enhancement if real-world complaints surface. R12.8 (locale baseline): the 8+ new t() keys in /context output (`Compaction thresholds`, `Effective window`, `Warn/Auto/Hard threshold`, `Current tier`, tier names, `window − {{reserve}} reserve`) had no entries in en.js. Added as baseline; other locales fall back to the literal key (existing Used/Free behavior). Not flagged in mustTranslateKeys.ts — would force breaking-CI on locale maintainers; same precedent as existing Used/Free which also aren't flagged. R12.9 + R12.10 (discoverability): added `model.chatCompression.disabled` and `model.chatCompression.imageTokenEstimate` rows to settings.md; updated the REMOVED row for `contextPercentageThreshold` to mention the new `disabled: true` migration path per gpt-5.5's exact suggested text. Schema entry in settingsSchema.ts deliberately NOT changed — adding nested sub-properties for chatCompression would require rewriting the schema design for ALL existing sub-fields (imageTokenEstimate) and is out of scope for this round; TypeScript's ChatCompressionSettings interface already provides IDE-side autocomplete. 2405 core + 43 CLI tests in touched files passing. Pre-existing serve/* import resolution failures in CLI workspace unaffected.

wenshao · 2026-05-20T03:30:17Z

+    ) {
+      // eslint-disable-next-line no-console
+      console.warn(
+        '[qwen-code] chatCompression.contextPercentageThreshold has been removed ' +


[Critical] Deprecation warning contradicts the PR's own replacement mechanism

The console.warn message says "auto-compaction cannot currently be disabled" and tells users to open an issue "so we can consider a replacement." But this same PR adds ChatCompressionSettings.disabled as the first-class replacement for the removed contextPercentageThreshold: 0 escape hatch. Users who see this warning will be misled into thinking no disable mechanism exists and may file unnecessary issues or resort to unsafe workarounds.

Suggested change

'[qwen-code] chatCompression.contextPercentageThreshold has been removed ' +

'Note: the "contextPercentageThreshold" setting is removed. ' +

'To disable auto-compaction, set "chatCompression": { "disabled": true } instead.'

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+      // disable gate catches the proactive cheap-gate (force=false);
+      // hard-rescue uses force=true to bypass the breaker, so it would
+      // otherwise sidestep that gate. Skip at the source.
+      const autoCompactionDisabled =


[Critical] No test coverage for hard-rescue suppression when chatCompression.disabled === true

The sendMessageStream hard-rescue gate checks !autoCompactionDisabled before computing wantHardRescue. This is the sole protection against hard-rescue (which uses force=true) bypassing the service-layer disabled gate. Only the service-layer disable check is tested (chatCompressionService.test.ts "honors chatCompression.disabled"); the geminiChat guard has zero coverage. A regression that drops this guard would cause hard-rescue to silently ignore disabled: true.

Suggested change

const autoCompactionDisabled =

// Add a test in the "sendMessageStream hard-tier rescue" describe block:

// set mockConfig.getChatCompression().mockReturnValue({ disabled: true }),

// seed lastPromptTokenCount above hard threshold,

// assert compressSpy.mock.calls[0][1].force is false

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+  imageTokenEstimate: number = DEFAULT_IMAGE_TOKEN_ESTIMATE,
+): number {
+  let totalChars = 0;
+  for (const content of contents) {


[Critical] char/4 token estimation severely underestimates for CJK text (2-4x error)

The code comment claims "char/4 is a conservative lower bound (real tokenizers vary ±30%)," but for Chinese/Japanese/Korean the opposite is true — most tokenizers encode CJK at 1-2 chars per token, not 4. This means for CJK-heavy conversations the cheap-gate triggers much later than it should, hard-tier rescue misses the window before API overflow, and the /context display shows misleadingly low token counts. For CJK users this is a systematic correctness issue that silently undermines the entire three-tier ladder.

Consider applying a language-aware correction factor, or at minimum lowering the ratio for safety. The current CHARS_PER_TOKEN=4 is only "conservative" for Latin-alphabet languages.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+    // the API layer is still the last-ditch safety net. Replaces the
+    // removed `contextPercentageThreshold: 0` escape hatch.
+    if (chatCompressionSettings?.disabled && !force && !bypassTokenThreshold) {
+      return {


[Critical] The disabled NOOP path returns with zero observability

When chatCompressionSettings.disabled === true, the early-return NOOP is entirely silent — no debug log, no warn, no telemetry event. This replaces the old contextPercentageThreshold: 0 escape hatch, but unlike the old path which at least produced NOOP telemetry, this leaves zero audit trail. An oncall investigating "why isn't compression running?" has no signal to find the disabled flag short of inspecting runtime config.

Suggested change

return {

if (chatCompressionSettings?.disabled && !force && !bypassTokenThreshold) {

debugLogger.debug(

'[compaction] NOOP: chatCompression.disabled is true',

);

return { ... };

}

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+ * (autoCompact.ts:30) which is based on p99.99 of real compaction outputs.
+ */
+export const COMPACT_MAX_OUTPUT_TOKENS = 20_000;
+


[Suggestion] SUMMARY_RESERVE and COMPACT_MAX_OUTPUT_TOKENS are both hardcoded to 20_000 with a prose comment requiring them to stay equal — but there is no compile-time enforcement

If someone changes one constant without the other, computeThresholds computes effectiveWindow = window - SUMMARY_RESERVE while the actual output cap is COMPACT_MAX_OUTPUT_TOKENS. The threshold ladder silently drifts from reality. This is a classic "works by coincidence" coupling with no compiler error, no runtime exception, and no test failure to catch the drift.

Suggested change

export const COMPACT_MAX_OUTPUT_TOKENS = 20_000;

export const SUMMARY_RESERVE = COMPACT_MAX_OUTPUT_TOKENS;

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+  /** Token count at which UI warn tier triggers. */
+  readonly warn: number;
+  /** Token count at which auto-compaction triggers. */
+  readonly auto: number;


[Suggestion] computeThresholds(window) has no input validation — NaN/Infinity input silently disables all compression

If called with NaN (e.g., contextWindowSize misread from config), NaN - SUMMARY_RESERVE = NaN, Math.max(NaN, 0) = NaN, and all thresholds become NaN. Since NaN >= anything is always false, the cheap-gate always passes (NOOP) and hard-rescue never fires. Compression is silently disabled with no error. Extremely unlikely but catastrophic when it happens.

Suggested change

readonly auto: number;

if (!Number.isFinite(window) || window <= 0) {

const infinite = Number.MAX_SAFE_INTEGER;

return { warn: infinite, auto: infinite, hard: infinite };

}

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+ * the steady-state estimate lags by one response (typically 500–5000
+ * tokens) and the hard-tier rescue (which sits only HARD_BUFFER ≈ 3K
+ * from the window edge) fires late, costing a doomed API round-trip
+ * before reactive recovery catches the overflow.


[Suggestion] estimatePromptTokens accepts a history parameter that is unused in the steady-state path — leaky abstraction

The primary caller (sendMessageStream) passes [] for history to avoid a getHistory(true) clone. In the steady-state branch (lastPromptTokenCount > 0), the history argument is never read. This optimization leaks into the callee's API surface: future callers may not realize they can pass [], or may mistakenly pass [] on the cold-start path, silently getting a severe underestimate.

Consider splitting into two functions or using an options object so the cold-start case is explicitly opt-in:

Suggested change

* before reactive recovery catches the overflow.

export function estimateSteadyStateTokens(

lastPromptTokenCount: number,

lastCandidatesTokenCount: number,

userMessage: Content,

imageTokenEstimate?: number,

): number { ... }

export function estimateColdStartTokens(

history: Content[],

userMessage: Content,

imageTokenEstimate?: number,

): number { ... }

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+  /**
+   * Number of consecutive auto-compaction failures for this chat. The
+   * cheap-gate NOOPs once this reaches MAX_CONSECUTIVE_FAILURES (default 3)
+   * until a successful compress (forced or not) resets it to 0. Replaces the


[Suggestion] Inconsistent naming between two adjacent failure counters: consecutiveFailures vs hardRescueFailureCount

One uses bare "Failures" without a count suffix, the other uses "FailureCount". Additionally, consecutiveFailures doesn't specify what fails — consecutive what? Without reading the JSDoc, it's ambiguous. Both track related-but-different failure domains within the same class.

Consider renaming both for symmetry and clarity, e.g., autoCompactionConsecutiveFailures and hardRescueConsecutiveFailures.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-20T03:30:18Z

+ */
+function tierColor(tier: ContextTier): string {
+  switch (tier) {
+    case 'safe':


[Suggestion] warn and auto tiers share the same color (theme.status.warning), making them visually indistinguishable

The three-tier ladder has four states (safe, warn, auto, hard) but only three distinct colors — the warn-to-auto escalation has no visual signal. A user seeing /context output cannot tell from color alone whether they're in the "fire a tip" tier (warn) or the "actually compacting" tier (auto).

Use a distinct color for auto (e.g., a lighter orange distinct from the warn yellow) so each of the four states has a visually unique badge.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao

⚠️ CI note: Lint check is failing and Windows test is still pending on this commit.

wenshao · 2026-05-20T03:56:28Z

+        '[qwen-code] chatCompression.contextPercentageThreshold has been removed ' +
+          'and is now controlled by built-in thresholds. Setting will be ignored. ' +
+          'Note: auto-compaction cannot currently be disabled — the old ' +
+          '"set threshold to 0 to disable" escape hatch is gone. If you need ' +


[Critical] Deprecation warning contradicts disabled field

This warning states "auto-compaction cannot currently be disabled" but the same PR introduces chatCompression.disabled: boolean (config.ts:281) which does exactly that. Users migrating from contextPercentageThreshold: 0 will believe there is no replacement.

The same false claim appears in docs/users/configuration/settings.md:148 ("There is currently no replacement to disable auto-compaction").

Suggested change

'"set threshold to 0 to disable" escape hatch is gone. If you need ' +

'Note: to disable auto-compaction, set chatCompression.disabled to true instead. ' +

'If you need to retain full history for other reasons, use /clear between conversations.',

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

+          // would silently disable hard-tier rescue for the rest of
+          // the session.
+          this.lastCandidatesTokenCount = Number.isFinite(
+            usageMetadata.candidatesTokenCount,


[Critical] Negative candidatesTokenCount not clamped

The R11.1 Number.isFinite guard catches NaN and Infinity but NOT negative finite values. Number.isFinite(-100_000) returns true, so a hostile/buggy provider reporting candidatesTokenCount: -100000 passes through and is added to the estimate in estimatePromptTokens, effectively subtracting from it.

Concretely: lastPromptTokenCount=176_999 + lastCandidatesTokenCount=-100_000 + small user message → effectiveTokens ≈ 77K, far below any hard threshold. Hard-tier rescue never fires regardless of actual prompt size.

Suggested change

usageMetadata.candidatesTokenCount,

this.lastCandidatesTokenCount = Number.isFinite(

usageMetadata.candidatesTokenCount,

)

? Math.max(0, usageMetadata.candidatesTokenCount as number)

: 0;

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

@@ -1375,6 +1667,23 @@ export class GeminiChat {
          // Always update the per-chat counter so this chat (including


[Critical] totalTokenCount fallback double-counts candidates tokens

When promptTokenCount is falsy (0 or undefined), this falls back to totalTokenCount which already includes candidates tokens. Then estimatePromptTokens (tokenEstimation.ts:82-84) adds lastCandidatesTokenCount on top, producing promptTokens + 2×candidatesTokens.

On providers that omit promptTokenCount (some OpenAI-compatible endpoints), this over-estimates by 500–5000 tokens, causing false-positive hard-tier rescue triggers (the HARD_BUFFER is only 3K from the window edge).

Suggested change

// Always update the per-chat counter so this chat (including

usageMetadata.promptTokenCount ?? usageMetadata.totalTokenCount;

Additionally, estimatePromptTokens should only add lastCandidatesTokenCount when lastPromptTokenCount was sourced from promptTokenCount (not totalTokenCount). Consider exposing a flag or restructuring the estimator.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

+          // sent on THIS turn.
+          //
+          // R11.1: use Number.isFinite so a hostile / buggy provider
+          // payload (NaN, Infinity, non-number) coerces to 0 instead


[Critical] lastPromptTokenCount missing Number.isFinite guard

Unlike lastCandidatesTokenCount (guarded 18 lines below with the R11.1 Number.isFinite check), lastPromptTokenCount is assigned directly without validation. The R11.1 comment explicitly warns that "NaN >= hard is always false" would "silently disable hard-tier rescue" — yet the same class of bug exists on the far more impactful primary field.

A provider returning -Infinity for promptTokenCount passes the if (lastPromptTokenCount) truthiness check and is stored. -Infinity < auto is always false → cheap-gate never fires → auto-compaction silently disabled.

Suggested change

// payload (NaN, Infinity, non-number) coerces to 0 instead

const rawPromptTokens =

usageMetadata.promptTokenCount ?? usageMetadata.totalTokenCount;

if (rawPromptTokens) {

this.lastPromptTokenCount = Number.isFinite(rawPromptTokens)

? (rawPromptTokens as number)

: this.lastPromptTokenCount;

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

+      // hard-rescue uses force=true to bypass the breaker, so it would
+      // otherwise sidestep that gate. Skip at the source.
+      const autoCompactionDisabled =
+        this.config.getChatCompression()?.disabled === true;


[Critical] autoCompactionDisabled hard-rescue guard has no integration test

The guard suppressing hard-tier rescue when chatCompression.disabled === true has no test at the sendMessageStream level. The service-level cheap-gate NOOP test exists (chatCompressionService.test.ts:675), but nothing verifies that shouldForceFromHard stays false when disabled.

Without this test, a regression removing !autoCompactionDisabled from the wantHardRescue expression would silently force-compress sessions whose user explicitly opted out.

Suggested test: set mockConfig.getChatCompression to return { disabled: true }, set lastPromptTokenCount above the hard threshold, send a message, and assert compressSpy.mock.calls[0][1].force === false.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

 // Services
 // ============================================================================

+export {


[Critical] MAX_CONSECUTIVE_FAILURES not exported from barrel

The design doc's migration guide instructs SDK consumers to "change true to MAX_CONSECUTIVE_FAILURES", but this constant is not re-exported from the package barrel. Consumers following the documented path cannot import { MAX_CONSECUTIVE_FAILURES } from '@qwen-code/qwen-code-core'.

Suggested change

export {

export {

computeThresholds,

MAX_CONSECUTIVE_FAILURES,

SUMMARY_RESERVE,

type CompactionThresholds,

} from './services/chatCompressionService.js';

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

+            persistedOutputTokens,
        );
      }
    }


[Suggestion] Hook contract change: firePostCompactEvent now receives XML-wrapped summary

The summary parameter passed to hooks is now <state_snapshot>...</state_snapshot> (constructed at line 598), whereas previously it was raw model text (thoughts filtered). External hook consumers that parse, display, or store compact_summary will now see XML envelope tags they did not expect.

Consider passing snapshotMatch[1] (the inner content without wrapper tags), or documenting the format change in the hook-event type definition.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:28Z

+    );
+    lines.push('');
+    lines.push(`**${t('Compaction thresholds')}**`);
+    // R6.13: i18n the labels + derive the reserve hint from the


[Suggestion] Inconsistent i18n: new labels use t(), existing category labels do not

New threshold labels are wrapped in t() (e.g., t('Effective window'), t('Warn threshold')), but adjacent category labels remain raw strings: fmtCategoryRow('System prompt', ...), fmtCategoryRow('Built-in tools', ...), etc.

Non-English locales will see a mixed-language render. Either wrap all labels in t() or add a // TODO(i18n): wrap remaining category labels comment to track the gap.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:29Z

+ * runtime would treat the session as sitting in.
 */
-const DEFAULT_COMPRESSION_THRESHOLD = 0.7;
+function currentTier(


[Suggestion] currentTier small-window degradation branch untested

When thresholds.hard === thresholds.auto (e.g., 32K window: both are 22,400), the guard thresholds.hard > thresholds.auto prevents misclassifying the tier as 'hard'. All /context tests use a 200K window where hard > auto, so this branch is never exercised.

Suggested test: use makeMockConfig(32_000) with lastPromptTokenCount = 25_000, assert currentTier === 'auto' (not 'hard').

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

wenshao · 2026-05-20T03:56:29Z

+ * design discussion.
+ */
+const SNAPSHOT_REGEX = new RegExp(
+  `[\\s\\S]*<${COMPRESSION_SNAPSHOT_TAG}>([\\s\\S]*?)</${COMPRESSION_SNAPSHOT_TAG}>`,


[Suggestion] Truncation guard false-positive on partial closing tag

SNAPSHOT_REGEX requires the full closing </state_snapshot> tag. When the model writes a complete snapshot body but hits COMPACT_MAX_OUTPUT_TOKENS mid-closing-tag (e.g., emits </state_snap), the regex returns no match and a semantically-complete summary is discarded as COMPRESSION_FAILED_OUTPUT_TRUNCATED.

After 3 such events the breaker trips and auto-compaction stops. Consider accepting partial closing tags or verifying the captured content contains expected child tags (<overall_goal>, <current_plan>) as a more robust completeness check.

— qwen-latest-series-invite-beta-v34 via Qwen Code /review

LaZzyMan · 2026-05-20T04:07:29Z

Closing in favor of #4345 — a clean re-cut of the original spec + early-discovered real bugs (R1-R5, 6 commits, 4288 LOC, no AI-driven scope creep).

This PR ran 12 review rounds and accumulated ~2700 LOC of review-driven additions, much of it self-inflicted regression chains and AI-reviewer scope creep. The PR became un-reviewable.

Substantive review-driven refinements (hard-rescue counter, scratchpad envelope extraction, disable escape hatch, hostile-provider hardening, etc.) are being split into focused follow-up issues for independent review.

Branch state preserved at tag pr-4168-archive-pre-revert (already pushed to origin) for archaeology.

The triage retrospective that drove this revert is also being folded into the project's pr-triage skill — adds a round-weighted decline bar (defaults Suggestions/test gaps/docs/observability to overthinking from round 5+) which review-response lacked.

…istoryShallow) Main landed #4286 (replace structuredClone with shallow copy) which: - Reverted #4186's heap-pressure auto-compaction safety net (#4286 removed HEAP_PRESSURE_COMPRESSION_RATIO because the underlying OOM cause was fixed by the shallow-copy refactor) - Reverted #4168's consecutiveFailures ladder back to single-shot hasFailedCompressionAttempt - Introduced getHistoryShallow() / peekLastHistoryEntry() to replace structuredClone-based history access - Added a Chinese-language design doc draft for this exact redesign Resolution strategy: - Take OUR redesign everywhere it conflicts: three-tier threshold ladder, consecutiveFailures circuit breaker, hard-rescue, token estimator, hard-rescue debug log, CompressOptions plumbing for pendingUserMessage / precomputedEffectiveTokens / trigger. - DROP all bypassTokenThreshold / heapPressureCompressionCooldownUntil / HEAP_PRESSURE_* / mockGetHeapStatistics / mockHeapPressure code (heap-pressure mechanism is gone on main; we're not reviving it). - Use main's new getHistoryShallow(true) in chatCompressionService and in the hard-tier rescue estimator path (was getHistory(true) before main's refactor; the shallow path is what other compaction call sites now use). - For chatCompressionService.test.ts inline mockChat objects, alias getHistoryShallow to the same vi.fn() as getHistory so existing .mockReturnValue() calls drive both methods. - For the design doc, keep our resolved Open Question 2 closure rationale and prepend the round-2 blockquote clarifying the Background section describes pre-redesign behavior; take main's slightly more thorough SUMMARY_RESERVE paragraph where it explains both with/without-thinking cases. - Replace the round-2 test that asserted "hard-rescue forwards consecutiveFailures=3" with a test compatible with the post-merge history-access shape (now using getHistoryShallow). 346 core tests passing; CLI typecheck clean for affected files. Pre-existing provider-config typecheck errors from main's #4287 refactor are unrelated to this PR and not touched here.

…er (#4345) * feat(core)!: redesign auto-compaction thresholds with three-tier ladder Replaces the single 70% proportional threshold with a three-tier ladder (warn/auto/hard) that combines proportional fallback with absolute reservation. Large-window models (>=128K) now reserve ~33K instead of 30% of the window, freeing tens of thousands of context tokens that the old formula wasted. Other improvements bundled in the same redesign: - Compression sideQuery now disables thinking and caps maxOutputTokens at 20K, matching claude-code so the buffer math is predictable across providers (Anthropic/OpenAI/Gemini handle thinking budgets inconsistently) - Failure handling upgraded from one-shot permanent lock to a 3-strike circuit breaker; reactive overflow still latches immediately - New estimatePromptTokens helper closes the lag-by-one-turn and first-send-is-0 gaps in lastPromptTokenCount - Hard-tier rescue pulls reactive overflow recovery forward to before the API call, saving an oversized round-trip - /context command displays the three-tier ladder + current tier - tipRegistry's context-* tips track the new thresholds instead of fixed 50/80/95 percentages BREAKING CHANGE: chatCompression.contextPercentageThreshold setting is removed. Settings files containing the field log a one-line deprecation warning at startup and the value is ignored; behaviour is now controlled by built-in thresholds via the new computeThresholds() function. Design: docs/design/auto-compaction-threshold-redesign.md Plan: docs/plans/2026-05-14-auto-compaction-threshold-redesign.md * test(core): fix leftover hasFailedCompressionAttempt option in compress test A pre-existing test case at chatCompressionService.test.ts:678 still passed `hasFailedCompressionAttempt: false` in the CompressOptions shape; rebasing onto current main surfaced this as a typecheck error because the field was renamed to `consecutiveFailures` (Task 7 of the three-tier ladder migration). Update to `consecutiveFailures: 0` — semantically equivalent, the test asserts the side-query is called when `force: true`, no other behaviour change. * fix(core): drop compaction summary when output hits maxOutputTokens cap Adds a defensive guard in ChatCompressionService.compress() that detects when the side-query summary hit COMPACT_MAX_OUTPUT_TOKENS (20K). In that case the summary is likely truncated mid-content, so we drop it and return NOOP rather than persist a half-summary. The next send re-tries; reactive overflow still catches the catastrophic case where the API rejects the next request as too large. Documented in the design doc as risk #2; the bot reviewer on PR #4168 correctly pushed for it to land alongside the threshold redesign rather than as a follow-up since the new 20K cap is what makes truncation likely in the first place. * fix(cli): render three-tier thresholds in /context TUI view The Task 11 redesign updated the non-interactive text formatter (formatContextUsageText) but left ContextUsage.tsx — the interactive React component that real /context users see — unchanged. As a result the TUI still showed the old single "Autocompact buffer" line and none of the new warn/auto/hard ladder. Adds a "Compaction thresholds" section after the per-category breakdown: - Effective window - Warn / Auto / Hard threshold rows with a ▶ marker on the row the current usage has crossed - Current tier label coloured by severity (safe→green, warn/auto→ yellow, hard→red) The existing progress bar legend (Used / Free / Autocompact buffer) is preserved because it's tied to the three-segment progress bar visualisation; the new section adds the absolute numbers + tier badge on top of that. Caught by the tmux e2e test (PR #4168 ci-monitor follow-up). Pre-fix the assertion 'Compaction thresholds' missed completely from the TUI; post-fix the new section renders correctly for fresh and live sessions on 1M / 200K / 128K windows. * fix(core,cli): address PR #4168 review batch 4 Behavior fixes: - MAX_TOKENS truncation guard now returns COMPRESSION_FAILED_EMPTY_SUMMARY instead of NOOP so the consecutive-failure breaker actually trips after repeated max-length summaries (R1.1). - Reactive overflow failure increments consecutiveFailures by 1 instead of latching to MAX in one shot, so a transient network blip doesn't permanently disable auto-compaction. The hard-tier rescue resets the counter, which remains the designated recovery path (R1.2). - /context current-tier classification uses rawOverhead (system + tools + memory + skills) as the tier input when API data is not yet available, rather than 0 — large inherited contexts no longer silently show 'safe' (R2.2). Performance: - sendMessageStream computes effectiveTokens ONCE and passes it through TryCompressOptions.precomputedEffectiveTokens, so the cheap-gate inside service.compress doesn't redo the estimation. Also fixes the imageTokenEstimate inconsistency between the rescue and cheap-gate paths (R1.3 + R1.4). - Steady-state path (lastPromptTokenCount > 0) skips the costly getHistory(true) clone — estimatePromptTokens only needs the user message in that branch. Code hygiene: - BYTES_PER_TOKEN → CHARS_PER_TOKEN (inputs are char counts, not byte counts; CJK text would mislead under the old name) (R3.1). - Drop dead getContextUsagePercent helper + index re-export — no callers in source after the threshold rewire (R1.5). - Add a comment on estimatePromptTokens' first-send fallback documenting the ~15-20K under-estimate (system prompt + tools + skills) and that reactive overflow is the safety net (R3.3). Tests: - New CLI ContextUsage.test.tsx exercises the React renderer for the three-tier section: section presence, ▶ marker placement per tier, current-tier label coloring (R1.6). - New chatCompressionService.test.ts case pins that a stale contextPercentageThreshold: 0 value in user settings no longer short-circuits compaction (R2.1). - New tokenEstimation.test.ts case covers functionResponse (distinct nested-parts branch from functionCall) (R3.5). - New geminiChat.test.ts integration test exercises the real ChatCompressionService — not a mock — for the first-send-after- inherited-history scenario where lastPromptTokenCount=0 and only the full-history estimate can cross the auto threshold (R3.4). Declined: R3.2 (change `>=` to `>` on the MAX_TOKENS guard). The current operator catches the at-cap case as suspicious, which is intentional — landing exactly at the output cap is far more likely truncation than clean stop given p99.99 ≈ 17K. With R1.1 in place, persistent truncations trip the breaker after MAX_CONSECUTIVE_FAILURES so the worst case is bounded. * fix(core,cli): address PR #4168 review batch 5 - R5.1: tighten /context tier comment + TODO. The rawOverhead-based fix doesn't cover `--continue` restores with many history messages (since rawOverhead excludes messagesTokens). UI may still show 'safe' for one render until the first send. Documented inline and added a TODO to plumb chat history into collectContextData for same-source-of-truth as the cheap-gate. - R5.2a: add TODO(finish_reason) at the truncation guard. The `>= cap` heuristic false-positives on legitimate at-cap summaries; the proper signal is finish_reason which runSideQuery doesn't surface today. - R5.2b: split telemetry — new CompressionStatus.COMPRESSION_FAILED_OUTPUT_TRUNCATED enum value. Distinct from EMPTY_SUMMARY so logs/telemetry can tell prompt-quality failures (tune prompt / splitter) from capacity failures (raise cap / shrink splitter input). isCompressionFailureStatus() treats both as failures so the breaker behavior is unchanged. - R5.3: expand consecutiveFailures JSDoc to clarify it tracks "non-force, non-hard-rescue consecutive failures" — hard-rescue resets the counter and force=true skips increments, so the counter is the "regular path" health signal only; reactive overflow is the real safety net for the force-only paths. - R5.4: document the CompressOptions field rename (hasFailedCompressionAttempt: boolean → consecutiveFailures: number) as an SDK breaking change in the design doc with migration guide. * fix(core): disambiguate hard-rescue from manual /compress orphan-strip Self-review (dual reviewer / pr-triage round 1) caught a correctness regression in the hard-rescue path: `sendMessageStream` calls `tryCompress(force=true)` from inside the pre-push window when `effectiveTokens >= hard`. The service's orphan-strip predicate at `chatCompressionService.ts:426-429` gated on `force` alone, which conflated two distinct call shapes: - manual `/compress` (force=true, trigger='manual'): user-initiated between turns; trailing model funcCall IS orphaned because no funcResponse is coming - hard-rescue (force=true, trigger='auto'): automatic mid-turn; trailing model funcCall is ACTIVE because its matching funcResponse is sitting in the pending `userContent` waiting to be pushed The strip fired for both, so a hard-rescue triggered mid tool-use loop would drop the active funcCall. After compression returned and `userContent` (the funcResponse) was pushed, the next API request carried tool_result with no matching tool_use → provider validation error. The in-code comment at L422-424 already documented this exact constraint for the auto-compress case (`force=false`), but reusing `force=true` for hard-rescue silently violated the same constraint. Fix: - Gate `hasOrphanedFuncCall` on `compactTrigger === 'manual'` instead of `force`. The trigger field already disambiguates intent. - `sendMessageStream` hard-rescue now passes `trigger: 'auto'` explicitly (without it, `force=true` defaults to `trigger='manual'` via the `?? (force ? 'manual' : 'auto')` resolver). Sibling audit for "force=true non-manual callsites": - `GeminiClient.tryCompressChat` (manual /compress): correct — manual - `sendMessageStream` hard-rescue: fixed in this commit - `sendMessageStream` reactive overflow catch: already passes trigger='auto'; runs AFTER API call (userContent in history), so if it observes a trailing funcCall it IS orphaned but findCompressSplitPoint handles the case without needing the strip RED-first regression test added: `preserves trailing model+funcCall under hard-rescue (force=true + trigger=auto)` in `chatCompressionService.test.ts`. Failed against pre-fix code (the strip dropped the funcCall); passes against the fix. Adjacent fixes from the same triage round: - `docs/users/configuration/settings.md`: the `chatCompression.contextPercentageThreshold` row still said "use 0 to disable compression entirely" — code has ignored the value since the removal commit. Marked the row REMOVED with migration guidance pointing at the design doc. - `packages/core/src/config/config.ts`: the deprecation warning now tells users how to silence it (remove the key) and where to read current behavior, instead of just announcing the removal. - `docs/design/auto-compaction-threshold-redesign.md`: closed Open Question 2 (small-window hard/auto collapse) — decision is to NOT annotate `/context`, with rationale on file. Tests: 2395 core tests passing, typecheck clean. * docs(core): fix tier-collapse direction in auto-compaction design doc Self-review on the 50bac97 commit caught a direction error in the M2a Open Question 2 closure note: said `currentTier` skips `'hard'` and goes to `'auto'` on collapsed windows, which is backwards. `contextCommand.ts:43-44` checks `tokens >= thresholds.hard` first (no `hard > auto` guard — that fix lives in a separate follow-up), so when `hard === auto` the `'hard'` branch matches first and the `'auto'` band is the empty one. Updated the rationale to describe the actual collapse direction and cite the source-of-truth file:line. Conclusion of the open question (don't annotate `/context`) is unchanged — only the explanation is corrected. * refactor(core): extract shared in-flight funcCall fixture in compression tests The auto-compress and hard-rescue tests for "trailing funcCall is active, not orphaned" shared a byte-identical 4-message history and mock setup. Pull both into setupInFlightFuncCallFixture() inside the describe block so each test only contains the scenario name, the compress() call shape, and its own assertions. Net -29 LOC, no behavior change. * fix(core,cli): address PR #4345 round-2 review feedback - geminiChat: remove pre-call consecutiveFailures reset in hard-rescue. force=true already bypasses the breaker check in chatCompressionService; the pre-reset was redundant on success (post-call L614 already handles it) and *broke* the breaker on failure paths — hard-rescue failures don't increment via tryCompress (force=true skips that branch), only the reactive overflow path at L992 explicitly increments. With the pre-reset the counter oscillated 0↔1 every send and MAX_CONSECUTIVE_FAILURES=3 was unreachable. Wrote a RED test asserting the forwarded counter is the latched value, not zero; the test failed against the old code and passes with the reset removed. - geminiChat: log hard-tier-rescue triggers via debugLogger.warn including effectiveTokens, hard, and the current consecutiveFailures so operators debugging "compaction stopped working" have a breadcrumb. - chatCompressionService: clamp effectiveWindow to >= 0 in computeThresholds so the value surfaced in /context stays meaningful for tiny windows (window < SUMMARY_RESERVE). auto/warn/hard outputs are unaffected because each is Math.max(proportional, absolute) and the proportional branch dominates whenever the absolute branch goes negative. - turn.ts: rewrite COMPRESSION_FAILED_OUTPUT_TRUNCATED docstring. Drop the misleading "compression succeeded" framing (the summary is dropped and isCompressionFailureStatus returns true) and reference the full enum name COMPRESSION_FAILED_EMPTY_SUMMARY instead of the abbreviation. - contextCommand.test.ts: reword the no-API-data-session test comment. collectContextData classifies estimated sessions against rawOverhead; with default fixtures rawOverhead lands in `safe`, but heavy system-prompt / skill / MCP loads can push it into warn/auto/hard. - design doc Background: prepend a blockquote clarifying the section describes pre-redesign behavior and that the inline file:line references point at code before PR #4345 (which removes them). - ui/types: replace the duplicated ContextThresholds interface with a type alias to the core's CompactionThresholds. Field-by-field copy in contextCommand.ts becomes a direct spread. ContextUsage.tsx keeps its CompactionThresholds React component name — the alias avoids the collision a direct import would have caused. - contextCommand: interpolate the actual reserve value into the "(window − 20K reserve)" annotation so SUMMARY_RESERVE retuning doesn't leave the text stale. * fix(core): address PR #4345 round-3 + round-4 review feedback R3-1: rewrite the stale "Hard-tier rescue resets the counter" comment in the reactive-overflow path. The R2 commit removed the pre-call reset from hard-rescue; the only counter-reset path is now the post-call COMPRESSED branch in tryCompress. Two contradicting comments in the same file would mislead a future maintainer tracing the lifecycle. R3-2: rewrite the JSDoc on CompactionThresholds.hard. The "(resets failure counter)" phrasing was true under the pre-R2 design; after R2 the hard threshold force-triggers compaction and bypasses the breaker, but does not reset the counter (which only happens on COMPRESSED success via the post-call branch). The type is consumed by both geminiChat and the CLI UI (via ContextThresholds alias), so the authoritative description had to match the actual contract. R3-3: add a Step 3 to the hard-rescue regression test. The test title claims "success recovers via the post-call branch" but the original Steps 1-2 only verified the latched counter was forwarded INTO the call. Step 3 follows up with a below-hard send and asserts the forwarded counter is 0 — proving geminiChat.ts:614 ran on the COMPRESSED result. R3-4: assert effectiveWindow === 0 on the existing extreme-small-window test and add a separate zero-window edge case. The Math.max(0, ...) clamp from R2 was previously unasserted; a regression that removed the clamp would go undetected. R4-1: forward originalTokenCount on the breaker-NOOP path in chatCompressionService.compress() to match the adjacent threshold-NOOP path (L368-369). Returning {originalTokenCount: 0, newTokenCount: 0} masked "breaker tripped at N tokens" as "empty session" in telemetry dashboards. R4-2a: add debugLogger.warn at the two consecutiveFailures increment sites (cheap-gate path L586 and reactive-overflow path L955) when the counter reaches MAX_CONSECUTIVE_FAILURES. The breaker is one of the PR's headline safety features but, prior to this round, had zero observability when it tripped. Required importing MAX_CONSECUTIVE_FAILURES into geminiChat.ts. R4-3: programmatically link tokenEstimation.ts's CHARS_PER_TOKEN to compactionInputSlimming.ts's TOKEN_TO_CHAR_RATIO. Both are 4 today and represent the same generic char/token conversion. Exporting from compactionInputSlimming and aliasing in tokenEstimation eliminates the silent-drift hazard the JSDoc already warned about. Declined (round-weighted bar at round 4): - R3-5: debugLogger test for hard-rescue trigger — observability test coverage is overthinking at round 3+; the log is informational. - R4-2b: expose breaker state in /context — new feature; out of scope. - R4-4: render test for auto-tier marker — test coverage gap on working code, defer to follow-up PR per round-weighted bar. - R4-5a: extract makeFakeChat/makeFakeConfig shared factory — pure test refactor at round 4, not a fix. - R4-5b: direct unit test for precomputedEffectiveTokens — exercised indirectly via hard-rescue path tests in geminiChat.test.ts. - R4-6: truncation-guard fallback test for missing candidatesTokenCount — code already has a TODO acknowledging the heuristic is imperfect (chatCompressionService.ts:549-553); defer. * fix(core): address PR #4345 round-5 review feedback R5-1: assert breaker-NOOP forwards originalTokenCount. R4-1 changed the breaker-NOOP return from `{0, 0}` to `{originalTokenCount, originalTokenCount}` so telemetry can distinguish "breaker tripped at N tokens" from "empty session", but the existing test only checked compressionStatus and newHistory. Now seeds a non-zero originalTokenCount (120K) and asserts both fields forward it. R5-2: forward originalTokenCount on the empty-history NOOP. This was sibling drift on R4-1 — I fixed the cited breaker-NOOP site but missed the empty-history NOOP. Of 5 NOOP return sites in chatCompressionService, 4 now forward originalTokenCount (breaker, threshold-gate, post-split, min-compression-fraction) and 1 (this one) was still returning `{0, 0}`, breaking the project-wide invariant. Now consistent. R5-3: replace 10 stale line-number references with semantic anchors. After the R3+R4 push, the line refs in my R2/R3 comments (`geminiChat.ts:614`, `chatCompressionService.ts:339`, `line 992`, `L627`, `line 944`) no longer pointed at their original targets — `geminiChat.ts:614` now points at `setSystemInstruction`'s body, completely unrelated to compaction. The pattern itself is fragile; semantic phrasing ("the post-call reset in tryCompress's COMPRESSED handler") doesn't drift when lines shift. 347/347 affected core tests passing locally; typecheck clean. * fix(core): address PR #4345 round-6 review feedback (R6 sweep) R6-1: rewrite the stale JSDoc bullet on `consecutiveFailures` (the "Hard-tier rescue failures" bullet). The old wording said "the counter is reset to 0 BEFORE the rescue call" — that contradicted R5 which explicitly removed the pre-call reset. Now the bullet matches the actual behavior: counter is NOT pre-reset, force=true bypasses the breaker, post-call COMPRESSED handler resets on success, reactive overflow is the explicit-increment safety net. My R5 stale-comment sweep only grep'd inline `//` comments; this JSDoc on the field declaration slipped through. Re-audited "reset to 0 BEFORE" / "pre-reset" across both packages — single site remaining. R6-7: assert `passedOpts.trigger === 'auto'` in the hard-rescue test. This field is the orphan-strip safety wire added by the C1 fix (the service's `compactTrigger === 'manual'` check would otherwise strip the trailing active funcCall mid tool-loop). The test asserted force and pendingUserMessage but not the trigger; a refactor dropping the 'auto' from `trigger: shouldForceFromHard ? 'auto' : undefined` would silently break orphan-strip safety. Now regression-guarded with a single-line expect. 164/164 affected core tests passing locally. Declined per round-weighted bar (round 6 defaults Suggestion / Test coverage / Style to overthinking): - R6-2/3/6: test-coverage gaps on working code — defer to follow-up - R6-4: redundant truthy guard on always-set fields — style nit - R6-5: text-vs-UI inconsistency on /context — existing test enforces current behavior; treat as design decision (offer follow-up if reviewer escalates) - R6-8 (tipRegistry small-window context-high): explicitly closed in design doc's Open Question 2 — small windows have empty context-high band by design; UI work is out-of-scope for this PR - R6-9: wasted clone on rare fallback path — Suggestion-level perf - R6-10 (CompressionMessage missing case): file not in this PR's diff; reviewer themselves proposed it as follow-up

LaZzyMan force-pushed the lazzy/trusting-grothendieck-8a8501 branch from 1dcef8c to d270af0 Compare May 15, 2026 08:09

LaZzyMan marked this pull request as ready for review May 15, 2026 09:59

This was referenced May 15, 2026

OOM in long sessions: V8 heap pressure can exceed limit before token-based compaction runs #4185

Closed

fix(core): add heap-pressure auto-compaction safety net #4186

Merged

wenshao reviewed May 16, 2026

View reviewed changes

This was referenced May 17, 2026

📊 AI CLI 工具社区动态日报 2026-05-17 zx0828/big_model_radar#60

Open

📊 AI CLI 工具社区动态日报 2026-05-17 gsscsd/big_model_radar#356

Open

wenshao reviewed May 17, 2026

View reviewed changes

LaZzyMan force-pushed the lazzy/trusting-grothendieck-8a8501 branch from 3786355 to 181393c Compare May 18, 2026 02:55

wenshao reviewed May 18, 2026

View reviewed changes

wenshao mentioned this pull request May 18, 2026

feat(serve): auth device-flow route (#4175 Wave 4 PR 21) #4255

Merged

18 tasks

wenshao requested changes May 18, 2026

View reviewed changes

LaZzyMan force-pushed the lazzy/trusting-grothendieck-8a8501 branch from 9cbcfd2 to 27200f7 Compare May 20, 2026 02:34

wenshao requested changes May 20, 2026

View reviewed changes

wenshao reviewed May 20, 2026

View reviewed changes

Comment thread docs/users/configuration/settings.md Outdated

LaZzyMan force-pushed the lazzy/trusting-grothendieck-8a8501 branch from 27200f7 to e861d07 Compare May 20, 2026 03:11

wenshao requested changes May 20, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 20, 2026

📊 AI CLI 工具社区动态日报 2026-05-20 ivanweng2077/big_model_radar#64

Open

wenshao requested changes May 20, 2026

View reviewed changes

LaZzyMan mentioned this pull request May 20, 2026

feat(core)!: redesign auto-compaction thresholds with three-tier ladder #4345

Merged

4 tasks

LaZzyMan closed this May 20, 2026

yiliang114 mentioned this pull request May 20, 2026

fix(core): replace structuredClone with shallow copy to prevent OOM in long sessions #4286

Merged

LaZzyMan mentioned this pull request May 25, 2026

Follow-up from PR #4386 R5 review: 3 polish items (warning width chrome, audit-log callId, ACP audit-log sibling) #4509

Closed

This was referenced May 26, 2026

feat(worktree): Phase D — startup --worktree flag + symlinkDirectories + PR refs #4381

Merged

fix(core): truncate model-facing tool output #4520

Open

-        info: {
+      config
+        .getDebugLogger()
+        .warn(
+          `[chat-compression] summary output reached the ` +
+            `COMPACT_MAX_OUTPUT_TOKENS cap (${COMPACT_MAX_OUTPUT_TOKENS}); ` +
+            `dropping potentially-truncated result.`,
+        );
+      return {
+        newHistory: null,
+        info: {
+          originalTokenCount,
+          newTokenCount: originalTokenCount,
+          compressionStatus: CompressionStatus.FAILURE,
+        },
+      };

	self.consecutiveFailures = MAX_CONSECUTIVE_FAILURES;
	self.consecutiveFailures += 1;

-            chat.getHistory(true),
+    const thresholds = computeThresholds(contextLimit);
+    const effectiveTokens =
+      originalTokenCount > 0
+        ? estimateContentTokens([pendingUserMessage ?? []])
+        : estimatePromptTokens(
+            chat.getHistory(true),
+            pendingUserMessage,
+,
+            slimmingConfig.imageTokenEstimate,
+          );

-export const BYTES_PER_TOKEN = 4;
+/**
+ * Average characters-per-token for char-based token estimation.
+ * Matches the inverse of TOKEN_TO_CHAR_RATIO in compactionInputSlimming.ts.
+ */
+export const CHARS_PER_TOKEN = 4;

	compressionOutputTokenCount >= COMPACT_MAX_OUTPUT_TOKENS
	compressionOutputTokenCount > COMPACT_MAX_OUTPUT_TOKENS

-  return estimateContentTokens([...history, userMessage], imageTokenEstimate);
+  // Fallback: estimate from history + user message. Note this underestimates
+  // by ~15-20K tokens (system prompt, tool definitions, skills) — the reactive
+  // overflow handler is the safety net if the hard-tier rescue misses.
+  return estimateContentTokens([...history, userMessage], imageTokenEstimate);

-  it('estimates functionCall (json-dense) contributes some positive count', () => {
+  it('estimates functionResponse (json-dense) contributes some positive count', () => {
+    const c: Content = {
+      role: 'user',
+      parts: [{ functionResponse: { name: 'tool', response: { result: 'data'.repeat(100) } } }],
+    };
+    const result = estimateContentTokens([c]);
+    expect(result).toBeGreaterThan(0);
+  });
+  it('estimates functionCall (json-dense) contributes some positive count', () => {


		export type ContextTier = 'safe' \| 'warn' \| 'auto' \| 'hard';

		export interface ContextThresholds {

-                } catch (compressionError) {
+                } catch (compressionError) {
+                  if (
+                    params.config?.abortSignal?.aborted ||
+                    isAbortError(compressionError)
+                  ) {
+                    throw compressionError;
+                  }
+                  debugLogger.warn('Reactive compression failed.', compressionError);
+                  self.consecutiveFailures += 1;

	'Context near hard limit — auto-compact will force on next send. Consider /clear if you want to start fresh.',
	'Context near hard limit — auto-compact was forced on this turn. Consider /clear if context remains tight.',

	'[qwen-code] chatCompression.contextPercentageThreshold has been removed ' +
	'Note: the "contextPercentageThreshold" setting is removed. ' +
	'To disable auto-compaction, set "chatCompression": { "disabled": true } instead.'

-      const autoCompactionDisabled =
+// Add a test in the "sendMessageStream hard-tier rescue" describe block:
+// set mockConfig.getChatCompression().mockReturnValue({ disabled: true }),
+// seed lastPromptTokenCount above hard threshold,
+// assert compressSpy.mock.calls[0][1].force is false

-      return {
+if (chatCompressionSettings?.disabled && !force && !bypassTokenThreshold) {
+  debugLogger.debug(
+    '[compaction] NOOP: chatCompression.disabled is true',
+  );
+  return { ... };
+}


	export const COMPACT_MAX_OUTPUT_TOKENS = 20_000;
	export const SUMMARY_RESERVE = COMPACT_MAX_OUTPUT_TOKENS;

-  readonly auto: number;
+if (!Number.isFinite(window) || window <= 0) {
+  const infinite = Number.MAX_SAFE_INTEGER;
+  return { warn: infinite, auto: infinite, hard: infinite };
+}

- * before reactive recovery catches the overflow.
+export function estimateSteadyStateTokens(
+  lastPromptTokenCount: number,
+  lastCandidatesTokenCount: number,
+  userMessage: Content,
+  imageTokenEstimate?: number,
+): number { ... }
+export function estimateColdStartTokens(
+  history: Content[],
+  userMessage: Content,
+  imageTokenEstimate?: number,
+): number { ... }

	'"set threshold to 0 to disable" escape hatch is gone. If you need ' +
	'Note: to disable auto-compaction, set chatCompression.disabled to true instead. ' +
	'If you need to retain full history for other reasons, use /clear between conversations.',

		@@ -1375,6 +1667,23 @@ export class GeminiChat {
		// Always update the per-chat counter so this chat (including

	// Always update the per-chat counter so this chat (including
	usageMetadata.promptTokenCount ?? usageMetadata.totalTokenCount;

-          // payload (NaN, Infinity, non-number) coerces to 0 instead
+        const rawPromptTokens =
+          usageMetadata.promptTokenCount ?? usageMetadata.totalTokenCount;
+        if (rawPromptTokens) {
+          this.lastPromptTokenCount = Number.isFinite(rawPromptTokens)
+            ? (rawPromptTokens as number)
+            : this.lastPromptTokenCount;

Conversation

LaZzyMan commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Scope / Risk

Testing Matrix

Design references

Uh oh!

github-actions Bot commented May 15, 2026

📋 Review Summary

🔍 General Feedback

🎯 Specific Feedback

🟡 High Priority Issues

🟢 Medium Priority Issues

🔵 Low Priority Suggestions

✅ Highlights

Uh oh!

LaZzyMan commented May 15, 2026

Review response — commit 6ce81e73c

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Summary

Uh oh!

LaZzyMan commented May 15, 2026

E2E 测试报告

✅ 真实模型 E2E（最重要）

✅ TUI /context 三层阈值显示

✅ Headless smoke

⚠️ 仅 unit test 覆盖、未跑真实模型 E2E

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LaZzyMan commented May 18, 2026

LaZzyMan commented May 15, 2026 •

edited

Loading

Review response — commit `6ce81e73c`

github-actions Bot commented May 15, 2026 •

edited

Loading

✅ TUI `/context` 三层阈值显示

Review batch 4 — commit `181393c49`

Review batch 5 — commit `681b70501`

-export {
+export {
+  computeThresholds,
+  MAX_CONSECUTIVE_FAILURES,
+  SUMMARY_RESERVE,
+  type CompactionThresholds,
+} from './services/chatCompressionService.js';