Skip to content

fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134)#4460

Merged
wenshao merged 3 commits into
daemon_mode_b_mainfrom
f2-cleanup-pr-b-self-heal-observability
May 23, 2026
Merged

fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134)#4460
wenshao merged 3 commits into
daemon_mode_b_mainfrom
f2-cleanup-pr-b-self-heal-observability

Conversation

@doudouOUC

@doudouOUC doudouOUC commented May 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

PR B of the F2 (#4336) post-merge cleanup bucket — issue #4175 item 7. Two reviewer-filed observability gaps adopted; one (W93) source-verified non-repro and skipped.

  • W133-aMcpClient.onerror's upstream error (EPIPE / OAuth 401 / server-crash) is now threaded through to the silent-drop 'failed' event's lastError string, so operators triaging a 'failed' event see the actual cause without grepping --debug logs out of band. Preserves the literal "silent transport drop" substring for log-grep backward compat.
  • W134void this.sweepAndDisconnect('silent_drop') no longer swallows pid-sweep failures. sweepAndDisconnect now returns a structured SweepResult; the silent-drop chain inspects it and emits a structured warn log when either listDescendantPids threw OR sigtermPids killed fewer descendants than discovered. Surfaces orphan-process pressure to operators without inflating PR scope (no new SSE event or SDK reducer state — deferred to W134-followup if maintainers want metrics).
  • W93 — declined; source-verified that the W1 fix in F2 commit 6 (mcp-transport-pool.ts:1031-1035) already added entry.forceShutdown('manual') to spawnEntry's catch path, which runs the full cleanup table (status listener removal, timer clear, subscriber detach, sweep+disconnect, onClosed eviction). Post-W1 the "partial cleanup" claim doesn't reproduce.

Changes

File Δ What
packages/core/src/tools/mcp-client.ts +40 / 0 lastTransportError?: Error private field; populate in onerror BEFORE updateStatus(DISCONNECTED) cascade; reset at top of connect(); new getLastTransportError() getter
packages/core/src/tools/mcp-pool-entry.ts +99 / -2 SweepResult interface (file-private); sweepAndDisconnect returns it instead of void; W120 silent-drop block reads getLastTransportError() and appends : <message> to lastError; silent-drop fire-and-forget chain inspects result and debugLogger.warns on pidSweepError OR partial-signal
packages/core/src/tools/mcp-transport-pool.test.ts +242 / -3 4 new tests + 2 module-mocks (pid-descendants.js, debugLogger.js singleton stub)

Test plan

  • npx vitest run --root packages/core src/tools/mcp-transport-pool.test.ts32/32 pass (28 pre-existing + 4 new)
  • npx vitest run --root packages/core src/tools/mcp-client.test.ts46/46 pass (no behavior change beyond the new private field)
  • npx tsc --noEmit -p packages/core/tsconfig.json — clean
  • npx tsc --noEmit -p packages/cli/tsconfig.json — clean (pre-existing server.ts errors on baseline unchanged)
  • npx eslint --max-warnings 0 clean on the 3 touched files

Behavior changes (zero wire/SDK breaking):

  • PoolEvent.failed.lastError carries the upstream error message in a : <msg> suffix when McpClient.onerror captured one before the W120 listener fired. Pre-fix marker substring "silent transport drop" preserved.
  • sweepAndDisconnect private return type changes from Promise<void> to Promise<SweepResult>. All 4 callers internal to mcp-pool-entry.ts; forceShutdown / doRestart ignore the return (JS implicit-void at await).
  • New debugLogger.warn line on silent-drop sweep observability — surface only at --debug warn+, no other side effect.

Audit notes (covered in plan, captured here for reviewers):

  • The second mcpClient.onerror at mcp-client.ts:706 lives in the legacy non-class connectAndDiscover flow on a raw SDK Client — pool mode never reaches it.
  • Manager-side readResource self-heal at mcp-client-manager.ts:2390 checks getStatus(), doesn't read lastError. No interaction.
  • Silent drop emits 'failed' directly per JSDoc at mcp-pool-events.ts:40-41; no 'disconnected' precedes it. Targeting the 'failed' emit is correct.

Follow-ups (out of scope)

  • W133-c (#4175 item 7 PR C): add source: 'reconnect_budget' | 'silent_drop' discriminator field to PoolEvent.failed. Requires SDK semver minor bump.
  • W134 metrics surface: if maintainers want a daemon-wide orphan-process pressure metric (counter on GET /workspace/mcp or new mcp_sweep_warning SSE event), file as W134-followup. This PR's structured warn log is the minimum viable signal.

🤖 Generated with Qwen Code


📝 描述准确性更新(2026-05-31,作者自查)

注明:Changes 表行数偏旧(如 mcp-pool-entry 实为 +113/-5)。

W93 declined as already satisfied by W1 fix in #4336 commit 6
(spawnEntry's catch already calls forceShutdown which runs the full
cleanup table — listener removal, timer clear, subscriber detach,
sweep+disconnect, onClosed eviction). Source-verified non-repro.

W133-a: McpClient.onerror now captures the error in a private
`lastTransportError` field (reset at each connect()); the W120
silent-drop block at mcp-pool-entry.ts:346 reads it via the new
`getLastTransportError()` getter and appends `: <error.message>` to
the lastError string on the emitted 'failed' event. Preserves the
literal "silent transport drop" prefix invariant for log-grep
backward compat — pre-fix marker stays a substring.

W134: sweepAndDisconnect now returns SweepResult instead of void —
{ pidSweepError?, disconnectError?, descendantsFound?,
descendantsSignaled? }. The silent-drop fire-and-forget caller chains
to inspect the result and emits a structured warn log when either
pid-sweep threw OR sigtermPids partially signaled (signaled < found)
— surfaces orphan-process pressure without inflating PR scope (no
new SSE event or SDK reducer state; deferred to W134-followup if
maintainers want metrics).

forceShutdown / doRestart sweep callers ignore the return value (JS
implicit-void at await sites preserves behavior).

4 new tests in mcp-transport-pool.test.ts covering W133-a happy path
+ fallback (no prior onerror) + W134 pidSweepError + W134
partial-signal failure modes. Module-mocks pid-descendants.js for
controllable sweep behavior, and debugLogger.js to observe warn
calls (production logger is session-gated and a no-op in tests).
Singleton-stub debugLogger mock so production module-load
`createDebugLogger('McpPool:Entry')` and the test's retrieval get
the same vi.fn instances.

Verification:
- tsc clean: packages/core, packages/cli (server.ts pre-existing
  errors unchanged)
- F2 transport-pool: 32/32 pass (28 pre-existing + 4 new)
- mcp-client: 46/46 pass
- eslint --max-warnings 0 clean on 3 touched files

Part of #4175 #4336 follow-up bucket.
@github-actions

Copy link
Copy Markdown
Contributor

📋 Review Summary

This PR (W133-a + W134) addresses two observability gaps in MCP server error handling by threading upstream transport errors through to the 'failed' event's lastError field and surfacing orphan-process pressure via structured warn logs. The implementation is surgical, well-tested, and preserves backward compatibility. All test cases pass and the changes are non-breaking.

🔍 General Feedback

  • Code quality is excellent: Comments are thorough with cross-references to issue numbers, commit hashes, and line numbers. The JSDoc annotations explain the why behind each change.
  • Backward compatibility preserved: The synthetic "silent transport drop" substring is retained for existing log-grep tooling.
  • Test coverage is comprehensive: 4 new tests cover both W133-a (error threading) and W134 (sweep observability) scenarios, including edge cases like partial signals and fallback behavior.
  • TypeScript discipline: The new SweepResult interface is appropriately scoped as file-private (interface not export interface), reflecting its internal-only usage.
  • Positive pattern: The PR demonstrates good judgment in scoping—deferring W93 as already-fixed and keeping W134's warn-log-only approach (no new SSE events or SDK state changes).

🎯 Specific Feedback

🔵 Low

  • File: mcp-client.ts:130-137 — The comment for setting lastTransportError mentions it happens "BEFORE the synchronous updateStatus(DISCONNECTED) cascade" to guarantee population before the W120 listener fires. Consider adding a brief inline assertion or TypeScript type annotation that makes this ordering more explicit for future maintainers (e.g., a comment noting that updateStatus is synchronous and doesn't yield to the event loop).

  • File: mcp-pool-entry.ts:373-383 — The lastError construction uses a ternary that appends : ${upstreamError.message}. Consider extracting this into a small helper function (e.g., composeLastError(upstreamError?: Error): string) to make the intent clearer and reduce line length. This is a minor readability enhancement.

  • File: mcp-pool-entry.ts:425-463 — The .then((result) => { ... }) chain inside the silent-drop block is quite dense. Consider extracting the warn-log decision logic into a private method (e.g., _logSweepObservabilityIfNeeded(result: SweepResult)) to improve testability and reduce the cognitive load of the main chain. This would also make it easier to add future observability enhancements without bloating this callback.

  • File: mcp-transport-pool.test.ts:54-74 — The mock for debugLogger.js includes a comment explaining the singleton stub pattern. Consider adding a small utility function in the test file (e.g., getDebugLoggerStub()) that encapsulates this pattern, so future test authors don't accidentally reinstantiate a fresh stub and break the singleton invariant.

  • File: mcp-pool-entry.ts:107-136 — The SweepResult interface tracks both pidSweepError and disconnectError, but only pidSweepError is used in the warn-log decision at lines 445-460. The disconnectError field appears unused in the current implementation. If it's intentionally tracked for future use, add a comment noting this. If it's genuinely unused, consider removing it to reduce confusion.

✅ Highlights

  • Excellent test design: The W133-a fallback test (falling back to synthetic-only marker when no upstream error was captured) demonstrates deep understanding of the race condition being defended against.
  • Defensive coding: The sigtermPids partial-signal detection (signaled < found) is a clever way to surface orphan-process pressure without requiring new infrastructure.
  • Clear documentation: The PR description's audit notes (e.g., explaining why readResource self-heal and the legacy connectAndDiscover flow are unaffected) show thorough cross-file analysis.
  • Appropriate scoping: The decision to keep SweepResult internal and not thread it through SSE events demonstrates good judgment about PR scope and follow-up work.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves MCP transport-pool observability in “silent transport drop” scenarios by (1) threading the underlying SDK transport error into the emitted 'failed' event and (2) surfacing pid-sweep anomalies (enumeration failure or partial SIGTERM) via a structured warn log.

Changes:

  • Capture and expose the most recent SDK Client.onerror error in McpClient, and append it to the silent-drop 'failed' event lastError.
  • Make PoolEntry.sweepAndDisconnect() return a structured SweepResult and use it to emit a warn log when pid sweeping fails or only partially signals descendants.
  • Add/extend Vitest coverage for the new observability behavior (including mocks for pid-sweep and debug logging).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
packages/core/src/tools/mcp-client.ts Adds lastTransportError tracking + getter; clears on connect() so silent-drop can include upstream cause in failure events.
packages/core/src/tools/mcp-pool-entry.ts Uses getLastTransportError() when emitting silent-drop 'failed'; changes sweepAndDisconnect() to return SweepResult and logs warnings on sweep anomalies.
packages/core/src/tools/mcp-transport-pool.test.ts Adds targeted tests for error-threading and pid-sweep warning behavior; introduces module mocks to make these assertions possible.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/core/src/tools/mcp-pool-entry.ts Outdated
Comment thread packages/core/src/tools/mcp-pool-entry.ts Outdated
Comment thread packages/core/src/tools/mcp-client.ts
Comment thread packages/core/src/tools/mcp-transport-pool.test.ts
T1 [copilot mcp-pool-entry.ts:116 — stale line ref in SweepResult JSDoc]:
  replaced `mcp-pool-entry.ts:383` with stable method-anchor reference
  to the W120 silent-drop block inside `statusChangeListener`. Line
  numbers drift on every edit; method names don't.

T2 [copilot mcp-pool-entry.ts:453 — `?? 0` ambiguous in warn payload]:
  silent-drop warn log now prints `descendantsFound=unknown` and
  `descendantsSignaled=unknown` when the values are undefined (only
  reachable in the pidSweepError branch — sweep threw before
  assignment). Operators triaging the warn can now distinguish
  "sweep succeeded but found 0 descendants" from "sweep itself
  threw, count is genuinely unmeasured". Locked in via a new
  assertion in the W134 pidSweepError test.

T3 [copilot mcp-client.ts:116 — brittle line refs in lastTransportError
  JSDoc]: replaced `mcp-pool-entry.ts:346` and `mcp-client.ts:130`
  with stable method/block names (the `statusChangeListener` silent-
  drop block; the `client.onerror` arrow inside connect()). Same
  fix applied to the parallel comment in mcp-transport-pool.test.ts:730
  for consistency.

T4 [copilot mcp-transport-pool.test.ts:797 — singleton-stub mock comment
  contradictory]: rewrote the comment to unambiguously describe what
  the mock DOES (factory body runs once; inner arrow returns the same
  object on every call) instead of the prior hypothetical phrasing
  ("Returning a fresh object would have...") which read as a
  description of current behavior at first glance.

All 4 are doc/comment fixes — zero behavior change apart from the
T2 string format ('unknown' instead of '0'). Verified:
- 32/32 mcp-transport-pool.test.ts pass
- tsc clean on packages/core
- eslint --max-warnings 0 clean on 3 touched files
Comment thread packages/core/src/tools/mcp-pool-entry.ts Outdated
…Error field

T5 [wenshao mcp-pool-entry.ts:134 — `disconnectError` is dead data]:
  glm-5.1 review caught that the field was populated when
  `client.disconnect()` threw (line 844) but no consumer ever read
  it — the silent-drop `.then()` handler gated only on
  `pidSweepError` and partial-signal; `forceShutdown` and `doRestart`
  ignore the return; no test asserted on it.

Removed the field from `SweepResult` and the assignment in the
disconnect catch. The pre-existing `debugLogger.error(`client.disconnect
failed for ...`)` inside `sweepAndDisconnect` already gives operators
the signal — adding it to the outer silent-drop warn would have been
duplicate noise. If a future consumer needs to gate logic on disconnect
failures, re-add the field + reader at that point.

Verification: 32/32 mcp-transport-pool.test.ts pass; tsc + eslint
clean on the touched file.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found. LGTM! ✅ — DeepSeek/deepseek-v4-pro via Qwen Code /review

@doudouOUC doudouOUC requested review from chiga0 and yiliang114 May 23, 2026 13:46
@wenshao

wenshao commented May 23, 2026

Copy link
Copy Markdown
Collaborator

Local maintainer validation — all PR-relevant gates green ✅

Reviewed at head 450e301c8 (against base 57d04786d on daemon_mode_b_main) in a dedicated tmux session (pr4460, 4 windows) under git worktree /.qwen/tmp/review-pr-4460. The PR has 3 author commits — the main F2 cleanup + 2 round fold-ins from copilot/doc review.

Note: This PR targets daemon_mode_b_main, not main. No CI runs exist on this branch (the review-pr jobs are the only checks visible and both show SKIPPED).

Environment

  • macOS 26.4.1 (Darwin 25.4.0 arm64), Node 22.17.0, npm 11.8.0
  • Fresh npm ci (1453 packages)
  • Repo version 0.15.11

Results

Stage Command Result
Install npm ci ✅ exit 0
Build npm run build ✅ exit 0 (only pre-existing curly warnings in vscode-ide-companion)
PR-specific tests npx vitest run src/tools/mcp-transport-pool.test.ts src/tools/mcp-client.test.ts 78/78 (mcp-transport-pool 32 including 4 new, mcp-client 46)
Full packages/core suite cd packages/core && npx vitest run ⚠️ 316/318 files passed / 8696 passed from 87023 failures all pre-existing / environmental (see Triage)

Triage of the 3 core failures (NOT caused by PR 4460)

File Fails Cause Verified
src/skills/skill-manager.test.ts 2 .qwen path fixture bug Reproduces on all .qwen/tmp/ worktrees across 7+ PR validations
src/core/anthropicContentGenerator/anthropicContentGenerator.test.ts 1 Claude Code UA injection (claude-cli/1.2.3 overrides QwenCode/1.2.3) Reproduces on qwen-code-x3/main and all prior PR validations

Zero MCP-related test failures — the only files touched by this PR are clean.

Changes summary

File Δ Purpose
packages/core/src/tools/mcp-client.ts +42 / 0 lastTransportError private field + getLastTransportError() getter; populated in client.onerror before updateStatus(DISCONNECTED) cascades; reset at connect() start
packages/core/src/tools/mcp-pool-entry.ts +113 / -5 SweepResult interface; sweepAndDisconnect returns structured result instead of void; W120 silent-drop block threads lastTransportError into lastError; silent-drop chain inspects sweep result + warns on orphan-process pressure
packages/core/src/tools/mcp-transport-pool.test.ts +250 / 0 4 new tests (W133-a silent-drop transport error threading, W134 sweep-result observability) + pid-descendants / debugLogger module mocks

Behavioral observations

W133-a (transport error threading): Before this PR, when an MCP transport silently drops (EPIPE, OAuth 401, server crash), the pool entry emitted a 'failed' event with lastError: "transport disconnected (silent transport drop)" — generic, no root cause. After this PR, McpClient.onerror captures the upstream error into lastTransportError BEFORE updateStatus(DISCONNECTED) cascades to the pool entry listener. The W120 silent-drop block reads it via getLastTransportError() and appends : <error.message> to the lastError string. Successful reconnects clear the stale error via the reset in connect().

Key design detail: the error is captured BEFORE the updateStatus call because updateStatus synchronously fires the pool entry's statusChangeListener, which reads lastTransportError. This ordering is explicitly called out in both the implementation comments and the copilot review feedback.

W134 (orphan-process observability): sweepAndDisconnect now returns a SweepResult with pidSweepError, descendantsFound, and descendantsSignaled. The silent-drop fire-and-forget chain inspects this and emits a structured debugLogger.warn when either listDescendantPids threw OR sigtermPids signaled fewer descendants than discovered. This surfaces orphan-process pressure to operators without inflating PR scope (no new SSE event or SDK reducer state — deferred to follow-up). forceShutdown and doRestart callers ignore the return (JS implicit-void).

W93 (declined): Source-verified that the W1 fix in F2 commit 6 already added entry.forceShutdown('manual') to spawnEntry's catch path at mcp-transport-pool.ts:1031-1035, which runs the full cleanup table. Post-W1 the "partial cleanup" claim doesn't reproduce.

Audit notes:

  • The second mcpClient.onerror at mcp-client.ts:706 lives in the legacy non-class connectAndDiscover flow — pool mode never reaches it, so it doesn't need the W133-a treatment
  • Manager-side readResource self-heal at mcp-client-manager.ts:2390 checks getStatus(), doesn't read lastError — no interaction
  • Silent drop emits 'failed' directly (per JSDoc at mcp-pool-events.ts:40-41); no 'disconnected' precedes it — targeting 'failed' is correct

Scope / risk

  • Diff: +405 / -5 across 3 files (2 source + 1 test), 3 focused commits
  • Zero wire/SDK breaking changes — PoolEvent.failed.lastError string format gains an optional suffix but preserves the pre-fix "silent transport drop" substring for log-grep backward compat
  • sweepAndDisconnect return type change (Promise<void>Promise<SweepResult>) is internal — all 4 callers are in the same file; JS implicit-void absorbs the return at call sites that ignore it
  • Only new side effect is one debugLogger.warn line on silent-drop sweep observability — surface only at --debug warn+

Reviewer recommendation

Safe to merge into daemon_mode_b_main.

  • All 78 PR-specific tests pass (32 mcp-transport-pool + 46 mcp-client)
  • Zero MCP-related failures in the full core suite (8696 passed)
  • Design is well-documented with cross-file call-site annotations (every lastTransportError mutation/read site has a comment pointing to the W120 consumer)
  • Audit notes cover all three potential interaction points (legacy flow, manager self-heal, event ordering)
  • W93 declined with source-level evidence — reduces PR scope, no loss
  • The 3 pre-existing core failures are the same environmental issues observed across all PR validations

— Maintainer local validation, run on 450e301c8 from upstream pull/4460/head.

@wenshao wenshao merged commit 0c04309 into daemon_mode_b_main May 23, 2026
4 checks passed
@doudouOUC doudouOUC deleted the f2-cleanup-pr-b-self-heal-observability branch May 23, 2026 15:24

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found. LGTM! ✅ — qwen3.7-max via Qwen Code /review

// The shared mock factory returns the SAME stub for the same call
// shape — re-invoke createDebugLogger to grab the warn vi.fn that
// mcp-pool-entry.ts captured at module load. (The mock's factory
// returns a NEW object each call, so we have to assert via the

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] This inline comment contradicts the mock definition above (and the top-of-file comment that was already fixed in f3157ec). The parenthetical says "The mock's factory returns a NEW object each call" but the actual mock at line 51 uses singleton-stub pattern (const stub = {...}; return { createDebugLogger: () => stub }) — the SAME object every time.

Suggested change
// returns a NEW object each call, so we have to assert via the
// The mock factory returns a singleton stub — re-invoke
// createDebugLogger to get a handle on the same warn vi.fn
// that mcp-pool-entry.ts captured at module load.

— qwen3.7-max via Qwen Code /review

* `client.onerror` arrow inside `connect()` (this file). Consumer:
* the W120 silent-drop block inside `PoolEntry.statusChangeListener`.
*/
getLastTransportError(): Error | undefined {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] This 12-line JSDoc largely duplicates the field-level doc on lastTransportError (lines 106–118) — same F2/W133-a provenance, same consumer (W120 silent-drop block), same reset semantics. Two copies that say the same thing in slightly different words will drift.

Suggested change
getLastTransportError(): Error | undefined {
/**
* Most recent SDK Client `onerror` payload, or `undefined` if none
* since the last `connect()`. See `lastTransportError` field docs.
*/
getLastTransportError(): Error | undefined {

— qwen3.7-max via Qwen Code /review

// doesn't gate the outer warn on it (the inner error log
// already gives operators the signal), and forceShutdown /
// doRestart callers ignore the return entirely. Per wenshao
// review #4460: was previously stored on `SweepResult.disconnectError`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] This comment narrates review history referencing a field (SweepResult.disconnectError) that no longer exists in the codebase. Once merged, the only way to discover this history is git blame — which is the appropriate place for it. Inline comments referencing removed entities become confusing landmines for future readers.

Suggested change
// review #4460: was previously stored on `SweepResult.disconnectError`
// Disconnect failure is rare (usually a transport bug). Not surfaced
// via SweepResult — the inner error log gives operators the signal,
// and forceShutdown / doRestart callers ignore the return.

— qwen3.7-max via Qwen Code /review

generation: this._generation,
lastError: 'transport disconnected (silent transport drop)',
lastError: upstreamError
? `transport disconnected (silent transport drop): ${upstreamError.message}`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nice to have] When upstreamError is a truthy Error with an empty .message (rare but possible — new Error('')), the ternary produces "transport disconnected (silent transport drop): " with a dangling colon-space. Guarding on the message string rather than the object avoids this:

Suggested change
? `transport disconnected (silent transport drop): ${upstreamError.message}`
lastError: upstreamError?.message
? `transport disconnected (silent transport drop): ${upstreamError.message}`
: 'transport disconnected (silent transport drop)',

— qwen3.7-max via Qwen Code /review

}
}
} catch (err) {
result.pidSweepError =

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] pidSweepError appears to be dead code in production. listDescendantPids (pid-descendants.ts:77-91) wraps its entire body in a try/catch that returns [] on any failure — it never rejects. sigtermPids (pid-descendants.ts:370-393) wraps each process.kill in an individual try/catch and never throws. So this catch (err) block that sets result.pidSweepError is unreachable under listDescendantPids's current contract.

The test at mcp-transport-pool.test.ts:788 uses vi.mocked(listDescendantPids).mockRejectedValueOnce(sweepError) which bypasses the function's internal error handling entirely — it exercises a code path that cannot occur outside of tests.

Two options:

  • (a) If this is defense-in-depth against a future listDescendantPids change, add a comment explicitly stating that.
  • (b) If not, remove the pidSweepError field and the first try/catch entirely (the disconnect try/catch below is still needed), and adjust the test to validate through listDescendantPids's actual error contract.

— qwen3.7-max via Qwen Code /review

result.descendantsSignaled = sigtermPids(descendants);
debugLogger.debug(
`Sent SIGTERM to ${signaled}/${descendants.length} descendants ` +
`Sent SIGTERM to ${result.descendantsSignaled}/${descendants.length} descendants ` +

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The debug log mixes a SweepResult field (result.descendantsSignaled) with the local variable descendants.length in the same template string. Since result.descendantsFound = descendants.length is assigned two lines above, they are always equal here — but using one field and one local forces the reader to verify the equivalence.

Suggested change
`Sent SIGTERM to ${result.descendantsSignaled}/${descendants.length} descendants ` +
`Sent SIGTERM to ${result.descendantsSignaled}/${result.descendantsFound} descendants ` +

— qwen3.7-max via Qwen Code /review

doudouOUC added a commit that referenced this pull request May 24, 2026
…4353) + F2 cleanup (#4411, #4460) + F1 test split (#4445) + main sync

# Conflicts:
#	packages/cli/src/acp-integration/acpAgent.worktree.test.ts
#	packages/core/src/tools/notebook-edit.test.ts
#	packages/core/src/utils/notebook.test.ts
#	packages/core/src/utils/notebook.ts
doudouOUC added a commit that referenced this pull request May 27, 2026
Squashed feature work from daemon_mode_b_main branch, rebased onto
latest main to establish proper merge-base and clean PR diff.

Original commits:
- perf(core): F2 cleanup PR A — R9/W11/W12/R10 (post-merge follow-ups) (#4411)
- refactor(acp-bridge): F1 test split — lift bridge.test.ts (6861 LOC) to acp-bridge (#4445)
- fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134) (#4460)
- feat(sdk/daemon-ui): unified completeness follow-up to #4328 (#4353)
- docs(serve): v0.16-alpha known limits + SDK QWEN_SERVER_TOKEN env fallback (PR 27) (#4473)
- docs(deploy): local launch templates for v0.16-alpha (PR 30a) (#4483)
- feat(daemon+sdk): cross-client real-time sync completeness (#4484)
- feat(serve): add POST /session/:id/recap (#4504)
- feat(daemon): add voterClientId to permission_resolved (A4) (#4539)
- feat(serve): --allow-origin <pattern> CORS allowlist (T2.4 #4514) (#4527)
- feat(daemon): in-session model switch reaches the bus (A1) (#4546)
- feat(serve): prompt absolute deadline + SSE writer idle timeout (#4514 T2.9) (#4530)
- Feat/daemon react cli (#4380)
doudouOUC added a commit that referenced this pull request May 27, 2026
…4460)

* fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134)

W93 declined as already satisfied by W1 fix in #4336 commit 6
(spawnEntry's catch already calls forceShutdown which runs the full
cleanup table — listener removal, timer clear, subscriber detach,
sweep+disconnect, onClosed eviction). Source-verified non-repro.

W133-a: McpClient.onerror now captures the error in a private
`lastTransportError` field (reset at each connect()); the W120
silent-drop block at mcp-pool-entry.ts:346 reads it via the new
`getLastTransportError()` getter and appends `: <error.message>` to
the lastError string on the emitted 'failed' event. Preserves the
literal "silent transport drop" prefix invariant for log-grep
backward compat — pre-fix marker stays a substring.

W134: sweepAndDisconnect now returns SweepResult instead of void —
{ pidSweepError?, disconnectError?, descendantsFound?,
descendantsSignaled? }. The silent-drop fire-and-forget caller chains
to inspect the result and emits a structured warn log when either
pid-sweep threw OR sigtermPids partially signaled (signaled < found)
— surfaces orphan-process pressure without inflating PR scope (no
new SSE event or SDK reducer state; deferred to W134-followup if
maintainers want metrics).

forceShutdown / doRestart sweep callers ignore the return value (JS
implicit-void at await sites preserves behavior).

4 new tests in mcp-transport-pool.test.ts covering W133-a happy path
+ fallback (no prior onerror) + W134 pidSweepError + W134
partial-signal failure modes. Module-mocks pid-descendants.js for
controllable sweep behavior, and debugLogger.js to observe warn
calls (production logger is session-gated and a no-op in tests).
Singleton-stub debugLogger mock so production module-load
`createDebugLogger('McpPool:Entry')` and the test's retrieval get
the same vi.fn instances.

Verification:
- tsc clean: packages/core, packages/cli (server.ts pre-existing
  errors unchanged)
- F2 transport-pool: 32/32 pass (28 pre-existing + 4 new)
- mcp-client: 46/46 pass
- eslint --max-warnings 0 clean on 3 touched files

Part of #4175 #4336 follow-up bucket.

* fix(core): #4460 round 1 fold-in — 4 copilot doc/comment threads adopted

T1 [copilot mcp-pool-entry.ts:116 — stale line ref in SweepResult JSDoc]:
  replaced `mcp-pool-entry.ts:383` with stable method-anchor reference
  to the W120 silent-drop block inside `statusChangeListener`. Line
  numbers drift on every edit; method names don't.

T2 [copilot mcp-pool-entry.ts:453 — `?? 0` ambiguous in warn payload]:
  silent-drop warn log now prints `descendantsFound=unknown` and
  `descendantsSignaled=unknown` when the values are undefined (only
  reachable in the pidSweepError branch — sweep threw before
  assignment). Operators triaging the warn can now distinguish
  "sweep succeeded but found 0 descendants" from "sweep itself
  threw, count is genuinely unmeasured". Locked in via a new
  assertion in the W134 pidSweepError test.

T3 [copilot mcp-client.ts:116 — brittle line refs in lastTransportError
  JSDoc]: replaced `mcp-pool-entry.ts:346` and `mcp-client.ts:130`
  with stable method/block names (the `statusChangeListener` silent-
  drop block; the `client.onerror` arrow inside connect()). Same
  fix applied to the parallel comment in mcp-transport-pool.test.ts:730
  for consistency.

T4 [copilot mcp-transport-pool.test.ts:797 — singleton-stub mock comment
  contradictory]: rewrote the comment to unambiguously describe what
  the mock DOES (factory body runs once; inner arrow returns the same
  object on every call) instead of the prior hypothetical phrasing
  ("Returning a fresh object would have...") which read as a
  description of current behavior at first glance.

All 4 are doc/comment fixes — zero behavior change apart from the
T2 string format ('unknown' instead of '0'). Verified:
- 32/32 mcp-transport-pool.test.ts pass
- tsc clean on packages/core
- eslint --max-warnings 0 clean on 3 touched files

* fix(core): #4460 round 2 fold-in — remove dead SweepResult.disconnectError field

T5 [wenshao mcp-pool-entry.ts:134 — `disconnectError` is dead data]:
  glm-5.1 review caught that the field was populated when
  `client.disconnect()` threw (line 844) but no consumer ever read
  it — the silent-drop `.then()` handler gated only on
  `pidSweepError` and partial-signal; `forceShutdown` and `doRestart`
  ignore the return; no test asserted on it.

Removed the field from `SweepResult` and the assignment in the
disconnect catch. The pre-existing `debugLogger.error(`client.disconnect
failed for ...`)` inside `sweepAndDisconnect` already gives operators
the signal — adding it to the outer silent-drop warn would have been
duplicate noise. If a future consumer needs to gate logic on disconnect
failures, re-add the field + reader at that point.

Verification: 32/32 mcp-transport-pool.test.ts pass; tsc + eslint
clean on the touched file.
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
doudouOUC added a commit that referenced this pull request Jun 11, 2026
* perf(core): F2 cleanup PR A — R9/W11/W12/R10 (post-merge follow-ups) (#4411)

* refactor(core): F2 PR A R9 — McpClientManager options-object ctor

R9 (filed as F2 follow-up from #4336 review): 7 positional ctor args
collapse to (config, toolRegistry, options?: McpClientManagerOptions).
The trailing 5 (eventEmitter, sendSdkMcpMessage, healthConfig,
budgetConfig, pool) become named fields on `McpClientManagerOptions`.
Test factory `mkManager(overrides?)` introduced at the top of
`mcp-client-manager.test.ts` so each of the prior 80 inline
constructions becomes a single line naming only the field(s) the test
overrides; the 4 `undefined` sentinels each test threaded through to
reach the trailing `pool` arg are gone.

Net: 113 LOC removed (test) + 35 LOC added (src exposes interface +
mkManager factory + tool-registry call site update). Behavior
unchanged — same field assignments, same downgrade-enforce-without-
budget breadcrumb, same budget event wiring.

Filed bucket: F2 perf / cleanup PR A (R9 + W11 + W12 + R10/R23 T7),
see issue #4175 item 7 "F2 post-merge cleanup PRs". This is the first
of the 4 fixes in PR A; W11/W12/R10 follow as separate commits.

Test sweep: 84/84 mcp-client-manager.test.ts pass; typecheck clean.

* refactor(core): F2 PR A W11 — extract attachPooledSession + rollbackReservationOnSpawnFailure

W11 (filed as F2 follow-up from #4336 review): two private helpers
on `McpTransportPool` to eliminate inline duplication in `acquire()`:

  - `attachPooledSession(entry, id, serverName, cfg, sessionId,
    toolReg, promptReg)`: builds `SessionMcpView` + `entry.attach`
    with the standard pool release callback. Used by both the
    fast-path attach (existing entry) and the post-spawn attach
    (after `await inFlight`). NOT used by `createUnpooledConnection`
    — its release callback runs `entry.forceShutdown('manual')` +
    `indexDetach` directly (no pool refcount accounting since
    unpooled entries are per-session).

  - `rollbackReservationOnSpawnFailure(reservationResult, serverName)`:
    R24 T17 contract — only release the budget slot if THIS acquire
    actually reserved a new slot (`'reserved'`); `'already_held'`
    skips because the sibling owns it. Used by both the unpooled
    catch and the pooled spawn-in-flight catch.

Race-window invariants (W10 / W77 / W90 / W111 / W125 / R24 T17)
stay at the call sites because they describe the SURROUNDING
ordering, not the helpers themselves. Helpers are documented to
defer those decisions back to callers.

Behavior unchanged. Filed bucket: F2 perf cleanup PR A (R9 done /
W11 this commit / W12 + R10 to follow).

Test sweep: 28/28 mcp-transport-pool.test.ts pass; typecheck clean.

* refactor(core): F2 PR A W12 — SessionMcpView precompute filter Sets

W12 (filed as F2 follow-up from #4336 review): `applyTools` /
`applyPrompts` precompute `excludeSet` + `includeSet` once per pass
instead of scanning `cfg.includeTools` / `cfg.excludeTools` arrays
inside every per-tool iteration.

Pre-fix the per-tool predicate (`passesSessionFilter`) walked both
arrays for every snapshot entry → O(M × N) per `applyTools` call.
With M tools × N filter entries, typical M=5-20 / N=2-5 case
finishes in microseconds either way; the win is data-structure
correctness and code clarity, not perceived perf.

`passesSessionFilter` / `passesSessionPromptFilter` (the array-
based predicates) stay exported and unchanged for unit tests + any
caller wanting to test a single name without paying Set construction.
The bulk path uses two new private helpers `compileNameFilter` +
`compiledFilterAccepts` whose Sets live on the `applyTools` /
`applyPrompts` stack frame.

Same semantics: `excludeTools` is direct-equality match (no parens
strip — pre-F2 behavior preserved); `includeTools` strips the first
`(...)` suffix so `toolName(args)` matches `toolName`.

Filed bucket: F2 perf cleanup PR A (R9 + W11 done / W12 this commit
/ R10 to follow).

Test sweep: 13/13 session-mcp-view.test.ts pass; typecheck clean.

* perf(core): F2 PR A R10 / R23 T7 — pid-descendants ps snapshot + pgrep fallback

R10 / R23 T7 (filed as F2 follow-up from #4336 review): the Linux
/ macOS pid-descendant enumeration moves from per-pid `pgrep -P
<pid>` BFS (one subprocess fork per node visited) to a single
`ps -A -o pid=,ppid=` snapshot followed by an in-memory tree walk
over `Map<ppid, pid[]>`. Windows analog: single `Get-CimInstance
Win32_Process | ConvertTo-Csv` snapshot of all `(ProcessId,
ParentProcessId)` rows replaces per-pid
`Get-CimInstance -Filter "ParentProcessId=$p"` BFS.

Two motivations:
  1. **Fork count**: typical `npx → tool` / `uvx → tool` wrapper
     trees are 2-3 levels deep with B=1-3 children per node →
     pre-fix BFS forked ~5-10 subprocesses per pool-shutdown call.
     Post-fix: exactly 1 fork regardless of tree depth.
  2. **Snapshot consistency**: pre-fix BFS walked the table level
     by level; a child that forked between two adjacent BFS levels
     could be missed (we'd see the child but query its
     descendants AFTER the new fork). The snapshot path captures
     the table at one instant; new descendants forked after the
     snapshot are tolerated by the existing ESRCH-tolerant
     SIGTERM loop.

Caveats:
  - `ps -A -o pid=,ppid=` is POSIX standard (macOS / Linux /
    *BSD), but BusyBox `ps` <v1.28 (2018) doesn't support `-o`.
    Distroless containers may not have `ps` at all. To preserve
    behavior on those edge platforms, the legacy per-pid `pgrep`
    BFS is retained as a fallback (`listDescendantPidsUnixPgrepFallback`).
    Same retention on Windows for the per-pid filter path.
  - Snapshot path uses `maxBuffer: 8MB` to cover ~250k-process
    pathological hosts. Default 1MB would clip at ~30k processes.
  - `MAX_DESCENDANTS = 256` / `MAX_DEPTH = 8` caps preserved on
    both snapshot + fallback paths.
  - Snapshot scans the entire host process table (not just the
    target subtree). On the typical 200-500 process developer
    machine this parses in <10ms; the win over BFS is real but
    not order-of-magnitude — ~2x improvement, not 100x. PR A's
    motivation framing is "fork hygiene + consistency", not raw
    perf.

Empty-result detection: snapshot path tracks `parsedRows`. If the
ps/CIM tool runs successfully but produces 0 parseable rows
(BusyBox without `-o` echoing usage, AppLocker truncating CIM
output, etc.), we throw — the outer catch falls back to the
per-pid path. A genuine "root has no children" case parses many
rows and just returns empty from the walk. So the
"no-children-found" semantics are preserved across both paths.

Test gate update: pre-fix `integration: spawn-and-enumerate` test
skipped on `CI === '1'` because pgrep wasn't available on
minimal CI runners. Post-fix `ps -A` is universally available on
non-distroless Linux/macOS — only the Windows skip remains.
6/6 pid-descendants tests pass including the now-active
integration spawn test.

Design doc (`docs/design/f2-mcp-transport-pool.md` §6.4 + the F2
follow-up table at lines 82-85) updated to reflect the snapshot
+ fallback shape, and to mark W11 / W12 / R9 / R10 as ✅ Done in
PR A with the per-fix commit refs.

This commit completes F2 cleanup PR A. Filed bucket order:
R9 (commit 0cb1eaa27) → W11 (commit 2d546efca) → W12 (commit
a4a855ab3) → R10 (this commit). Issue #4175 item 7 "F2 post-
merge cleanup PRs": PR A done; PR B (W93 + W133-a + W134) and
PR C (W133-c SDK breaking) to follow as separate clusters.

Test sweep: 287/287 F2 + cli pass; ESLint clean; typecheck clean
(core + cli). Integration test on macOS local runs the new
snapshot path successfully.

* refactor(core): F2 PR A R2 — wenshao followup (visited set + dedup predicate)

Two Suggestions from wenshao's first PR #4411 review pass (07:15Z),
both small and worth folding before merge:

PR-A-R2 #1 (pid-descendants.ts:309 — walkDescendants visited set):
  `walkDescendants`'s BFS lacked a `visited` set. If the snapshot
  captures a PID-reuse cycle — rare but possible on busy hosts with
  rapid pid churn between `ps -A`'s start and parse, where Linux
  wraparound can show a freed pid in a different parent's children
  list creating an A→B / B→A cycle — pre-fix BFS would revisit nodes
  and fill the MAX_DESCENDANTS=256 quota with duplicate entries,
  starving legitimate descendants. Pre-PR-A the per-pid `pgrep` BFS
  had the same theoretical issue but was less exposed (each
  `pgrep -P pid` call returns only DIRECT children; snapshot captures
  the whole tree at once, making cycles instantly visible).

  Fix: 3-LOC `Set<number>` add. `root` seeded into `visited` so a
  malformed snapshot listing root as a descendant of its own child
  doesn't re-enqueue root either.

PR-A-R2 #2 (session-mcp-view.ts:117 — predicate dedup):
  After W12, the exported `passesSessionFilter` /
  `passesSessionPromptFilter` still called `passesNameFilter` (the
  pre-W12 array-based implementation), while `applyTools` /
  `applyPrompts` used `compiledFilterAccepts(compileNameFilter(...))`.
  Two parallel implementations of the same predicate — future change
  to one without the other would silently diverge:
    - the exported function's tests (passesSessionFilter unit tests)
      would still pass
    - the production filter path in applyTools/applyPrompts would
      behave differently

  Reviewer also noted `passesSessionPromptFilter` had zero callers
  in production code or tests after W12 — `applyPrompts` no longer
  references it. Kept the export rather than deleting it (matches
  the `passesSessionFilter` shape for symmetry + the F3 audit-path
  comment block earmarks both as the replay predicates), but routed
  both through `compiledFilterAccepts(compileNameFilter(...))` so
  there is a single source of truth. Set construction is per-call
  for these exports (negligible for unit-test / one-off probes);
  the bulk paths in `applyTools` / `applyPrompts` still construct
  ONE filter per pass via the original W12 code path.

`passesNameFilter` (the standalone array-based helper) deleted —
its only callers were the two exports, which now use the compiled
path. Public-API surface unchanged: the two exported functions
keep their signatures and semantics.

Test sweep: 19/19 pid-descendants + session-mcp-view tests pass;
typecheck + ESLint clean.

Continues commit chain: f05917071 (R9) → 20d2f1b90 (W11) →
6cf18f641 (W12) → 2a41c6fae (R10) → this (R2 followups).

* fix(core): F2 PR A R3 T3 — Windows CSV delimiter locale fix

`ConvertTo-Csv -NoTypeInformation` honors the system locale's list
separator on PowerShell 5.1. On German / French / Dutch / Italian /
... locales the separator is `;` not `,`, so the regex
`^"(\d+)","(\d+)"$` in `snapshotProcessTreeWin` never matched →
`parsedRows === 0` → snapshot threw → fell back to the per-pid CIM
filter path with ~0.5-1s extra PowerShell startup latency per
descendant on every pool shutdown.

Fix: 1-LOC `-Delimiter ","` on `ConvertTo-Csv`. Forces comma
regardless of locale or PowerShell version. PowerShell 7+ defaults
to comma already; 5.1 (the Windows-bundled version most users have
without explicit upgrade) honored locale. The explicit delimiter
makes both consistent.

Skipped wenshao's companion Suggestion T4 (test coverage for
walkDescendants MAX_DESCENDANTS / MAX_DEPTH caps) as F2 hardening
follow-up — the caps are simple 2-line guards exercisable by
inspection; ~50 LOC of mock infrastructure isn't commensurate
with the regression risk on currently-stable defensive code,
and (per the issue #4175 follow-up bucket) we keep dedicated
test-coverage work out of perf-cleanup PRs.

Continues commit chain: f05917071 (R9) → 20d2f1b90 (W11) →
6cf18f641 (W12) → 2a41c6fae (R10) → ced5d62b0 (R2) → this (R3 T3).

Test sweep: 6/6 pid-descendants tests pass; typecheck + ESLint clean.

* refactor(acp-bridge): F1 test split — lift bridge.test.ts (6861 LOC) to acp-bridge (#4445)

* refactor(acp-bridge): rename httpAcpBridge.test.ts -> bridge.test.ts (git mv)

Pure file rename; zero content change. Follow-up commits will:
- extract FakeAgent + makeChannel + makeBridge into testUtils.ts
- split 4 daemon-host integration tests back to cli/daemonStatusProvider.test.ts

Part of #4175 F1 test split (deferred from #4334).

* refactor(acp-bridge): extract testUtils + split daemon-host tests to cli (#4175 F1)

Net mechanical extraction following commit 2aff1a4d1 (pure git mv of
httpAcpBridge.test.ts -> bridge.test.ts). After this commit
`@qwen-code/acp-bridge` owns the bulk of the lifted bridge test
suite, and cli keeps only the 4 daemon-host integration tests that
need to wire `createDaemonStatusProvider()`.

Changes:

1. New `packages/acp-bridge/src/internal/testUtils.ts` (~280 LOC):
   FakeAgent, FakeAgentOpts, ChannelHandle, makeChannel, makeBridge
   (no statusProvider default — acp-bridge tests exercise the
   no-provider fallback path), WS_A/WS_B/SESS_A constants. Marked
   @internal; lives under `internal/` matching the existing
   `stderrLine.ts` package-private convention. Exposed via new
   `./internal/testUtils` subpath in package.json exports.

2. `packages/acp-bridge/src/bridge.test.ts` shrinks from 6861 ->
   ~6400 LOC: fixtures replaced with named imports from
   `./internal/testUtils.js`; cross-package import
   `from './daemonStatusProvider.js'` removed (4 daemon-host tests
   moved out); ACP SDK + bridgeErrors / workspacePaths / bridge /
   channel / bridgeTypes imports split into multiple statements
   reflecting actual post-F1 provenance.

3. New `packages/cli/src/serve/daemonStatusProvider.test.ts`
   (~240 LOC, 4 tests): wires real `createDaemonStatusProvider()`
   through a cli-side `makeBridge` wrapper to assert end-to-end
   daemon env / preflight cells. Imports
   `createHttpAcpBridge` via the `./httpAcpBridge.js` re-export
   shim — doubles as a shim surface smoke check.

Verification:
- acp-bridge: 291/291 tests pass (177 in bridge.test.ts).
- cli: daemonStatusProvider.test.ts 4/4 pass; full cli suite 6742/6767
  green (16 pre-existing failures in AuthDialog / memoryDiagnostics /
  useAtCompletion — all on `daemon_mode_b_main` baseline, last
  modified by commits predating this branch).
- Tests counts pre-split: 181 in httpAcpBridge.test.ts;
  post-split: 177 in bridge.test.ts + 4 in daemonStatusProvider.test.ts
  = 181 (parity preserved).

Part of #4175 F1 test split (deferred from #4334).

* refactor(acp-bridge): self-review round 1 — vitest alias + doc/comment polish

Five code-reviewer findings folded in on top of e97282f30:

S1 [Suggestion] — Test-utils ships to npm + cli reads stale dist.
  Added `packages/cli/vitest.config.ts:resolve.alias` mapping
  `@qwen-code/acp-bridge/internal/testUtils` → the .ts source. The
  package subpath export is RETAINED (required for TypeScript
  `nodenext` to resolve types — it won't fall back to tsconfig
  paths once exports rejects a subpath). Dual-channel approach
  documented in the testUtils JSDoc, including the alpha-stage 0.0.1
  tradeoff that the file still ships in dist (stripInternal /
  .npmignore deferred).

S2 [Suggestion] — Stale wording "two tests" in narrative comment.
  bridge.test.ts split-marker now correctly says "4 fallback tests"
  (no-provider × 2 surfaces + throwing-provider × 2 surfaces).

S3 [Suggestion] — "Shim smoke check" only half-applied.
  daemonStatusProvider.test.ts now routes `BridgeOptions` and
  `HttpAcpBridge` types through `./httpAcpBridge.js` shim too
  (alongside `createHttpAcpBridge`), so the entire factory surface
  the cli tests rely on flows through the F1 re-export shim.

N1 [Nit] — Asymmetric split-marker phrasing.
  Both markers now describe the 4 moved tests by surface
  (env real / preflight idle / preflight merged-live /
  preflight extMethod-throws) rather than "1 of" + "3 more".

N2 [Nit] — testUtils "the suite" ambiguity.
  makeChannel JSDoc now references `bridge.test.ts` explicitly
  instead of "the suite" (which was unambiguous pre-split when
  helpers + 10 createInMemoryChannel sites lived in the same file).

Verification: 291/291 acp-bridge tests pass; 4/4 cli daemon
integration tests pass; tsc clean on both packages (pre-existing
server.ts errors on baseline unchanged); eslint --max-warnings 0
clean on all 4 touched files.

* docs(cli): self-review round 2 — fix stale vitest.config.ts alias comment

Round 2 reviewer caught a 3-way contradiction in the round 1 docs:
- vitest.config.ts said: alias replaces the export, internal/* stays
  unpublished (matches stderrLine convention).
- package.json: subpath export IS declared.
- testUtils.ts JSDoc: both channels intentionally retained,
  testUtils ships in dist.

Round 1 explicitly chose to retain the export because TS `nodenext`
won't fall back to tsconfig `paths` once `exports` rejects a
subpath; the alias only serves to short-circuit *runtime* resolution
so cli reads src/ not dist/. Rewriting the vitest.config.ts comment
to reflect that dual-channel reality (and pointing readers at
testUtils.ts for the full rationale).

* fix(acp-bridge): #4445 round 3 fold-in — 4 of 7 reviewer threads adopted

PR #4445 review pass — 4 adopt + 3 decline (declines replied
inline; not folded here):

ADOPTED:

T1 [copilot daemonStatusProvider.test.ts:136 — bridge.shutdown
   missing]: added `await bridge.shutdown()` to test 2 (preflight
   idle). Three of four tests already shut down; symmetry +
   future-proof if `createHttpAcpBridge` gains background work
   even when no channel was spawned.

T5 [wenshao testUtils.ts:92 — makeBridge naming collision]: cli-
   side helper renamed `makeBridge` -> `makeBridgeWithDaemonStatusProvider`
   (4 call sites in daemonStatusProvider.test.ts), JSDoc updated to
   reference the wenshao thread. testUtils.makeBridge stays as the
   canonical name used by ~100 tests in bridge.test.ts. A future
   contributor can no longer pick the wrong helper by accident.

T6 [wenshao testUtils.ts:32 — JSDoc mis-claims @internal tag matches
   stderrLine.ts convention]: fixed wording. stderrLine.ts uses prose
   only; @internal is an additional package-private signal, not a
   convention match. Also restructured the npm-leak paragraph to
   describe the new .npmignore-via-files-negation enforcement (T7).

T7 [wenshao package.json:70 — testUtils ships to npm]: switched
   `files: ["dist"]` -> `files: ["dist", "!dist/internal/testUtils.*",
   "!dist/**/*.test.*"]`. Wenshao's suggested `"test"` exports
   condition wasn't viable: vitest sets `vitest` not `test`, and
   gating on `vitest` would hide types from the cli's tsc compile.
   The negation-pattern files-field excludes the built testUtils
   from the publish surface while keeping the subpath export entry
   that TypeScript `nodenext` needs to resolve types. Verified via
   `npm pack --dry-run`: dist/internal/stderrLine.* still ships
   (production internal helper); dist/internal/testUtils.* +
   dist/**/*.test.* are excluded.

DECLINED (replied on PR threads, not folded here):

T2/T3 [copilot — `handles` array unused in tests 3/4]: bookkeeping
   matches the pre-split bridge.test.ts verbatim; cleanup is scope
   creep on this rename PR.

T4 [copilot — testUtils eager-imports createHttpAcpBridge,
   cross-copy identity risk]: cli daemonStatusProvider.test.ts uses
   its OWN local `makeBridgeWithDaemonStatusProvider` and never
   imports testUtils.makeBridge — the cross-copy concern isn't
   triggered. Premature abstraction on a test-only fixture.

Verification: 291/291 acp-bridge tests pass; 4/4 cli daemon tests
pass; tsc clean both packages; eslint --max-warnings 0 clean on
2 touched .ts files; `npm pack --dry-run` confirms publish-surface
exclusions.

* fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134) (#4460)

* fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134)

W93 declined as already satisfied by W1 fix in #4336 commit 6
(spawnEntry's catch already calls forceShutdown which runs the full
cleanup table — listener removal, timer clear, subscriber detach,
sweep+disconnect, onClosed eviction). Source-verified non-repro.

W133-a: McpClient.onerror now captures the error in a private
`lastTransportError` field (reset at each connect()); the W120
silent-drop block at mcp-pool-entry.ts:346 reads it via the new
`getLastTransportError()` getter and appends `: <error.message>` to
the lastError string on the emitted 'failed' event. Preserves the
literal "silent transport drop" prefix invariant for log-grep
backward compat — pre-fix marker stays a substring.

W134: sweepAndDisconnect now returns SweepResult instead of void —
{ pidSweepError?, disconnectError?, descendantsFound?,
descendantsSignaled? }. The silent-drop fire-and-forget caller chains
to inspect the result and emits a structured warn log when either
pid-sweep threw OR sigtermPids partially signaled (signaled < found)
— surfaces orphan-process pressure without inflating PR scope (no
new SSE event or SDK reducer state; deferred to W134-followup if
maintainers want metrics).

forceShutdown / doRestart sweep callers ignore the return value (JS
implicit-void at await sites preserves behavior).

4 new tests in mcp-transport-pool.test.ts covering W133-a happy path
+ fallback (no prior onerror) + W134 pidSweepError + W134
partial-signal failure modes. Module-mocks pid-descendants.js for
controllable sweep behavior, and debugLogger.js to observe warn
calls (production logger is session-gated and a no-op in tests).
Singleton-stub debugLogger mock so production module-load
`createDebugLogger('McpPool:Entry')` and the test's retrieval get
the same vi.fn instances.

Verification:
- tsc clean: packages/core, packages/cli (server.ts pre-existing
  errors unchanged)
- F2 transport-pool: 32/32 pass (28 pre-existing + 4 new)
- mcp-client: 46/46 pass
- eslint --max-warnings 0 clean on 3 touched files

Part of #4175 #4336 follow-up bucket.

* fix(core): #4460 round 1 fold-in — 4 copilot doc/comment threads adopted

T1 [copilot mcp-pool-entry.ts:116 — stale line ref in SweepResult JSDoc]:
  replaced `mcp-pool-entry.ts:383` with stable method-anchor reference
  to the W120 silent-drop block inside `statusChangeListener`. Line
  numbers drift on every edit; method names don't.

T2 [copilot mcp-pool-entry.ts:453 — `?? 0` ambiguous in warn payload]:
  silent-drop warn log now prints `descendantsFound=unknown` and
  `descendantsSignaled=unknown` when the values are undefined (only
  reachable in the pidSweepError branch — sweep threw before
  assignment). Operators triaging the warn can now distinguish
  "sweep succeeded but found 0 descendants" from "sweep itself
  threw, count is genuinely unmeasured". Locked in via a new
  assertion in the W134 pidSweepError test.

T3 [copilot mcp-client.ts:116 — brittle line refs in lastTransportError
  JSDoc]: replaced `mcp-pool-entry.ts:346` and `mcp-client.ts:130`
  with stable method/block names (the `statusChangeListener` silent-
  drop block; the `client.onerror` arrow inside connect()). Same
  fix applied to the parallel comment in mcp-transport-pool.test.ts:730
  for consistency.

T4 [copilot mcp-transport-pool.test.ts:797 — singleton-stub mock comment
  contradictory]: rewrote the comment to unambiguously describe what
  the mock DOES (factory body runs once; inner arrow returns the same
  object on every call) instead of the prior hypothetical phrasing
  ("Returning a fresh object would have...") which read as a
  description of current behavior at first glance.

All 4 are doc/comment fixes — zero behavior change apart from the
T2 string format ('unknown' instead of '0'). Verified:
- 32/32 mcp-transport-pool.test.ts pass
- tsc clean on packages/core
- eslint --max-warnings 0 clean on 3 touched files

* fix(core): #4460 round 2 fold-in — remove dead SweepResult.disconnectError field

T5 [wenshao mcp-pool-entry.ts:134 — `disconnectError` is dead data]:
  glm-5.1 review caught that the field was populated when
  `client.disconnect()` threw (line 844) but no consumer ever read
  it — the silent-drop `.then()` handler gated only on
  `pidSweepError` and partial-signal; `forceShutdown` and `doRestart`
  ignore the return; no test asserted on it.

Removed the field from `SweepResult` and the assignment in the
disconnect catch. The pre-existing `debugLogger.error(`client.disconnect
failed for ...`)` inside `sweepAndDisconnect` already gives operators
the signal — adding it to the outer silent-drop warn would have been
duplicate noise. If a future consumer needs to gate logic on disconnect
failures, re-add the field + reader at that point.

Verification: 32/32 mcp-transport-pool.test.ts pass; tsc + eslint
clean on the touched file.

* feat(sdk/daemon-ui): unified completeness follow-up to #4328 (#4353)

* feat(sdk/daemon-ui): expand event coverage to 28+ daemon event types (PR-A)

Closes the "12+ daemon events fall through to debug" gap surfaced in the PR
the daemon currently emits (Stage 1 + Wave 3-4), so renderers stop having
to peek at `rawEvent.data` for known event categories.

Session-meta:
- session.metadata.changed (from session_metadata_updated)
- session.approval_mode.changed (from approval_mode_changed)
- session.available_commands (from available_commands_update; upgraded
  from a status-text fallback to a typed event carrying the command list)

Workspace state (Wave 3-4):
- workspace.memory.changed
- workspace.agent.changed
- workspace.tool.toggled
- workspace.initialized
- workspace.mcp.budget_warning
- workspace.mcp.child_refused
- workspace.mcp.server_restarted
- workspace.mcp.server_restart_refused

Auth device-flow (Wave 4 OAuth, RFC 8628):
- auth.device_flow.started
- auth.device_flow.throttled
- auth.device_flow.authorized
- auth.device_flow.failed (carries DaemonAuthDeviceFlowSdkErrorKind)
- auth.device_flow.cancelled

- `DaemonUiErrorEvent.errorKind?: DaemonErrorKind` — closed-enum error
  category propagated from daemon's typed-error taxonomy. Renderers can
  branch on errorKind for "retry auth" vs "check file path" affordances
  instead of regex-matching `text`.
- `DaemonUiToolUpdateEvent.provenance?: DaemonUiToolProvenance` +
  `.serverId?` — closed enum ('builtin' | 'mcp' | 'subagent' | 'unknown').
  Falls back to the `mcp__<server>__<tool>` naming heuristic when the
  daemon doesn't stamp provenance explicitly. Unblocks UI namespace
  dispatch without string-matching toolName.

Session-meta / workspace / auth events do NOT push transcript blocks.
They are intentional sidechannel observations: `lastEventId` advances
(monotonic invariant preserved), but the chat-stream transcript stays
focused on user/assistant/tool/shell/permission content. Renderers
consume them via selectors (introduced in follow-up PRs).

All new event types produce short structured lines in
`daemonUiEventToTerminalText` for tail-style debug consumers. Web/IDE
renderers should consume the typed events directly via subscription.

40/40 tests pass. New tests verify:
- All 16 new event types normalize correctly
- Malformed payloads fall back to debug without leaking raw data
  (`secret` field never appears in fallback text)
- MCP tool provenance heuristic (`mcp__github__create_issue` →
  provenance='mcp', serverId='github')
- errorKind propagation on session_died / stream_error
- Reducer is no-op on new event types; lastEventId still advances

This is PR-A of the unified-renderer-layer follow-up series:
- PR-A (this commit) — event coverage + closed-enum schema
- PR-B — server-side timestamps + ordering refactor
- PR-C — multimodal content + tool preview taxonomy
- PR-D — render contract (toMarkdown / toHtml / toPlainText) + adapter
  conformance test framework
- PR-E — reducer state machine (subagent / progress / current tool /
  cancellation propagation)

See https://github.com/QwenLM/qwen-code/pull/4328#issuecomment-4494179724
for the full proposal.

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): server timestamps + event-id-based ordering (PR-B)

Closes the "时间定义不标准" gap surfaced in the PR #4328 review:
- Client-side `Date.now()` drifts across clients
- No daemon-authoritative timestamp propagated to UI
- Out-of-order replay events get fresher `state.now` than originals,
  breaking `createdAt` ordering

- `DaemonUiEventBase.serverTimestamp?: number` — daemon-authoritative
  wall-clock timestamp extracted from envelope.
- `DaemonTranscriptBlockBase.serverTimestamp?: number` + `clientReceivedAt: number`.
- `createdAt` preserved as `@deprecated` alias for `clientReceivedAt`
  (backward compat for code written before this PR).

`extractServerTimestamp` looks at three candidate envelope locations:

1. `event.serverTimestamp` (preferred when daemon adds it)
2. `event._meta.serverTimestamp` (Anthropic-style metadata convention)
3. `event.data._meta.serverTimestamp` (sessionUpdate nested location)

The SDK is ready to consume serverTimestamp WHEN daemon emits it, without
requiring a coordinated SDK release. Undefined when daemon doesn't emit
(current state) — graceful degradation to client-clock ordering.

`selectTranscriptBlocksOrderedByEventId(state)` — returns blocks sorted by:

1. `eventId` (daemon-monotonic SSE cursor) — primary key
2. `serverTimestamp` (daemon wall clock) — fallback for synthetic frames
3. `clientReceivedAt` (local clock) — last resort

Use this when displaying long sessions where event id 5 may arrive AFTER
event id 7 (typical in SSE replay-after-reconnect).

`formatBlockTimestamp(block, opts)` — formats the most authoritative
timestamp on a block using `Intl.DateTimeFormat`. Prefers
`serverTimestamp` over `clientReceivedAt` for cross-client consistency.
Accepts locale / timeZone / dateStyle / timeStyle.

Daemon needs to stamp `_meta.serverTimestamp` on every SSE envelope. This
SDK PR is ready to consume it the moment the daemon ships the field; no
coordination needed.

- serverTimestamp extraction from all three envelope locations
- Defaults undefined when envelope has none
- `selectTranscriptBlocksOrderedByEventId` sorts mixed-arrival events by
  eventId (replay scenario)
- `formatBlockTimestamp` prefers serverTimestamp; returns localized string

PR-B of the unified follow-up to PR #4328 (PR-A + PR-B + PR-C + PR-D +
PR-E in one branch).

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): reducer state machine — currentTool / approvalMode / cancellation propagation (PR-E)

Closes the "reducer state machine 设计缺漏" gap surfaced in the PR #4328 review:
- No `currentTool` — UI scans `blocks[]` to find the running tool
- No mirrored approval mode — UI walks events to badge "plan"/"yolo"
- Cancellation does not propagate — in-flight tool blocks stuck at
  'in_progress' forever when the parent prompt is cancelled

## State additions (sidechannel, no transcript blocks)

`DaemonTranscriptSidechannelState`:
- `currentToolCallId?: string` — toolCallId of the in-flight tool
- `approvalMode?: string` — mirrored from session.approval_mode.changed
- `toolProgress: Record<string, { ratio?, step? }>` — per-tool progress
  shape (daemon-side emission of `tool.progress` events pending)

## Reducer behavior

### `tool.update` events

`IN_FLIGHT_TOOL_STATUSES` = { pending, confirming, running, in_progress }
`TERMINAL_TOOL_STATUSES` = { completed, success, failed, error, canceled, cancelled }

- Tool enters in-flight: set `currentToolCallId = event.toolCallId`
- Tool enters terminal: clear `currentToolCallId` if it matches
- Unknown status (forward-compat): leave pointer untouched

This avoids the failure mode where a future daemon-emitted status like
`'paused'` would silently mark unknown states as either in-flight or
terminal incorrectly.

### `session.approval_mode.changed`

Mirror `event.next` onto `state.approvalMode`. Renderers can render a
mode badge ("plan" / "default" / "auto-edit" / "yolo") with a single
selector call, no event-stream walking.

### `assistant.done` with `reason === 'cancelled'`

`propagateCancellationToInFlightTools` walks every tool block whose
status is still in-flight and force-sets it to 'cancelled'. The daemon
does not guarantee terminal `tool_call_update` for every in-flight tool
when the parent prompt is cancelled, so this propagation prevents UI
spinners from spinning forever.

`currentToolCallId` is also cleared in the same call.

Non-cancellation `assistant.done` (e.g., `reason: 'end_turn'`) does NOT
propagate — in-flight tools remain in-flight until the daemon emits
their terminal update naturally.

## Selectors

- `selectCurrentTool(state)` — returns the running tool block, or undefined
- `selectApprovalMode(state)` — returns the mirrored approval mode
- `selectToolProgress(state, toolCallId)` — per-tool progress query

All exported from `@qwen-code/sdk/daemon`.

## Scope deliberately deferred

Subagent nesting (`parentBlockId` / `delegationId` / `DaemonSubagentTranscriptBlock`)
is NOT in this PR. The shape needs design discussion (how to project nested
events; whether to bake delegation tracking into transcript or sidechannel).
PR-D / PR-F follow-up.

## Test coverage (51/51 pass)

- currentToolCallId set on enter, cleared on terminal
- approvalMode mirrors changes
- Cancellation marks in-flight tools 'cancelled', leaves completed alone
- Unknown status does NOT clear currentToolCallId (forward-compat)
- Non-cancellation `assistant.done` does NOT propagate

## Roadmap

PR-E of the unified follow-up to PR #4328 (PR-A + PR-B + PR-E in this
branch; PR-C / PR-D pending).

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): tool preview taxonomy + multimodal content extraction (PR-C)

Closes two related gaps surfaced in the PR #4328 review:
- `DaemonToolPreview` had only 4 kinds — UI fell back to `key_value` /
  `generic` for tools that deserved structured display
- `getTextContent` silently dropped non-text content (image / audio /
  resource), so multimodal conversations vanished from the UI

`DaemonToolPreview` extends from 4 to 8 variants:

- `file_diff` — `{ path, oldText?, newText?, patch? }` — file edit tools
  (Anthropic-style `oldText/newText`, aider-style `patch`, write-style
  `newText` alone)
- `file_read` — `{ path, range?: [start, end] }` — file read tools, with
  range extracted from `lineRange` tuple OR `offset/limit` pair
- `web_fetch` — `{ url, method? }` — HTTP fetch tools (requires URL
  with scheme to avoid false positives on relative paths)
- `mcp_invocation` — `{ serverId, toolName, argsSummary? }` — MCP server
  tool calls, identified via `mcp__<server>__<tool>` naming convention
  (same heuristic as PR-A `DaemonUiToolUpdateEvent.provenance`)

Detector order matters — MCP wins first (most specific), then file_diff,
file_read, web_fetch, then the existing command / key_value fallbacks.

New helper `extractContentPart(value): DaemonUiContentPart | undefined`
returns a discriminated union:

```ts
type DaemonUiContentPart =
  | { kind: 'text'; text: string }
  | { kind: 'image'; mediaType: string; source: { url?, data? } }
  | { kind: 'audio'; mediaType: string; source: { url?, data? } }
  | { kind: 'resource'; uri: string; mediaType?, description? };
```

The existing `getTextContent` is preserved for backward compat. Renderers
that need to surface non-text content (web UI thumbnails, IDE attachment
chips) now have a typed shape to consume.

- Wiring `extractContentPart` into the normalizer / reducer so text
  blocks accumulate `parts: DaemonUiContentPart[]` alongside `text`
  (additive shape change requires render contract coordination — PR-D).
- 5 additional tool preview kinds (image_generation / code_block /
  tabular / subagent_delegation / search) — useful but not urgent;
  current 8 kinds cover the typical agent flows.

- file_diff detection from Anthropic / aider / write shapes
- file_read with lineRange tuple AND offset+limit pair
- web_fetch with method, REJECTS relative paths (no scheme)
- mcp_invocation with serverId + toolName extraction
- Detector priority: MCP wins over file_diff on conflicting shapes
- extractContentPart for text / image (url) / audio (data) / resource
- Unknown content type returns undefined (skip rather than synthesize)
- Image without source returns undefined (defensive)

PR-C of the unified follow-up to PR #4328 (PR-A + PR-B + PR-E + PR-C in
this branch; PR-D render contract pending).

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): render contract — markdown / HTML / plain text helpers (PR-D)

Closes the "render 契约只覆盖 terminal" gap surfaced in the PR #4328 review:

> PR ships `daemonUiEventToTerminalText` for terminal. Web/IDE/channel
> adapters each roll their own projection. No shared contract → adapter
> divergence is inevitable.

## New helpers

```ts
daemonBlockToMarkdown(block, opts?): string  // GFM-compatible
daemonBlockToHtml(block, opts?): string      // conservatively escaped HTML
daemonBlockToPlainText(block, opts?): string // for copy-paste / logs
daemonToolPreviewToMarkdown(preview, opts?): string
```

All three respect the same `kind` discrimination so adapters can switch
between them without touching call sites.

## Per-kind projection

For each `DaemonTranscriptBlock['kind']`:

- `user` / `assistant` / `thought` — plain text with role labels
- `tool` — header with toolName + structured preview + status badge
- `shell` — fenced code block, stream-discriminated (stdout vs stderr)
- `permission` — title + options list + resolved/pending indicator
- `status` / `debug` / `error` — semantic class / role (error → role=alert)

For each `DaemonToolPreview['kind']`:

- `ask_user_question` — question + options as bullet list
- `command` — fenced bash with optional cwd comment
- `file_diff` — unified diff in fenced code block (oldText/newText OR patch)
- `file_read` — `path (lines N-M)` line
- `web_fetch` — `METHOD url` line
- `mcp_invocation` — `serverId::toolName` with args summary
- `key_value` — bullet list
- `generic` — emphasized summary

## Security

- Default HTML sanitizer escapes `<`, `>`, `&`, `"`, `'` and FIRST strips
  ANSI/control sequences via `sanitizeTerminalText` (defense against
  agent-emitted escape codes in HTML output).
- Custom sanitizer hook for consumers wanting markdown→HTML pipelines
  (markdown-it + DOMPurify, etc.).
- `sanitizeUrls` option strips token-like query params (`token=`, `key=`,
  `x-amz-`, etc.) from URLs in `web_fetch` previews.
- `maxFieldLength` truncation defaults 8192, prevents pathological
  rendering on huge content.

## Adapter conformance (out of scope for this commit)

The conformance test framework (fixture corpus + `runAdapterConformanceSuite`)
mentioned in PR-D scope is deferred to a follow-up. The render helpers
here are the precondition — once stable, the conformance framework can
use them as the reference projection.

## Test coverage (77/77 pass)

- All 9 block kinds render in markdown (verified for user/assistant/tool/
  shell/permission/error specifically)
- file_diff renders as unified diff with old/new lines
- mcp_invocation renders as `server::tool` format
- HTML escapes XSS (`<script>` → `&lt;script&gt;`)
- HTML strips terminal escape sequences before escaping
- Error blocks emit `role="alert"` for screen readers
- plain text drops markdown delimiters
- maxFieldLength truncates with ellipsis
- sanitizeUrls strips token query params
- Custom sanitizer hook works

## Roadmap

PR-D of the unified follow-up to PR #4328 — completes the 5-PR series
(A: event coverage, B: time schema, E: state machine, C: tool preview +
content extraction, D: render contract).

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): 5 additional tool preview kinds — taxonomy complete (PR-F)

Closes the "5 additional preview kinds" item in PR #4353's TODO §A
(SDK-only work).

## New preview kinds (8 → 13)

- `code_block` — `{ language?, code, origin? }` — REPL / formatter /
  generator output, fenced as `\`\`\`<language>` in markdown
- `search` — `{ query, resultCount?, top? }` — grep / ripgrep / find /
  glob results with up to 5 top hits
- `tabular` — `{ columns, rows, totalRows? }` — structured table output
  (50-row cap with `totalRows` truncation indicator); supports both
  `columns: string[] + rows: unknown[][]` explicit shape and legacy
  `data: Array<Record<>>` shape (auto-infers columns from first row)
- `image_generation` — `{ prompt, thumbnailUrl?, model? }` — dall-e /
  diffusion / imagen / flux / sora style tools
- `subagent_delegation` — `{ agentName, task, parentDelegationId? }` —
  Anthropic-style Task tool and similar sub-agent dispatchers

## Detector priority

Order matters — most specific wins. New detectors slot in between
`mcp_invocation` and `file_diff`:

```
mcp_invocation > subagent_delegation > search > image_generation
  > file_diff > file_read > web_fetch > code_block > tabular
  > command > key_value > generic
```

Rationale: subagent / search / image generation are most discriminable
(distinct toolName patterns); file ops next; code_block / tabular last
because their shapes (`code:`, `columns:`) can appear in other tools.

## Render projections

Both `daemonToolPreviewToMarkdown` and the plain-text rendering paths
extended with cases for all 5 new kinds:

- code_block: fenced markdown code block with language tag
- search: bold header + GFM bullet list of top results
- tabular: GFM pipe table with header / separator / body / truncation hint
- image_generation: bold header + blockquoted prompt + embedded markdown
  image (URL sanitization respected via `sanitizeUrls` opt)
- subagent_delegation: bold delegate-arrow header + blockquoted task +
  optional parent delegation reference

## Test coverage (91/91 pass, +14 new)

- Each detector with positive case
- Detector priority verified: subagent_delegation wins over file_diff
  when toolName='Task' has both subagent + file-edit fields
- Tabular row cap (50) + totalRows stamping for truncated data
- Legacy data: Array<Record<>> auto-column inference
- Each render projection with structural assertions (markdown table
  format, image embed, bullet lists)

## Roadmap

PR-F of the unified follow-up to PR #4328. Brings the preview taxonomy
to 13 kinds covering: file ops (3), web (1), code/data (2), media (1),
agent control (2 — ask_user_question + subagent_delegation), MCP (1),
search (1), generic fallbacks (2).

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(sdk/daemon-ui): adapter conformance framework + fixture corpus (PR-G)

Closes the "Adapter conformance test framework" item in PR #4353's TODO §A.
Lets any daemon-ui adapter (TUI / web / IDE / channel / mobile) validate
that it projects a fixed corpus of daemon SSE event streams to the same
semantic shape — catches projection drift before it reaches users.

## API surface

```ts
interface DaemonUiAdapterUnderTest {
  reduce(events: readonly DaemonUiEvent[]): unknown;
  renderToText(state: unknown): string;
}

interface DaemonUiConformanceFixture {
  name: string;
  description: string;
  envelopes: DaemonEvent[];           // raw daemon envelopes
  expectedContains: string[];          // phrases the rendered text MUST contain
  expectedAbsent?: string[];           // phrases that MUST NOT appear
  normalizeOptions?: { ... };          // forward-compat normalize opts
}

runAdapterConformanceSuite(adapter, opts?): ConformanceSuiteResult
DAEMON_UI_CONFORMANCE_FIXTURES: ReadonlyArray<DaemonUiConformanceFixture>
```

## Design

**Format-agnostic assertion**: adapters can render to ANSI / HTML /
markdown / JSX — the framework only inspects plain text via
`renderToText`. Catches semantic divergence (missing user message,
wrong tool status, leaked secret) without forcing identical formatting.

**Embedded fixture corpus** (no fs reads — works in browser bundle):
- `simple-chat` — user/assistant streaming flow
- `tool-call-lifecycle` — running → completed transition
- `file-edit-diff` — file_diff preview surfacing
- `mcp-invocation` — MCP serverId/toolName extraction via heuristic
- `permission-lifecycle` — request + resolved with outcome
- `mcp-budget-warning` — Wave 3 event (adapter must observe but rendering
  is its choice)
- `cancellation-propagates` — tool block status flows
- `malformed-payload-redaction` — uses `includeRawEvent: true` to verify
  even a debug-mode adapter doesn't leak `token: secret-do-not-leak`
- `auth-device-flow-success` — Wave 4 OAuth events
- `available-commands-typed-event` — PR-A upgrade from status text

Per-fixture `expectedContains` and `expectedAbsent` describe the
content contract independently of format.

## Suite result

```ts
{
  passed: number,
  failed: ConformanceFailure[],   // each carries missing + leaked + excerpt
  total: number,
}
```

**Does not throw** — caller asserts on `result.failed` so adapter test
suites can produce per-fixture diagnostics rather than a single opaque
exception.

## Filter options

`only` / `skip` allow targeted runs during adapter development:

```ts
runAdapterConformanceSuite(myAdapter, { only: ['simple-chat'] });
runAdapterConformanceSuite(myAdapter, { skip: ['cancellation-propagates'] });
```

## Test coverage (97/97 pass, +6 new)

- SDK reference adapter (reducer + markdown render) passes all fixtures
- SDK reference adapter (reducer + plainText render) also passes
- Buggy adapter (empty string output) fails every fixture with non-empty
  `expectedContains`
- Buggy adapter (raw event dump via JSON.stringify) caught by redaction
  fixture's `expectedAbsent`
- `only` filter narrows to a single fixture
- `skip` filter excludes named fixtures from the corpus

## Usage from adapter authors

```ts
// In your adapter's test file
import { runAdapterConformanceSuite } from '@qwen-code/sdk/daemon';
import { reduceForTui, renderTuiState } from './my-tui-adapter';

it('TUI adapter conforms to daemon UI corpus', () => {
  const result = runAdapterConformanceSuite({
    reduce: reduceForTui,
    renderToText: renderTuiState,
  });
  expect(result.failed).toEqual([]);
});
```

## Roadmap

PR-G of the unified follow-up to PR #4328. The corpus is intentionally
small (10 fixtures) but extensible — adapter authors can submit new
fixtures via additions to `DAEMON_UI_CONFORMANCE_FIXTURES` to lock in
regression coverage for edge cases their adapter encountered.

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat(webui+sdk/daemon-ui): wire transcriptAdapter to SDK render contract (PR-H)

Closes the "WebUI transcriptAdapter migration" item in PR #4353's TODO §A.
Validates the PR-D render contract end-to-end on the real WebUI consumer.

`daemonTranscriptToUnifiedMessages(blocks, options?)` gains a new options
parameter:

```ts
interface DaemonTranscriptAdapterOptions {
  useMarkdown?: boolean;                  // default: false
  enrichToolDetailsWithPreview?: boolean; // default: false
}
```

Defaults preserve legacy behavior — existing callers see no change.

For `user` / `assistant` / `thought` blocks, content is projected via
SDK's `daemonBlockToMarkdown` instead of raw sanitized text. The WebUI's
markdown renderer (markdown-it) then gets:

- `**You**\n\n<content>` for user blocks (bold "You" label)
- Raw text for assistant blocks (markdown formatting in agent output
  passes through cleanly)
- `> *thought:* <text>` blockquote for thought blocks

For `tool` blocks, `rawOutput` is replaced with `daemonToolPreviewToMarkdown(block.preview)`.
This lets WebUI surfaces without per-preview-kind React components still
display:

- `file_diff` as a fenced unified diff
- `mcp_invocation` as `server::tool` with args summary
- `tabular` as GFM pipe table
- `search` as bullet list with match count
- `image_generation` as embedded markdown image
- `subagent_delegation` as delegate arrow + task quote

Renderers with per-kind components should leave this opt-out.

`packages/sdk-typescript/src/daemon/index.ts` was missing exports for
PR-D / PR-F / PR-G / PR-B / PR-E surface — WebUI's `@qwen-code/sdk/daemon`
import path uses the daemon root, not the ui/ sub-index. Added 15+
re-exports so consumers don't need to use the longer
`@qwen-code/sdk/daemon/ui/index.js` path.

Now exported from `@qwen-code/sdk/daemon` root:
- `daemonBlockToMarkdown` / `daemonBlockToHtml` / `daemonBlockToPlainText`
- `daemonToolPreviewToMarkdown`
- `extractContentPart` + `DaemonUiContentPart` type
- `formatBlockTimestamp` + `selectTranscriptBlocksOrderedByEventId`
- `selectCurrentTool` / `selectApprovalMode` / `selectToolProgress`
- `runAdapterConformanceSuite` + `DAEMON_UI_CONFORMANCE_FIXTURES`
- All associated types

`webui/src/daemon/transcriptAdapter.test.ts` mock blocks updated to include
`clientReceivedAt` (required field added in PR-B). Mechanical change —
every `createdAt: N` test fixture gets a matching `clientReceivedAt: N`.

- WebUI `npm run typecheck` — clean
- SDK `npm run typecheck` — clean
- SDK `vitest run test/unit/daemonUi.test.ts` — 97/97 pass
- WebUI transcriptAdapter test fixtures typecheck against updated
  DaemonTranscriptBlockBase schema

PR-H of the unified follow-up to PR #4328. Closes the WebUI migration
gap in TODO §A.

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* docs(daemon-ui): add developer guide + migration cookbook (PR-I)

Closes the final "Documentation" item in PR #4353's TODO §A. Brings the
unified daemon UI surface to ~95% SDK-side completion.

## Files added

- `docs/developers/daemon-ui/README.md` — full API reference
  - Three-layer model (normalizer → reducer → render helpers)
  - Quick start with idiomatic event-loop pattern
  - Event taxonomy (28+ types categorized: chat-stream / session-meta /
    workspace / auth device-flow)
  - Render contract cookbook (markdown / HTML / plainText)
  - Tool preview taxonomy (13 kinds with use cases)
  - State selectors (currentTool / approvalMode / toolProgress / ordering)
  - Cancellation propagation explanation
  - Time semantics (eventId > serverTimestamp > clientReceivedAt
    precedence)
  - Adapter conformance usage
  - ErrorKind dispatch pattern
  - Tool provenance dispatch pattern
  - Forward-compat principles

- `docs/developers/daemon-ui/MIGRATION.md` — adapter author migration
  cookbook
  - Step-by-step recommended adoption order (9 steps, value-ranked)
  - Before/after code examples for each step
  - Backward-compat checklist (everything is additive — no breaking
    changes)
  - Cross-references to PR-A through PR-H commits

## Roadmap

PR-I of the unified follow-up to PR #4328. Documentation-only — no
code changes; no tests affected.

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix(daemon-ui): address review feedback

* fix(daemon-ui): address review hardening feedback

* fix(daemon-ui): handle resync-required events

* feat(sdk/daemon-ui): consume daemon-side subagent nesting context (PR-K)

Closes the SDK-side gap for §B1 in PR #4353's TODO list. PR-E originally
deferred subagent nesting because daemon-side parent-context wasn't yet
stamped on tool_call events. After the rebase onto current
daemon_mode_b_main, source verification confirms the daemon now emits
`tool_call._meta.parentToolCallId` + `tool_call._meta.subagentType` via
`SubAgentTracker.getSubagentMeta()` (core), so the SDK side is unblocked.

## Schema additions (additive, forward-compat-safe)

`DaemonUiToolUpdateEvent`:
  - parentToolCallId?: string  — toolCallId of the parent Task / delegation
  - subagentType?: string      — sub-agent type label (e.g. 'code-reviewer')

`DaemonToolTranscriptBlock`:
  - parentToolCallId?: string  — mirror of event field
  - subagentType?: string      — mirror of event field
  - parentBlockId?: string     — pre-resolved by reducer when parent already
                                 in state, so renderers don't re-correlate

## Normalizer wiring

`normalizeToolUpdate` checks both top-level and `_meta` for parentToolCallId
+ subagentType (fallback chain mirrors how provenance/serverId are read).
Top-level tool calls without sub-agent context omit the fields cleanly.

## Reducer behavior

- New tool block: resolves `parentBlockId` from `toolBlockByCallId` at
  create time. Out-of-order arrival (child before parent) leaves
  `parentBlockId` undefined — selectors fall back to `parentToolCallId`
  lookup.
- Existing tool block update: adopts parent context if not yet
  correlated, never overwrites established correlation (handles the
  flow where SubAgentTracker activates after the initial tool_call).

## New public selectors

- selectSubagentChildBlocks(state, parentToolCallId): returns the
  array of tool blocks invoked inside a given parent delegation
- isSubagentChildBlock(block): type guard for "this tool block came
  from a sub-agent"

Both exported from @qwen-code/sdk/daemon root + ui/index.

## Forward-compat properties

- Top-level tool calls (no sub-agent) work identically as before
- Trimmed parent blocks: child fallback to undefined parentBlockId
- Daemon emits both fields together; SDK reads independently to tolerate
  partial future stamping

## Test coverage (129/129 pass, +5 new tests)

- Extract parentToolCallId + subagentType from `_meta`
- Top-level tool calls have undefined parent fields (forward-compat)
- Reducer correlates parentBlockId at create time
- Reducer adopts parent context on later update (out-of-order arrival)
- isSubagentChildBlock discriminator

## Roadmap

PR-K of the unified follow-up to PR #4353. Closes §B1 (subagent nesting)
in the TODO declaration; daemon-side already shipped on
`daemon_mode_b_main` via SubAgentTracker (core).

Remaining TODO §B / §D items still depend on further daemon/Core work:
- §B2 `tool.progress` event type (daemon emit pending)
- §D MessageEmitter multimodal echo + HistoryReplayer inlineData/fileData
  (core change pending)

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix(daemon-ui): PR-K self-review hardening — back-fill / trim / self-ref / docs

Multi-round self-review of PR-K (d8375fe46) surfaced two real bugs, a
few defensive gaps, and missing docs/fixture coverage. All addressed
in one commit.

## Bugs fixed

### Bug 1 — `parentBlockId` never back-filled for out-of-order arrival

Original PR-K resolved `parentBlockId` only at child create time, which
broke this flow:

  1. Child arrives WITH parent stamp → block created with
     `parentToolCallId` set, `parentBlockId` undefined (parent not in
     state yet)
  2. Parent arrives later → block created, `toolBlockByCallId` indexed
  3. Subsequent child updates: existing-block branch only ran the
     back-fill inside `!existing.parentToolCallId`, which is false (we
     already adopted the stamp in step 1). `parentBlockId` stayed
     undefined forever.

Fix: separate the two correlations.
  - existing-block update: independently back-fill `parentBlockId`
    whenever `parentToolCallId` is set and `parentBlockId` is missing
  - new-block create: scan existing children whose `parentToolCallId`
    matches the new block's `toolCallId` and back-fill their
    `parentBlockId`. Cheap O(n) over current blocks.

### Bug 2 — dangling `parentBlockId` after trim

`trimTranscriptState` reset `toolBlockByCallId[id]` to the trimmed
sentinel for evicted blocks but did NOT walk surviving children to
null their `parentBlockId` references. Renderers walking
`blockIndexById.get(parentBlockId)` would get undefined, with no
"why" signal.

Fix: post-trim, walk remaining tool blocks; if `parentBlockId`
references an id not in `keptIds`, null it. `parentToolCallId` stays
(survives trimming so selector-keyed queries still work).

## Defensive hardening

- **Self-reference guard** (normalizer): drop
  `parentToolCallId === toolCallId` before it reaches the reducer.
  Daemon should never emit this, but defending costs nothing.
- **Selector docstring**: clarify `selectSubagentChildBlocks` returns
  **direct** children only; document cycle / depth-cap responsibility
  for renderers walking up the chain.
- **Cosmetic**: remove redundant `as DaemonToolTranscriptBlock` cast
  in `isSubagentChildBlock` (TypeScript already narrows after
  `block.kind === 'tool'` on the discriminated union).
- **Alphabetical**: move `isSubagentChildBlock` re-export to correct
  position in both `daemon/index.ts` and `daemon/ui/index.ts`.

## Docs + conformance gaps closed

- `README.md` — new "Sub-agent nesting (PR-K)" section with full
  reducer behavior, out-of-order handling note, recursive walk example,
  cycle-defense note.
- `MIGRATION.md` — new step 8a with before/after for nested rendering.
- `conformance.ts` — new `subagent-nesting` fixture covering parent +
  nested child via `tool_call._meta`. Markdown-safe phrases chosen
  (markdown escapes `-` so titles cannot be substring-matched as-is).

## Test coverage (+5 tests, 134/134 pass)

- Self-reference dropped in normalizer
- Back-fill on out-of-order parent arrival (child first, parent after)
- Back-fill on later child update when parent now exists
- Dangling `parentBlockId` nulled after parent trimmed
- New `subagent-nesting` conformance fixture passes SDK reference adapter

## Side-effect verification

Verified no regressions:
- Cancellation propagation still cancels parent + children together
  (iterates `toolBlockByCallId`, which includes both)
- Render contract unchanged (`daemonBlockToMarkdown` etc. project per
  block, no nested awareness required)
- No serializer to update
- `selectTranscriptBlocksOrderedByEventId` unaffected (parent-agnostic)

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix(daemon-ui): permission block trim contract — wenshao review

Addresses both items from wenshao's review on PR #4353:

## Critical — resolvePermissionBlock missing TRIMMED guard

The sibling `upsertPermissionBlock` (transcript.ts:544) correctly returns
early when `existingId === TRIMMED_PERMISSION_BLOCK_ID`, but
`resolvePermissionBlock` (transcript.ts:581) had no such guard. When
`maxBlocks` trimming evicted a pending permission request, a subsequent
`permission.resolved` event would:

1. Fail the `getWritableBlockById` lookup (sentinel is not a real block id)
2. Fall through and create a brand-new orphan resolution block

This wasted a block slot, accelerated further trimming, and silently
broke the trimmed-block contract that the request-side guard establishes.

Fix: mirror the request-side guard. Read the index entry up front,
return early on the sentinel.

## Suggestion — permissionBlockByRequestId grows unboundedly

`trimTranscriptState` writes `TRIMMED_PERMISSION_BLOCK_ID` for evicted
permission requests but never deletes those entries. Unlike the tool
side (which calls `pruneTrimmedToolIndexes` post-trim), the permission
index grew without bound in long sessions.

Fix: add `pruneTrimmedPermissionIndexes` analogous to the tool-side
helper. Caps the sentinel set at `maxBlocks` entries; older entries are
deleted (any later resolution event still drops cleanly via the new
Critical guard).

## Tests

- Updated existing `keeps orphan permission resolutions visible after
  request trimming` test to encode the corrected contract (drops silently
  instead of creating an orphan). Test rename: "drops resolution for
  trimmed permission requests (wenshao Critical)".
- New `Suggestion: pruneTrimmedPermissionIndexes caps the trimmed
  sentinel set` test verifies the cap.

Total: 136/136 tests pass, SDK + WebUI typecheck green.

## Side-effect verification

- `upsertPermissionBlock` already had the equivalent guard — no
  asymmetry remains.
- `pruneTrimmedPermissionIndexes` only touches entries holding the
  sentinel; live permission blocks are unaffected.
- Selectors over `state.blocks` (e.g. `selectPendingPermissionBlocks`)
  iterate the block array, not the index — unaffected by cap.

Generated with AI

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix(daemon-ui): address wenshao + doudouOUC inline reviews (2026-05-23)

Addresses the 13 inline review comments from wenshao (6) and doudouOUC
(7, one overlap) on the 2026-05-23 review round.

## Critical / Important

### sanitizeUrls not threaded through HTML preview path (doudouOUC)

`daemonBlockToHtml` for tool blocks called `daemonToolPreviewToPlainText`
which didn't accept `opts` — when callers set `sanitizeUrls: true`, the
markdown path stripped auth tokens but the HTML path leaked them into
the DOM. Now: helper accepts opts, threads through `web_fetch.url` and
`image_generation.thumbnailUrl`.

### enrichToolDetailsWithPreview overwrote rawOutput (doudouOUC)

The webui adapter replaced structured `rawOutput` with a markdown
summary string when `enrichDetails: true`. Downstream `ToolCallData`
consumers may branch on the shape (object vs string) and break. Plus
the actual tool output was silently dropped.

Fix: keep `rawOutput` verbatim, surface markdown via a new optional
`previewMarkdown` field added to `ToolCallData`.

### transcriptBlockToTerminalText zero test coverage (wenshao)

Added 12 tests covering each `switch` branch (user / assistant / thought
/ tool / shell stdout+stderr / permission unresolved+resolved / status /
debug / error) plus the unknown-kind degradation path. Verified
`assertNever` returns a graceful error line (does NOT throw) — wenshao's
reviewer was slightly wrong on the throw claim but coverage gap was
real.

### selectTranscriptBlocksOrderedByEventId no memoization (wenshao)

Selector was called from React `useSyncExternalStore` and re-sorted on
every dispatch — including sidechannel-only events that don't touch
blocks. Added WeakMap cache keyed on `state.blocks` reference; the
reducer preserves the same array reference for non-block-mutating
events, so the cache hits across renders.

### selectSubagentChildBlocks O(n) per call (wenshao)

Naive `state.blocks.filter()` was O(n) per call; rendering a tree with
m parents made it O(n*m). Built a memoized reverse index keyed on
`state.blocks` reference (WeakMap of parentToolCallId →
DaemonToolTranscriptBlock[]). Each lookup now O(1) after first call.

### Test file TS errors at root tsc (wenshao)

Fixed multiple TS errors in `daemonUi.test.ts` flagged by root
`tsc --noEmit`:
- Added `DaemonTranscriptState` + `DaemonUiEvent` imports
- `block.content` access via `as Array<Record<string, unknown>>` cast
- `delete` on globalThis property via narrower interface cast
- `debug?.text` via `DaemonUiEvent & { text: string }` narrowing (Extract on
  union with `'status' | 'debug'` literal would resolve to never)
- 6 occurrences of index-signature access via bracket notation
- `raw: null` added to 3 `DaemonUiPermissionOption` literals (required field)
- Explicit type annotations on conformance-suite `renderToText` params

Note: `webui/src/daemon/transcriptAdapter.test.ts` shows residual
"clientReceivedAt does not exist" errors at root tsc, but this is
environmental — the resolution trace shows `@qwen-code/sdk/daemon`
crossing into a sibling worktree's stale dist via shared workspace
node_modules. In a single-worktree CI checkout this resolves cleanly.

## Suggestions (cleanups)

### Hoist asDaemonErrorKind double-eval (doudouOUC)

`session_died` + `stream_error` cases each computed `asDaemonErrorKind`
twice in the conditional spread (predicate + value). Hoisted to const,
no functional change.

### renderToolHeader bypassed opts (doudouOUC)

Forwarded `opts` so `maxFieldLength` is honored for tool title /
toolName / toolKind.

### isSensitiveKey duplicates (doudouOUC)

Removed duplicate `endsWith('accesskey')` / `endsWith('secretkey')`
checks and the redundant exact-match `privatekey` (already covered by
`endsWith`).

### propagateCancellationToInFlightTools iterated trimmed (wenshao)

Filter `TRIMMED_TOOL_BLOCK_ID` sentinels up front. Avoids redundant
index dereferences in long sessions with many historical tools.

### toolProgress shallow clone (doudouOUC + wenshao)

`cloneTranscriptState` outer `...state` spread shared inner
`{ ratio?, step? }` references between snapshots. Once `tool.progress`
event handlers start mutating in place, the prior snapshot would leak.
Deep-clone the inner records now (cost bounded by in-flight tools,
small).

### isDeviceFlowErrorKind closed set (wenshao + doudouOUC)

Both reviewers suggested strict validation. We INTENTIONALLY kept
lenient pass-through — the public type
`DaemonAuthDeviceFlowSdkErrorKind` explicitly includes `(string & {})`
as a forward-compat escape hatch (existing test `keeps future
auth_device_flow_failed errorKind values observable` enforces this).
Now expose `KNOWN_DEVICE_FLOW_ERROR_KINDS` as documentation and
explain the design in the JSDoc.

## Validation

| | |
|---|---|
| SDK tests | 148/148 pass (+12 terminal coverage + assorted hardening) |
| SDK typecheck | clean |
| WebUI typecheck | clean |

## Side-effect verification

- WeakMap memos invalidate correctly: reducer creates a fresh
  `state.blocks` reference only on block-mutating events. Sidechannel
  events reuse t…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants