Skip to content

feat(serve): add POST /session/:id/recap#4504

Merged
doudouOUC merged 4 commits into
QwenLM:daemon_mode_b_mainfrom
doudouOUC:feat/daemon-session-recap
May 26, 2026
Merged

feat(serve): add POST /session/:id/recap#4504
doudouOUC merged 4 commits into
QwenLM:daemon_mode_b_mainfrom
doudouOUC:feat/daemon-session-recap

Conversation

@doudouOUC

@doudouOUC doudouOUC commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Expose generateSessionRecap (packages/core/src/services/sessionRecap.ts) to daemon clients via a new POST /session/:id/recap route, so SDK / web UI / IDE-plugin callers can fetch a one-sentence "where did I leave off" summary without driving the agent through a full prompt turn.
  • Closes the pre-release: fix ci #1 entry in the "core features missing from daemon" inventory under proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175/recap is already used by the TUI and useAwaySummary, but daemon clients had no access path.
  • v1 is manual trigger only: no SSE push event, no auto-firing on idle. Follow-ups (auto-trigger, request-id-based cancel, structured /stats / /export ports) are tracked separately.

What's added

Layer Surface
HTTP route POST /session/:id/recap — non-strict gate (parity with /prompt), X-Qwen-Client-Id honored, no body required
Capability tag session_recap (always-on, slotted next to workspace_mcp_restart)
ACP control ext-method qwen/control/session/recap
Bridge interface HttpAcpBridge.generateSessionRecap(sessionId, context?)
SDK DaemonClient.recapSession(sessionId, opts?) + DaemonSessionClient.recap(opts?) + DaemonSessionRecapResult type re-exported from the top-level barrel

Result shape:

{ "sessionId": "sess:42", "recap": "Debugging the auth retry race. Next: add deterministic timing." }

recap is null (a 200, not an error) when history is too short or the side-query fails — preserving core's documented best-effort contract.

Architecture

Reuses the existing ext-method roundtrip pattern from setSessionApprovalMode (so no new infrastructure):

HTTP route (server.ts)
  → bridge.generateSessionRecap(...)           (acp-bridge/bridge.ts)
    → entry.connection.extMethod('qwen/control/session/recap', {sessionId})
      → acpAgent handles ext-method            (acpAgent.ts)
        → session.getConfig() + generateSessionRecap(config, signal)

Non-strict gate — same posture as /session/:id/prompt: the call costs tokens but mutates no state. Operators that want it locked down configure --token, which gates everything.

Cancellation is best-effort at v1: client disconnect aborts the bridge-side wait, but the LLM call in the child runs to completion (recap is short — single-attempt side-query, ~1–5s typical, maxOutputTokens: 300). A 60s backstop timeout guards a wedged ACP channel. A future request-id-based cancel ext-method can plumb full end-to-end cancellation if it ever becomes worth the bandwidth cost.

Test plan

  • packages/sdk-typescriptDaemonClient.test.ts (+6: empty-body POST, null-recap passthrough, URL encoding, X-Qwen-Client-Id header, AbortSignal propagation, 404 throw); DaemonSessionClient.test.ts (+1: session-bound wrapper); daemon-public-surface.test.ts (+1: type-lock for DaemonSessionRecapResult). 643 / 643 ✓
  • packages/acp-bridgebridge.test.ts (+3: ext-method forwarding, null-recap preservation, SessionNotFoundError on unknown id). 180 / 180 ✓
  • packages/cliserver.test.ts (+6: 200 happy path, 200 null-recap, client-id context, 404, malformed client-id, non-strict-gate posture pin) plus capability registry ordering update. 261 / 261 ✓
  • npm run build clean (0 errors)

End-to-end smoke (manual, against a freshly built daemon):

# Terminal A
mkdir -p /tmp/qwen-recap && cd /tmp/qwen-recap
node $WT/dist/cli.js serve

# Terminal B — after a short conversation through /session/:id/prompt
curl -sX POST http://127.0.0.1:4170/session/$SID/recap | jq .
# → {"sessionId":"…","recap":"…one sentence…"}

# Capability advertisement
curl -s http://127.0.0.1:4170/capabilities | jq '.features | map(select(. == "session_recap"))'
# → ["session_recap"]

# Empty-history null path (call before any prompt)
curl -sX POST http://127.0.0.1:4170/session/$SID/recap | jq .
# → {"sessionId":"…","recap":null}

Out of scope (follow-ups)

  • chat.mjs demo /recap command — separate commit, three lines.
  • Auto-trigger / away-detection on the daemon side — needs idle timer + new SSE event + per-session config knob; PR would double in size.
  • Streaming the recap — single-shot is fine at ~300 output tokens.
  • prompt_suggestion SSE event (NES) and other Top-5 ROI ports (/stats structured, /export structured, /goal CRUD) — same proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 sub-goal, separate PRs.

Docs

  • docs/developers/qwen-serve-protocol.md — new ### POST /session/:id/recap section (request/response, capability tag, error envelope, cancellation semantics).
  • docs/users/qwen-serve.md — feature-list bullet pointing at the SDK helper.
  • docs/design/session-recap/session-recap-design.md — added the daemon HTTP path to the triggers table + a "Daemon access path" subsection covering the non-strict gate decision and v1 cancellation caveat.

🤖 Generated with Qwen Code


📝 描述准确性更新(2026-05-31,作者自查)

更正:v1 实际无取消(与本 PR docs 一致)——Architecture 中 client disconnect aborts the bridge-side wait 不成立(route 未监听 close、未线程化 AbortSignal)。

Wraps generateSessionRecap (core/services/sessionRecap.ts) so daemon
clients can fetch a one-sentence "where did I leave off" summary
without driving the agent through a full prompt turn. Mirrors the
ext-method roundtrip used by /session/:id/approval-mode — bridge
forwards `qwen/control/session/recap` to the ACP child, which calls
the existing core helper against the per-session GeminiClient history.

- Route: non-strict mutation gate (parity with /prompt — costs tokens
  but mutates no state)
- Capability tag: `session_recap`
- SDK: `client.recapSession(sessionId, opts)` +
  `session.recap(opts)` convenience wrapper
- 60s bridge-side backstop timeout; client-disconnect aborts the
  HTTP wait (LLM call in the child still completes — recap is short)
- Recap is best-effort: short history / transient model failure
  surfaces as 200 with `recap: null`, not an error

Tests cover the route (200 happy path, 200 null recap, client-id
context, 404 on unknown session, malformed client-id, non-strict gate
posture), the bridge ext-method roundtrip (success, null recap,
SessionNotFoundError), the SDK client + session-client wrappers
(URL encoding, body, headers, signal propagation, 404 throw), and a
public-surface type lock for `DaemonSessionRecapResult`.

Closes part of QwenLM#4175 (Top 5 ROI port QwenLM#1 from the daemon coverage gap
inventory). Targets daemon_mode_b_main integration branch.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
Copilot AI review requested due to automatic review settings May 25, 2026 09:32
@doudouOUC doudouOUC requested review from chiga0 and wenshao May 25, 2026 09:34
@doudouOUC doudouOUC self-assigned this May 25, 2026
@github-actions

Copy link
Copy Markdown
Contributor

📋 Review Summary

This PR exposes the existing generateSessionRecap functionality to daemon clients via a new POST /session/:id/recap HTTP route, completing issue #4175's entry point for daemon session recaps. The implementation follows established patterns (ext-method roundtrip via ACP bridge) and includes comprehensive test coverage across all affected packages. The change is well-documented with clear contracts around the best-effort nature of the recap feature.

🔍 General Feedback

  • Architecture consistency: The implementation correctly reuses the existing ext-method pattern established by setSessionApprovalMode, maintaining architectural coherence across the daemon control surface.
  • Test coverage: Excellent test coverage across all layers (SDK, bridge, CLI server) with appropriate edge cases (null recap, session not found, malformed headers, abort signal propagation).
  • Documentation quality: Design doc, protocol docs, and user-facing docs are all updated with clear contracts, error conditions, and cancellation semantics.
  • Type safety: New DaemonSessionRecapResult type is properly threaded through all barrel exports with a type-lock test to prevent future regressions.
  • Cancellation handling: The v1 best-effort cancellation approach is well-documented with clear rationale for why the LLM call runs to completion (short duration, single-attempt side-query).

🎯 Specific Feedback

🟢 Medium

  • packages/cli/src/serve/server.ts:1488 - The route handler uses mutate() without explicit options, relying on the default non-strict behavior. While the comment explains this mirrors /prompt, consider adding an explicit { strict: false } to make the gate posture self-documenting and catch any future default changes:

    app.post('/session/:id/recap', mutate({ strict: false }), async (req, res) => {
  • packages/acp-bridge/src/bridge.ts:2887 - The SESSION_RECAP_TIMEOUT_MS constant (60s) is well-justified in the JSDoc, but consider whether this should be configurable via BridgeOptions for operators dealing with particularly slow model providers, similar to how initTimeoutMs is configurable.

  • packages/sdk-typescript/src/daemon/DaemonClient.ts:997 - The method bypasses fetchTimeoutMs with a comment explaining why, but the SDK doesn't expose a way for callers to set a custom timeout on the underlying fetch. Consider whether the opts parameter should accept a timeoutMs override for callers who want tighter control.

🔵 Low

  • packages/cli/src/acp-integration/acpAgent.ts:2267 - The ext-method handler creates a new AbortController().signal but immediately discards the controller reference. The comment acknowledges v1 has no cross-process abort plumbing. Consider naming this unusedSignal or adding a // v1: no cancellation plumbing inline comment to make the intentional discard more obvious to future readers.

  • docs/design/session-recap/session-recap-design.md:44 - The design doc mentions "A future request-id-based cancel ext-method can plumb full end-to-end cancellation if needed" — consider filing a follow-up issue (proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 tracking issue?) to track this enhancement so it doesn't get lost.

  • packages/acp-bridge/src/bridgeTypes.ts:338 - The JSDoc for generateSessionRecap mentions BridgeClientRequestContext but doesn't explain what fields it contains. Consider adding a brief note or type reference (e.g., "currently carries clientId for future event fan-out").

✅ Highlights

  • Excellent error handling: The null-recap passthrough (200 with recap: null) is correctly implemented and tested across all layers, preserving the core helper's best-effort contract without forcing clients to special-case "no recap yet" as an error.
  • Comprehensive test matrix: The server tests cover all critical paths including the non-strict gate posture pin, which prevents future refactors from accidentally tightening the gate.
  • Clean SDK surface: The DaemonSessionClient.recap() wrapper correctly forwards the bound clientId and optional signal, making it ergonomic for session-scoped callers.
  • Capability advertising: The session_recap tag is properly registered in SERVE_CAPABILITY_REGISTRY with clear documentation, enabling clients to pre-flight support before calling.

@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks for the review — won't take these in this PR.

Medium 1 (explicit mutate({ strict: false })): the established convention in server.ts is bare mutate() for non-strict (/prompt, /cancel, /heartbeat, /model, /permission/*) and mutate({ strict: true }) for strict. Matching that convention beats self-documentation here — divergence would imply the route's gate posture is different from its siblings.

Medium 2 (configurable SESSION_RECAP_TIMEOUT_MS): YAGNI. 60s is already 6–12x typical 1–5s recap latency. initTimeoutMs is configurable because session-creation latency varies wildly with cold spawn + MCP discovery; a single-attempt side-query with maxOutputTokens: 300 doesn't have that variance. Happy to add a knob if a real operator hits the cap.

Medium 3 (SDK timeoutMs override): redundant with the existing signal parameter — callers wanting a deadline can pass AbortSignal.timeout(N). Two ways to express the same constraint would muddy the API surface.

Low 1 (rename to unusedSignal): factual misread — there is no named variable to rename. The code is new AbortController().signal as an inline argument; the inline expression is its own breadcrumb, paired with the JSDoc immediately above explaining the v1 cancellation contract.

Low 2 (file follow-up issue for request-id-based cancel): already listed in the PR description's "Out of scope" section; tracking issue inflation isn't needed.

Low 3 (expand JSDoc on BridgeClientRequestContext): the type definition is the source of truth and is one grep away; restating its shape at each consumer would rot the moment a field is added.

Highlights noted, thanks.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a daemon-accessible session recap feature by exposing core’s generateSessionRecap through the serve HTTP API, ACP control ext-methods, the ACP bridge, and the TypeScript SDK (with docs + tests to lock the contract).

Changes:

  • Add POST /session/:id/recap to qwen serve, advertise it via the new session_recap capability, and plumb it through ACP (qwen/control/session/recap) + HttpAcpBridge.
  • Add SDK surface: DaemonClient.recapSession(...), DaemonSessionClient.recap(...), and the DaemonSessionRecapResult type (re-exported from the top-level barrel).
  • Add/extend unit tests and protocol/user/design documentation for the recap route, response shape, and best-effort semantics.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/sdk-typescript/test/unit/DaemonSessionClient.test.ts Tests session-bound wrapper forwards clientId + AbortSignal.
packages/sdk-typescript/test/unit/DaemonClient.test.ts Tests new recapSession request shape, encoding, headers, signal forwarding, and error handling.
packages/sdk-typescript/test/unit/daemon-public-surface.test.ts Type-lock to ensure DaemonSessionRecapResult remains publicly exported.
packages/sdk-typescript/src/index.ts Re-export DaemonSessionRecapResult from the SDK root barrel.
packages/sdk-typescript/src/daemon/types.ts Define DaemonSessionRecapResult and document best-effort recap: null contract.
packages/sdk-typescript/src/daemon/index.ts Re-export DaemonSessionRecapResult from the daemon sub-barrel.
packages/sdk-typescript/src/daemon/DaemonSessionClient.ts Add recap() wrapper that delegates to DaemonClient.recapSession.
packages/sdk-typescript/src/daemon/DaemonClient.ts Implement recapSession(sessionId, opts?) calling POST /session/:id/recap.
packages/cli/src/serve/server.ts Add the POST /session/:id/recap HTTP route calling bridge.generateSessionRecap.
packages/cli/src/serve/server.test.ts Add route tests + ensure capability list includes session_recap.
packages/cli/src/serve/capabilities.ts Add session_recap to SERVE_CAPABILITY_REGISTRY.
packages/cli/src/acp-integration/acpAgent.ts Handle ACP ext-method qwen/control/session/recap by calling generateSessionRecap.
packages/acp-bridge/src/status.ts Add SERVE_CONTROL_EXT_METHODS.sessionRecap constant.
packages/acp-bridge/src/bridgeTypes.ts Extend HttpAcpBridge interface with generateSessionRecap(...).
packages/acp-bridge/src/bridge.ts Implement generateSessionRecap with a 60s backstop timeout and ACP ext-method forwarding.
packages/acp-bridge/src/bridge.test.ts Add bridge tests for forwarding, null recap preservation, and missing-session errors.
docs/users/qwen-serve.md Document the new “Session recap” feature at a user level.
docs/developers/qwen-serve-protocol.md Specify the wire contract for POST /session/:id/recap (capability, response, errors, cancellation semantics).
docs/design/session-recap/session-recap-design.md Update triggers/design to include the daemon HTTP path and its posture/cancellation notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1489 to +1503
const clientId = parseClientIdHeader(req, res);
if (clientId === null) return;
try {
const response = await bridge.generateSessionRecap(
sessionId,
clientId !== undefined ? { clientId } : undefined,
);
res.status(200).json(response);
} catch (err) {
sendBridgeError(res, err, {
route: 'POST /session/:id/recap',
sessionId,
});
}
});

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — addressed in commit 058bde70f per chiga0's option 1: reconciled the route comment + protocol doc + design doc to match actual v1 behavior (no res.once('close'), no AbortSignal threaded through the bridge, only the 60s SESSION_RECAP_TIMEOUT_MS backstop + getTransportClosedReject race). Going with the docs-match-reality path rather than wiring the AbortController, because the ACP child handler also passes a never-aborting AbortController().signal to generateSessionRecap — an HTTP-side controller in isolation would be cosmetic, the LLM call still runs to completion either way. A request-id-based cancel ext-method can plumb full e2e cancellation in a follow-up PR.

const current = config.getApprovalMode();
return { previous, current };
}
case SERVE_CONTROL_EXT_METHODS.sessionRecap: {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The ~30-line sessionRecap ext-method handler has no test coverage. acpAgent.test.ts tests other ext-method handlers (rewindSession, renameSession, workspaceMcp, etc.) but sessionRecap is absent. The uncovered paths include: RequestError.invalidParams on missing/invalid sessionId, sessionOrThrow for unknown sessions, generateSessionRecap(config, signal) return value wrapping, and recap: null pass-through.

Consider adding a describe block mirroring the existing renameSession test pattern to pin the routing, parameter validation, and response shape.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — agreed, will add. Deferring to a follow-up PR to keep this one focused on chiga0's doc-only gate. The follow-up will add a describe('SERVE_CONTROL_EXT_METHODS.sessionRecap') block in acpAgent.test.ts mirroring the renameSession pattern, covering: success path, recap: null pass-through, RequestError.invalidParams on missing/invalid sessionId, SESSION_ID_RE rejection (will also be added — sibling handlers enforce it but recap currently doesn't), and sessionOrThrow on unknown session.

sessionId: string,
opts?: { signal?: AbortSignal; clientId?: string },
): Promise<DaemonSessionRecapResult> {
const res = await this._fetch(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] recapSession calls this._fetch() directly, bypassing fetchWithTimeout. Every other non-streaming method (heartbeat, cancel, restartMcpServer, initWorkspace, setWorkspaceToolEnabled) uses fetchWithTimeout. When the caller omits signal (the common path), the SDK request has no client-side timeout — a wedged daemon causes await client.recapSession(id) to hang indefinitely.

Consider using fetchWithTimeout with a per-call timeout above the bridge's 60s backstop (e.g. 90s), similar to how restartMcpServer handles its long-running nature:

return await this.fetchWithTimeout(
  `${this.baseUrl}/session/${encodeURIComponent(sessionId)}/recap`,
  { method: 'POST', headers: ..., body: '{}', signal: opts?.signal },
  async (res) => {
    if (!res.ok) throw await this.failOnError(res, 'POST /session/:id/recap');
    return (await res.json()) as DaemonSessionRecapResult;
  },
  RECAP_DEFAULT_TIMEOUT_MS, // e.g. 90_000
);

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — agreed, will fix in the follow-up PR. The bridge's 60s SESSION_RECAP_TIMEOUT_MS is a server-side backstop and doesn't protect against TCP half-open / DNS stall on the client. Switching to fetchWithTimeout with a RECAP_DEFAULT_TIMEOUT_MS=90_000 (slightly above the bridge ceiling so a healthy slow recap isn't cut off client-side) gives the SDK consumer the same network-level safety net every other non-streaming method has. JSDoc will be updated to drop the "bypasses fetchTimeoutMs" claim — that wasn't even accurate, raw _fetch just has no timeout at all. Holding for the follow-up to keep this PR minimal-doc-only per chiga0's gate.

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Qwen Code Review (DEEP)

CI-safe profile adapted from bundled /review; focused passes: correctness/security, test coverage, maintainability/performance, and undirected audit.

Correctness / Security

  • P2 packages/cli/src/serve/server.ts:1470 — Route comment and PR description claim "client disconnect aborts the bridge-side wait," but no req.on('close') / res.on('close') handler is wired. The async handler awaits bridge.generateSessionRecap() to completion regardless of client disconnect. The bridge's getTransportClosedReject races the ACP transport lifecycle, not the HTTP client socket. Net effect: on client disconnect, the bridge still blocks on the ACP roundtrip for up to 60s, and the LLM side-query in the child always runs to completion. The code behavior is acceptable for v1 (the design doc and PR body both acknowledge the LLM call runs to completion), but the route-level comment ("disconnect is detected via res.once('close') for the bridge-side cancellation") describes a mechanism that does not exist. Fix: either implement the disconnect listener (create an AbortController, abort on 'close', pass signal to bridge) or correct the comment to match actual behavior — misleading comments in security-relevant access-control code will confuse the next maintainer who needs to reason about cancellation.

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts:993-1010recapSession calls this._fetch (raw fetch, no timeout wrapper), intentionally bypassing fetchTimeoutMs per the JSDoc. The optional signal parameter is the only timeout mechanism. A caller who invokes client.recapSession(sessionId) without { signal: AbortSignal.timeout(N) } has no default protection — if the daemon TCP-accepts but never responds (half-open connection, daemon wedged after accept), the returned promise hangs indefinitely. Every other DaemonClient method applies fetchTimeoutMs as a backstop. Fix: either apply fetchTimeoutMs as a race (letting signal override for shorter deadlines), or document in the JSDoc that callers must pass AbortSignal.timeout() to avoid unbounded hangs, with a code example. The current JSDoc mentions cancellation via signal but does not warn that omitting it removes all timeout protection.

Needs Verification

  • Whether this._fetch in DaemonClient applies any internal timeout wrapping that would mitigate the bypass in recapSession. The diff only shows _fetch being called directly — if the private implementation has its own timeout logic, the P2 above is a false positive.
  • Whether other non-strict session routes (/prompt, /cancel) in server.ts implement res.once('close') abort plumbing. If none do, the misleading comment is a pattern-level documentation debt rather than a this-PR-specific omission; if /prompt does, this route should match.

Validation Evidence

PRESENT — PR body includes exact curl commands with expected JSON responses for happy path, null-recap, and capability advertisement; test plan lists pass counts for all three affected packages (643/643, 180/180, 261/261); design doc updated with daemon access path and cancellation caveat.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Test Coverage

  • P2 packages/acp-bridge/src/bridge.ts:2878-2879 — The info.isDying branch is untested. When a session's ACP channel is in the process of shutting down (isDying === true), generateSessionRecap throws SessionNotFoundError. No test exercises this concurrent-shutdown path. Other bridge methods (e.g., setSessionApprovalMode) likely have the same gap, but since this is new code, a test using the existing FakeAgent/channel patterns with a dying channel would verify the guard works as intended — otherwise a recap call racing a session close could produce an unclear transport error instead of the documented 404.

  • P2 packages/acp-bridge/src/bridge.ts:2883-2889 — The 60s backstop timeout (SESSION_RECAP_TIMEOUT_MS) is untested. If the ACP child never responds to qwen/control/session/recap, the withTimeout race should reject with a timeout error. A test using a recapFactory that never resolves its promise would confirm the backstop fires and surfaces a meaningful error rather than hanging the HTTP route indefinitely. The existing setApprovalMode tests may not cover timeout races either, but the recap timeout is a new constant with new semantics (60s vs 10s default).

  • P3 packages/acp-bridge/src/bridge.ts:2893 — The defensive response.recap ?? null normalization is untested. If the ACP child returns { sessionId, recap: undefined } (or omits recap entirely), the bridge normalizes it to null. A test where the fake agent returns recap: undefined would pin this contract — minor because the core helper is documented to return string | null, but the normalization exists precisely because the author doesn't fully trust that invariant.

Needs Verification

  • packages/cli/src/acp-integration/acpAgent.ts:2257-2260 — The RequestError.invalidParams guard for missing/empty sessionId in the ext-method handler is not directly unit-tested. The bridge tests use a FakeAgent stub that always receives a valid sessionId from the bridge layer (which has already validated the session exists). A malformed ext-method call directly to the agent is not exercised. Whether this matters depends on whether other ext-method handlers in acpAgent.ts have direct unit tests — if the established pattern is to only test via bridge roundtrips, this is consistent and lower-priority.

Validation Evidence

PRESENT — The PR body includes exact curl commands against a freshly built daemon (POST /session/:id/recap, capability check via GET /capabilities, empty-history null-recap path) with observed JSON output. Test reports are cited with pass counts (643/643, 180/180, 261/261). The manual smoke commands demonstrate the full HTTP→bridge→ACP→core roundtrip including the recap: null best-effort path.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Maintainability / Performance

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts:986 — JSDoc claims the method "Bypasses fetchTimeoutMs because the underlying side-query can take longer than the default 30s budget," but the implementation does nothing to bypass it. opts?.signal is only set when the caller provides one; when omitted, _fetch receives no signal and — if it applies fetchTimeoutMs internally — would enforce the default 30s timeout on a call documented to need up to 60s. Compare with setApprovalMode in the same file, which also passes opts?.signal but makes no bypass claim. Impact: a daemon client calling recapSession() without an explicit signal may hit an unexpected timeout on slow model responses, contradicting the documented contract. Fix: verify _fetch's timeout behavior when no signal is passed; if fetchTimeoutMs applies, either pass an explicit longer AbortSignal.timeout(SESSION_RECAP_TIMEOUT_MS) or remove the bypass claim from the JSDoc.

  • P2 packages/cli/src/serve/server.ts:1481 — Route comment states "disconnect is detected via res.once('close') for the bridge-side cancellation," but the handler registers no close listener on the response. The bridge's generateSessionRecap races extMethod against getTransportClosedReject (ACP transport close), not HTTP response close — client disconnect does not propagate to the bridge-side wait. Only the 60s backstop timeout provides a ceiling. Impact: future maintainers debugging client-disconnect scenarios will believe the abort path exists at the route layer when it does not; the 60s backstop is the only safety net for abandoned requests. Fix: either implement the disconnect listener (e.g., res.once('close', () => controller.abort()) with a bridge-level abort signal) or correct the comment to match the actual v1 behavior ("client disconnect does not abort the bridge-side wait; 60s backstop timeout is the ceiling").

  • P3 packages/acp-bridge/src/bridge.ts:2892extMethod response is cast via as { sessionId: string; recap: string | null } with no runtime shape check. If the ACP child returns an unexpected structure (e.g., after a child-side refactor), response.recap ?? null silently produces recap: null — indistinguishable from a legitimate best-effort null. Impact: silent data corruption path masked by the null-tolerant contract. Fix: add a lightweight typeof guard: if (typeof response?.recap !== 'string' && response?.recap !== null) throw ... before returning, or narrow with a type predicate.

Needs Verification

  • Whether DaemonClient._fetch applies fetchTimeoutMs when no signal is passed — determines if the P2 JSDoc/code mismatch in recapSession is a latent runtime bug or only a documentation error. Requires reading _fetch implementation (not in diff).

Undirected Audit

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts:998–1012recapSession bypasses fetchTimeoutMs entirely (JSDoc states this is intentional), and signal is optional. A caller who omits signal gets no client-side timeout — the bridge has a 60 s backstop, but network-level hangs (TCP half-open, DNS stall) are invisible to the bridge and would strand the caller indefinitely. Every other session-scoped SDK method presumably applies fetchTimeoutMs as a safety net; this is the outlier.

    • Impact: SDK consumer footgun — the method signature doesn't hint that omitting signal removes all timeout protection.
    • Suggested fix: Apply a sensible backstop via AbortSignal.any([opts?.signal, AbortSignal.timeout(120_000)]). (or document the bypass prominently in the method signature, e.g. signal required or a dedicated timeoutMs that defaults to a safe ceiling).
  • P2 packages/cli/src/acp-integration/acpAgent.ts:2253–2258 — The sessionRecap ext-method handler validates typeof sessionId !== 'string' || sessionId.length === 0 but does not apply the SESSION_ID_RE format check that sibling handlers (deleteSession, setSessionApprovalMode) use. If those handlers reject malformed IDs before touching session state, this handler should match — even though recap is read-only, the inconsistency means a future refactor that adds write-side-effects to the recap path inherits no input validation.

    • Impact: Inconsistent defense-in-depth; future-maintainer landmine.
    • Suggested fix: Add !SESSION_ID_RE.test(sessionId) to the guard, matching deleteSession.

Needs Verification

  • sessionOrThrow may already enforce SESSION_ID_RE. If this.sessionOrThrow(sessionId) internally validates the format (or can only return sessions whose IDs matched the regex at creation time), the acpAgent finding above is a false positive. The diff does not include sessionOrThrow's definition, so this needs a codebase check.

  • Other SDK session methods and fetchTimeoutMs. The recapSession bypass is only a consistency concern if sibling methods (prompt, setApprovalMode, etc.) apply fetchTimeoutMs. If they also bypass it for similar latency reasons, the pattern is established and this is not an outlier — verify against DaemonClient.ts methods not in the diff.

Validation Evidence

PRESENT — PR body includes a manual end-to-end smoke test with exact curl commands and observed JSON output for the happy path, the null-recap path, and capability advertisement. Unit tests cover all three layers (server route, bridge, SDK) with 15 new test cases documented with pass counts.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Qwen Code /review · qwen3.7-max · tier DEEP

@wenshao

wenshao commented May 25, 2026

Copy link
Copy Markdown
Collaborator

本地验证报告(maintainer review)

把本 PR 的单 commit 01323fda7 cherry-pick 到当前 origin/daemon_mode_b_main81b46c203,落后 PR base 74c5d4505 3 个 commits)做"合并后状态"模拟。Cherry-pick 干净,3 个文件 auto-merge 成功,diff 仍然 19 文件 +584/-11,与 PR 描述完全一致。

验证矩阵(merged 状态)

检查 结果 备注
packages/acp-bridge npm test 300 / 300 pass(7 文件) 含本 PR 新增的 3 个 generateSessionRecap 用例
packages/cli vitest src/serve/server.test.ts 261 / 261 pass 含本 PR 新增的 6 个 /session/:id/recap 路由用例 + 容量注册顺序更新
packages/sdk-typescript npm test 648 / 648 pass(14 文件) recapSession / DaemonSessionRecapResult 等新增用例
4 包 tsc --noEmit(acp-bridge / cli / core / sdk-typescript) 0 errors
仓库根 npm run lint 0 errors 全部干净
仓库根 npm install(含 prepare → build succeed 1372 deps 装齐,全 workspace dist 产物正常

E2E smoke test(真机:boot daemon + curl)

启动 node packages/cli/dist/index.js serve --port 4170(loopback,无 token),逐项验证 PR 文档契约:

场景 期望(PR 文档) 实际 结果
GET /capabilities 中是否广告 session_recap feature 已注册 "features":[..., "session_recap", ...]
POST /session/<unknown-id>/recap 404 404 {"error":"No session with id ...", "sessionId":"..."}
POST /session/<id>/recap 携带非法 X-Qwen-Client-Id: bad@id 400 {code:'invalid_client_id'} 400 {"error":"...","code":"invalid_client_id"}
POST /session/<id>/recap 空 body(GET-style) 200 + {sessionId, recap} 200 {"sessionId":"...","recap":"<sentence>"}
POST /session/<id>/recap Content-Type: application/json body {} 同上 200 同 shape
真实 prompt 一轮后 recap 200 + 非 null 单句 200 + 单句 recap
非阻塞 SSE:路由不影响主流 不发 SSE 事件 (文档承诺,未在本次量化验证)

实际响应样例(已隐去具体 sessionId):

{"sessionId":"","recap":"Setting up context for the Qwen Code project in /tmp/pr4504-merged. Next: specify your development task or question to begin."}

代码评审要点

  • 架构复用:完全沿用 setSessionApprovalMode 已有 ext-method 路径(HTTP route → bridge.generateSessionRecapextMethod('qwen/control/session/recap')acpAgentcore/services/sessionRecap.ts#generateSessionRecap),不引入新基础设施。
  • 网关姿态:route 选择 non-strict gate — 与 /session/:id/prompt 一致(耗 token、不变更状态)。运维方仍可通过 --token 全局锁定。
  • 取消语义诚实:v1 是 best-effort — client 断开会终止 bridge 端等待,但 ACP child 内的 LLM 调用会跑完(侧查询限 maxOutputTokens: 300,1–5s 典型)。60s backstop 防 ACP 通道卡死。这一点 PR 文档与实现都明确标注,没有误导性的"完整取消"承诺。
  • null 不抛错:保持 core helper 既有 best-effort 契约 —— 历史不足/侧查询失败/底层 model 错都返回 200 + recap:null,便于客户端去模板化处理。
  • 类型出口DaemonSessionRecapResult 在 sdk-typescript 顶层 barrel re-export,public-surface 测试有 type-lock 用例固化。

风险与遗留

  • ✅ 无 settings/migration;不影响主 chat history、不发 SSE 事件 —— 路由本身对系统状态完全无副作用(除 token 计费)。
  • ✅ Auto-trigger / SSE push / request-id-based cancel 等 v2 工作 PR 描述已明确划入 follow-up,不阻塞本 PR。
  • ⚠️ 仅在 macOS 验证;route 实现是字符串/JSON 处理,无平台 syscall 依赖,跨平台风险低。
  • ⚠️ PR base 落后 daemon_mode_b_main 3 个 commit(74c5d4505 → 81b46c203),但 cherry-pick 干净,GitHub merge 时会自动重算。无需作者手工 rebase。

结论

建议合并到 daemon_mode_b_main。所有可归因于本 PR 的检查(unit / typecheck / lint / build / 真机 E2E)全部通过;契约(route shape、HTTP 状态码、capability 注册、null 容忍)与 PR 文档完全一致。


验证环境:macOS Darwin 25.4.0,Node v22.17.0,npm 11.8.0。tmux 多 worktree 并行(PR-as-is + cherry-picked-on-daemon_mode_b_main 对照),单 commit cherry-pick 无冲突。

@chiga0

chiga0 commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Review: feat(serve): add POST /session/:id/recap (#4504)

Reviewed against daemon_mode_b_main 46f8d48f1. Clean, well-tested PR that reuses the existing ext-method roundtrip — no new infra. One correctness-of-documentation issue worth fixing before merge, plus a couple of minor notes.

Overview

Exposes generateSessionRecap via POST /session/:id/recapbridge.generateSessionRecapqwen/control/session/recap ext-method → acpAgent → core helper. Non-strict gate (parity with /prompt), best-effort recap: null surfaced as 200, capability tag session_recap, SDK recapSession / recap helpers. Good.

🔴 Should fix — cancellation claim vs. implementation mismatch

The PR body says "client disconnect aborts the bridge-side wait", and the route comment in server.ts says "disconnect is detected via res.once('close') for the bridge-side cancellation". Neither is implemented.

Compare the /prompt route (server.ts:1248-1280), which the PR claims to mirror:

const abort = new AbortController();
const onResClose = () => { if (!res.writableEnded) abort.abort(); };
res.once('close', onResClose);
...
await bridge.sendPrompt(sessionId, {...}, abort.signal, ...);

The new /recap route has no AbortController, no res.once('close'), and passes no signal — and HttpAcpBridge.generateSessionRecap(sessionId, context?) has no signal parameter to receive one anyway. So the only thing bounding a wedged call is the 60s SESSION_RECAP_TIMEOUT_MS backstop + the getTransportClosedReject race. Client disconnect does not abort the bridge-side wait.

That's a defensible v1 posture (recap is short, single-shot), but the comment + PR description are inaccurate. Two options:

  1. Fix the docs — drop the res.once('close') sentence from the route comment and reword the PR/qwen-serve-protocol.md cancellation note to: "no disconnect-driven cancellation in v1; bounded only by the 60s backstop + transport-closed race." (lowest effort, matches reality)
  2. Wire it for real — mirror /prompt: add res.once('close') + thread an AbortSignal param through generateSessionRecapwithTimeout/extMethod. More work, and the child-side generateSessionRecap(config, new AbortController().signal) in acpAgent.ts would still run to completion, so end-to-end cancel isn't achieved regardless.

Given the child-side honestly documents "v1: no cross-process abort plumbing", option 1 is the consistent choice — make the route comment + protocol doc say the same thing the child says.

🟡 Minor

  • acpAgent.ts passes new AbortController().signal (never aborts). Honest per the inline comment, but consider passing the ambient request/turn signal if one is reachable on session, so a session teardown mid-recap doesn't leave a dangling side-query. Not blocking.
  • Token-cost amplification. /recap is a non-strict-gated pure-cost endpoint (every call is an LLM side-query, no state benefit) and the daemon has no per-route rate limit today. On a no-token loopback default a buggy client can spam it to burn tokens. Consistent with the daemon's current posture, but worth a one-line "known limit" in qwen-serve.md next to the SDK-helper bullet.
  • Concurrent recaps on one session — two simultaneous /recap calls run two side-queries against the same GeminiClient history. Assuming the side-query is a read-only fork this is fine; worth a one-line confirmation that generateSessionRecap doesn't mutate shared chat state.

✅ Strengths

  • Reuses the setSessionApprovalMode ext-method pattern verbatim — zero new infrastructure.
  • Returns canonical entry.sessionId rather than the child-echoed value — resists body/echo spoofing.
  • Best-effort recap: null contract preserved and tested end-to-end (bridge + server + SDK).
  • Strong coverage: 15 new tests incl. the non-strict-gate pin test (guards against a future mass strict-flip regression), malformed X-Qwen-Client-Id → 400, URL encoding, AbortSignal propagation, 404.
  • Capability registry ordering + server.test.ts baseline updated together.
  • SESSION_RECAP_TIMEOUT_MS rationale (why 60s, why not inherit initTimeoutMs) is well documented.

Verdict

Approve once the cancellation comment + protocol-doc wording are reconciled with the actual implementation (option 1). Everything else is solid.

Per chiga0's review on QwenLM#4504 (option 1 — match docs to reality rather
than wire up cosmetic AbortController plumbing). The route, design doc,
and protocol reference all claimed "client disconnect aborts the
bridge-side wait" via `res.once('close')`, but the route has no such
listener and the bridge accepts no `AbortSignal`. The only ceilings
are the 60s `SESSION_RECAP_TIMEOUT_MS` backstop and the transport-
closed race against ACP channel death.

Wiring an HTTP-side AbortController in isolation would be cosmetic
because the ACP child handler also passes a never-aborting
`AbortController().signal` to the core helper (no cross-process abort
plumbing yet) — e2e cancel needs both layers. Recap is short (~1–5s,
`maxOutputTokens: 300`), so the absent cancellation is acceptable for
v1; a request-id-based cancel ext-method can land in a follow-up.

Also adds two known-limit bullets to the user guide per chiga0's other
minor notes: token-cost amplification on no-token loopback (no
per-route rate limit) and concurrent-recap safety (side-query reads
chat history via `GeminiClient.getChat().getHistory()` snapshot and
runs through a separate `BaseLlmClient`, never mutating the session's
`GeminiChat`).

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks @chiga0 — went with option 1 in commit 058bde70f.

Reconciled three locations to match actual v1 behavior:

  • packages/cli/src/serve/server.ts — route comment now spells out: no res.once('close'), no AbortSignal threaded into the bridge, only the 60s SESSION_RECAP_TIMEOUT_MS backstop + getTransportClosedReject race. Explicit reasoning that wiring an HTTP-side controller in isolation would be cosmetic since the child-side generateSessionRecap(config, new AbortController().signal) runs to completion anyway.
  • docs/developers/qwen-serve-protocol.md — "Cancellation: none in v1" with the same explanation.
  • docs/design/session-recap/session-recap-design.md — daemon access path section now says "Cancellation is absent in v1" instead of the misleading "best-effort".

Also picked up your two minor doc requests in docs/users/qwen-serve.md:

  • Token-cost amplification known limit — non-strict gate + no per-route rate limit in v1; configure --token (and optionally --require-auth) on shared hosts.
  • Concurrent recap safety — verified runSideQuery calls a separate BaseLlmClient.generateText against a getChat().getHistory() snapshot; the session's GeminiChat is never mutated. Safe to call from multiple clients without coordination.

Left as-is for follow-up (per your "not blocking" labels):

  • Threading session-teardown signal into acpAgent.ts's new AbortController().signal — that needs the cross-process abort plumbing to be useful, deferred with the request-id-based cancel ext-method.
  • ci-bot's additional P2/P3 items (isDying-branch test, 60s-timeout race test, runtime guard on the as cast, SESSION_ID_RE check in the ext-method handler) — will land in a separate small PR once this merges to daemon_mode_b_main.

Tests + typecheck + lint all still green after the doc edits.

@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks @wenshao — appreciate the cherry-pick validation. Just landed commit 058bde70f taking chiga0's option 1 on the cancellation docs (no code change to the route itself; just reconciled the comments + protocol doc + design doc to match the actual v1 behavior of "60s backstop only, no disconnect-driven cancel").

For your two earlier inline [Suggestion]s — both deferred to a follow-up PR after this merges:

  • acpAgent.ts:2247 test coverage for the new ext-method handler — will add a describe block mirroring renameSession's pattern.
  • DaemonClient.ts:1001 fetchWithTimeout wrapping — agreed the raw _fetch is the outlier; will switch to fetchWithTimeout with a RECAP_DEFAULT_TIMEOUT_MS (90s) backstop so omitting signal no longer leaves callers exposed to unbounded hangs.

Holding both for the follow-up PR to keep this one minimal-doc-fix-only per chiga0's gate.

@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks for the deep review — partially taken in commit 058bde70f, rest deferred.

Taken (commit 058bde70f, doc-only per maintainer chiga0's recommendation):

  • Cancellation comment/doc mismatch (your P2 mentioned in correctness, maintainability, and undirected-audit tiers, 3×) — reconciled server.ts route comment, qwen-serve-protocol.md, and session-recap-design.md to match actual v1 behavior. Going with the "match docs to reality" path rather than wiring an HTTP-side AbortController, because the child-side generateSessionRecap(config, new AbortController().signal) still runs to completion (no cross-process abort plumbing), so a route-only controller would be cosmetic.

Deferred to a small follow-up PR after this merges:

  • recapSession SDK timeout (your P2 in maintainability + undirected-audit) — agreed _fetch direct call is the outlier; switching to fetchWithTimeout with RECAP_DEFAULT_TIMEOUT_MS=90_000 is the right move. wenshao raised the same point.
  • info.isDying branch test (P2) — will add using FakeAgent/channel pattern with a dying channel.
  • 60s SESSION_RECAP_TIMEOUT_MS race test (P2) — will add a never-resolving recapFactory to confirm the backstop fires.
  • extMethod response shape runtime guard (P3) — will add typeof response?.recap !== 'string' && response?.recap !== null throw.
  • SESSION_ID_RE check in the ext-method handler (P2, needs-verify) — confirmed sessionOrThrow only does Map.get and does NOT enforce the regex; sibling handlers deleteSession/renameSession do. Will add for consistency.
  • recap ?? null normalization test (P3) — cheap one-line test.

Not taken:

  • "Pass ambient session-teardown signal into acpAgent.ts's new AbortController().signal" — chiga0 also flagged this and labeled it not-blocking; deferred with the request-id-based cancel ext-method when there's a real e2e abort story.

Tests + typecheck + lint all still green.

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Qwen Code Review (DEEP)

CI-safe profile adapted from bundled /review; focused passes: correctness/security, test coverage, maintainability/performance, and undirected audit.

I need to examine the actual code to verify correctness and security claims. Let me read the key files.

Now let me verify the fetchWithTimeout pattern used by adjacent SDK methods versus the raw _fetch used by the new recapSession:

Correctness / Security

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts (new recapSession method)recapSession calls this._fetch directly, bypassing fetchWithTimeout. Every other non-streaming method in the class (42 call sites) uses fetchWithTimeout; only subscribeEvents (long-lived SSE) and prompt (LLM calls that can take minutes) legitimately bypass it. Recap is maxOutputTokens: 300, single-attempt, ~1–5s typical — the prompt justification doesn't apply. A caller that omits signal (optional parameter) has no client-side timeout ceiling: if the daemon hangs or the network stalls, the promise never settles. The bridge has its own 60s backstop (SESSION_RECAP_TIMEOUT_MS), but that only covers the bridge→ACP leg; the SDK→daemon HTTP connection is unprotected. Suggested fix: switch to fetchWithTimeout with the consume callback pattern used by heartbeat, cancel, etc., or at minimum wrap with a RECAP_DEFAULT_TIMEOUT_MS (e.g. 90s) backstop. Author acknowledged in PR comment, deferred to follow-up.

  • P2 packages/acp-bridge/src/bridge.ts:2897 (new generateSessionRecap)as { sessionId: string; recap: string | null } is an unguarded type assertion on the ext-method response. If the ACP child returns an unexpected shape (e.g. { sessionId: "x" } with recap absent, or recap: {} from a future refactoring), response.recap ?? null silently yields null — indistinguishable from the legitimate best-effort null. No runtime validation of typeof response?.recap. Low probability given the controlled protocol, but a one-line guard (if (typeof response?.recap !== 'string' && response?.recap !== null) throw) would prevent silent data loss. Author acknowledged as P3, deferred.

  • P3 packages/cli/src/acp-integration/acpAgent.ts (new sessionRecap case) — The ext-method handler validates typeof sessionId !== 'string' || sessionId.length === 0 but does not apply SESSION_ID_RE (/^[0-9a-fA-F-]{32,36}$/). Sibling handlers deleteSession and renameSession enforce the regex. sessionApprovalMode also skips it, so the new handler matches the closest analogue, but the inconsistency means a malformed sessionId like "foo bar" passes initial validation and only fails at sessionOrThrow (Map miss) — the error message differs from what the regex-rejected handlers produce. Low severity since sessionOrThrow still catches it, but a uniform validation layer would prevent divergent error shapes. Author acknowledged, deferred.

Needs Verification

  • info.isDying branch in bridge.generateSessionRecap — The diff shows if (!info || info.isDying) throw new SessionNotFoundError(sessionId), but channelInfoForEntry and isDying are not in the diff context. Cannot verify from the supplied diff whether isDying is the correct field name or whether the SessionNotFoundError mapping is appropriate (vs. a transport-specific error). No test covers this branch. Author acknowledged the missing test, deferred.

Validation Evidence

PRESENT — PR body includes manual curl smoke test (exact commands, expected JSON output for happy path, empty-history null, and capability advertisement), plus per-package unit test counts (643/643, 180/180, 261/261). Sufficient for an API-addition PR; no TUI/visual surface to validate.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Test Coverage

  • P1 packages/cli/src/acp-integration/acpAgent.ts:2247-2275 — The new case SERVE_CONTROL_EXT_METHODS.sessionRecap: ext-method handler (~30 lines, 4 branches) has zero test coverage in acpAgent.test.ts. Every sibling ext-method handler added in recent PRs (renameSession at line 2254, rewindSession at line 1695, restoreSessionHistory at line 1814) has a dedicated describe block with happy-path, error-path, and edge-case tests. The new handler's untested branches are:

    1. Invalid/missing sessionIdRequestError.invalidParams (params validation)
    2. Unknown sessionsessionOrThrow (map miss)
    3. Happy pathgenerateSessionRecap(config, signal) returning a recap string
    4. Null recap passthrough → core helper returns null, handler surfaces it verbatim

    The bridge-side test (bridge.test.ts) and HTTP route test (server.test.ts) exercise the full roundtrip through a FakeAgent, but they mock at the ext-method transport layer — they never validate that the real QwenAgent.extMethod handler correctly extracts sessionId from params, calls sessionOrThrow, threads the never-aborting signal, or maps the core helper's return value. A regression in the handler (e.g., wrong param key, missing typeof guard, swapped return fields) would pass the existing test suite.

    Suggested test: add a describe('QwenAgent extMethod sessionRecap', ...) block in acpAgent.test.ts mirroring the renameSession pattern — mock generateSessionRecap via the @qwen-code/qwen-code-core mock, exercise all four branches. The author acknowledged this deferral in a PR comment.

  • P2 packages/acp-bridge/src/bridge.ts:2886if (!info || info.isDying) throw new SessionNotFoundError(sessionId). The isDying channel branch is unreachable in the existing test suite: bridge.test.ts covers the byId.get miss (unknown sessionId) but never constructs a dying-channel entry. A regression that drops this guard (e.g., sending an ext-method into a torn-down ACP channel) would silently hang until the 60s backstop fires. Suggested test: add a case using FakeAgent with a pre-closed channel, call generateSessionRecap, assert SessionNotFoundError.

  • P2 packages/acp-bridge/src/bridge.ts:2888-2895 — The Promise.race([withTimeout(…, SESSION_RECAP_TIMEOUT_MS, …), getTransportClosedReject(entry)]) has two non-happy paths (backstop timeout fires, transport closes mid-flight) — neither is tested. The 60s timeout in particular is a safety-critical guard for a wedged ACP channel; without a test, a refactor that breaks withTimeout's rejection semantics would go unnoticed. Suggested test: a recapFactory that never resolves, assert the bridge rejects within ~60s (use vi.useFakeTimers to avoid wall-clock wait).

  • P3 packages/acp-bridge/src/bridge.ts:2897recap: response.recap ?? null normalizes undefined to null. The existing bridge tests always return {recap: string} or {recap: null} — the undefinednull normalization (defensive against a future ext-method response shape change) is untested. One-line test: have the factory return {sessionId: '...'} with no recap field, assert result.recap is null.

Needs Verification

  • No cross-file concerns beyond what the diff and existing test files already confirm.

Validation Evidence

PRESENT — The PR body includes an end-to-end smoke transcript (curl -sX POST http://127.0.0.1:4170/session/$SID/recap showing both the recap-string and null-recap responses, plus capability advertisement verification). Unit test counts are reported per package (643/643 sdk, 180/180 bridge, 261/261 cli). The author also posted a response to reviewer chiga0 confirming tests + typecheck + lint all green after the doc-reconciliation commit.

This is an automated, advisory, comment-only review — it never approves or requests changes.

The P1 finding maps to a specific diff line. Posting inline comment:

Test Coverage

  • P1 packages/cli/src/acp-integration/acpAgent.ts:2247 — The new case SERVE_CONTROL_EXT_METHODS.sessionRecap: ext-method handler (~30 lines, 4 branches) has zero test coverage in acpAgent.test.ts. Every sibling ext-method handler added in recent PRs (renameSession — 4 tests at line 2254, rewindSession — 4 tests at line 1695, restoreSessionHistory — 4 tests at line 1814) has a dedicated describe block. The new handler's untested branches:

    1. Invalid/missing sessionIdRequestError.invalidParams (params validation)
    2. Unknown sessionsessionOrThrow (map miss)
    3. Happy pathgenerateSessionRecap(config, signal) returns a recap string
    4. Null recap passthrough → core helper returns null, surfaced verbatim

    The bridge test (bridge.test.ts) and HTTP route test (server.test.ts) mock at the ext-method transport layer — they never validate the real QwenAgent.extMethod handler's param extraction, session lookup, signal threading, or return-value mapping. A regression in the handler (wrong param key, missing typeof guard, swapped return fields) passes the full test suite silently.

    Suggested fix: add describe('QwenAgent extMethod sessionRecap', ...) in acpAgent.test.ts mirroring the renameSession pattern — mock generateSessionRecap via the existing @qwen-code/qwen-code-core vi.mock, cover all four branches. Author acknowledged this deferral in PR comments.

  • P2 packages/acp-bridge/src/bridge.ts:2886if (!info || info.isDying) throw new SessionNotFoundError(sessionId): the isDying channel branch is unreachable in existing tests. bridge.test.ts covers the byId.get miss but never constructs a dying-channel entry. A regression that drops this guard would send ext-methods into a torn-down ACP channel, silently hanging until the 60s backstop. Suggested test: FakeAgent with a pre-closed channel → assert SessionNotFoundError.

  • P2 packages/acp-bridge/src/bridge.ts:2888-2895Promise.race([withTimeout(…, SESSION_RECAP_TIMEOUT_MS), getTransportClosedReject(entry)]): neither the backstop-timeout path nor the transport-closed path is tested. The 60s timeout is a safety-critical guard for wedged ACP channels; without a test, a refactor breaking withTimeout's rejection semantics goes unnoticed. Suggested test: never-resolving recapFactory + vi.useFakeTimers() to advance past 60s, assert rejection.

  • P3 packages/acp-bridge/src/bridge.ts:2897recap: response.recap ?? null normalizes undefinednull. Existing tests always return {recap: string} or {recap: null} — the undefined normalization (defensive against future ext-method shape changes) is untested. Suggested test: factory returns {sessionId: '...'} with no recap field, assert result.recap is null.

Needs Verification

  • No cross-file concerns beyond what the diff and existing test files confirm.

Validation Evidence

PRESENT — PR body includes an end-to-end smoke transcript (curl against a live daemon showing recap-string and null-recap responses, plus GET /capabilities verification). Per-package unit test counts reported (643/643 sdk, 180/180 bridge, 261/261 cli). Author confirmed tests + typecheck + lint green after the doc-reconciliation commit 058bde70f.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Maintainability / Performance

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts:~997recapSession calls this._fetch directly, bypassing fetchWithTimeout. In the existing codebase, _fetch is reserved for two justified cases: prompt() (model turns can take minutes) and subscribeEvents (indefinite SSE stream). Every other SDK method — including the sibling cancel() which is also a short fire-and-forget POST — goes through fetchWithTimeout. The JSDoc justification ("can take longer than the default 30s budget under a slow model") overstates the risk: the core helper has maxOutputTokens: 300 with 1–5s typical latency, and the bridge itself enforces a 60s SESSION_RECAP_TIMEOUT_MS backstop. A caller who omits signal has no client-side ceiling and will hang indefinitely if the daemon is unresponsive (bridge timeout fires server-side but the SDK's TCP socket stays open). Author has acknowledged and committed to switching to fetchWithTimeout with a RECAP_DEFAULT_TIMEOUT_MS = 90_000 in a follow-up — flagging here so it doesn't get lost.

    Suggested fix:

    async recapSession(
      sessionId: string,
      opts?: { signal?: AbortSignal; clientId?: string },
    ): Promise<DaemonSessionRecapResult> {
      return await this.fetchWithTimeout(
        `${this.baseUrl}/session/${encodeURIComponent(sessionId)}/recap`,
        {
          method: 'POST',
          headers: this.headers(
            { 'Content-Type': 'application/json' },
            opts?.clientId,
          ),
          body: '{}',
          signal: opts?.signal,
        },
        async (res) => {
          if (!res.ok) throw await this.failOnError(res, 'POST /session/:id/recap');
          return (await res.json()) as DaemonSessionRecapResult;
        },
      );
    }

Needs Verification

  • No concerns requiring code outside the supplied context. The route, bridge, ACP handler, and SDK layers are all present in the diff and follow established sibling patterns (/session/:id/prompt, /session/:id/approval-mode, cancel). Route-level error handling via sendBridgeError, parseClientIdHeader, and the mutate() gate all match existing conventions. The Promise.race([withTimeout(...), getTransportClosedReject(...)]) pattern in the bridge is consistent with other ext-method roundtrips.

Validation Evidence

PRESENT — PR body includes an end-to-end manual smoke test transcript (curl commands against a running daemon showing 200 with recap text, 200 with recap: null on empty history, and capability advertisement). Unit test counts reported: sdk-typescript 643/643, acp-bridge 180/180, cli 261/261, build clean.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Undirected Audit

  • P2 packages/cli/src/serve/server.ts:~1496 (via sendBridgeError at line 2520) — When the bridge's 60s SESSION_RECAP_TIMEOUT_MS backstop fires, withTimeout throws BridgeTimeoutError. sendBridgeError has no specific branch for this error type — it falls through to the generic 500 handler, which writes a free-text error message with no code field. Every other route that can time out (session init via initTimeoutMs, MCP restart) either has dedicated error mapping or a structured errorKind. The recap protocol doc (qwen-serve-protocol.md) lists 400 and 404 as the error envelope but doesn't mention the 500 paths (bridge timeout, transport death). SDK consumers branching on response.code or error.status can't distinguish "bridge timeout on a wedged ACP channel" from "unexpected daemon crash." Impact: automated tooling that retries on 5xx will retry a wedged-channel timeout (which will just time out again), while one that treats all 500s as fatal will tear down a session that's still alive behind a slow model. Suggested fix: add a BridgeTimeoutError branch in sendBridgeError → 504 with code: 'bridge_timeout', and document the 5xx paths in the protocol reference's error section.

  • P2 packages/cli/src/acp-integration/acpAgent.ts:~2254 — The new sessionRecap ext-method handler validates typeof sessionId !== 'string' || sessionId.length === 0 but omits the SESSION_ID_RE.test(sessionId) check that every sibling handler in the same switch block applies (deleteSession:1629, renameSession:1648, closeSession:1696, and one more at 1729). The author acknowledged this in PR comments as a deferred item, but flagging here because the inconsistency is a cross-file coupling issue that focused per-file passes may underweight: a future handler copied from this one inherits the weaker validation, widening the gap. Impact: low immediate risk (the bridge's byId.get catches unknown IDs first), but it creates a "validation debt" that compounds with each new ext-method. Suggested fix: add !SESSION_ID_RE.test(sessionId) to the guard, matching the sibling pattern.

  • P3 packages/cli/src/serve/server.ts:~1487 — The route comment block is now a 12-line prose essay explaining what the route does not do (no disconnect listener, no AbortSignal, no cross-process abort). While accurate and well-reasoned, this is a future-maintainer trap: the comment is the only guardrail preventing someone from "fixing" the absent cancellation by wiring res.once('close') without also plumbing the cross-process abort on the ACP child side (which would be cosmetic, as the comment explains). If the cross-process abort plumbing lands in a follow-up, this comment must be updated in lockstep or it becomes actively misleading. Suggested fix: extract the v1 cancellation contract into a single named constant or function (e.g., // RECAP_V1_NO_CANCEL — see design doc §Daemon access path) so the invariant has one update point.

Needs Verification

  • The diff references packages/acp-bridge/src/bridge.ts which doesn't exist on the current working tree — the actual bridge implementation lives in packages/cli/src/serve/httpAcpBridge.ts. This suggests the PR targets a branch where the bridge has been extracted into its own package (acp-bridge), but the extraction isn't visible on this branch. The generateSessionRecap bridge method, withTimeout utility, getTransportClosedReject, and channelInfoForEntry all need to be verified against the actual target-branch bridge file to confirm the diff applies cleanly and the SESSION_RECAP_TIMEOUT_MS constant is placed in the correct scope.

  • The generateSessionRecap core helper reads geminiClient.getChat().getHistory() — the PR docs claim this is a snapshot that doesn't mutate the session's GeminiChat. This is plausible from the code (getHistory() typically returns a copy), but should be verified against the GeminiClient implementation to confirm no shared-reference mutation is possible under concurrent recap calls.

Validation Evidence

PRESENT — The PR includes: (1) command transcripts (curl examples in the PR body showing happy path, null-recap, and capability advertisement), (2) test reports (643/643 SDK, 180/180 bridge, 261/261 CLI), (3) build verification (npm run build clean). The evidence covers the core wire contract and error paths. Missing: no before/after evidence of the 60s timeout actually firing (the backstop path is untested — acknowledged by the author as a deferred item).

This is an automated, advisory, comment-only review — it never approves or requests changes.


Qwen Code /review · qwen3.7-max · tier DEEP

@QwenLM QwenLM deleted a comment from qwen-code-ci-bot May 25, 2026
@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Qwen Code Review (DEEP)

CI-safe profile adapted from bundled /review; focused passes: correctness/security, test coverage, maintainability/performance, and undirected audit.

Correctness / Security

No correctness or security issues found.

The implementation is well-structured and follows established codebase patterns at every layer:

  • Bridge generateSessionRecap mirrors the existing sendSessionExtMethod helper pattern (byId.getchannelInfoForEntryisDying check → Promise.race with withTimeout + getTransportClosedReject), with a custom 60s backstop instead of the default initTimeoutMs. The isDying → SessionNotFoundError mapping matches setSessionApprovalMode (line 3368 of httpAcpBridge.ts). Using entry.sessionId instead of response.sessionId in the return avoids trusting the ACP child's response identity. response.recap ?? null normalizes undefined from a degenerate child response.

  • acpAgent.ts ext-method handler uses the same typeof sessionId !== 'string' || sessionId.length === 0 validation as its closest sibling sessionApprovalMode (line 1588), both being SERVE_CONTROL_EXT_METHODS handlers. The deleteSession/renameSession handlers that use SESSION_ID_RE are a different class of ext-method (session lifecycle, not serve-control). sessionOrThrow does Map.get so invalid IDs produce a clean invalidParams error regardless.

  • HTTP route uses bare mutate() (non-strict), matching /session/:id/prompt, /session/:id/heartbeat, /session/:id/load, /session/:id/resume, and /session (create). parseClientIdHeader + sendBridgeError follow the established error-mapping contract.

  • SDK recapSession uses raw this._fetch without fetchWithTimeout, matching the prompt method (line 1045) — both are non-strict, session-scoped POST routes where the caller controls cancellation via signal. encodeURIComponent(sessionId) prevents path traversal.

  • Core generateSessionRecap reads geminiClient.getChat().getHistory() (snapshot, no mutation) and routes through runSideQuery → separate BaseLlmClient.generateText — concurrent recap calls are safe as documented.

  • Second commit correctly reconciles the cancellation documentation to match actual v1 behavior (no HTTP disconnect listener, no AbortSignal plumbing, 60s backstop only), removing the misleading "client disconnect aborts the bridge-side wait" claim from three locations.

Needs Verification

No concerns requiring verification outside the supplied context.

Validation Evidence

PRESENT — the PR body includes:

  • Per-package test counts: sdk-typescript 643/643 ✓, acp-bridge 180/180 ✓, cli 261/261 ✓, npm run build clean (0 errors).
  • End-to-end smoke transcript: curl -sX POST http://127.0.0.1:4170/session/$SID/recap | jq . showing both populated-recap and null-recap (empty-history) responses, plus capability advertisement probe.
  • Test plan enumerates specific test counts per package and per describe-block (+6 SDK, +3 bridge, +6 server route, +1 public-surface type lock).

This is an automated, advisory, comment-only review — it never approves or requests changes.


Test Coverage

  • P1 packages/cli/src/acp-integration/acpAgent.ts:2244–2275 — The new SERVE_CONTROL_EXT_METHODS.sessionRecap ext-method handler has zero test coverage. This is 30 lines of production code in the ACP child process with three distinct branches:

    1. Invalid/missing sessionId validation → RequestError.invalidParams
    2. sessionOrThrow(sessionId) → unknown-session error path
    3. Happy path calling generateSessionRecap(config, new AbortController().signal) and returning {sessionId, recap}

    Every other SERVE_CONTROL_EXT_METHODS handler in the codebase has at least one test via the extMethod test harness (renameSession has a full describe block at line 2254, sessionApprovalMode has the approval-mode roundtrip). The new handler is the only one with none. The author explicitly acknowledged this gap in a PR comment ("will add a describe block mirroring renameSession's pattern") and deferred it. Impact: a regression in sessionId extraction, sessionOrThrow semantics, or the return-shape contract would go undetected. Suggested test: add a describe('QwenAgent extMethod sessionRecap') block with at minimum (a) happy path returning the recap, (b) missing sessionId throwing invalidParams, (c) unknown sessionId via sessionOrThrow.

  • P2 packages/acp-bridge/src/bridge.ts (generateSessionRecapisDying branch)if (!info || info.isDying) throw new SessionNotFoundError(sessionId) is untested. The three bridge tests cover: success, null-recap passthrough, and unknown sessionId via byId.get miss. The isDying path (session exists but channel is dying) is a distinct code path that could regress silently. Author acknowledged and deferred.

  • P2 packages/acp-bridge/src/bridge.ts (generateSessionRecap — 60s timeout race)Promise.race([withTimeout(..., SESSION_RECAP_TIMEOUT_MS, ...), getTransportClosedReject(entry)]) has no test confirming the backstop fires on a wedged ACP channel. The withTimeout utility is presumably tested in isolation, but the integration — confirming the recap call actually rejects with the timeout error when the ext-method never resolves — is uncovered. Author acknowledged and deferred.

  • P3 packages/acp-bridge/src/bridge.ts (recap ?? null normalization)recap: response.recap ?? null defensively coerces undefined to null, but no test exercises the case where the ext-method response omits the recap field entirely. Minor — one-line test, author acknowledged.

Needs Verification

  • The acpAgent.ts test file on the PR branch (daemon_mode_b_main) may have additional test infrastructure for ext-method handlers beyond what's visible on the current branch. The acpAgent.test.ts on the current branch has no sessionApprovalMode handler tests either, so it's possible the testing convention for control ext-methods differs from what's observable here. Verify against the PR branch whether an acpAgent-level test exists for sessionRecap that wasn't included in the diff.
  • The core generateSessionRecap function (packages/core/src/services/sessionRecap.ts) is not modified by this PR and presumably has its own test coverage — the gap is only at the daemon integration layers.

Validation Evidence

PRESENT — PR body includes: (1) exact test counts per package (643/643, 180/180, 261/261) with pass confirmation, (2) npm run build clean, (3) end-to-end smoke commands (curl against live daemon for happy path, empty-history null path, and capability advertisement), (4) author PR comments explicitly enumerate deferred test items with rationale.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Maintainability / Performance

No maintainability or performance issues found.

The material concerns in this diff — recapSession bypassing fetchWithTimeout (leaving callers without a default timeout), the absent SESSION_ID_RE validation in the ACP ext-method handler, the unguarded as cast on the extMethod response, and missing test coverage for the isDying branch and 60s backstop timeout — are all already raised and acknowledged in existing PR comments, with the author committing to a follow-up PR for each. Re-flagging them would be noise.

The code follows established codebase conventions: bare mutate() for non-strict gating (parity with /prompt, /cancel, /heartbeat, /model), parseClientIdHeader null/undefined protocol, encodeURIComponent on SDK path segments, Promise.race against withTimeout + getTransportClosedReject in the bridge (same shape as setSessionApprovalMode). Comment density is high but matches surrounding files.

Needs Verification

No concerns requiring additional context.

Validation Evidence

PRESENT — the PR body includes an end-to-end smoke transcript (curl against a freshly built daemon: happy-path recap, null-recap on empty history, capability advertisement), plus per-package test counts (643/643 SDK, 180/180 bridge, 261/261 CLI) with specific test descriptions.

This is an automated, advisory, comment-only review — it never approves or requests changes.


Undirected Audit

  • P2 packages/sdk-typescript/src/daemon/DaemonClient.ts:~1001recapSession calls this._fetch(...) directly, bypassing fetchWithTimeout. Every other non-streaming method in DaemonClient routes through fetchWithTimeout, which layers an AbortSignal.timeout(fetchTimeoutMs) on top of the caller's signal. When a caller omits signal (the common case for a one-shot recap fetch), the only ceiling is the daemon-side 60s SESSION_RECAP_TIMEOUT_MS plus TCP-level timeouts. If the daemon accepts the connection but stalls before responding (e.g., bridge wedged on a different ext-method serializing the ACP channel), the SDK call hangs indefinitely. The author has acknowledged this in PR comments and deferred to a follow-up with fetchWithTimeout + RECAP_DEFAULT_TIMEOUT_MS. Impact: SDK consumers without an explicit AbortSignal.timeout() have no client-side backstop — the call can pin a socket past any reasonable expectation. Suggested fix: switch to fetchWithTimeout with a dedicated recap timeout constant (90s proposed), or at minimum wrap in Promise.race against a local backstop when opts?.signal is absent. Tracked as follow-up.

  • P3 packages/cli/src/acp-integration/acpAgent.ts:~2259 — The new sessionRecap ext-method handler validates typeof sessionId !== 'string' || sessionId.length === 0 but does not apply the SESSION_ID_RE regex that sibling handlers (deleteSession, renameSession, closeSession, setSessionApprovalMode) all enforce. sessionOrThrow only does a Map.get — no format guard. An arbitrary string reaches Map.get (safe, no injection) but diverges from the defense-in-depth posture of its neighbors. Impact: low — the bridge layer already keys sessions by valid IDs, so an invalid format reaching the child is unlikely in practice. But a future caller that bypasses the bridge (direct ACP test harness, fuzzing) would see inconsistent validation. Suggested fix: add if (!sessionId || !SESSION_ID_RE.test(sessionId)) to match the four sibling handlers. Author has acknowledged and deferred to follow-up.

No additional undirected-audit issues found.

Needs Verification

No concerns requiring code outside the supplied context.

Validation Evidence

PRESENT — PR body includes:

  • Exact commands: curl -sX POST http://127.0.0.1:4170/session/$SID/recap | jq . with observed JSON output (happy path and null-recap path)
  • Capability advertisement check: curl -s http://127.0.0.1:4170/capabilities | jq '.features | ...'
  • Per-package unit test counts: sdk-typescript 643/643 ✓, acp-bridge 180/180 ✓, cli 261/261 ✓
  • npm run build clean (0 errors)

This is an automated, advisory, comment-only review — it never approves or requests changes.


Qwen Code /review · qwen3.7-max · tier DEEP

}
const session = this.sessionOrThrow(sessionId);
const config = session.getConfig();
// v1: no cross-process abort plumbing. Client disconnect aborts

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The reconciliation in commit 058bde70f corrects the cancellation narrative in 3 docs + the server.ts route comment, but this inline comment still says "Client disconnect aborts the bridge-side wait" — the exact false claim the PR disproves everywhere else.

The new server.ts comment explicitly says "This matches the ACP child's acpAgent.ts handler, which also passes a never-aborting AbortController().signal" — pointing readers here for corroboration, but the comment they land on says the opposite.

Suggested change
// v1: no cross-process abort plumbing. Client disconnect aborts
// v1: no cross-process abort plumbing. The bridge does not listen
// for HTTP client disconnect and no AbortSignal is threaded through
// the ext-method, so the LLM call in this child always runs to
// completion. The only ceilings are the bridge's 60s
// `SESSION_RECAP_TIMEOUT_MS` backstop and the transport-closed race
// against ACP channel death. Acceptable because recap is short
// (single-attempt side-query, `maxOutputTokens: 300`). A future
// request-id-based cancel ext-method can plumb a real signal
// end-to-end if the bandwidth cost ever becomes an issue.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — good catch, applied verbatim in commit 40c23eed8. You're right that 058bde7's server.ts comment pointed here for corroboration but landed readers on the opposite claim. The new comment now mirrors the route's wording ("no AbortSignal threaded through the ext-method, only 60s SESSION_RECAP_TIMEOUT_MS backstop + transport-closed race") so the cross-reference is consistent end-to-end.

…-method

The previous commit (058bde7) reconciled the cancellation narrative
in 3 doc files + the route comment in server.ts, but missed the inline
comment inside the ACP child's `SERVE_CONTROL_EXT_METHODS.sessionRecap`
handler. That comment still claimed "Client disconnect aborts the
bridge-side wait" — the exact false statement 058bde7 was meant to
remove from the codebase. Worse, the new server.ts comment from 058bde7
points readers at this handler for corroboration ("This matches the ACP
child's `acpAgent.ts` handler ..."), so a reader following that crumb
would land on a comment saying the opposite.

Per @wenshao's `[Suggestion]` review on QwenLM#4504, applying his suggested
replacement verbatim. Comment-only change; no behavior delta.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks — all three items in this review are already tracked and deferred to the follow-up PR after this merges. See the breakdown in #issuecomment-4534337349:

  • recapSession fetchWithTimeout switch (your P2) — tracked.
  • Bridge as cast runtime guard (your P2) — tracked.
  • SESSION_ID_RE consistency in the ext-method handler (your P3) — tracked. Note: the v2 review further down correctly observes this matches sessionApprovalMode's posture rather than deleteSession/renameSession, so it's lower-priority than originally classified, but still on the list.

For info.isDying test coverage (your "Needs Verification") — also already on the follow-up list as a bridge-test addition using the existing FakeAgent channel pattern.

@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks — clean bill on correctness/security noted. The P1 test coverage finding (acpAgent sessionRecap ext-method handler describe block) is already tracked in #issuecomment-4534337349 — wenshao raised the same concern as a [Suggestion] and it's first on the follow-up PR's list along with fetchWithTimeout, runtime guard, and the SESSION_ID_RE consistency check.

Appreciate the explicit confirmation that runSideQuery doesn't mutate GeminiChat and that entry.sessionId (vs child-echoed) avoids identity spoofing — those were exactly the contracts the user-guide bullet was meant to surface.

Comment thread packages/acp-bridge/src/bridge.ts Outdated
* provider hiccups without inheriting the 10s `initTimeoutMs` default
* (which would false-fire on any GPT-style slow start). The race is a
* safety net against a wedged ACP channel — actual cancellation on
* client disconnect is handled at the HTTP route layer.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The JSDoc for SESSION_RECAP_TIMEOUT_MS states "actual cancellation on client disconnect is handled at the HTTP route layer," but the server.ts route comment (line 1486) explicitly says the opposite: "v1 cancellation: NONE on the route side. There is intentionally no res.once('close') listener and no AbortSignal plumbed into bridge.generateSessionRecap."

The reconciliation in commits 058bde70f and 40c23eed8 corrected 3 doc files, the server.ts route comment, and the acpAgent.ts ext-method handler comment — but this bridge-layer JSDoc was missed. A reader following the server.ts cross-reference would land on a claim that contradicts the route's own documentation.

Suggested change
* client disconnect is handled at the HTTP route layer.
* safety net against a wedged ACP channel there is no HTTP-side
* disconnect cancellation in v1 (see server.ts route comment).

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — applied verbatim in commit 0a1931113. Did a wider sweep this time and caught two more stale "cancellation via signal" claims in the SDK JSDocs (DaemonClient.recapSession + DaemonSessionClient.recap); both now spell out that the signal aborts only the local HTTP fetch and the daemon/child layers ignore it. Hopefully the cancellation narrative is now consistent end-to-end across all six locations.

… SDK JSDocs

Third pass on the same task. wenshao caught one more spot in
`bridge.ts:330` (JSDoc for `SESSION_RECAP_TIMEOUT_MS` claimed "actual
cancellation on client disconnect is handled at the HTTP route layer"
— the exact opposite of what the route comment + protocol doc + design
doc + acpAgent comment all now say).

Pre-empting another round-trip by sweeping the rest of the codebase
and fixing the two remaining misleading SDK JSDocs in the same go:

- `DaemonClient.recapSession`: previously said "cancellation is via
  the optional signal" without qualifying that the signal aborts ONLY
  the local HTTP fetch. The daemon-side wait + the child-side LLM call
  both ignore it. Spelled out the layered reality: signal → fetch
  cancellation only; bridge → 60s backstop; ACP child → always runs to
  completion. Also corrected the "bypasses fetchTimeoutMs" claim — the
  raw `_fetch` simply doesn't go through that wrapper at all.
- `DaemonSessionClient.recap`: same clarification on the wrapper that
  delegates to `recapSession`.

Comment-only changes; no behavior delta.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4 at 0a1931113: R3 Suggestion (bridge.ts:330 comment contradiction) addressed, plus two additional stale SDK cancellation JSDoc claims fixed. Full re-review of the 19-file diff (9 agents, 2 reverse audit rounds) found 0 new issues. Build clean, 1083/1084 tests pass (1 pre-existing flaky test outside PR scope). All prior suggestions resolved. — qwen3.7-max via Qwen Code /review

@wenshao

wenshao commented May 26, 2026

Copy link
Copy Markdown
Collaborator

维护者本地真实测试验证报告 (PR #4504)

把 PR HEAD 0a191311 合到 latest daemon_mode_b_main(81b46c20)后,使用 tmux session pr4504 跑了完整构建 / 类型检查 / lint / 文件级单测 / 真实 qwen serve 端到端冒烟。结论是 可以合并 — 所有测试通过,recap 端点在真实 daemon 上 HTTP 行为与 PR 描述完全一致,non-strict gate / 404 / invalid_client_id 400 等边界用例都符合预期。

合并 / 构建

步骤 结果
Base daemon_mode_b_main @ 81b46c20
PR HEAD 0a191311
合并提交 4163a47f (merge clean, 无冲突,自动 merge 了 bridge.test.ts/bridge.ts/sdk-typescript/src/daemon/index.ts)
npm ci EXIT 0
npm run build EXIT 0 (0 errors,15 个与本 PR 无关的 lint warning)
npm run typecheck (5 个 workspace) EXIT 0
聚焦 lint(本 PR 改动的 11 个 .ts) EXIT 0,无 warning/error

文件级单元测试

PR 描述里给出的就是按文件计数,这里也按相同口径复核(全部 vitest run --coverage=false):

测试文件 PR 描述 实测
sdk-typescript: DaemonClient.test.ts + DaemonSessionClient.test.ts + daemon-public-surface.test.ts 3 files,155 passed
acp-bridge/src/bridge.test.ts 180 / 180 1 file,186 passed
cli/src/serve/server.test.ts 261 / 261 1 file,261 passed(完全吻合)

为了确认本 PR 没拖累相邻代码,顺便跑了三个 package 各自的全量套件:

passed
sdk-typescript 全量 648 / 648
acp-bridge 全量 300 / 300
cli/src/serve/ 全量 651 / 651

端到端冒烟(真实 qwen serve daemon + curl)

# 启动 daemon(loopback,默认无 token,ephemeral port)
node /private/tmp/pr4504-merged/dist/cli.js serve --port 0 --hostname 127.0.0.1
# qwen serve listening on http://127.0.0.1:61409 (mode=http-bridge, workspace=/private/tmp/qwen-recap-ws)
# qwen serve: bearer auth disabled (loopback default). Set QWEN_SERVER_TOKEN to enable.

1. Capability 暴露 & 排序

curl -s $URL/capabilities | jq '.features | map(select(. == "session_recap"))'
# → [ "session_recap" ]

features 列表里的位置正好紧挨 workspace_mcp_restart,与 PR "slotted next to workspace_mcp_restart" 完全一致:

"workspace_init",
"workspace_mcp_restart",
"session_recap",        ← 新增,顺序符合预期
"mcp_workspace_pool",

2. 创建 session(无 body)

curl -sX POST $URL/session -H 'Content-Type: application/json' -d '{}' | jq .
# {
#   "sessionId": "a0f845a6-ae6a-4fe4-9afd-5845a4325fe2",
#   "workspaceCwd": "/private/tmp/qwen-recap-ws",
#   "attached": false,
#   "clientId": "client_1bcc0be6-...",
#   "createdAt": "2026-05-26T01:27:26.746Z"
# }

3. POST /session/$SID/recap — happy path(空 body)

curl -sX POST $URL/session/$SID/recap -H 'Content-Type: application/json' | jq .
# {
#   "sessionId": "a0f845a6-ae6a-4fe4-9afd-5845a4325fe2",
#   "recap": "Setting up the Qwen Code context in an empty workspace. Next: specify the project or task to begin."
# }

注:由于本地 daemon 已配好 LLM 凭证,即使一条 prompt 都没发,因为 daemon 启动时往 history 里塞了 qwen.md 上下文,generateSessionRecap 真的发起了一次 side-query(单 attempt,maxOutputTokens: 300),~3s 返回了一句真实的 recap。这反过来正面验证了:路由 → bridge → ACP child → core generateSessionRecap → LLM 整条链路在 daemon 进程模型下确实工作,不只是 mock 通过。Result shape 与 PR 描述一致({ sessionId, recap })。

4. X-Qwen-Client-Id header 传播

curl -sX POST $URL/session/$SID/recap -H 'X-Qwen-Client-Id: my-client-42' | jq .
# 200 + 正常 recap;non-strict gate 下 unknown client 也允许调用(与 server.test.ts:2666 注释一致)

5. 404 — unknown session

curl -sX POST $URL/session/sess:does-not-exist/recap -w "\nHTTP_STATUS=%{http_code}\n"
# {"error":"No session with id \"sess:does-not-exist\"","sessionId":"sess:does-not-exist"}
# HTTP_STATUS=404

6. 400 invalid_client_id — malformed header

curl -sX POST $URL/session/$SID/recap -H 'X-Qwen-Client-Id: bad client id with spaces' -w "\nHTTP_STATUS=%{http_code}\n"
# {"error":"`X-Qwen-Client-Id` must be a non-empty token of 128 characters or fewer","code":"invalid_client_id"}
# HTTP_STATUS=400

(空字符串 X-Qwen-Client-Id: 被视作 "no header" 而非 malformed,返回 200 — 与 parseClientIdHeader 实现一致,server.test.ts:1718 也用 'bad client id' 这种带空格的值作为 malformed 触发条件,空 header 不算 malformed。)

期望矩阵 vs 实测

检查项 期望 实测
features 包含 session_recap true true
session_recap 紧邻 workspace_mcp_restart true true
POST /session/:id/recap 200 + {sessionId, recap} shape
Unknown session → 404 404 404
Malformed X-Qwen-Client-Id → 400 invalid_client_id 400 400
Non-strict gate(未知 client 不被拒) 200 200
路由 → bridge → ACP child → LLM 真实闭环 LLM 真发出 LLM 返回真实 recap,~3s
bridge.test.ts 文件级测试 180 / 180 186 / 186
server.test.ts 文件级测试 261 / 261 261 / 261

复现

git fetch origin pull/4504/head:pr-4504-test
git fetch origin daemon_mode_b_main
git worktree add -b pr4504-merge-test /tmp/pr4504-merged origin/daemon_mode_b_main
cd /tmp/pr4504-merged && git merge pr-4504-test --no-edit
npm ci && npm run build

# 单元测试
(cd packages/sdk-typescript && npx vitest run --coverage=false \
  test/unit/DaemonClient.test.ts \
  test/unit/DaemonSessionClient.test.ts \
  test/unit/daemon-public-surface.test.ts)
(cd packages/acp-bridge && npx vitest run --coverage=false src/bridge.test.ts)
(cd packages/cli && npx vitest run --coverage=false src/serve/server.test.ts)

# E2E:启动 + curl
node dist/cli.js serve --port 0 --hostname 127.0.0.1
# 在另一个 shell 拿 URL 后跑 curl(见上文具体命令)

结论

  • ✅ 与 latest base 干净合并,无冲突
  • ✅ build / typecheck / lint 全绿
  • ✅ 文件级单元测试与 PR 描述完全吻合(其中 server.test.ts: 261/261 精确匹配,bridge.test.ts: 186 略多于 PR 时点的 180,但全部通过)
  • ✅ 真实 qwen serve daemon 端到端冒烟:capability / happy path / 404 / 400 / non-strict gate 全部符合 PR 描述
  • ✅ recap 真实 LLM 调用闭环走通(~3s,单 attempt),证明不只是 mock 通过
  • 同意合并。

@doudouOUC doudouOUC merged commit 3363f9d into QwenLM:daemon_mode_b_main May 26, 2026
4 checks passed
@doudouOUC doudouOUC deleted the feat/daemon-session-recap branch May 26, 2026 07:09
doudouOUC added a commit that referenced this pull request May 27, 2026
Squashed feature work from daemon_mode_b_main branch, rebased onto
latest main to establish proper merge-base and clean PR diff.

Original commits:
- perf(core): F2 cleanup PR A — R9/W11/W12/R10 (post-merge follow-ups) (#4411)
- refactor(acp-bridge): F1 test split — lift bridge.test.ts (6861 LOC) to acp-bridge (#4445)
- fix(core): F2 cleanup PR B — self-heal observability (W133-a + W134) (#4460)
- feat(sdk/daemon-ui): unified completeness follow-up to #4328 (#4353)
- docs(serve): v0.16-alpha known limits + SDK QWEN_SERVER_TOKEN env fallback (PR 27) (#4473)
- docs(deploy): local launch templates for v0.16-alpha (PR 30a) (#4483)
- feat(daemon+sdk): cross-client real-time sync completeness (#4484)
- feat(serve): add POST /session/:id/recap (#4504)
- feat(daemon): add voterClientId to permission_resolved (A4) (#4539)
- feat(serve): --allow-origin <pattern> CORS allowlist (T2.4 #4514) (#4527)
- feat(daemon): in-session model switch reaches the bus (A1) (#4546)
- feat(serve): prompt absolute deadline + SSE writer idle timeout (#4514 T2.9) (#4530)
- Feat/daemon react cli (#4380)
doudouOUC added a commit that referenced this pull request May 27, 2026
* feat(serve): add POST /session/:id/recap

Wraps generateSessionRecap (core/services/sessionRecap.ts) so daemon
clients can fetch a one-sentence "where did I leave off" summary
without driving the agent through a full prompt turn. Mirrors the
ext-method roundtrip used by /session/:id/approval-mode — bridge
forwards `qwen/control/session/recap` to the ACP child, which calls
the existing core helper against the per-session GeminiClient history.

- Route: non-strict mutation gate (parity with /prompt — costs tokens
  but mutates no state)
- Capability tag: `session_recap`
- SDK: `client.recapSession(sessionId, opts)` +
  `session.recap(opts)` convenience wrapper
- 60s bridge-side backstop timeout; client-disconnect aborts the
  HTTP wait (LLM call in the child still completes — recap is short)
- Recap is best-effort: short history / transient model failure
  surfaces as 200 with `recap: null`, not an error

Tests cover the route (200 happy path, 200 null recap, client-id
context, 404 on unknown session, malformed client-id, non-strict gate
posture), the bridge ext-method roundtrip (success, null recap,
SessionNotFoundError), the SDK client + session-client wrappers
(URL encoding, body, headers, signal propagation, 404 throw), and a
public-surface type lock for `DaemonSessionRecapResult`.

Closes part of #4175 (Top 5 ROI port #1 from the daemon coverage gap
inventory). Targets daemon_mode_b_main integration branch.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* docs(serve): reconcile recap cancellation docs with actual v1 behavior

Per chiga0's review on #4504 (option 1 — match docs to reality rather
than wire up cosmetic AbortController plumbing). The route, design doc,
and protocol reference all claimed "client disconnect aborts the
bridge-side wait" via `res.once('close')`, but the route has no such
listener and the bridge accepts no `AbortSignal`. The only ceilings
are the 60s `SESSION_RECAP_TIMEOUT_MS` backstop and the transport-
closed race against ACP channel death.

Wiring an HTTP-side AbortController in isolation would be cosmetic
because the ACP child handler also passes a never-aborting
`AbortController().signal` to the core helper (no cross-process abort
plumbing yet) — e2e cancel needs both layers. Recap is short (~1–5s,
`maxOutputTokens: 300`), so the absent cancellation is acceptable for
v1; a request-id-based cancel ext-method can land in a follow-up.

Also adds two known-limit bullets to the user guide per chiga0's other
minor notes: token-cost amplification on no-token loopback (no
per-route rate limit) and concurrent-recap safety (side-query reads
chat history via `GeminiClient.getChat().getHistory()` snapshot and
runs through a separate `BaseLlmClient`, never mutating the session's
`GeminiChat`).

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* docs(serve): finish recap cancellation reconciliation in acpAgent ext-method

The previous commit (058bde7) reconciled the cancellation narrative
in 3 doc files + the route comment in server.ts, but missed the inline
comment inside the ACP child's `SERVE_CONTROL_EXT_METHODS.sessionRecap`
handler. That comment still claimed "Client disconnect aborts the
bridge-side wait" — the exact false statement 058bde7 was meant to
remove from the codebase. Worse, the new server.ts comment from 058bde7
points readers at this handler for corroboration ("This matches the ACP
child's `acpAgent.ts` handler ..."), so a reader following that crumb
would land on a comment saying the opposite.

Per @wenshao's `[Suggestion]` review on #4504, applying his suggested
replacement verbatim. Comment-only change; no behavior delta.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* docs(serve): finish recap cancellation reconciliation across bridge + SDK JSDocs

Third pass on the same task. wenshao caught one more spot in
`bridge.ts:330` (JSDoc for `SESSION_RECAP_TIMEOUT_MS` claimed "actual
cancellation on client disconnect is handled at the HTTP route layer"
— the exact opposite of what the route comment + protocol doc + design
doc + acpAgent comment all now say).

Pre-empting another round-trip by sweeping the rest of the codebase
and fixing the two remaining misleading SDK JSDocs in the same go:

- `DaemonClient.recapSession`: previously said "cancellation is via
  the optional signal" without qualifying that the signal aborts ONLY
  the local HTTP fetch. The daemon-side wait + the child-side LLM call
  both ignore it. Spelled out the layered reality: signal → fetch
  cancellation only; bridge → 60s backstop; ACP child → always runs to
  completion. Also corrected the "bypasses fetchTimeoutMs" claim — the
  raw `_fetch` simply doesn't go through that wrapper at all.
- `DaemonSessionClient.recap`: same clarification on the wrapper that
  delegates to `recapSession`.

Comment-only changes; no behavior delta.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants