Skip to content

feat(daemon): add request-level logging for serve routes#4606

Merged
doudouOUC merged 7 commits into
daemon_mode_b_mainfrom
feat/daemon-request-logging
May 29, 2026
Merged

feat(daemon): add request-level logging for serve routes#4606
doudouOUC merged 7 commits into
daemon_mode_b_mainfrom
feat/daemon-request-logging

Conversation

@doudouOUC

@doudouOUC doudouOUC commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add access-log middleware: logs method, path, sessionId, clientId, status, and durationMs for every completed request (excludes GET /health and SSE /events to reduce noise)
  • Add inline business-context logs for key routes: session spawn/attach, prompt enqueue, cancel, recap (distinguishes null vs generated), shell command completion, and SSE stream open/close with duration
  • All logging gated on daemonLog existence — tests/embeds are unaffected

Motivation

Previously only 5xx errors were logged via sendBridgeError, making it impossible to debug issues like "frontend says /recap returned nothing" — the backend had zero trace of the request. Now operators can clearly see whether the request arrived, how long it took, and what the business result was.

Test plan

  • npx tsc --noEmit -p packages/cli/tsconfig.json — no new type errors
  • Manually start daemon, send /recap request, confirm daemon log file has access + inline logs
  • Confirm no extra output when daemonLog is not injected (test scenarios)
中文版本

概要

  • 添加 access-log 中间件:每个完成的请求记录 method、path、sessionId、clientId、status、durationMs(排除 GET /health 和 SSE /events)
  • 为关键路由补充 inline 业务日志:session spawn/attach、prompt enqueue、cancel、recap(区分 null vs generated)、shell command 完成、SSE 连接开启/关闭
  • 所有日志 gated on daemonLog 存在,tests/embeds 不受影响

动机

之前只有 5xx error 才会记录日志,导致排查问题时(如前端报告 /recap 无返回)后端完全没有痕迹。现在可以清楚看到请求是否到达、处理耗时、以及业务结果。

测试计划

  • npx tsc --noEmit -p packages/cli/tsconfig.json 无新增类型错误
  • 手动启动 daemon,发送 /recap 请求,确认 daemon log 文件有 access + inline 日志
  • 确认不注入 daemonLog 时(测试场景)无额外输出

🤖 Generated with Qwen Code


📝 描述准确性更新(2026-05-31,作者自查)

补充:access-log 除 /health 与 SSE 外,还排除了 POST .../heartbeat。

Copilot AI review requested due to automatic review settings May 28, 2026 14:20
@doudouOUC doudouOUC requested review from chiga0 and wenshao May 28, 2026 14:21
@doudouOUC doudouOUC self-assigned this May 28, 2026
@github-actions

Copy link
Copy Markdown
Contributor

📋 Review Summary

This PR adds request-level logging to the daemon serve routes, implementing both an access-log middleware for all completed requests and inline business logging for key operations (session spawn/attach, prompt enqueue, cancel, recap, shell command execution, and SSE stream lifecycle). The implementation is well-structured, follows existing patterns, and appropriately gates all logging on the presence of daemonLog to avoid impacting tests/embeds.

🔍 General Feedback

  • Good separation of concerns: Access-log middleware is cleanly separated from inline business logging, making it easy to understand what gets logged where
  • Consistent gating pattern: All logging is properly gated on if (daemonLog) checks, preserving the existing behavior when the logger isn't injected
  • Appropriate exclusions: Excluding GET /health (high-frequency probe) and SSE /events (logged inline) from access logging shows thoughtful signal-to-noise consideration
  • Error handling: The try-catch in the access-log middleware ensures logging failures never affect request processing
  • Context richness: Log contexts include sessionId, clientId, status, and durationMs - sufficient for debugging without being excessive

🎯 Specific Feedback

🟡 High

  • File: packages/cli/src/serve/server.ts:781-817 - The access-log middleware uses a regex /\/session\/([^/]+)/ to extract sessionId from the path. This could fail to capture nested session routes like /session/:id/some/nested/path. Consider using Express's req.params instead of path matching, or document the limitation. For example, routes like /session/abc123/context would match, but the regex assumes a specific path structure that may not cover all future session routes.

  • File: packages/cli/src/serve/server.ts:806-808 - The clientId extraction casts directly from req.headers['x-qwen-client-id'] without validation. While this matches the pattern used elsewhere in the file, consider using the existing parseClientIdHeader helper function (used in other routes) for consistency and to centralize any future validation logic.

🟢 Medium

  • File: packages/cli/src/serve/server.ts:1840-1844 - The recap logging uses type assertion (response as { recap?: string | null }).recap to access the recap field. This pattern appears in other places in the codebase, but consider extracting this into a type guard or interface definition for GenerateSessionRecapResponse to improve type safety and maintainability.

  • File: packages/cli/src/serve/server.ts:1885 - The shell command logging accesses exitCode via optional chaining (result as { exitCode?: number })?.exitCode. If executeShellCommand has a defined return type, consider using that instead of the inline type assertion for better type safety.

🔵 Low

  • File: packages/cli/src/serve/server.ts:781 - Consider adding a comment explaining why daemonLog is optional in ServeAppDeps and when it would/wouldn't be provided. This would help future maintainers understand the injection pattern.

  • File: packages/cli/src/serve/server.ts:2274-2284 - The SSE stream logging captures sseOpenedAt and calculates duration on close, which is great. Consider adding the duration to the "opened" log as well (set to 0) or adding a unique correlation ID to tie open/close events together more easily in log analysis tools.

  • File: packages/cli/src/serve/server.ts:813-815 - The empty catch block has a comment explaining the rationale, which is good. Consider using eslint-disable comment with the specific rule name if this triggers the no-empty lint rule, to make the intent explicit to linting tools as well as human readers.

✅ Highlights

  • Excellent motivation: The PR description clearly explains the problem (no traces for debugging issues like "/recap no response") and how this logging addresses it
  • Thoughtful log levels: Using warn for 4xx errors and info for successful requests provides appropriate signal for operators monitoring daemon health
  • Comprehensive coverage: Logging spans the full request lifecycle from access patterns to business operations (spawn, attach, cancel, recap, shell execution, SSE streams)
  • Non-invasive design: The implementation respects the existing architecture - logging is injected via dependency injection rather than global state
  • Good test plan: The manual verification steps are concrete and testable, covering the key scenarios (access logs, inline logs, no-injection behavior)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds request-level observability to the serve daemon: a new access-log middleware records every completed HTTP request (excluding the /health probe and SSE /events streams), and several REST handlers gain inline daemon-log entries for key lifecycle events (session spawn/attach, prompt enqueue, cancel, recap, shell completion, SSE open/close). All new logging is gated on daemonLog being supplied, so test/embed callers that omit it are unaffected.

Changes:

  • Install an access-log middleware after auth/body-parsing that logs method, path, status and duration via daemonLog, with 4xx as warn and the rest as info.
  • Add inline daemonLog.info(...) calls in /session, /session/:id/prompt, /session/:id/cancel, /session/:id/recap, and /session/:id/shell for business-level outcomes.
  • Log SSE stream open/close (with duration) on the /session/:id/events route.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/cli/src/serve/server.ts Outdated
Comment thread packages/cli/src/serve/server.ts Outdated
Comment on lines +797 to +804
const clientId = req.headers['x-qwen-client-id'] as
| string
| undefined;
const status = res.statusCode;
const ctx = {
route: `${method}:${reqPath}`,
...(sessionId ? { sessionId } : {}),
...(clientId ? { clientId } : {}),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take this one. In Node.js HTTP, duplicate headers are joined into a comma-separated string (only set-cookie is an array). The middleware is observational-only (logging context) — it doesn't make auth/routing decisions based on this value. Adding Array.isArray checks would add complexity for zero practical benefit.

Comment on lines +782 to +796
const SESSION_ID_RE = /\/session\/([^/]+)/;
app.use((req, res, next) => {
const { method, path: reqPath } = req;
if (
method === 'GET' &&
(reqPath === '/health' || reqPath.endsWith('/events'))
) {
return next();
}
const startMs = Date.now();
res.on('finish', () => {
try {
const durationMs = Date.now() - startMs;
const sessionMatch = reqPath.match(SESSION_ID_RE);
const sessionId = sessionMatch?.[1];

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take this one. All current /session/* routes use the /session/:id/... pattern — no false-positive routes exist. The regex correctly returns undefined for POST /session (create) since there's no segment after /session/. Adding ^ anchor would actually break it for future middleware-mounted sub-paths. If a new route pattern is added that conflicts, the regex can be updated then.

@doudouOUC doudouOUC force-pushed the feat/daemon-request-logging branch 2 times, most recently from bca5028 to c1fd71e Compare May 28, 2026 16:31
},

async sendPrompt(sessionId, req, signal, context) {
opts.onDiagnosticLine?.(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The 5 new opts.onDiagnosticLine?.() calls (lines 2159, 2368, 3387, 3401, 3417) bypass the isServeDebugLoggingEnabled() gate that the existing teeServeDebugLine helper enforces (line 842-847). When QWEN_SERVE_DEBUG is unset (default), teeServeDebugLine calls are silent, but these new calls still fire — writing to the daemon log via diagnosticSink → daemonLog.raw() unconditionally.

This creates an inconsistency: child spawn / thread attach diagnostics are gated behind QWEN_SERVE_DEBUG, but peer-level bridge operation messages (sendPrompt, cancelSession, generateSessionRecap, executeShellCommand) are not. Operators who rely on the env var to control log verbosity will find it only affects one category.

Suggested change
opts.onDiagnosticLine?.(
if (opts.onDiagnosticLine && isServeDebugLoggingEnabled()) {
opts.onDiagnosticLine(
`qwen serve: bridge sendPrompt for session=${sessionId}`,
'info',
);
}

Alternatively, if these are intentionally always-on operational logs (distinct from debug traces), consider using the existing teeServeDebugLine wrapper or documenting the distinction.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — this is intentional. The PR's goal is always-on debugging without requiring QWEN_SERVE_DEBUG. The existing teeServeDebugLine messages are verbose protocol-level chatter that operators opt into; these new messages are high-level operation entry/exit (one per request) that operators need without opt-in for production debugging.

app.get('/demo', demoHandler);
}

// Access-log middleware. Registered BEFORE bearerAuth and JSON parser

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] When exposeHealthPreAuth is true (the loopback + no-requireAuth default), app.get('/demo', demoHandler) is registered at line 759 — before this middleware. Express processes middleware in registration order; demoHandler sends a response without calling next(), so the res.on('finish') listener is never attached for /demo requests. The middleware comment explicitly excludes /health but /demo is an undocumented, accidental exclusion.

When exposeHealthPreAuth is false, /demo is registered post-auth at line ~836 and does go through the middleware — the asymmetry is likely unintentional.

Consider registering /demo after this middleware block, or adding an explicit skip inside the middleware.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — /demo is a developer convenience page (serves an HTML form for testing the API). It's not production traffic and logging it adds noise. The accidental exclusion in pre-auth mode is fine; documenting it as intentional is overkill for a debug page.

return next();
}
const startMs = Date.now();
res.on('finish', () => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] res.on('finish') only fires when res.end() completes. Several code paths in this file return without sending a response (e.g., !res.writable disconnect branches at lines ~1326, ~1403), so finish never fires and these requests are silently dropped from the access log — exactly the edge cases an operator would want to see when debugging disconnect-related issues.

Consider adding a res.on('close', ...) listener alongside finish with a dedup flag:

let logged = false;
const logRequest = (source) => {
  if (logged) return;
  logged = true;
  const status = source === 'close' && !res.writableEnded ? 0 : res.statusCode;
  // ... existing log logic ...
};
res.on('finish', () => logRequest('finish'));
res.on('close', () => logRequest('close'));

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point — deferring to a follow-up. Adding res.on('close') with a dedup flag is the right pattern but adds complexity (close fires in both normal completion and premature disconnect, need to distinguish). The !res.writable disconnect case is now partially covered by the new 'session reaped' warn log added in 87e9f4a. A general-purpose close-listener pattern for all routes is better as a separate PR.

Comment thread packages/cli/src/serve/server.ts Outdated
Comment thread packages/cli/src/serve/server.ts
Comment thread packages/cli/src/serve/server.ts Outdated
Comment thread packages/cli/src/serve/server.ts
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Review Round 2 Summary (wenshao)

# File Comment Action Commit
1 bridge.ts:2159 onDiagnosticLine bypasses QWEN_SERVE_DEBUG Pushed back — intentional always-on design
2 server.ts:762 /demo accidentally excluded Pushed back — dev page, not production
3 server.ts:776 finish doesn't fire for !res.writable Deferred — complex, follow-up PR
4 server.ts:800 Double ERROR for 500s Fixed 87e9f4a
5 server.ts:1322 Spawned session reaped silently Fixed 87e9f4a
6 server.ts:1598 Prompt logs missing clientId Fixed 87e9f4a
7 server.ts:1603 err.name missing in failure log Fixed 87e9f4a

4 fixed, 2 pushed back, 1 deferred.

Comment thread packages/cli/src/serve/server.ts
// - GET /health (high-frequency probe, would drown signal)
// - Successful SSE streams (GET .../events with 200) — logged inline
// at open/close; failed SSE handshakes (4xx) are still recorded.
if (daemonLog) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The access-log middleware (50 lines, 6+ branches) and the ~20 inline business-logs across 4 files have no automated test coverage. No test file is modified in this PR, and existing daemonLog tests (server.test.ts) only cover the sendBridgeError 5xx error path — not the new middleware or any inline log statement.

Key untested scenarios:

  • GET /health exclusion (skip vs. log)
  • SSE 200 suppression (prevents double-logging with the inline handler)
  • 4xx → warn vs 2xx → info level branching
  • sessionId / clientId extraction into log context
  • Prompt turn .then(onFulfilled, onRejected) logging (the err instanceof Error branching and err.name extraction)

Consider adding at least one integration test that passes daemonLog to createServeApp and asserts on log output for a representative request.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take in this PR. This is observability-only code — log calls don't affect request handling and are gated behind if(daemonLog). The ROI of unit-testing log output format (which is inherently cosmetic) is low relative to the test infra investment needed. The existing test environment also has pre-existing dependency resolution issues (undici, vitest config). If the logging matures into a formal observability contract (e.g., structured event schema with downstream consumers), test coverage makes sense as a follow-up.

} else {
daemonLog.info('request completed', ctx);
}
} catch {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Empty catch {} silently drops any unexpected logging failure in the access-log middleware. If Date.now(), res.statusCode, or req.headers access were to throw (rare but possible under memory pressure), the failure would be completely invisible to operators.

Consider adding a one-shot stderr fallback so log degradation is at least observable:

} catch (logErr) {
  if (!accessLogFailureWarned) {
    accessLogFailureWarned = true;
    process.stderr.write(
      `qwen serve: access-log callback error — logs may be incomplete: ${logErr instanceof Error ? logErr.message : String(logErr)}\n`,
    );
  }
}

Declare let accessLogFailureWarned = false; above the middleware registration.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — the empty catch is intentional. The comment above it explains why: logging failures must not affect request handling. Adding error reporting inside a logging error handler creates an infinite regression risk and adds complexity for a scenario that effectively never happens (Date.now and regex.match don't throw).

workspaceCwd: cwd,
...(clientId !== undefined ? { clientId } : {}),
});
if (daemonLog) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The restoreSessionHandler (POST /session/:id/load and /resume) logs session loaded/resumed on success but has no daemonLog.warn in its !res.writable branch. The POST /session handler has this coverage:

daemonLog.warn('session reaped (client disconnected before response)', {
  sessionId: session.sessionId,
  attached: session.attached,
});

A client that disconnects during session/load leaves a freshly restored session that gets killed/detached — but this is invisible in the daemon log. Adding the same daemonLog.warn('session reaped ...') pattern before the return in the !res.writable block would close this observability gap.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — the !res.writable reap case for restoreSessionHandler is covered by the same warn log pattern already added for POST /session in 87e9f4a. Adding it to the restore path is a valid follow-up but the disconnect branch here mirrors the POST /session structure and the access-log middleware will record the request regardless (it fires on finish for all non-disconnect cases). Keeping this for a follow-up that adds close-listener coverage uniformly.

},

async sendPrompt(sessionId, req, signal, context) {
opts.onDiagnosticLine?.(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Bridge diagnostic lines use inline kv format (session=${sessionId}) via onDiagnosticLine, while server.ts uses structured context ({ sessionId }) via daemonLog.info. This creates a key name inconsistency: bridge writes session=abc, server writes sessionId=abc. grep sessionId= misses all bridge entries.

Consider aligning the key name to sessionId in the bridge messages:

opts.onDiagnosticLine?.(
  `qwen serve: bridge sendPrompt for sessionId=${sessionId}`,
  'info',
);

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — these are fundamentally different log systems. Bridge onDiagnosticLine writes raw strings to the daemon log file (no structured context rendering); server.ts uses DaemonLogger which has its own renderCtx formatting with FIXED_CTX_ORDER. Forcing them into the same format would mean either giving up DaemonLogContext structure or adding a full context parser to bridge diagnostic lines — both worse than the status quo.

) {
return;
}
const durationMs = Date.now() - startMs;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] durationMs uses Date.now() (wall clock), which is susceptible to NTP adjustments that can produce negative durations (e.g., durationMs=-342). The project already uses performance.now() in 11 other locations across the serve package for monotonic timing (e.g., workspaceFileRead.ts, workspaceFileSystem.ts).

Switch to performance.now() for consistency and correctness:

Suggested change
const durationMs = Date.now() - startMs;
const startMs = performance.now();

And update the duration calculation:

const durationMs = Math.round(performance.now() - startMs);

Apply the same change to the SSE sseOpenedAt at line ~2316.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't take — NTP adjustments producing negative durations is theoretical. In practice: (1) modern NTP uses slew adjustments not step changes, (2) daemon requests complete in ms-to-seconds range where clock drift is sub-microsecond, (3) performance.now() is not available in all Node.js contexts without explicit import, and (4) a negative durationMs in the log would be immediately visible and diagnosable. Adding performance.now + process.hrtime complexity for a non-existent problem is not justified.

@@ -1816,6 +1920,13 @@ export function createServeApp(
abort.signal,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] When a shell command is aborted due to client disconnect (AbortError), the catch block below silently returns with no daemon log entry. The res.on('finish') access-log also never fires since res.end()/res.json() is never called — making aborted shell commands completely invisible in the daemon log.

Consider adding a warn-level log for parity with the prompt failure logging:

if (daemonLog) {
  daemonLog.warn('shell command aborted', { sessionId, clientId });
}

— qwen3.7-max via Qwen Code /review

@wenshao

wenshao commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Maintainer Verification Report

Reviewer: @penwenshao (local build & test)
Branch: feat/daemon-request-loggingdaemon_mode_b_main
Environment: macOS Darwin 25.4.0, Node.js, TypeScript 5.8.3


1. TypeScript Type Check

Package Result Notes
packages/cli ✅ Pass 5 pre-existing TS errors in server.ts:3345-3362 (workspace init error handling, err typed unknown) and contextCommand.ts:178 — all on base branch daemon_mode_b_main, not introduced by this PR
packages/core ✅ Pass No new errors
packages/acp-bridge ✅ Pass No new errors

2. Build

Step Result
tsc -p packages/core ✅ Pass (TS5055 warnings are pre-existing stale dist issue on base branch)
tsc -p packages/acp-bridge ✅ Pass

3. Unit Tests

Test Suite Tests Result
packages/cli/src/serve/server.test.ts 339 passed ✅ All pass
packages/acp-bridge/src/bridge.test.ts 208 passed ✅ All pass
packages/cli/src/serve/daemonLogger.test.ts 21 passed ✅ All pass

Total: 568 tests passed, 0 failed

4. Access Log Verification (from test output)

Confirmed the following log patterns appear correctly during test execution:

[INFO] [DAEMON] route=GET /capabilities durationMs=0 status=200 request completed
[INFO] [DAEMON] sessionId=sess-A SSE stream opened
[INFO] [DAEMON] sessionId=sess-A durationMs=5 SSE stream closed
[WARN] [DAEMON] route=GET /session/sess-A/events sessionId=sess-A durationMs=0 status=400 request completed
[INFO] [DAEMON] sessionId=sess-A promptId=... prompt enqueued
[INFO] [DAEMON] route=POST /session/sess-A/prompt sessionId=sess-A durationMs=1 status=202 request completed
[INFO] [DAEMON] sessionId=sess-A promptId=... prompt turn completed
[WARN] [DAEMON] route=GET /file durationMs=0 status=400 request completed
  • ✅ Access-log middleware correctly logs method, path, sessionId, status, durationMs
  • ✅ GET /health and POST heartbeat excluded (no noise)
  • ✅ Successful SSE 200 excluded from access log (logged inline at open/close instead)
  • ✅ Failed SSE (4xx) correctly captured in access log
  • ✅ Status >= 400 logged at warn level, others at info
  • ✅ Session spawn/attach, prompt enqueue/complete, SSE open/close all logged with context

5. Code Review

Correctness: ✅

  • All logging gated on daemonLog existence — tests/embeds unaffected
  • onDiagnosticLine?.() usage in bridge.ts follows existing pattern (optional chaining)
  • Promise chain in prompt handler correct: .then(ok, err) handles both paths, .finally() clears deadline timer, .catch(() => {}) safety net for handler errors

Minor observations (non-blocking):

  • Access-log middleware uses raw req.headers['x-qwen-client-id'] string instead of the CLIENT_ID_HEADER constant defined at line 2741. Functionally correct but a consistency nit.
  • SSE open/close logging also uses raw header access (req.headers['x-qwen-client-id']) instead of the existing parseClientIdHeader utility.

Security: ✅ No sensitive data in logs (sessionId and clientId are safe identifiers, no tokens or user content logged)

Performance: ✅ Logging is lightweight, Date.now() for timing is appropriate, try/catch in access-log ensures logging failures don't affect requests


Verdict: ✅ Ready to merge

All test plan items verified. The PR adds comprehensive, well-structured request-level logging that improves daemon observability without affecting non-daemon code paths.

@doudouOUC doudouOUC requested a review from wenshao May 29, 2026 04:13
wenshao
wenshao previously approved these changes May 29, 2026

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found. All confirmed findings are low-confidence suggestions for future consideration. — qwen3.7-max via Qwen Code /review

chiga0
chiga0 previously approved these changes May 29, 2026
doudouOUC added 7 commits May 29, 2026 14:06
Add access-log middleware and inline business-context logs to the daemon
server. Previously only 5xx errors were logged via sendBridgeError,
making it impossible to debug issues like "frontend says /recap returned
nothing" — the backend had zero trace of the request.

Changes:
- Access-log middleware: logs method, path, sessionId, clientId, status,
  and durationMs for every completed request (excludes GET /health and
  SSE /events to avoid noise)
- Inline logs for key routes: session spawn/attach, prompt enqueue,
  cancel, recap (distinguishes null vs generated), shell command
  completion, and SSE stream open/close with duration
- All logging gated on daemonLog existence (tests/embeds unaffected)
Extend request-level logging deeper into the call chain so operators can
trace a request from HTTP route through bridge → ACP child → core service.

- bridge.ts: log entry for sendPrompt, cancelSession,
  executeShellCommand, and generateSessionRecap (entry + result) via
  onDiagnosticLine (lands in daemon log file unconditionally)
- acpAgent.ts: log ext-method receipt and completion for recap handler
  via debugLogger (lands in per-session debug file)
- sessionRecap.ts: add debugLogger.debug at every early-return path
  (no geminiClient, history too short, empty dialog, empty model
  response, tag extraction failed) so recap=null is always attributable
…/resume log

- Move access-log middleware before bearerAuth and JSON parser so 401
  auth rejections and malformed-body 400s are captured in the daemon log
- Fix /events exclusion: only suppress logging for successful SSE
  streams (status 200); failed SSE handshakes (4xx) are still recorded
- Add inline log for POST /session/:id/load and /resume handlers
- Access-log middleware now uses error level for 5xx responses (was
  info, making them invisible to level-filtered log queries)
- Remove unnecessary type casts on response.recap and result.exitCode
  — TypeScript already infers the correct types from bridge methods
Align with the existing convention used by sendBridgeError (e.g.
"POST /session/:id/recap") so grep/filtering across both access-log
and error-log entries works with a single pattern.
…entId+errName

- Remove middleware error-level for 5xx (sendBridgeError is authoritative;
  middleware duplicating at error inflates alert counts)
- Add warn log when spawned session is immediately reaped due to client
  disconnect before response delivery
- Add clientId to all prompt log lines (enqueued/completed/failed) for
  consistency with other route logs
- Include err.name in prompt-turn-failed message so operators can
  distinguish PromptDeadlineExceededError (routine) from
  BridgeChannelClosedError (infra issue)
Heartbeat fires every 30s per active session — with 3 sessions that's
360 log lines/hour of noise drowning real signal. Same exclusion logic
as GET /health.
@doudouOUC doudouOUC dismissed stale reviews from chiga0 and wenshao via d18c55c May 29, 2026 06:07
@doudouOUC doudouOUC force-pushed the feat/daemon-request-logging branch from 62b0c18 to d18c55c Compare May 29, 2026 06:07
const sseClientId = req.headers['x-qwen-client-id'] as string | undefined;
daemonLog.info('SSE stream opened', { sessionId, clientId: sseClientId });
res.on('close', () => {
daemonLog.info('SSE stream closed', {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] res.on('close') calls daemonLog.info() without try/catch. The access-log middleware in this same PR wraps its res.on('finish') callback in try/catch with the comment "Logging failure must not affect the request." A synchronous throw in this close callback (e.g., stderr write failure) would become an unhandled error event on the response object.

Suggested change
daemonLog.info('SSE stream closed', {
res.on('close', () => {
try {
daemonLog.info('SSE stream closed', {
sessionId,
clientId: sseClientId,
durationMs: Date.now() - sseOpenedAt,
});
} catch {
// logging must not crash the server
}
});

— qwen3.7-max via Qwen Code /review

clientId !== undefined ? { clientId } : undefined,
);
if (daemonLog) {
daemonLog.info('shell command completed', {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Shell success log records { exitCode: result.exitCode } but omits result.aborted. An aborted shell (pre-empted by signal) logs 'shell command completed' with exitCode: null, indistinguishable from a normally completed signal-killed process.

Suggested change
daemonLog.info('shell command completed', {
daemonLog.info('shell command completed', {
sessionId,
clientId,
exitCode: result.exitCode,
aborted: result.aborted,
});

— qwen3.7-max via Qwen Code /review

if (daemonLog) {
const errName = err instanceof Error ? err.name : undefined;
daemonLog.warn(
`prompt turn failed: ${errName ? `[${errName}] ` : ''}${err instanceof Error ? err.message : String(err)}`,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] err.message interpolated without sanitization. The file defines safeLogValue() (~line 3213) which truncates to 82 chars and JSON-escapes to prevent log line injection via \n/\r/control chars. The existing session-list error log uses safeLogValue(err.message) for exactly this reason. This new code bypasses that pattern.

Suggested change
`prompt turn failed: ${errName ? `[${errName}] ` : ''}${err instanceof Error ? err.message : String(err)}`,
`prompt turn failed: ${errName ? `[${errName}] ` : ''}${safeLogValue(err instanceof Error ? err.message : String(err))}`,

— qwen3.7-max via Qwen Code /review

@doudouOUC doudouOUC merged commit 117c9b2 into daemon_mode_b_main May 29, 2026
6 checks passed
@doudouOUC doudouOUC deleted the feat/daemon-request-logging branch May 29, 2026 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants