Skip to content

feat(telemetry): add tool spans and session.id to daemon/ACP path#4630

Merged
doudouOUC merged 6 commits into
daemon_mode_b_mainfrom
feat/daemon-acp-tool-spans
May 30, 2026
Merged

feat(telemetry): add tool spans and session.id to daemon/ACP path#4630
doudouOUC merged 6 commits into
daemon_mode_b_mainfrom
feat/daemon-acp-tool-spans

Conversation

@doudouOUC

@doudouOUC doudouOUC commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add session.id attribute to llm_request, tool, and tool.execution spans in session-tracing.ts — makes all daemon spans queryable by session in ARMS
  • Wrap Session.ts runTool() with startToolSpan / runInToolSpanContext / endToolSpan + startToolExecutionSpan / endToolExecutionSpan around invocation.execute()
  • Emit logConversationFinishedEvent at turn end (inside withInteractionSpan, after #handleStopHookLoop)
  • Wrap #executeCronPrompt body in withInteractionSpan so cron tool calls also get proper trace hierarchy

Closes #4602 (Milestone 1 + Milestone 2 middle-ground approach)

Cherry-picked from PR #4608 (which targeted the wrong base branch).

Resulting trace hierarchy

[Daemon] qwen-code.daemon.request (route span)
  └── qwen-code.daemon.bridge (prompt.dispatch)
        │
        ▼ (cross-process, W3C traceparent in _meta)
[ACP Child] qwen-code.interaction (session.id=xxx, turn_status=ok)
  ├── qwen-code.llm_request (session.id=xxx, context=interaction)
  ├── qwen-code.tool (session.id=xxx, tool.name=Read, tool.call_id=...)
  │     └── qwen-code.tool.execution (session.id=xxx)
  ├── qwen-code.tool (session.id=xxx, tool.name=Bash, tool.call_id=...)
  │     └── qwen-code.tool.execution (session.id=xxx)
  ├── qwen-code.llm_request (second LLM call in same turn)
  └── conversation_finished (log event)

Test plan

  • packages/core typecheck passes
  • packages/cli typecheck passes (0 new errors; pre-existing acpAgent.ts errors from unmerged status types)
  • npx vitest run src/telemetry/session-tracing.test.ts — 71 tests pass
  • npx vitest run src/acp-integration/session/Session.test.ts — 75 tests pass
  • Verify with live OTLP backend that tool spans appear under interaction span
  • Verify session.id filterable in ARMS for daemon traces

🤖 Generated with Qwen Code

Add interaction-level and tool-level OTel spans to the daemon's ACP
Session.ts, closing the observability gap described in #4602.

Changes:
- session-tracing.ts: emit session.id on llm_request, tool, and
  tool.execution spans via getCurrentSessionId()
- Session.ts runTool(): wrap tool lifecycle in startToolSpan /
  runInToolSpanContext / endToolSpan; wrap invocation.execute() in
  startToolExecutionSpan / endToolExecutionSpan
- Session.ts #executePrompt: emit logConversationFinishedEvent at
  turn end (inside withInteractionSpan, after #handleStopHookLoop)
- Session.ts #executeCronPrompt: wrap body in withInteractionSpan
  so cron tool calls also get proper trace hierarchy
Copilot AI review requested due to automatic review settings May 29, 2026 15:07
@github-actions

Copy link
Copy Markdown
Contributor

📋 Review Summary

This PR adds comprehensive OpenTelemetry tracing for daemon/ACP tool execution paths, enabling session-based querying in ARMS. The implementation wraps tool calls with proper span hierarchies (tooltool.execution), adds session.id attributes to all relevant spans, and emits conversation_finished events at turn completion. The changes are well-structured and align with the existing core tracing implementation.

🔍 General Feedback

  • Strong alignment with core: The implementation closely mirrors the tracing patterns from coreToolScheduler.ts, maintaining consistency across CLI and ACP paths.
  • Proper span hierarchy: The nested span structure (tooltool.execution) correctly captures both the permission/approval lifecycle and the actual tool execution.
  • Good error handling: Span error states are properly tracked via spanSuccess/spanError flags and set in the finally block.
  • Comprehensive coverage: Tool spans, execution spans, LLM request spans, and interaction spans all now carry session.id for unified querying.
  • Test coverage: Existing test suites pass (71 tests in session-tracing, 75 in Session tests).

🎯 Specific Feedback

🟡 High

  • File: packages/cli/src/acp-integration/session/Session.ts:1804-2500 - The runTool method refactoring introduces significant nesting (try-catch inside runInToolSpanContext inside the main try block). While the span wrapping is correct, consider extracting the inner tool execution logic into a separate private method (e.g., #executeToolWithSpan) to reduce cyclomatic complexity and improve readability. The current ~700-line method with 4+ levels of nesting is difficult to follow.

  • File: packages/cli/src/acp-integration/session/Session.ts:974-994 - The turnCount variable is incremented inside the while loop but the logConversationFinishedEvent is called in the finally block of the try-catch wrapping #handleStopHookLoop. This means the turn count will be accurate only if the loop completes normally. If an exception is thrown during tool execution (before reaching #handleStopHookLoop), the event won't fire. Verify this is the intended behavior—should conversation_finished fire for aborted turns?

🟢 Medium

  • File: packages/cli/src/acp-integration/session/Session.ts:1896-1898 - The spanError variable is declared at line 1977 (per diff) but initialized as undefined. The early error response functions (errorResponse and earlyErrorResponse) now set spanError, but there's a risk of stale values if multiple error paths are hit. Consider using a more explicit state management pattern (e.g., an object { error?: string } passed by reference) to avoid potential closure issues.

  • File: packages/core/src/telemetry/session-tracing.ts:707-725 - In startToolExecutionSpan, the sessionId is captured and added to attributes, but the span context object at line 721-725 stores attributes: sessionId ? { 'session.id': sessionId } : {}. This duplicates the attribute storage (once on the span, once in the context object). Verify this is intentional for downstream consumers of activeSpans, or consolidate to avoid confusion.

🔵 Low

  • File: packages/cli/src/acp-integration/session/Session.ts:1561-1650 - The #executeCronPrompt method now wraps its body in withInteractionSpan, which is excellent. However, the error handling sets cronHadError = true and returns early without explicitly ending the interaction span. The withInteractionSpan helper should handle this, but adding a comment clarifying that the span lifecycle is managed by the wrapper would help future maintainers.

  • File: packages/core/src/telemetry/session-tracing.ts:441-445 - In startLLMRequestSpan, the conditional spread ...(sessionId ? { 'session.id': sessionId } : {}) is used. Consider using a consistent pattern across all span-start functions (e.g., always setting the attribute, even if undefined, and letting the OTel SDK handle it) to reduce cognitive load when comparing implementations.

✅ Highlights

  • Excellent trace hierarchy documentation: The PR description clearly shows the expected span structure, making it easy to verify the implementation against the design.
  • Proper span context propagation: Using runInToolSpanContext ensures all nested operations (hooks, sub-agent execution) automatically inherit the tool span context.
  • Defensive error handling: The startToolExecutionSpan try-catch around invocation.execute() ensures span completion even if tool execution throws unexpectedly.
  • Consistent attribute naming: Using standard session.id across all span types enables straightforward ARMS queries.
  • Good test coverage: The fact that 146 existing tests pass without modification indicates the changes are backward-compatible and well-integrated.

…n cancellation

- Cron path getResultStatus now checks ac.signal.aborted so aborted
  cron runs record turn_status='cancelled' instead of 'ok'
- Tool execution span success path now checks abortSignal.aborted,
  aligning with coreToolScheduler cancellation semantics
@doudouOUC doudouOUC force-pushed the feat/daemon-acp-tool-spans branch from 1df350f to a814e95 Compare May 29, 2026 15:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves observability for the daemon/ACP execution path by aligning its OpenTelemetry span hierarchy and key attributes (notably session.id) with the existing CLI path, so traces can be queried and understood at the session/turn/tool level.

Changes:

  • Add session.id attributes to llm_request, tool, and tool.execution spans in core session tracing.
  • Wrap ACP Session.runTool() with tool span + tool execution sub-span context so tool work is properly nested under the interaction span.
  • Emit conversation_finished at the end of ACP turns and wrap cron prompt execution in an interaction span for consistent trace structure.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
packages/core/src/telemetry/session-tracing.ts Adds optional session.id propagation to LLM/tool/tool.execution spans.
packages/cli/src/acp-integration/session/Session.ts Adds interaction/tool/tool.execution spans for ACP execution, logs conversation-finished at turn end, and wraps cron prompts in interaction spans.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts Outdated
Comment thread packages/cli/src/acp-integration/session/Session.ts

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note that doesn't map to a single changed line:

  • No test coverage for the new telemetry. The new session.id stamping on tool/tool.execution/llm_request spans and the conversation_finished emission are not asserted anywhere. session-tracing.test.ts only asserts session.id on the interaction span (which already carried it pre-PR), and Session.test.ts references none of the new span symbols. The existing suites pass without exercising the new branches, so a regression in exactly these paths (including the session.id issue flagged inline) would pass CI silently. Worth adding focused assertions for the success/error/cancel span outcomes and for session.id matching the owning session.

— claude-opus-4-8 via Claude Code /qreview

Comment thread packages/core/src/telemetry/session-tracing.ts Outdated
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts Outdated
doudouOUC added 2 commits May 30, 2026 21:32
…hed coverage

Address wenshao + Copilot review on the daemon/ACP telemetry path.

- session-tracing.ts: derive session.id for llm_request/tool/tool.execution
  spans from the per-session parent span context (resolveSessionId) instead of
  the process-global getCurrentSessionId(). A daemon hosts many sessions in one
  process, so the global cross-stamped child spans with whichever session last
  initialized telemetry while the interaction span carried the correct id.
  Falls back to the global for the single-session CLI path. [wenshao Critical]

- Session.ts #executePrompt: move logConversationFinishedEvent into a finally
  wrapping the whole turn loop so cancelled / no-stream / API-error / rate-limit
  terminal paths also emit (previously only the clean stop-hook path did).
  Emitted for all approval modes — an intentional divergence from the CLI's
  YOLO-only gating, since daemon turns run autonomously regardless of mode.
  [wenshao Critical + Suggestion, Copilot turnCount]

- Session.ts #executeCronPrompt: emit conversation_finished on every terminal
  cron path (clean / abort / caught error). [wenshao Critical]

- Session.ts runTool success path: reflect toolResult.error and cancellation in
  logToolCall / recordToolResult / the tool span instead of hardcoding success,
  so soft tool failures are no longer mislabeled as successful. [Copilot]

- Session.ts tool-confirmation Cancel: route through earlyErrorResponse so
  spanError carries the cancellation reason (was the generic 'tool error') and
  the declined call is recorded. [wenshao Suggestion]

- Session.ts startToolSpan: dual-emit the legacy call_id alias like
  CoreToolScheduler for backwards-compat dashboards. [Copilot]
… tool outcome

Address wenshao's no-test-coverage CHANGES_REQUESTED on the daemon/ACP
telemetry changes.

- session-tracing.test.ts: assert tool / llm_request / tool.execution spans
  derive session.id from the owning interaction context (not the process-
  global) including a multi-session isolation case and the CLI global
  fallback.
- Session.test.ts: assert conversation_finished is emitted on the normal turn
  AND on the error/throw path (the path that previously dropped it), and that
  a soft tool failure (toolResult.error) is recorded with status 'error'.
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Review round addressed — wenshao + Copilot

Pushed two commits:

  • 5712e69 — fixes
  • 6ec8bd2 — tests (addresses the "no test coverage" CHANGES_REQUESTED)

Both packages typecheck clean; session-tracing.test.ts (79) and Session.test.ts (84) pass, including 8 new focused tests.

# Reviewer Item Verdict
session-tracing.ts wenshao [Critical] session.id read from process-global in multi-session daemon FixedresolveSessionId() derives it from the per-session parent span context; global only as CLI fallback
Session.ts:993 wenshao [Critical] conversation_finished only on the clean stop-hook path Fixed — moved into a finally around the whole turn loop; cron path too
Session.ts:993 wenshao [Suggestion] YOLO gating divergence from CLI Diverging intentionally (daemon runs all modes autonomously) — documented + tested; left this thread open for your call on strict CLI parity
Session.ts:2396 wenshao [Suggestion] Cancel case left spanError unset Fixed — routed through earlyErrorResponse
Session.ts Copilot tool soft-failure (toolResult.error) recorded as success Fixed — status/spanSuccess/spanError now reflect the real outcome
Session.ts Copilot missing call_id legacy alias Fixed — dual-emitted
Session.ts Copilot cron error → cancelled status Partial — true error status needs widening the shared InteractionSpanResultStatus type (CLI-path impact); out of scope
Session.ts Copilot turnCount semantics vs CLI Won't take — loop-iteration count is the meaningful per-prompt value on the daemon path
Session.ts Copilot cancellation modeled as error / short error codes Won't take — deliberate; first-class cancelled span status is a broader telemetry-model change

New tests:

  • session.id derives from the owning session (tool / llm_request / tool.execution), multi-session isolation, and CLI global fallback.
  • conversation_finished emitted on the normal turn AND the error/throw path.
  • soft tool failure recorded with status: 'error'.

One open thread (YOLO gating) — happy to switch to strict CLI parity if you'd prefer; otherwise the divergence is intentional and documented.

@wenshao

wenshao commented May 30, 2026

Copy link
Copy Markdown
Collaborator

Verification Report — PR #4630

Commit: 6ec8bd296 (feat(telemetry): add tool spans and session.id to daemon/ACP path)
Base: daemon_mode_b_main
Tester: wenshao
Date: 2026-05-30


Test Results

Check Result Details
packages/core typecheck PASS 0 errors
packages/cli typecheck PASS 0 new errors from this PR (see note below)
session-tracing.test.ts (vitest) PASS 1 file, 79 tests passed
Session.test.ts (vitest) PASS 1 file, 84 tests passed

Test Details

File Tests Time
src/telemetry/session-tracing.test.ts 79 19ms
src/acp-integration/session/Session.test.ts 84 359ms

TypeScript Error Analysis

Branch TS Errors in packages/cli
daemon_mode_b_main (base) 102
feat/daemon-acp-tool-spans (PR) 103
Net difference +1

The +1 difference is Session.ts(1002): Parameter 'result' implicitly has an 'any' type — a trivial implicit-any in the new tool span callback. All other "new" lines in the diff are pre-existing errors that shifted line numbers due to acpAgent.ts / server.ts refactoring. The PR body correctly notes these as "pre-existing acpAgent.ts errors from unmerged status types".

Items Not Verified (require live infrastructure)

  • Live OTLP backend: tool spans appear under interaction span
  • session.id filterable in ARMS for daemon traces

Verdict

All 4 automatable checks pass. 163 tests total, core typecheck clean, CLI typecheck has only pre-existing errors from unmerged upstream types. Safe to merge pending live OTLP verification.

…plicit any

The interaction-span getResultStatus callback relied on generic inference
of T from the turn-loop return type; under the branch's pre-existing
upstream type errors that inference degrades to any, surfacing a
noImplicitAny error at the callback. Annotate the param explicitly so it
no longer depends on inference. (wenshao verification report)
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Thanks for the verification run, @wenshao. Fixed the one concrete finding:

  • Session.ts:1002 implicit-any on the interaction-span getResultStatus callback → annotated the result param explicitly in 2da89e5. It was relying on generic inference of the turn-loop return type, which degrades to any under the branch's pre-existing upstream type errors; the explicit annotation removes that dependency.

The rest of your analysis notes line up with the review round I just pushed (5712e69 fixes + 6ec8bd2 tests): conversation_finished now fires on all terminal paths incl. abort/error (and cron), and the soft-tool-failure / spanError / session.id items are addressed. Net TS delta should be 0 now.

Two open threads I'd still like your call on:

  • conversation_finished YOLO gating — currently emitting for all daemon approval modes (intentional divergence from the CLI); left that review thread open.
  • The bot's "extract #executeToolWithSpan" complexity suggestion — happy to do it as a follow-up if you want it in-scope here.

Live OTLP/ARMS verification still pending on my side.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] InteractionSpanResultStatus at session-tracing.ts:50 only has 'ok' | 'cancelled' — no 'error' value. This constrains the cron path (Session.ts:1704) which maps caught API errors to 'cancelled', and will constrain any future caller that needs to distinguish errors from cancellations. Consider extending to 'ok' | 'error' | 'cancelled' and handling 'error' in withInteractionSpan's finally to set SpanStatusCode.ERROR.

— qwen3.7-max via Qwen Code /review

Comment thread packages/cli/src/acp-integration/session/Session.ts Outdated
Comment thread packages/cli/src/acp-integration/session/Session.ts
Comment thread packages/cli/src/acp-integration/session/Session.ts Outdated
…outcomes

Address wenshao review round 2 (telemetry-accuracy suggestions).

- session-tracing.ts: extend InteractionSpanResultStatus to 'ok' | 'error' |
  'cancelled' and have withInteractionSpan's finally set SpanStatusCode.ERROR
  when getResultStatus reports 'error' on a non-throwing path. Guarded so a
  thrown error's specific message is not overwritten by the generic one.
- Session.ts #executeCronPrompt: map caught cron errors to 'error' (was
  'cancelled'), so turn_status dashboards no longer miss cron API failures.
- Session.ts runTool success path: compute aborted/status/succeeded once
  before emitResult so the client-facing success flag matches telemetry on
  abort-induced cancellation (previously emitResult used !toolResult.error).
- Session.ts runTool error paths: errorResponse and the catch-block
  recordToolResult now label aborted calls 'cancelled' instead of 'error'.

Tests: withInteractionSpan 'error' status -> span ERROR, and thrown-error
message preserved.
@doudouOUC

Copy link
Copy Markdown
Collaborator Author

Review round 2 addressed — wenshao (telemetry accuracy)

Pushed 1737887 (+ 2da89e5 earlier for the implicit-any from your verification report).

Item Verdict
InteractionSpanResultStatus lacks 'error' (review note + cron Session.ts:1704) Fixed — extended to 'ok' | 'error' | 'cancelled'; withInteractionSpan finally now sets SpanStatusCode.ERROR for 'error' (guarded so a thrown error's message isn't overwritten); cron maps cronHadError → 'error'. This reverses my earlier "out of scope" call to the bot — your explicit ask made it worth doing.
emitResult not abort-aware (Session.ts:2603) Fixed — compute aborted/status/succeeded once before the emit branch; emitResult uses success: succeeded. Client UI and telemetry now agree on cancellation.
catch-path recordToolResult hardcodes 'error' (Session.ts:2709) Fixed — catch recordToolResult and errorResponse's logToolCall now label aborted calls 'cancelled'.

Tests added: withInteractionSpan 'error' → span ERROR, and a guard test that a thrown error's message survives. Full run: session-tracing 81 + Session 84 = 165 pass; both packages typecheck clean.

Still open for your call: the YOLO-gating thread (conversation_finished emitted for all daemon approval modes — intentional divergence). Everything else from rounds 1 & 2 is resolved.

@doudouOUC doudouOUC requested a review from chiga0 May 30, 2026 16:13
@doudouOUC doudouOUC self-assigned this May 30, 2026

@chiga0 chiga0 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QoderWork Code Review — PR #4630

Decision: APPROVED

Summary

feat(telemetry): add tool spans and session.id to daemon/ACP path — closes #4602.

Adds interaction/tool/tool.execution span hierarchy and session.id propagation to the daemon/ACP execution path, bringing parity with the CLI telemetry model.

Review Notes

  • Multi-session session.id fix: resolveSessionId correctly derives from parent span context instead of the process-global — critical for daemon processes hosting many sessions. Fallback to getCurrentSessionId() preserves the single-session CLI path.
  • Span lifecycle: Clean startToolSpanrunInToolSpanContextendToolSpan wrapping with tool.execution sub-spans. spanError captured on all paths (early permission denials, soft failures, catch-block exceptions, abort-induced cancellations).
  • Error/cancel semantics: errorStatusSet guard in withInteractionSpan prevents thrown error message overwrite. 'error' | 'cancelled' | 'ok' mapping correct for both cron and tool paths.
  • conversation_finished placement: Correctly moved to finally wrapping the entire turn loop — covers all terminal paths (clean stop, abort, API error, rate limit). Intentional divergence from CLI's YOLO-only gating (daemon runs all modes autonomously).
  • Test coverage: 10 new tests — session.id multi-session isolation, interaction span error status, thrown error message preservation, conversation_finished on normal + error paths, soft tool failure recording.

No findings. LGTM.


This review was generated by QoderWork AI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants