feat(telemetry): add tool spans and session.id to daemon/ACP path#4630
Conversation
Add interaction-level and tool-level OTel spans to the daemon's ACP Session.ts, closing the observability gap described in #4602. Changes: - session-tracing.ts: emit session.id on llm_request, tool, and tool.execution spans via getCurrentSessionId() - Session.ts runTool(): wrap tool lifecycle in startToolSpan / runInToolSpanContext / endToolSpan; wrap invocation.execute() in startToolExecutionSpan / endToolExecutionSpan - Session.ts #executePrompt: emit logConversationFinishedEvent at turn end (inside withInteractionSpan, after #handleStopHookLoop) - Session.ts #executeCronPrompt: wrap body in withInteractionSpan so cron tool calls also get proper trace hierarchy
📋 Review SummaryThis PR adds comprehensive OpenTelemetry tracing for daemon/ACP tool execution paths, enabling session-based querying in ARMS. The implementation wraps tool calls with proper span hierarchies ( 🔍 General Feedback
🎯 Specific Feedback🟡 High
🟢 Medium
🔵 Low
✅ Highlights
|
…n cancellation - Cron path getResultStatus now checks ac.signal.aborted so aborted cron runs record turn_status='cancelled' instead of 'ok' - Tool execution span success path now checks abortSignal.aborted, aligning with coreToolScheduler cancellation semantics
1df350f to
a814e95
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves observability for the daemon/ACP execution path by aligning its OpenTelemetry span hierarchy and key attributes (notably session.id) with the existing CLI path, so traces can be queried and understood at the session/turn/tool level.
Changes:
- Add
session.idattributes tollm_request,tool, andtool.executionspans in core session tracing. - Wrap ACP
Session.runTool()with tool span + tool execution sub-span context so tool work is properly nested under the interaction span. - Emit
conversation_finishedat the end of ACP turns and wrap cron prompt execution in an interaction span for consistent trace structure.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| packages/core/src/telemetry/session-tracing.ts | Adds optional session.id propagation to LLM/tool/tool.execution spans. |
| packages/cli/src/acp-integration/session/Session.ts | Adds interaction/tool/tool.execution spans for ACP execution, logs conversation-finished at turn end, and wraps cron prompts in interaction spans. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wenshao
left a comment
There was a problem hiding this comment.
A note that doesn't map to a single changed line:
- No test coverage for the new telemetry. The new
session.idstamping ontool/tool.execution/llm_requestspans and theconversation_finishedemission are not asserted anywhere.session-tracing.test.tsonly assertssession.idon the interaction span (which already carried it pre-PR), andSession.test.tsreferences none of the new span symbols. The existing suites pass without exercising the new branches, so a regression in exactly these paths (including thesession.idissue flagged inline) would pass CI silently. Worth adding focused assertions for the success/error/cancel span outcomes and forsession.idmatching the owning session.
— claude-opus-4-8 via Claude Code /qreview
…hed coverage Address wenshao + Copilot review on the daemon/ACP telemetry path. - session-tracing.ts: derive session.id for llm_request/tool/tool.execution spans from the per-session parent span context (resolveSessionId) instead of the process-global getCurrentSessionId(). A daemon hosts many sessions in one process, so the global cross-stamped child spans with whichever session last initialized telemetry while the interaction span carried the correct id. Falls back to the global for the single-session CLI path. [wenshao Critical] - Session.ts #executePrompt: move logConversationFinishedEvent into a finally wrapping the whole turn loop so cancelled / no-stream / API-error / rate-limit terminal paths also emit (previously only the clean stop-hook path did). Emitted for all approval modes — an intentional divergence from the CLI's YOLO-only gating, since daemon turns run autonomously regardless of mode. [wenshao Critical + Suggestion, Copilot turnCount] - Session.ts #executeCronPrompt: emit conversation_finished on every terminal cron path (clean / abort / caught error). [wenshao Critical] - Session.ts runTool success path: reflect toolResult.error and cancellation in logToolCall / recordToolResult / the tool span instead of hardcoding success, so soft tool failures are no longer mislabeled as successful. [Copilot] - Session.ts tool-confirmation Cancel: route through earlyErrorResponse so spanError carries the cancellation reason (was the generic 'tool error') and the declined call is recorded. [wenshao Suggestion] - Session.ts startToolSpan: dual-emit the legacy call_id alias like CoreToolScheduler for backwards-compat dashboards. [Copilot]
… tool outcome Address wenshao's no-test-coverage CHANGES_REQUESTED on the daemon/ACP telemetry changes. - session-tracing.test.ts: assert tool / llm_request / tool.execution spans derive session.id from the owning interaction context (not the process- global) including a multi-session isolation case and the CLI global fallback. - Session.test.ts: assert conversation_finished is emitted on the normal turn AND on the error/throw path (the path that previously dropped it), and that a soft tool failure (toolResult.error) is recorded with status 'error'.
Review round addressed — wenshao + CopilotPushed two commits:
Both packages typecheck clean;
New tests:
One open thread (YOLO gating) — happy to switch to strict CLI parity if you'd prefer; otherwise the divergence is intentional and documented. |
Verification Report — PR #4630Commit: Test Results
Test Details
TypeScript Error Analysis
The +1 difference is Items Not Verified (require live infrastructure)
Verdict✅ All 4 automatable checks pass. 163 tests total, core typecheck clean, CLI typecheck has only pre-existing errors from unmerged upstream types. Safe to merge pending live OTLP verification. |
…plicit any The interaction-span getResultStatus callback relied on generic inference of T from the turn-loop return type; under the branch's pre-existing upstream type errors that inference degrades to any, surfacing a noImplicitAny error at the callback. Annotate the param explicitly so it no longer depends on inference. (wenshao verification report)
|
Thanks for the verification run, @wenshao. Fixed the one concrete finding:
The rest of your analysis notes line up with the review round I just pushed ( Two open threads I'd still like your call on:
Live OTLP/ARMS verification still pending on my side. |
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] InteractionSpanResultStatus at session-tracing.ts:50 only has 'ok' | 'cancelled' — no 'error' value. This constrains the cron path (Session.ts:1704) which maps caught API errors to 'cancelled', and will constrain any future caller that needs to distinguish errors from cancellations. Consider extending to 'ok' | 'error' | 'cancelled' and handling 'error' in withInteractionSpan's finally to set SpanStatusCode.ERROR.
— qwen3.7-max via Qwen Code /review
…outcomes Address wenshao review round 2 (telemetry-accuracy suggestions). - session-tracing.ts: extend InteractionSpanResultStatus to 'ok' | 'error' | 'cancelled' and have withInteractionSpan's finally set SpanStatusCode.ERROR when getResultStatus reports 'error' on a non-throwing path. Guarded so a thrown error's specific message is not overwritten by the generic one. - Session.ts #executeCronPrompt: map caught cron errors to 'error' (was 'cancelled'), so turn_status dashboards no longer miss cron API failures. - Session.ts runTool success path: compute aborted/status/succeeded once before emitResult so the client-facing success flag matches telemetry on abort-induced cancellation (previously emitResult used !toolResult.error). - Session.ts runTool error paths: errorResponse and the catch-block recordToolResult now label aborted calls 'cancelled' instead of 'error'. Tests: withInteractionSpan 'error' status -> span ERROR, and thrown-error message preserved.
Review round 2 addressed — wenshao (telemetry accuracy)Pushed
Tests added: Still open for your call: the YOLO-gating thread ( |
chiga0
left a comment
There was a problem hiding this comment.
QoderWork Code Review — PR #4630
Decision: APPROVED ✅
Summary
feat(telemetry): add tool spans and session.id to daemon/ACP path — closes #4602.
Adds interaction/tool/tool.execution span hierarchy and session.id propagation to the daemon/ACP execution path, bringing parity with the CLI telemetry model.
Review Notes
- Multi-session
session.idfix:resolveSessionIdcorrectly derives from parent span context instead of the process-global — critical for daemon processes hosting many sessions. Fallback togetCurrentSessionId()preserves the single-session CLI path. - Span lifecycle: Clean
startToolSpan→runInToolSpanContext→endToolSpanwrapping withtool.executionsub-spans.spanErrorcaptured on all paths (early permission denials, soft failures, catch-block exceptions, abort-induced cancellations). - Error/cancel semantics:
errorStatusSetguard inwithInteractionSpanprevents thrown error message overwrite.'error' | 'cancelled' | 'ok'mapping correct for both cron and tool paths. conversation_finishedplacement: Correctly moved tofinallywrapping the entire turn loop — covers all terminal paths (clean stop, abort, API error, rate limit). Intentional divergence from CLI's YOLO-only gating (daemon runs all modes autonomously).- Test coverage: 10 new tests — session.id multi-session isolation, interaction span error status, thrown error message preservation, conversation_finished on normal + error paths, soft tool failure recording.
No findings. LGTM.
This review was generated by QoderWork AI
Summary
session.idattribute tollm_request,tool, andtool.executionspans insession-tracing.ts— makes all daemon spans queryable by session in ARMSSession.ts runTool()withstartToolSpan/runInToolSpanContext/endToolSpan+startToolExecutionSpan/endToolExecutionSpanaroundinvocation.execute()logConversationFinishedEventat turn end (insidewithInteractionSpan, after#handleStopHookLoop)#executeCronPromptbody inwithInteractionSpanso cron tool calls also get proper trace hierarchyCloses #4602 (Milestone 1 + Milestone 2 middle-ground approach)
Cherry-picked from PR #4608 (which targeted the wrong base branch).
Resulting trace hierarchy
Test plan
packages/coretypecheck passespackages/clitypecheck passes (0 new errors; pre-existing acpAgent.ts errors from unmerged status types)npx vitest run src/telemetry/session-tracing.test.ts— 71 tests passnpx vitest run src/acp-integration/session/Session.test.ts— 75 tests passsession.idfilterable in ARMS for daemon traces🤖 Generated with Qwen Code