You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current status: implementation-complete on main for the originally requested daemon OpenTelemetry path, including a local live OTLP smoke validation on 2026-06-13.
writes qwen-code.prompt_id to the active daemon HTTP request span for POST /session/:id/prompt;
derives daemon-to-ACP traceparent from the active bridge span, so ACP child propagation no longer depends on the daemon's no-op outbound propagator;
attributes deferred blocked-on-user and hook spans to the owning session from their logical parent context, avoiding stale daemon session fallback in multi-session cases;
adds regression coverage for daemon trace metadata injection, bridge prompt dispatch span ordering, prompt-id span enrichment, and multi-session deferred-span attribution.
Acceptance status
POST /session/:id/prompt can be represented as daemon HTTP request span -> bridge prompt dispatch span -> ACP child interaction span -> LLM/tool spans where applicable.
Daemon request failures and bridge failures have correlated span/log coverage with route/session/prompt context where known.
Session lifecycle, prompt dispatch, bridge, child-process, route, metrics, and structured-log telemetry landed through the daemon batch and follow-ups.
Trace context is propagated across the daemon-to-ACP boundary without relying on the global outbound propagator.
Live backend smoke: local OTLP HTTP collector smoke verified on 2026-06-13. The exported trace contained qwen-code.daemon.request (POST /session/:id/prompt) -> qwen-code.daemon.bridge (prompt.dispatch) -> qwen-code.interaction -> qwen-code.llm_request, plus qwen-code.tool (read_file) -> qwen-code.tool.execution; the daemon HTTP span carried qwen-code.prompt_id and the ACP child spans shared the same trace id.
Remaining
No remaining implementation or validation item for the original daemon end-to-end OpenTelemetry scope.
Optional follow-up: add a checked-in/manual smoke artifact or CI job if maintainers want the local OTLP collector validation to be repeatable.
Background
Qwen Code's OpenTelemetry implementation is increasingly complete for the interactive/runtime path, but qwen serve still has a daemon-specific observability gap.
Today the serve daemon process handles HTTP routing, session lifecycle APIs, bridge queueing, ACP child process management, prompt dispatch, cancellation, SSE/EventBus fan-out, and bridge error translation. Most of those daemon-layer operations are not represented in OpenTelemetry. The ACP child can initialize telemetry after loadCliConfig(...) and may emit agent-internal model/tool logs or spans, but that does not cover the full daemon path from HTTP request to bridge to child to response/events.
Current findings:
qwen serve starts from packages/cli/src/commands/serve.ts and packages/cli/src/serve/runQwenServe.ts; it calls the serve runner directly and does not construct a Config for the daemon process, so initializeTelemetry(...) is not run in the daemon itself.
Config initializes telemetry from packages/core/src/config/config.ts, so telemetry exists mainly in paths that build a normal runtime config.
ACP sessions call loadCliConfig(...) in packages/cli/src/acp-integration/acpAgent.ts, so child processes can have telemetry if settings enable it.
The ACP session path logs user prompts/tool calls, but it does not currently provide the same top-level interaction span coverage as the interactive client.ts path.
sendBridgeError(...) and the bridge lifecycle are primarily observable through daemon stderr today, not OTel traces/logs.
This issue is narrower than #3731 and #4548: make daemon-mode execution reconstructable as a coherent OpenTelemetry trace/log/metric story.
Problem
When a daemon client sees an error such as POST /session/:id/prompt returning HTTP 500, operators cannot reconstruct the complete path from telemetry alone:
inbound HTTP request to the daemon
route validation and client/session lookup
bridge channel selection or child spawn/reuse
prompt queue wait and dispatch
ACP child prompt handling
model request and tool execution
SSE/EventBus output fan-out
cancellation, close, child exit, and error translation
Some lower-level model/tool telemetry may exist in the child, but the parent daemon span, bridge span, queue timing, lifecycle events, and error mapping are missing. This leaves gaps between client-visible HTTP failures and agent-internal telemetry.
There is also a multi-session concern: the current telemetry SDK is process-level, while daemon mode may serve multiple sessions over time. Any daemon/ACP telemetry work must avoid stale session root context and must attribute spans/logs to the correct workspace, session, prompt, and client.
Proposal
Add OpenTelemetry coverage for the qwen serve daemon path.
Suggested scope:
1. Initialize telemetry in the daemon process
Initialize OTel before the HTTP server starts when telemetry is enabled for the daemon workspace/config.
Reuse existing exporter, shutdown, diagnostic suppression, resource-attribute, and bounded flush semantics from the core telemetry SDK.
Ensure the daemon process does not emit exporter diagnostics to stdout/stderr in structured/non-interactive contexts.
Flush/shutdown telemetry during serve shutdown/drain.
2. Add daemon HTTP/request spans
Create a span per relevant daemon request, using route templates rather than raw URLs. At minimum cover:
POST /session
POST /session/:id/load
POST /session/:id/prompt
POST /session/:id/cancel
DELETE /session/:id
GET /workspace/:id/sessions
SSE/EventBus subscription routes if applicable
Recommended attributes:
HTTP method, route template, status code
workspace id/path hash where safe
session id when known
prompt id when known
client id when known
request duration
error code/type and sanitized error message for failures
3. Add bridge and child-process spans/events
Instrument the daemon bridge around operations that are invisible from ACP child telemetry:
session create/load/close/cancel
child process spawn/reuse/exit
bridge channel lookup
prompt queue wait time
prompt dispatch duration
cancel propagation to ACP child
pending permission cancellation
EventBus/SSE publish/fan-out failures
bridge transport close/errors
This should make a prompt trace show where time was spent before the ACP child began model/tool work.
4. Propagate trace context across daemon and ACP child
Define a W3C trace context boundary between daemon request spans and ACP child work.
Possible approaches:
pass traceparent/tracestate through an ACP request metadata field if the protocol allows it;
pass a daemon-generated trace context in an internal envelope field that is not exposed as user prompt content;
fall back to OTel links if strict parent-child context is unsafe for queued or long-lived work.
The child-side prompt/interaction span should be parented to, or linked from, the daemon prompt/bridge span so the trace is navigable end to end.
5. Align ACP session tracing with interactive tracing
Bring ACP prompt handling closer to the interactive client.ts trace tree:
create a top-level interaction/prompt span for each ACP prompt;
ensure child LLM spans and tool spans attach under the correct prompt span;
preserve existing prompt/tool log events;
avoid global session-root leakage across multiple sessions in one long-lived process.
6. Add daemon metrics/log records where useful
Metrics/logs should complement traces without creating high-cardinality explosions.
Useful low-cardinality metrics may include:
request count/latency by route and status class
active sessions by workspace
prompt queue wait duration
child process spawn/restart count
bridge error count by error code/type
cancellation/close count
Log records should include trace/span ids where possible, especially for bridge errors and child stderr correlation.
Acceptance criteria
With telemetry enabled, POST /session/:id/prompt produces a trace that starts at the daemon HTTP route and continues through bridge dispatch into ACP child prompt handling, LLM requests, and tool execution where applicable.
A generic daemon 500 is marked on the relevant span and emits a correlated log record with route, session id, prompt id if known, and sanitized error details.
Status update (2026-06-13)
Current status: implementation-complete on
mainfor the originally requested daemon OpenTelemetry path, including a local live OTLP smoke validation on 2026-06-13.Landed
main. Its telemetry slice includes daemon prompt lifecycle spans (feat(telemetry): trace daemon prompt lifecycle #4556), daemon/ACP tool spans andsession.idattribution (feat(telemetry): add tool spans and session.id to daemon/ACP path #4630), per-prompt trace ids (feat(telemetry): per-prompt traceId for bounded, renderable traces #4661), expanded daemon route coverage (feat(telemetry): expand daemon telemetry route coverage #4682), and daemon OTel metrics / structured log records (feat(telemetry): add daemon OTel metrics and structured log records #4749).main:qwen-code.prompt_idto the active daemon HTTP request span forPOST /session/:id/prompt;traceparentfrom the active bridge span, so ACP child propagation no longer depends on the daemon's no-op outbound propagator;Acceptance status
POST /session/:id/promptcan be represented as daemon HTTP request span -> bridge prompt dispatch span -> ACP child interaction span -> LLM/tool spans where applicable.session.idremains the cross-prompt correlation key.qwen-code.daemon.request(POST /session/:id/prompt) ->qwen-code.daemon.bridge(prompt.dispatch) ->qwen-code.interaction->qwen-code.llm_request, plusqwen-code.tool(read_file) ->qwen-code.tool.execution; the daemon HTTP span carriedqwen-code.prompt_idand the ACP child spans shared the same trace id.Remaining
Background
Qwen Code's OpenTelemetry implementation is increasingly complete for the interactive/runtime path, but
qwen servestill has a daemon-specific observability gap.Today the serve daemon process handles HTTP routing, session lifecycle APIs, bridge queueing, ACP child process management, prompt dispatch, cancellation, SSE/EventBus fan-out, and bridge error translation. Most of those daemon-layer operations are not represented in OpenTelemetry. The ACP child can initialize telemetry after
loadCliConfig(...)and may emit agent-internal model/tool logs or spans, but that does not cover the full daemon path from HTTP request to bridge to child to response/events.Current findings:
qwen servestarts frompackages/cli/src/commands/serve.tsandpackages/cli/src/serve/runQwenServe.ts; it calls the serve runner directly and does not construct aConfigfor the daemon process, soinitializeTelemetry(...)is not run in the daemon itself.Configinitializes telemetry frompackages/core/src/config/config.ts, so telemetry exists mainly in paths that build a normal runtime config.loadCliConfig(...)inpackages/cli/src/acp-integration/acpAgent.ts, so child processes can have telemetry if settings enable it.client.tspath.sendBridgeError(...)and the bridge lifecycle are primarily observable through daemon stderr today, not OTel traces/logs.Related but distinct work:
traceparent+ session-id propagation.This issue is narrower than #3731 and #4548: make daemon-mode execution reconstructable as a coherent OpenTelemetry trace/log/metric story.
Problem
When a daemon client sees an error such as
POST /session/:id/promptreturning HTTP 500, operators cannot reconstruct the complete path from telemetry alone:Some lower-level model/tool telemetry may exist in the child, but the parent daemon span, bridge span, queue timing, lifecycle events, and error mapping are missing. This leaves gaps between client-visible HTTP failures and agent-internal telemetry.
There is also a multi-session concern: the current telemetry SDK is process-level, while daemon mode may serve multiple sessions over time. Any daemon/ACP telemetry work must avoid stale session root context and must attribute spans/logs to the correct workspace, session, prompt, and client.
Proposal
Add OpenTelemetry coverage for the
qwen servedaemon path.Suggested scope:
1. Initialize telemetry in the daemon process
2. Add daemon HTTP/request spans
Create a span per relevant daemon request, using route templates rather than raw URLs. At minimum cover:
POST /sessionPOST /session/:id/loadPOST /session/:id/promptPOST /session/:id/cancelDELETE /session/:idGET /workspace/:id/sessionsRecommended attributes:
3. Add bridge and child-process spans/events
Instrument the daemon bridge around operations that are invisible from ACP child telemetry:
This should make a prompt trace show where time was spent before the ACP child began model/tool work.
4. Propagate trace context across daemon and ACP child
Define a W3C trace context boundary between daemon request spans and ACP child work.
Possible approaches:
traceparent/tracestatethrough an ACP request metadata field if the protocol allows it;The child-side prompt/interaction span should be parented to, or linked from, the daemon prompt/bridge span so the trace is navigable end to end.
5. Align ACP session tracing with interactive tracing
Bring ACP prompt handling closer to the interactive
client.tstrace tree:6. Add daemon metrics/log records where useful
Metrics/logs should complement traces without creating high-cardinality explosions.
Useful low-cardinality metrics may include:
Log records should include trace/span ids where possible, especially for bridge errors and child stderr correlation.
Acceptance criteria
POST /session/:id/promptproduces a trace that starts at the daemon HTTP route and continues through bridge dispatch into ACP child prompt handling, LLM requests, and tool execution where applicable.create,load,cancel,close,list) emit useful spans/events or metrics.Non-goals
Open questions
qwen servehas not created a normal sessionConfigyet?