Skip to content

bug(telemetry): qwen-code.interaction span has wrong trace id (escapes session root context) #4486

@doudouOUC

Description

@doudouOUC

TL;DR

startInteractionSpan in packages/core/src/telemetry/session-tracing.ts:143 creates the qwen-code.interaction span without passing the session root context as parent, so the OTel SDK assigns it a fresh random trace id. The interaction span ends up in its own one-span trace, while its semantic children (llm_request, tool, tool.execution) — which DO inherit the session context — land in the sessionId-derived trace. The hierarchy breaks across two unrelated trace ids.

Repro (real production data)

Session 33878ff9-1467-43ef-a692-634a21bbb1bf (cn-beijing, 2026-05-25 09:53–09:56) produced two distinct trace ids for the same session:

Trace ID Span count What it contains
4acca441dba8071e51177f7c82fe0d07 101 spans log-bridge events + llm_request + tool spans (uses sha256(sessionId)[:32])
54dbac5a0e1437165971820716388282 1 span the qwen-code.interaction span only (uses OTel-generated random trace id)

Both traces carry the same session.id tag, but they share no OTel trace context. Any trace viewer (ARMS / Jaeger / Zipkin) renders them as two disconnected items.

Root cause

session-tracing.ts has 4 startSpan call sites; only startInteractionSpan is missing the ctx parameter:

// L143 — qwen-code.interaction  ❌ MISSING ctx
const span = getTracer().startSpan(SPAN_INTERACTION, {
  kind: SpanKind.INTERNAL,
  attributes,
});

// L215 — qwen-code.llm_request  ✅
const span = getTracer().startSpan(
  SPAN_LLM_REQUEST,
  { kind: SpanKind.INTERNAL, attributes },
  ctx,
);

// L311 — qwen-code.tool  ✅
const span = getTracer().startSpan(
  SPAN_TOOL,
  { kind: SpanKind.INTERNAL, attributes },
  ctx,
);

// L416 — qwen-code.tool.execution  ✅
const span = getTracer().startSpan(
  SPAN_TOOL_EXECUTION,
  { kind: SpanKind.INTERNAL },
  ctx,
);

The intent that all session spans share one trace id is documented in tracer.ts:252-254:

All spans created within this context will share the same trace id, consistent with LogToSpanProcessor.

Why #4126 didn't catch this

PR #4126 (feat(telemetry): unify span creation paths for hierarchical trace tree) did most of the heavy lifting — it wired llm_request / tool / tool.execution to inherit the parent interaction context. But it changed only the children's context propagation, not interaction itself. So we ended up with a half-fixed hierarchy: children correctly chain to the sessionId-derived trace id, parent doesn't.

Impact

  1. Hierarchy is broken in any trace viewer. A user looking at trace 4acca441… sees llm_request, tool, api_response spans as orphaned siblings under the synthetic session root, with no wrapping interaction span to group them by user turn.
  2. session.id → traceId derivation lookups silently miss data. Any tool that takes a sessionId and computes traceId = sha256(sessionId)[:32] then calls GetTrace will only fetch the log-bridge trace, completely missing every native interaction span (one per turn). This is the bug pattern the qwen-trace-query skill recently switched away from after hitting it in practice.
  3. interaction.sequence counter becomes confusing. Each interaction span sits in its own trace, so the sequence numbers (1, 2, 3, …) appear across N unrelated traces rather than as ordered children of one session trace.

Suggested fix

3 lines in session-tracing.ts:143:

const sessionCtx = getSessionContext() ?? otelContext.active();
const span = getTracer().startSpan(SPAN_INTERACTION, {
  kind: SpanKind.INTERNAL,
  attributes,
}, sessionCtx);

Or equivalently call the existing startSpanWithContext helper in tracer.ts:198, which already wraps this pattern.

A regression test would assert that interaction.spanContext().traceId === deriveTraceId(sessionId), matching the other three span types.

Affected version

Reproduced on origin/main @ 84f408017 (latest as of this writing).

Related

  • PR feat(telemetry): unify span creation paths for hierarchical trace tree #4126feat(telemetry): unify span creation paths for hierarchical trace tree (the work that addressed the children but missed the parent)
  • packages/core/src/telemetry/session-tracing.ts:143 — bug location
  • packages/core/src/telemetry/tracer.ts:252-266createSessionRootContext (the context that should be the interaction's parent)

Metadata

Metadata

Assignees

Labels

category/telemetryTelemetry and analyticspriority/P2Medium - Moderately impactful, noticeable problemstatus/needs-triageIssue needs to be triaged and labeledtype/bugSomething isn't working as expected

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions