feat(telemetry): inject traceId/spanId into debug log files for OTel correlation#3847
Conversation
There was a problem hiding this comment.
Pull request overview
Adds OpenTelemetry correlation metadata to debug logging and introduces helper spans around API/tool execution so telemetry and local debug logs can be tied together by trace IDs.
Changes:
- Injects
trace_id/span_idinto debug log lines, preferring the active OTel span and falling back to session-derived IDs. - Introduces shared telemetry helpers (
trace-id-utils,session-context,tracer) for deterministic trace IDs and session-root span context. - Wraps content generation and tool execution paths in spans, and updates related tests/utilities.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
packages/core/src/utils/debugLogger.ts |
Adds trace/span context lookup and injects it into debug log lines. |
packages/core/src/utils/debugLogger.test.ts |
Expands logger tests for trace-context formatting and fallback behavior. |
packages/core/src/telemetry/tracer.ts |
New helper module for withSpan, manual span context, and session-root context creation. |
packages/core/src/telemetry/trace-id-utils.ts |
New shared utilities for deterministic trace IDs and random hex/span IDs. |
packages/core/src/telemetry/trace-id-utils.test.ts |
Adds tests for deterministic and random ID helpers. |
packages/core/src/telemetry/session-context.ts |
Adds singleton storage for the session root OTel context. |
packages/core/src/telemetry/sdk.ts |
Seeds and clears session telemetry context during SDK init/shutdown. |
packages/core/src/telemetry/log-to-span-processor.ts |
Reuses the new shared trace ID utilities. |
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts |
Wraps request and streaming generation paths in telemetry spans. |
packages/core/src/core/coreToolScheduler.ts |
Wraps tool execution in spans and sets status on failure/denial paths. |
packages/core/src/core/client.ts |
Wraps client generateContent calls in a higher-level span. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wenshao
left a comment
There was a problem hiding this comment.
Test coverage gaps
The new telemetry wrapping layer introduced by this PR has zero behavioral test coverage:
tracer.ts(new):withSpan,startSpanWithContext,createSessionRootContext— 0 testsclient.test.ts:generateContentnow wrapped inwithSpan— no span assertions addedcoreToolScheduler.test.ts:executeSingleToolCallnow wrapped inwithSpan, 5 error/cancel branches — no span assertionsloggingContentGenerator.test.ts:generateContent+generateContentStreamspan wrapping — no span assertionssdk.test.ts:setSessionContextcalls ininitializeTelemetry/shutdownTelemetrynot verified
The withSpan status-overwriting bug (inline comment at tracer.ts) would have been caught by a unit test. Consider adding tracer.test.ts covering: withSpan success/error paths, startSpanWithContext+runInContext, and createSessionRootContext determinism.
Other issues
log-to-span-processor.ts:deriveTraceId()called on everyonEmitwithout caching (redundant SHA-256) —debugLogger.tsalready caches it.coreToolScheduler.ts: hookblockReason/ tool error messages written directly intospan.setStatus()message field and exported via OTel — may contain file paths or command output.
wenshao
left a comment
There was a problem hiding this comment.
Posted Codex qreview findings from the latest review pass.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wenshao
left a comment
There was a problem hiding this comment.
Review summary
Nice piece of work — the trace-context priority (active span → session-derived fallback → omit) is exactly right, the SHA-256 caching avoids per-log-line hashing in both LogToSpanProcessor and debugLogger, and wasStatusSet() correctly preserves caller-set ERROR statuses across both throw and non-throw paths. Extracting trace-id-utils.ts is a clean dedupe.
A few concerns left as inline comments. Highlights:
- Streaming span lifetime —
loggingStreamWrapper'sfinally { span?.end() }only fires when the generator naturally completes or.return()/.throw()is called. Manual.next()consumers that abandon iteration will leak the span. runInSpancoverage gap — the for-await loop body itself isn't wrapped, only the trailing_logApiResponse/_logApiErrorcalls. Works today (no logs in the loop) but fragile.SpanStatusCode.UNSETas sentinel incoreToolScheduleris a code smell — UNSET is the OTel default and a reader might delete the call thinking it's a no-op. Cleaner to add a{ autoOkOnSuccess: false }option towithSpan.- Mutating the input span in
wrapSpanWithStatusTrackingworks but a Proxy/wrapper would be more hygienic.
Other points (not worth blocking inline comments)
log-to-span-processor.tsline 125 still hardcodestraceFlags: 1, // SAMPLED. Newtracer.tscode usesTraceFlags.SAMPLED; please align the constant for consistency.- The OTel resource's
session.idinsdk.ts:143is set once at init and stays stale after/clear. Per-recordsession.id(viagetCommonAttributesinloggers.ts) is correctly dynamic so traces and logs DO still correlate via the bridge — but backends that inspect resource-levelsession.idwill see the old value. This is pre-existing, not introduced here, but worth a follow-up. - Test coverage gaps: no test for
generateContentStreamspan lifecycle (success / consumer-break / iteration error), and no coverage of the 5 distinct status-setting paths incoreToolScheduler.attemptToolExecution(hook denial / postHook stop / success / toolResult.error / exception / abort). With a 350-line method body that's the riskiest part of the PR. tracer.test.tsmocks@opentelemetry/apiheavily — the parent-context priority logic isn't exercised against the real OTel API, so wiring bugs ingetParentContextwouldn't be caught here.
Recommendations
- Split out the unrelated whitespace/style churn (
monitorRegistry.ts,skill-manager.ts,skill-manager.test.ts,symlinkScope.test.ts) into a separate commit. - Add a streaming-span-lifecycle test and tool-span-status tests for the major paths.
- Decide on the
UNSETsentinel approach — either replace with an explicit option, or add a code comment explaining whysetStatus({ code: UNSET })is load-bearing here.
68b5a90 to
18fdaeb
Compare
…correlation Closes #3846 Add trace context injection into ~/.qwen/debug/{sessionId}.txt so every log line carries [trace_id=xxx span_id=yyy], enabling correlation between debug logs and OpenTelemetry traces in backends (e.g. Alibaba Cloud SLS). New modules: - trace-id-utils.ts: shared deriveTraceId (SHA-256), randomSpanId, randomHexString — single source of truth for ID generation used by both LogToSpanProcessor and debugLogger - session-context.ts: neutral singleton to break sdk.ts ↔ tracer.ts circular dependency - tracer.ts: withSpan (auto-lifecycle), startSpanWithContext (manual lifecycle for async generators), createSessionRootContext Key behaviour: - Priority: active OTel span > session-derived fallback > omit - Fallback traceId is SHA-256(sessionId), consistent with LogToSpanProcessor - Fallback spanId is stable per session (not random per line) - generateContentStream span covers full stream lifetime via startSpanWithContext + runInContext - Tool failure paths (hook denial, toolResult.error, execution exception) now correctly set SpanStatusCode.ERROR on the tool span 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
1142aa0 to
3202bfa
Compare
- Fix withSpan overwriting caller-set ERROR status with OK by tracking whether the callback has already called setStatus - Fix stream span context not active during async generator iteration by passing spanContext and using context.with() in loggingStreamWrapper - Refresh session root context on session change (/clear, /resume) so new spans inherit the correct traceId - Add span.setStatus(ERROR) on both cancellation paths in executeSingleToolCall so cancelled executions don't appear as OK spans - Add tracer.test.ts with 10 unit tests covering withSpan success/error/ handled-failure paths, startSpanWithContext, and createSessionRootContext 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
Avoid redundant SHA-256 computation on every onEmit call by caching the derived traceId per session ID. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
- Change cancellation span status from ERROR to UNSET (cancellation is not a fault per OTel semantic conventions) - Rename misleading "deterministic" test for createSessionRootContext since randomSpanId() makes each call non-deterministic - Add regression tests for refreshSessionContext verifying it updates session context when telemetry is initialized and is a no-op otherwise 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
- Wrap getTraceContext in try-catch so OTel failures never crash debugLogger call sites - Guard span.setStatus/span.end in catch blocks of loggingContentGenerator to prevent swallowing original API errors - Use ROOT_CONTEXT instead of context.active() in createSessionRootContext to ensure session roots are always top-level - Add load-bearing comments on UNSET setStatus calls in coreToolScheduler cancellation paths Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…y lint rule Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Code ReviewOverviewPR adds OTel trace context to
The motivation, design, and shared-trace-id approach are all sound. The implementation is well structured and reasonably tested. Strengths
Issues1.
|
| Risk | Severity |
|---|---|
UNSET cancellation pattern locks maintainers into the wrapper's tracking trick |
Medium |
Span attr prompt_id='' for the common case (no override) |
Low — easy fix, real correlation loss |
| Missing scheduler tests for the new span-status branches | Medium |
wrapSpanWithStatusTracking mutates passed-in span |
Low |
Defensive catch blocks around OTel calls |
Low |
Overall the design is good and the change does what it claims. I'd ask for items 1, 3, and 5 before merge; the rest are polish.
Activate the stream span while providers eagerly create HTTP requests, and resolve client prompt IDs before creating the generateContent span. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
|
@wenshao 已在
补充测试覆盖了:使用 验证:
另外, |
wenshao
left a comment
There was a problem hiding this comment.
Second-opinion review with glm-5.1 (prior review used gpt-5.5). No Critical issues found. 4 Suggestion-level findings:
tracer.ts:206TraceFlags.SAMPLED — already flagged in the open comment above; the code still usesSAMPLEDdespite a prior reply claiming it was changed toNONE. Reiterating for visibility.- Duplicated
API_CALL_FAILED_SPAN_STATUS_MESSAGE— identical constant defined independently inclient.tsandloggingContentGenerator.ts. - Three inconsistent safe-span-status patterns —
tracer.tshassafeSetStatus(),client.tsuses inlinetry/catch,coreToolScheduler.tshas local helpers. Worth consolidating. warnTelemetryOperationFailedrate-limiting untested — the 30s suppression window and suppressed-count logic have no test coverage.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Fix misleading test name: clarify that only traceId is deterministic (spanId is random) in createSessionRootContext test - Use TraceFlags.SAMPLED constant instead of hardcoded 1 in mock span - Add regression test verifying Config.startNewSession() calls refreshSessionContext with the new session ID Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
deriveSpanStatus now checks for the OTel semantic convention `error.message` attribute in addition to `error`, `error_message`, and `error_type` when determining whether a bridged span should be marked ERROR. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — gpt-5.5 via Qwen Code /review
PR #4058 Review Follow-up — All ResolvedFollow-up PR: #4058 All unresolved review feedback has been addressed: Must Address1. TraceFlags.SAMPLED hardcoding makes OTEL_TRACES_SAMPLER ineffective (raised by @wenshao 3×, Copilot)
2. OTel Resource session.id stale after /clear or /resume (raised by @wenshao 2×)
Optional3. Consumer-side stream logs outside span context (raised by @wenshao)
4. setStatus(UNSET) load-bearing hack (raised by @wenshao)
5. Abandoned generator span leak (raised by @wenshao)
6. Loop body not in runInSpan (raised by @wenshao)
|
|
Re: TraceFlags.SAMPLED hardcoding — Fixed in #4058. |
|
Re: OTel Resource session.id stale after session change — Fixed in #4058. Added |
|
Re: Span lifetime concern — abandoned generator — Fixed in #4058. Added defensive 5-minute timeout ( |
|
Re: runInSpan only wraps trailing calls, not loop body — Fixed in #4058. |
|
Re: Still open — TraceFlags.SAMPLED makes sampler ineffective — Fixed in #4058. |
|
Re: setToolSpanCancelled sets UNSET which is the default initial state — Fixed in #4058. Added |
|
Re: Partial fix — consumer-side stream logs still outside span — Fixed in #4058. |
|
Re: Critical — createSessionRootContext with NONE breaks default sampler — Fixed in #4058. |
…ion (#4058) * fix(telemetry): address PR #3847 review follow-ups for trace correlation Addresses unresolved review feedback from PR #3847: - Respect OTEL_TRACES_SAMPLER env var when setting TraceFlags on the synthetic session root, so custom samplers (e.g. traceidratio) are not bypassed by forced SAMPLED flag - Store current session ID in session-context.ts and use it as a fallback in LogToSpanProcessor when the OTel Resource session.id attribute is stale after /clear or /resume - Wrap for-await loop body in runInSpan() so debug logs emitted during stream iteration see the stream span as active - Add autoOkOnSuccess option to withSpan, eliminating the need for the load-bearing setStatus(UNSET) hack in cancellation paths - Add defensive 5-minute timeout for stream spans to prevent leaks from abandoned generators 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(telemetry): address review issues in PR #4058 - Fix sdk.test.ts assertion to match new two-arg setSessionContext call - Add spanEnded guard to prevent double span.end() when timeout fires - Add .trim() to OTEL_TRACES_SAMPLER env var for robustness - Pass sessionId in debugLogger.test.ts for signature completeness - Clarify log-to-span-processor comment: fallback covers "missing" not "stale" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(telemetry): adopt review feedback for sampler and idle timeout - Fix shouldForceSampled to force SAMPLED for all parentbased_* samplers, not just parentbased_always_on — parentbased samplers delegate to localParentNotSampled (default AlwaysOff) when parent has NONE, which silently drops all traces - Convert stream span timeout from fixed wall-clock to idle timeout: reset timer on each chunk so legitimately long streams are never affected; timeout only fires when no chunks arrive for 5 minutes - Add test for parentbased_traceidratio → TraceFlags.SAMPLED * fix(telemetry): guard resetSpanTimeout against already-ended span Prevent zombie timer accumulation when chunks arrive after idle timeout has already ended the span. * fix(telemetry): handle parentbased_always_off sampler and fill test gaps - shouldForceSampled() now returns false for parentbased_always_off, preventing silent over-sampling that contradicts user intent. - Updated JSDoc to document the always_on exception for non-parentbased samplers. - Added test for parentbased_always_off → TraceFlags.NONE. - Added test for success path resilience when safeSetStatus throws in coreToolScheduler. * fix(telemetry): harden span timeout ordering and session ID fallback - Swap spanEnded/span.end() order so finally block can retry if end() throws. - Clear idle timeout immediately after for-await loop exits, before post-loop processing. - Use || instead of ?? for session.id fallback to handle empty strings. * fix(telemetry): address review round 4 — docs, cleanup, and test gaps - Expand shouldForceSampled JSDoc: document env-var assumption, parentbased_traceidratio 100% sampling semantics. - Remove unnecessary runInSpan wrapper around chunk field assignments. - Add stream.timed_out span attribute when idle timeout fires. - Deduplicate config.getSessionId() call in initializeTelemetry. - Add tests for always_on and always_off sampler values. * fix(telemetry): harden timeout callback ordering and add || comment - Move safeSetStatus before setAttribute in timeout callback so ERROR status is set even if setAttribute throws. - Set spanEnded=true before span.end() so finally block never overwrites ERROR with OK if end() throws. - Add comment explaining || vs ?? choice for session.id fallback. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rrelation (QwenLM#4058) - Respect OTEL_TRACES_SAMPLER env var when setting TraceFlags - Store current session ID in session-context.ts as fallback in LogToSpanProcessor - Wrap for-await loop body in runInSpan() for debug logs during stream iteration - Add autoOkOnSuccess option to withSpan - Add defensive 5-minute idle timeout for stream spans to prevent leaks Cherry-picked from upstream QwenLM/qwen-code commit 4bb8dc8. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closes #3846
Summary
[trace_id=xxx span_id=yyy]into~/.qwen/debug/{sessionId}.txtlog lines only when an active OTel span or initialized telemetry session context is available, enabling correlation with OpenTelemetry backends without emitting synthetic trace IDs when telemetry is offNew files
trace-id-utils.tssession-context.tstracer.tstrace-id-utils.test.tstracer.test.tssession-context.test.tsKey design decisions
SHA-256(sessionId)[0:32]API call failed/API call aborted; detailed upstream errors stay in provider-level telemetry and local reporting人工验证
运行:



日志包含traceId和spanId:
span有记录
Test plan
Input/output
Input:
Output:
Input/output
Input:
Output:
Input/output
Input:
Output:
Input/output
Input:
Output:
~/.qwen/debug/latestcontainstrace_id=andspan_id=fieldsInput/output
Input:
Output:
Verification input:
Verification output:
QWEN_TELEMETRY_OTLP_ENDPOINTwith a local OTLP HTTP receiver and confirm debug log traceIds match exported span/log traceIdsInput/output
Input:
CLI output:
OTLP receiver output:
Debug log traceId:
🤖 Generated with Qwen Code
Input/output
Input:
Output: