Skip to content

fix(telemetry): add bounded shutdown timeout and fix service.version resource attribute #3811

@doudouOUC

Description

@doudouOUC

What would you like to be added?

Two production-critical fixes for the telemetry subsystem:

  1. Add bounded timeout to shutdownTelemetry() — currently await sdk.shutdown() has no timeout; if the OTLP endpoint is unreachable, shutdown hangs indefinitely.
  2. Fix service.version resource attribute — currently set to process.version (Node.js version, e.g. v20.11.0) instead of the application version.

Why is this needed?

Bounded shutdown timeout

packages/core/src/telemetry/sdk.ts shutdownTelemetry() directly awaits sdk.shutdown() with no time bound. When the configured OTLP endpoint is unreachable or slow, this blocks the CLI exit indefinitely. The LogToSpanProcessor already has a 30-second export timeout, but the SDK-level shutdown has no equivalent protection.

Expected behavior: shutdown should fail-open after a reasonable timeout (e.g. 10s) rather than hanging the process.

service.version bug

In packages/core/src/telemetry/sdk.ts, the resource is constructed with:

[SemanticResourceAttributes.SERVICE_VERSION]: process.version,

This sets service.version to the Node.js runtime version (e.g. v20.11.0) instead of the Qwen Code package version. All spans and metrics emitted carry the wrong version, making it impossible for backends to aggregate or filter by application version.

Suggested implementation

Shutdown timeout

  • Wrap sdk.shutdown() in Promise.race with a timeout promise (e.g. 10s)
  • On timeout, log a warning and resolve (fail-open) rather than hanging
  • Ensure telemetryInitialized and sdk are cleaned up even on timeout

service.version fix

  • Import or read the package version from package.json (or a constant derived at build time)
  • Replace process.version with the actual application version

Acceptance criteria

  • shutdownTelemetry() completes within a bounded timeout even when the OTLP backend is unreachable
  • service.version resource attribute reflects the Qwen Code application version, not the Node.js version
  • Existing tests continue to pass
  • New tests cover the bounded timeout behavior

Parent issue: #3731

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions