Skip to content

Feature Request: Add child spans for detailed OTEL traces #62964

@shaohq

Description

@shaohq

Feature Request: Add child spans for detailed OTEL traces

Problem

Currently, OpenClaw's OTEL instrumentation only provides coarse-grained spans. For example, openclaw.message.processed is a single span that encompasses the entire message processing time (~228 seconds), but there is no breakdown of where that time is spent.

Current Spans Observed

  • openclaw.message.processed - entire message processing (no child spans)
  • openclaw.model.usage - model API call with attributes but no sub-steps
  • openclaw.session.stuck - session stuck detection

Desired Behavior

Add child spans under openclaw.message.processed to break down:

  1. Tool calls - time spent in each tool execution
  2. Model API latency - time for API request/response round-trip
  3. Tokenization - time spent calculating/counting tokens
  4. Response building - time spent constructing the final response
  5. Other sub-operations - any significant internal steps

Example Use Case

When debugging slow responses, developers need to understand where time is spent:

  • Is it waiting on an LLM API?
  • Is it running tool executions?
  • Is it processing tokens?

Proposed Implementation

Wrap significant internal operations with child spans:

const parentSpan = tracer.startSpan("openclaw.message.processed");
// ...
const toolSpan = tracer.startSpan("openclaw.tool.execution", { parent: parentSpan });
// tool work
toolSpan.end();
// ...
parentSpan.end();

Environment

  • OpenClaw Version: 2026.4.5
  • OTLP Backend: Jaeger
  • Protocol: http/protobuf

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions