Skip to content

feat: OpenTelemetry trace export (closes #362)#383

Merged
spboyer merged 2 commits into
mainfrom
spboyer-issue-362-otel-export
Jun 28, 2026
Merged

feat: OpenTelemetry trace export (closes #362)#383
spboyer merged 2 commits into
mainfrom
spboyer-issue-362-otel-export

Conversation

@spboyer

@spboyer spboyer commented Jun 28, 2026

Copy link
Copy Markdown
Member

Closes #362.

Adds opt-in OpenTelemetry trace export for waza eval runs. Off by default — when --otel-exporter is not set the OTel SDK is never initialized.

Span hierarchy

graph TD
  Eval["eval"] --> Task["task"]
  Task --> Turn["turn (kind=initial)"]
  Turn --> Tool["tool_call"]
  Turn --> Model["model_call"]
Loading

Attributes follow the OpenTelemetry GenAI semantic conventions (gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.tool.name, gen_ai.tool.call.id, …) plus a small set of waza.* keys for eval/task identity.

New flags

Flag Description
--otel-exporter otlp | stdout | file (default: off)
--otel-endpoint OTLP endpoint (host:port or URL)
--otel-headers Comma-separated k=v headers
--otel-file Output path for file exporter
--otel-include-payloads Include prompt/tool/completion content (default: redacted to sha256+length)

Implementation notes

  • internal/telemetry/ is the new self-contained package. Provider returns a no-op tracer when disabled so callers always wrap work in spans without conditionals.
  • internal/orchestration/runner.go is wrapped (not refactored) — RunBenchmark/runTest/executeRun open the eval/task/turn spans; internal/orchestration/telemetry.go emits child tool_call/model_call spans from ExecutionResponse.ToolCalls and Usage.ModelMetrics.
  • Best-effort trace ID propagation: provider.SetGlobal() registers the configured tracer provider so engines or downstream HTTP clients that read the global provider attach to the same trace. No Copilot SDK changes required.
  • Default redaction emits only sha256 + byte length for prompts, tool args, tool results, and completions. --otel-include-payloads opts into full content for private debugging.
  • The new OTel deps (go.opentelemetry.io/otel/sdk, exporters/otlp/otlptrace/otlptracehttp, exporters/stdout/stdouttrace) are only used when the exporter is enabled — zero runtime cost for users who never opt in.

Tests

  • internal/telemetry/spans_test.go covers span hierarchy, GenAI attribute names, default redaction, --otel-include-payloads, no-op behavior when disabled, header parsing, and config validation — all without requiring a live backend (uses tracetest.InMemoryExporter with WithSyncer).
  • Full go test ./... and golangci-lint run ./... pass locally.

Docs

  • New guide: site/src/content/docs/guides/otel.mdx — span hierarchy diagram, local OTel Collector docker-compose example, redaction notes, auth headers.
  • site/src/content/docs/reference/cli.mdx and README.md flag tables updated.
  • Sidebar entry added in site/astro.config.mjs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add opt-in OpenTelemetry trace export for waza eval runs. Span hierarchy
is eval → task → turn → tool_call/model_call, with attributes following
the OpenTelemetry GenAI semantic conventions (gen_ai.system,
gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.tool.name, …).

Off by default — when --otel-exporter is not set the OTel SDK is not
initialized and there is no runtime overhead. Three exporters are
supported:

  --otel-exporter otlp   --otel-endpoint host:port [--otel-headers …]
  --otel-exporter stdout
  --otel-exporter file   --otel-file path

Default redaction strips prompt/tool-arg/tool-result/completion content;
only sha256 and byte length are emitted. Pass --otel-include-payloads to
record full content for private debugging.

The implementation is a thin wrapper around the existing event emission
in internal/orchestration/runner.go; nothing about the runner's control
flow changed. internal/telemetry encapsulates Provider/Config/spans and
exposes no-op tracers when disabled, so callers can always wrap work in
spans without conditionals.

Tests cover span hierarchy, GenAI attribute names, redaction behavior,
header parsing, and the disabled-provider path without requiring a live
backend. Adds a docs/guides/otel.mdx walkthrough with a local OTel
Collector example and updates the CLI reference + README.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 28, 2026 11:15

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in OpenTelemetry tracing pipeline to waza eval runs, aiming to export spans following the GenAI semantic conventions and document the new CLI flags and usage.

Changes:

  • Introduces internal/telemetry/ (config/provider + span helpers) and wires it into waza run and the eval runner.
  • Emits eval/task/turn spans plus after-the-fact tool_call/model_call spans from ExecutionResponse.
  • Updates docs site + README with new --otel-* flags and an OpenTelemetry guide.
Show a summary per file
File Description
site/src/content/docs/reference/cli.mdx Documents new --otel-* flags in CLI reference.
site/src/content/docs/guides/otel.mdx Adds an OpenTelemetry tracing guide (hierarchy, collector example, redaction, headers).
site/astro.config.mjs Adds the new guide to the docs sidebar.
README.md Adds --otel-* flags to the README flag table.
internal/telemetry/config.go Defines telemetry config, validation, and header parsing.
internal/telemetry/provider.go Implements tracer provider initialization and OTLP/stdout/file exporters.
internal/telemetry/semconv.go Centralizes GenAI and waza.* attribute keys.
internal/telemetry/spans.go Implements span creation/recording helpers + payload redaction.
internal/telemetry/spans_test.go Adds unit tests for hierarchy, attributes, redaction, and parsing/validation.
internal/orchestration/telemetry.go Emits tool_call/model_call spans from execution responses.
internal/orchestration/runner.go Wraps eval/task/turn execution in spans and links child spans under turns.
cmd/waza/cmd_run.go Adds --otel-* CLI flags and initializes/shuts down telemetry provider.
go.mod Adds OpenTelemetry-related dependencies.
go.sum Records checksums for new dependencies.

Review details

  • Files reviewed: 13/14 changed files
  • Comments generated: 4
  • Review effort level: Low

Comment thread internal/telemetry/spans.go Outdated
Comment thread internal/orchestration/telemetry.go
Comment thread internal/orchestration/runner.go
Comment thread internal/telemetry/provider.go
- Per-slot payload redaction keys (no collisions on multi-payload spans)
- Thread ToolCall.ID into gen_ai.tool.call.id
- Wrap every turn (initial, follow-ups, responder replies) in a turn span
- Emit NDJSON (one span per line) from stdout/file exporters

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer merged commit e43ae9f into main Jun 28, 2026
10 checks passed
@spboyer spboyer deleted the spboyer-issue-362-otel-export branch June 28, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: OpenTelemetry trace export for agent runs (agentic-first observability)

3 participants