feat: OpenTelemetry trace export (closes #362)#383
Merged
Conversation
Add opt-in OpenTelemetry trace export for waza eval runs. Span hierarchy is eval → task → turn → tool_call/model_call, with attributes following the OpenTelemetry GenAI semantic conventions (gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.tool.name, …). Off by default — when --otel-exporter is not set the OTel SDK is not initialized and there is no runtime overhead. Three exporters are supported: --otel-exporter otlp --otel-endpoint host:port [--otel-headers …] --otel-exporter stdout --otel-exporter file --otel-file path Default redaction strips prompt/tool-arg/tool-result/completion content; only sha256 and byte length are emitted. Pass --otel-include-payloads to record full content for private debugging. The implementation is a thin wrapper around the existing event emission in internal/orchestration/runner.go; nothing about the runner's control flow changed. internal/telemetry encapsulates Provider/Config/spans and exposes no-op tracers when disabled, so callers can always wrap work in spans without conditionals. Tests cover span hierarchy, GenAI attribute names, redaction behavior, header parsing, and the disabled-provider path without requiring a live backend. Adds a docs/guides/otel.mdx walkthrough with a local OTel Collector example and updates the CLI reference + README. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an opt-in OpenTelemetry tracing pipeline to waza eval runs, aiming to export spans following the GenAI semantic conventions and document the new CLI flags and usage.
Changes:
- Introduces
internal/telemetry/(config/provider + span helpers) and wires it intowaza runand the eval runner. - Emits eval/task/turn spans plus after-the-fact
tool_call/model_callspans fromExecutionResponse. - Updates docs site + README with new
--otel-*flags and an OpenTelemetry guide.
Show a summary per file
| File | Description |
|---|---|
| site/src/content/docs/reference/cli.mdx | Documents new --otel-* flags in CLI reference. |
| site/src/content/docs/guides/otel.mdx | Adds an OpenTelemetry tracing guide (hierarchy, collector example, redaction, headers). |
| site/astro.config.mjs | Adds the new guide to the docs sidebar. |
| README.md | Adds --otel-* flags to the README flag table. |
| internal/telemetry/config.go | Defines telemetry config, validation, and header parsing. |
| internal/telemetry/provider.go | Implements tracer provider initialization and OTLP/stdout/file exporters. |
| internal/telemetry/semconv.go | Centralizes GenAI and waza.* attribute keys. |
| internal/telemetry/spans.go | Implements span creation/recording helpers + payload redaction. |
| internal/telemetry/spans_test.go | Adds unit tests for hierarchy, attributes, redaction, and parsing/validation. |
| internal/orchestration/telemetry.go | Emits tool_call/model_call spans from execution responses. |
| internal/orchestration/runner.go | Wraps eval/task/turn execution in spans and links child spans under turns. |
| cmd/waza/cmd_run.go | Adds --otel-* CLI flags and initializes/shuts down telemetry provider. |
| go.mod | Adds OpenTelemetry-related dependencies. |
| go.sum | Records checksums for new dependencies. |
Review details
- Files reviewed: 13/14 changed files
- Comments generated: 4
- Review effort level: Low
- Per-slot payload redaction keys (no collisions on multi-payload spans) - Thread ToolCall.ID into gen_ai.tool.call.id - Wrap every turn (initial, follow-ups, responder replies) in a turn span - Emit NDJSON (one span per line) from stdout/file exporters Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #362.
Adds opt-in OpenTelemetry trace export for waza eval runs. Off by default — when
--otel-exporteris not set the OTel SDK is never initialized.Span hierarchy
Attributes follow the OpenTelemetry GenAI semantic conventions (
gen_ai.system,gen_ai.request.model,gen_ai.usage.input_tokens,gen_ai.usage.output_tokens,gen_ai.tool.name,gen_ai.tool.call.id, …) plus a small set ofwaza.*keys for eval/task identity.New flags
--otel-exporterotlp|stdout|file(default: off)--otel-endpoint--otel-headersk=vheaders--otel-filefileexporter--otel-include-payloadssha256+length)Implementation notes
internal/telemetry/is the new self-contained package.Providerreturns a no-op tracer when disabled so callers always wrap work in spans without conditionals.internal/orchestration/runner.gois wrapped (not refactored) —RunBenchmark/runTest/executeRunopen the eval/task/turn spans;internal/orchestration/telemetry.goemits childtool_call/model_callspans fromExecutionResponse.ToolCallsandUsage.ModelMetrics.provider.SetGlobal()registers the configured tracer provider so engines or downstream HTTP clients that read the global provider attach to the same trace. No Copilot SDK changes required.sha256+ byte length for prompts, tool args, tool results, and completions.--otel-include-payloadsopts into full content for private debugging.go.opentelemetry.io/otel/sdk,exporters/otlp/otlptrace/otlptracehttp,exporters/stdout/stdouttrace) are only used when the exporter is enabled — zero runtime cost for users who never opt in.Tests
internal/telemetry/spans_test.gocovers span hierarchy, GenAI attribute names, default redaction,--otel-include-payloads, no-op behavior when disabled, header parsing, and config validation — all without requiring a live backend (usestracetest.InMemoryExporterwithWithSyncer).go test ./...andgolangci-lint run ./...pass locally.Docs
site/src/content/docs/guides/otel.mdx— span hierarchy diagram, local OTel Collector docker-compose example, redaction notes, auth headers.site/src/content/docs/reference/cli.mdxandREADME.mdflag tables updated.site/astro.config.mjs.Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>