feat: add tracing using otel#740
Conversation
|
There was a problem hiding this comment.
Pull request overview
This PR adds OpenTelemetry support to Sablier, enabling distributed tracing and metrics collection through OTLP gRPC exporters. The implementation includes configuration options via YAML, environment variables, and CLI flags, along with comprehensive documentation.
- Introduces a new
pkg/tracingpackage with telemetry initialization and metrics definitions - Adds configuration support for enabling/disabling tracing and setting OTLP endpoint
- Integrates HTTP request tracing via otelgin middleware
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| sablier.sample.yaml | Adds tracing configuration section with enabled flag and endpoint |
| pkg/tracing/tracing.go | Implements OpenTelemetry initialization and shutdown for traces and metrics |
| pkg/tracing/metrics.go | Defines metrics and recording functions for sessions, instances, and requests |
| pkg/sabliercmd/start.go | Initializes tracing on application startup with graceful shutdown |
| pkg/sabliercmd/root.go | Adds CLI flags for tracing configuration |
| pkg/config/tracing.go | Defines tracing configuration struct with default values |
| pkg/config/configuration.go | Integrates tracing config into main configuration |
| internal/server/server.go | Adds otelgin middleware for HTTP request instrumentation |
| go.mod, go.sum | Adds OpenTelemetry dependencies at version 1.38.0 |
| docs/tracing.md | Comprehensive documentation on using OpenTelemetry with Sablier |
| docs/_sidebar.md | Adds tracing documentation to sidebar navigation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var err error | ||
| if t.tracerProvider != nil { | ||
| if shutdownErr := t.tracerProvider.Shutdown(ctx); shutdownErr != nil { | ||
| err = shutdownErr | ||
| t.logger.Error("failed to shutdown tracer provider", "error", err) | ||
| } | ||
| } | ||
|
|
||
| if t.meterProvider != nil { | ||
| if shutdownErr := t.meterProvider.Shutdown(ctx); shutdownErr != nil { | ||
| err = shutdownErr | ||
| t.logger.Error("failed to shutdown meter provider", "error", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
The Shutdown method only returns the last error encountered, discarding any previous errors. If the tracer provider fails to shut down (line 107-109), that error will be lost if the meter provider subsequently fails (line 114-115). Use a multierror pattern or errors.Join to preserve all shutdown errors.
| requestsDuration metric.Float64Histogram | ||
| } | ||
|
|
||
| func InitMetrics() (*Metrics, error) { |
There was a problem hiding this comment.
The InitMetrics function is defined but never called anywhere in the codebase. The metrics defined here (sessions.active, sessions.total, instances.started, etc.) will never be initialized or recorded, making the metrics functionality non-operational. InitMetrics should be called during application startup in pkg/sabliercmd/start.go after tracing initialization.
| traceExporter, err := otlptrace.New(ctx, otlptracegrpc.NewClient( | ||
| otlptracegrpc.WithEndpoint(cfg.Endpoint), | ||
| otlptracegrpc.WithInsecure(), | ||
| )) |
There was a problem hiding this comment.
The OTLP trace exporter is configured with WithInsecure(), disabling TLS encryption for all connections. This exposes telemetry data (which may contain sensitive information like instance names, strategies, and request details) to interception. Consider adding a configuration option to enable/disable TLS and default to secure connections, or at minimum, add a warning in the documentation about the security implications.
| metricExporter, err := otlpmetricgrpc.New(ctx, | ||
| otlpmetricgrpc.WithEndpoint(cfg.Endpoint), | ||
| otlpmetricgrpc.WithInsecure(), | ||
| ) |
There was a problem hiding this comment.
The OTLP metric exporter is configured with WithInsecure(), disabling TLS encryption. This has the same security concerns as the trace exporter - metrics data will be transmitted unencrypted. Add a TLS configuration option or default to secure connections.
| - **Provider operations** - Instance start, stop, inspect, list, and group operations | ||
| - **Session management** - Session creation and lifecycle |
There was a problem hiding this comment.
The documentation claims automatic instrumentation for 'Provider operations' and 'Session management', but the code changes only add the otelgin middleware for HTTP request tracing. There is no tracing instrumentation added to provider operations (start/stop/inspect/list) or session lifecycle code. Either add this instrumentation or update the documentation to accurately reflect what is currently instrumented.
| - **Provider operations** - Instance start, stop, inspect, list, and group operations | |
| - **Session management** - Session creation and lifecycle |
❌ 6 Tests Failed:
View the top 3 failed tests by shortest run time
📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
❌ 6 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
|



No description provided.