Fix stale status code reporting on self-observability metrics#8226
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8226 +/- ##
=====================================
Coverage 82.6% 82.6%
=====================================
Files 310 310
Lines 24647 24648 +1
=====================================
+ Hits 20370 20381 +11
+ Misses 3897 3890 -7
+ Partials 380 377 -3
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect self-observability metric attribution in the OTLP/HTTP trace and log exporters by ensuring the recorded HTTP status code reflects the final request attempt (or is omitted when no response was received), addressing issue #8218.
Changes:
- Reset per-attempt
statusCodeinside OTLP/HTTP retry loops to avoid stale status reporting. - Omit
http.response.status_codeattribute from self-observability metrics when no HTTP response was received (status == 0). - Add regression tests for the stale-status scenario in both trace and log HTTP exporters.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| exporters/otlp/otlptrace/otlptracehttp/client.go | Resets statusCode per retry attempt to prevent stale values leaking into metrics. |
| exporters/otlp/otlptrace/otlptracehttp/internal/observ/instrumentation.go | Avoids emitting http.response.status_code when no response status is known. |
| exporters/otlp/otlptrace/otlptracehttp/client_test.go | Adds regression test for stale status code in trace exporter self-observability metrics. |
| exporters/otlp/otlplog/otlploghttp/client.go | Resets statusCode per retry attempt to prevent stale values leaking into metrics. |
| exporters/otlp/otlplog/otlploghttp/internal/observ/instrumentation.go | Avoids emitting http.response.status_code when no response status is known. |
| exporters/otlp/otlplog/otlploghttp/client_test.go | Adds regression test for stale status code in log exporter self-observability metrics. |
| CHANGELOG.md | Notes the bug fix in the Unreleased “Fixed” section. |
b67a737 to
bb61d7d
Compare
|
Do we need observ to be implemented first for metrics, or can this change be applied to |
Co-authored-by: Flc゛ <i@flc.io>
…elemetry#8226) Fixes open-telemetry#8218 This now omits the status code attribute when a request fails before getting a response, and doesn't use the previous response's status code when reporting self-observability metrics.
### Added - Add `ByteSlice` and `ByteSliceValue` functions for new `BYTESLICE` attribute type in `go.opentelemetry.io/otel/attribute`. (#7948) - Apply attribute value limit to the `KindBytes` attribute type in `go.opentelemetry.io/otel/sdk/log`. (#7990) - Apply attribute value limit to the `BYTESLICE` attribute type in `go.opentelemetry.io/otel/sdk/trace`. (#7990) - Support `BYTESLICE` attributes in `go.opentelemetry.io/otel/trace`. (#8153) - Support `BYTESLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlptrace`. (#8153) - Support `BYTESLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlplog`. (#8153) - Support `BYTESLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric`. (#8153) - Support `BYTESLICE` attributes in `go.opentelemetry.io/otel/exporters/zipkin`. (#8153) - Add `String` method for `Value` type in `go.opentelemetry.io/otel/attribute`. (#8142) - Add `Slice` and `SliceValue` functions for new `SLICE` attribute type in `go.opentelemetry.io/otel/attribute`. (#8166) - Support `SLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlptrace`. (#8216) - Support `SLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlplog`. (#8216) - Support `SLICE` attributes in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric`. (#8216) - Support `SLICE` attributes in `go.opentelemetry.io/otel/exporters/zipkin`. (#8216) - Apply `AttributeValueLengthLimit` to `attribute.SLICE` type attribute values in `go.opentelemetry.io/otel/sdk/trace`, recursively truncating contained string values. (#8217) - Add `Error` field on `Record` type in `go.opentelemetry.io/otel/log/logtest`. (#8148) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc`. (#8157) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`. (#8157) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc`. (#8157) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`. (#8157) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc`. (#8157) - Add `WithMaxRequestSize` option in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#8157) - Add `Settable` to `go.opentelemetry.io/otel/metric/x` to allow reusing attribute options. (#8178) - Add experimental support for splitting metric data across multiple batches in `go.opentelemetry.io/otel/sdk/metric`. Set `OTEL_GO_X_METRIC_EXPORT_BATCH_SIZE=<max_size>` to enable for all periodic readers. See `go.opentelemetry.io/otel/sdk/metric/internal/x` for feature documentation. (#8071) - Add experimental self-observability metrics in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc`. Enable with `OTEL_GO_X_SELF_OBSERVABILITY=true` environment variable. See `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc/internal/x` for feature documentation. (#8192) - Add experimental self-observability metrics in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`. Enable with `OTEL_GO_X_SELF_OBSERVABILITY=true` environment variable. See `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp/internal/x` for feature documentation. (#8194) - Add experimental self-observability metrics in `go.opentelemetry.io/otel/exporters/stdout/stdoutlog`. Enable with `OTEL_GO_X_SELF_OBSERVABILITY=true` environment variable. See `go.opentelemetry.io/otel/stdout/stdoutlog/internal/x` for feature documentation. (#8263) - Add `WithDefaultAttributes` to `go.opentelemetry.io/otel/metric/x` to support setting default attributes on instruments. (#8135) - Add `go.opentelemetry.io/otel/semconv/v1.41.0` package. The package contains semantic conventions from the `v1.41.0` version of the OpenTelemetry Semantic Conventions. See the [migration documentation](./semconv/v1.41.0/MIGRATION.md) for information on how to upgrade from `go.opentelemetry.io/otel/semconv/v1.40.0`. (#8324) - Add Observable variants of instruments to `go.opentelemetry.io/otel/semconv/v1.41.0` package. (#8350) - Generate explicit histogram bucket boundaries from weaver configuration for HTTP and RPC duration instruments in `go.opentelemetry.io/otel/semconv/v1.41.0`. (#8002) ### Changed -⚠️ **Breaking Change:** `go.opentelemetry.io/otel/sdk/metric` now applies a default cardinality limit of 2000 to comply with the Metrics SDK specification recommendation. New attribute sets are dropped when the cardinality limit is reached. The measurement of these sets are aggregated into a special attribute set containing `attribute.Bool("otel.metric.overflow", true)`. This can break users who relied on the previous unlimited default. Set `WithCardinalityLimit(0)` or the deprecated `OTEL_GO_X_CARDINALITY_LIMIT=0` environment variable to preserve unlimited cardinality. Note that support for `OTEL_GO_X_CARDINALITY_LIMIT` may be removed in a future release. (#8247) - `ErrorType` in `go.opentelemetry.io/otel/semconv` now unwraps errors created with `fmt.Errorf` when deriving the `error.type` attribute. (#8133) - `go.opentelemetry.io/otel/sdk/log` now unwraps error chains created with `fmt.Errorf` when deriving the `error.type` attribute from errors on log records. (#8133) - `Set.MarshalLog` method in `go.opentelemetry.io/otel/attribute` now uses `Value.String` formatting following the [OpenTelemetry AnyValue representation for non-OTLP protocols](https://opentelemetry.io/docs/specs/otel/common/#anyvalue). (#8169) - Optimize `go.opentelemetry.io/otel/sdk/metric` to return a drop reservoir and short-circuit `Offer` calls to the exemplar reservoir when `exemplar.AlwaysOffFilter` is configured. (#8211) (#8267) - Optimize `go.opentelemetry.io/otel/sdk/metric` to return a drop reservoir for asynchronous instruments when `exemplar.TraceBasedFilter` is configured. (#8286) ### Deprecated - Deprecate `Value.Emit` method in `go.opentelemetry.io/otel/attribute`. Use `Value.String` instead. (#8176) ### Fixed - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Limit OTLP request size to 64 MiB by default in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. The limit applies before compression, oversized requests are treated as non-retryable errors, and the limit can be configured with the new `WithMaxRequestSize` option. (#8157, #8365) - Fix gzipped request body replay on redirect in `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`. (#8135) - Fix gzipped request body replay on redirect in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#8152) - `go.opentelemetry.io/otel/exporters/prometheus` now uses `Value.String` formatting for label values following the [OpenTelemetry AnyValue representation for non-OTLP protocols](https://opentelemetry.io/docs/specs/otel/common/#anyvalue). (#8170) - Propagate errors from the exporter when calling `Shutdown` on `BatchSpanProcessor` in `go.opentelemetry.io/otel/sdk/trace`. (#8197) - Fix stale status code reporting on self-observability metrics in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp` and `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#8226) - Fix a concurrent `Collect` data race and potential panic in `go.opentelemetry.io/otel/exporters/prometheus` when `WithResourceAsConstantLabels` option is used. (#8227) - Fix race condition in `FixedSizeReservoir` in `go.opentelemetry.io/otel/sdk/metric/exemplar` by reverting #7447. (#8249) - Fix `FixedSizeReservoir` in `go.opentelemetry.io/otel/sdk/metric/exemplar` to safely handle zero size. A capacity check in the constructor initializes the reservoir safely and skips initialization for zero-cap; early returns in `Offer()` and `Collect()` ensure no-op behavior. (#8295) - Fix counting of spans and logs in self-observability metrics in `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc`, `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`, `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc`, and `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#8254) - Drop conflicting scope attributes named `name`, `version`, or `schema_url` from metric labels in `go.opentelemetry.io/otel/exporters/prometheus`, preserving the dedicated `otel_scope_name`, `otel_scope_version`, and `otel_scope_schema_url` labels. (#8264) - Close schema files opened by `ParseFile` in `go.opentelemetry.io/otel/schema/v1.0` and `go.opentelemetry.io/otel/schema/v1.1`. ([GHSA-995v-fvrw-c78m](GHSA-995v-fvrw-c78m)) - Enforce the 8192-byte baggage size limit during extraction/parsing, changing behavior when the limit is exceeded in `go.opentelemetry.io/otel/baggage` and `go.opentelemetry.io/otel/propagation`. (#8222) - Fix `go.opentelemetry.io/otel/semconv/v1.41.0` to include `Attr*` helper methods for required attributes on observable instruments. (#8361) - Limit baggage extraction error reporting in `go.opentelemetry.io/otel/propagation` to prevent malformed or oversized baggage headers from flooding logs. ([GHSA-5wrp-cwcj-q835](GHSA-5wrp-cwcj-q835))
Fixes #8218
This now omits the status code attribute when a request fails before getting a response, and doesn't use the previous response's status code when reporting self-observability metrics.
Unfortunately, it is a bit hard to test the fix since it depends on retrying http requests.