OTLP to directly write to an interface which can hide storage details#16951
OTLP to directly write to an interface which can hide storage details#16951
Conversation
Signed-off-by: David Ashpole <dashpole@google.com>
Signed-off-by: David Ashpole <dashpole@google.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
to also check metadata. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
also fix inconsistent classic histogram. Count was lower than sum of bucket values. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Defer converting to whatever actual model/labels representation is used by storage. Prometheus will use whatever is set as build label. Downstream might keep slices for example when converting to something like remote write 1. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
85ac11c to
1c029e4
Compare
https://github.com/prometheus/prometheus/pull/16855/files#r2206815844 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
https://github.com/prometheus/prometheus/pull/16855/files#r2237079758 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
|
Benchmark at main(e35c09d) vs pr (4b89cd5) |
storage/remote/otlptranslator/prometheusremotewrite/metrics_to_prw_test.go
Outdated
Show resolved
Hide resolved
# Conflicts: # storage/remote/otlptranslator/prometheusremotewrite/helper.go
…ints We check buckets in the similar test for histograms. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
storage/remote/otlptranslator/prometheusremotewrite/combined_appender.go
Show resolved
Hide resolved
storage/remote/otlptranslator/prometheusremotewrite/combined_appender.go
Outdated
Show resolved
Hide resolved
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
aknuds1
left a comment
There was a problem hiding this comment.
LGTM AFAICT at this point. I trust you've made sure it integrates well with Mimir :)
To be more inline with future direction in #17104 Also use some helper variables in appender tests to make it more readable in shorter. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Thank you, I'll update the Mimir PR on Monday and test one more time. @bwplotka Can I get an approve from you as well? I'd like to merge this. We can later switch to whatever we agree on in #17104. But I don't want to be blocked on that and I've made the interface here as close to that as possible for now. |
To get prometheus/prometheus#16951 at b03fd1e7 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Test build in: grafana/mimir#12652 |
I've checked out Mimir and ran I see data (floats, native histograms) and metadata as well. Resolving comments and merging. |
To get prometheus/prometheus#16951 at b03fd1e7 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
To get prometheus/prometheus#16951 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
To get prometheus/prometheus#16951 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
…#12652) This PR updates mimir-prometheus to [`c2f03d6d14d2`.](grafana/mimir-prometheus@c2f03d6) Brings in prometheus/prometheus#16951. Changelog: OTLP: native support for OpenTelemetry metric start time to Prometheus metric created timestamp conversion, instead of converting to QuietZeroNaNs introduced in #10238. The configuration parameter `-distributor.otel-start-time-quiet-zero` is therefore deprecated and will be removed. Now supports start time for exponential histograms. This is a major rewrite of the endpoint in upstream Prometheus and Mimir. #12652 Implementation does away with the generated otlp translation code as we don't need to rewrite it for `mimirpb` anymore. The storage logic is behind an interface. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
…prometheus#16951) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <dashpole@google.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: David Ashpole <dashpole@google.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Refactor ConvertNHCBToClassicHistogram and add unit tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Add ConvertNHCBToClassic flag and refactor related types and tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (prometheus#17151) * Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 * Upgrade to client_golang@v1.23.2 --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> OTLP to directly write to an interface which can hide storage details (prometheus#16951) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <dashpole@google.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: David Ashpole <dashpole@google.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> fix(textparse/protobuf): metric family name corrupted by NHCB parser (prometheus#17156) * fix(textparse): implement NHCB parsing in ProtoBuf parser directly The NHCB conversion does some validation, but we can only return error from Parser.Next() not Parser.Histogram(). So the conversion needs to happen in Next(). There are 2 cases: 1. "always_scrape_classic_histograms" is enabled, in which case we convert after returning the classic series. This is to be consistent with the PromParser text parser, which collects NHCB while spitting out classic series; then returns the NHCB. 2. "always_scrape_classic_histograms" is disabled. In which case we never return the classic series. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * refactor(textparse): skip classic series instead of adding NHCB around Do not return the first classic series from the EntryType state, switch to EntrySeries. This means we need to start the histogram field state from -3 , not -2. In EntrySeries, skip classic series if needed. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * reuse nhcb converter Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * test(textparse/nhcb): test corrupting metric family name NHCB parse doesn't always copy the metric name from the underlying parser. When called via HELP, UNIT, the string is directly referenced which means that the read-ahead of NHCB can corrupt it. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (prometheus#17160) Signed-off-by: pipiland <user123@Minhs-Macbook.local> --------- Signed-off-by: pipiland <user123@Minhs-Macbook.local> Co-authored-by: pipiland <user123@Minhs-Macbook.local> feat(histogram): Refactor NHCB to classic conversion with enhanced validation and update tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Update NHCB to classic conversion logic and remove tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> refactor(histogram): Update ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> test cases for new conversion Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> refactor(histogram): Simplify label building in ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

Original PR: #16855
Original Title: OTLP to directly writes to storage.Appender
Main change by @dashpole . @krajorama adopting the PR while @dashpole is on vacation.
Remove intermediary OTLP -> PRW 1.0 step before writing to the appender. It also adds a CombinedAppender interface, which consolidates error handling, and makes it possible to re-use the library in Mimir or other storage backends, which currently uses it to translate to PRW 1.0.
Similar to #16827, but maintains existing code structure and unit tests.
Alternative to #16784.
Notable Changes:
otlp_without_metadata_appended_samples_totalinstead ofremote_write_without_metadata_appended_samples_total(this still exists just won't be used here).otlp_out_of_order_exemplars_totalreplaces "Error on ingesting out-of-order exemplars" warning log.appender.UpdateMetadatawith metadata. The unit is converted using theotlptranslatorlibrary.appender.AppendHistogramCTZeroSampleandappender.AppendCTZeroSamplewhen the created-timestamp-zero-ingestion feature is enabled.Benchmark summary: 36% less CPU, 38% fewer bytes, 19% fewer allocations
Which issue(s) does the PR fix:
Does this PR introduce a user-facing change?