fix(textparse/protobuf): metric family name corrupted by NHCB parser#17156
Merged
fix(textparse/protobuf): metric family name corrupted by NHCB parser#17156
Conversation
krajorama
commented
Sep 6, 2025
The NHCB conversion does some validation, but we can only return error from Parser.Next() not Parser.Histogram(). So the conversion needs to happen in Next(). There are 2 cases: 1. "always_scrape_classic_histograms" is enabled, in which case we convert after returning the classic series. This is to be consistent with the PromParser text parser, which collects NHCB while spitting out classic series; then returns the NHCB. 2. "always_scrape_classic_histograms" is disabled. In which case we never return the classic series. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Do not return the first classic series from the EntryType state, switch to EntrySeries. This means we need to start the histogram field state from -3 , not -2. In EntrySeries, skip classic series if needed. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
66edf6f to
9a047ee
Compare
krajorama
commented
Sep 6, 2025
krajorama
commented
Sep 6, 2025
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
NHCB parse doesn't always copy the metric name from the underlying parser. When called via HELP, UNIT, the string is directly referenced which means that the read-ahead of NHCB can corrupt it. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
6a2f59b to
eab7605
Compare
Member
|
Do you mind changing the changelog entry?
This bug is affecting metadata records in general, not only remote write (it also affects RW 1.0 and metadata API). What about
|
bwplotka
approved these changes
Sep 8, 2025
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Naman-B-Parlecha
added a commit
to Naman-B-Parlecha/prometheus
that referenced
this pull request
Sep 17, 2025
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Refactor ConvertNHCBToClassicHistogram and add unit tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Add ConvertNHCBToClassic flag and refactor related types and tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (prometheus#17151) * Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 * Upgrade to client_golang@v1.23.2 --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> OTLP to directly write to an interface which can hide storage details (prometheus#16951) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <dashpole@google.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: David Ashpole <dashpole@google.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> fix(textparse/protobuf): metric family name corrupted by NHCB parser (prometheus#17156) * fix(textparse): implement NHCB parsing in ProtoBuf parser directly The NHCB conversion does some validation, but we can only return error from Parser.Next() not Parser.Histogram(). So the conversion needs to happen in Next(). There are 2 cases: 1. "always_scrape_classic_histograms" is enabled, in which case we convert after returning the classic series. This is to be consistent with the PromParser text parser, which collects NHCB while spitting out classic series; then returns the NHCB. 2. "always_scrape_classic_histograms" is disabled. In which case we never return the classic series. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * refactor(textparse): skip classic series instead of adding NHCB around Do not return the first classic series from the EntryType state, switch to EntrySeries. This means we need to start the histogram field state from -3 , not -2. In EntrySeries, skip classic series if needed. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * reuse nhcb converter Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * test(textparse/nhcb): test corrupting metric family name NHCB parse doesn't always copy the metric name from the underlying parser. When called via HELP, UNIT, the string is directly referenced which means that the read-ahead of NHCB can corrupt it. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (prometheus#17160) Signed-off-by: pipiland <user123@Minhs-Macbook.local> --------- Signed-off-by: pipiland <user123@Minhs-Macbook.local> Co-authored-by: pipiland <user123@Minhs-Macbook.local> feat(histogram): Refactor NHCB to classic conversion with enhanced validation and update tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> feat(histogram): Update NHCB to classic conversion logic and remove tests Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> refactor(histogram): Update ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> test cases for new conversion Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> refactor(histogram): Simplify label building in ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When PrometheusProto scrape protocol is used to scrape a target and there is an NHCB conversion, the wrong metadata is written into the WAL. This means the Remote-Write 2.0 will send empty metadata with NHCB samples.
The metric family name is returned by a reference by the NHCB parser , directly from the ProtoBuf parser. However after it is returned , the NHCB parser reads ahead and the slice is overwritten while still being referenced in scrape. See #17075 (comment)
We could copy the data, but we wanted to implement direct NHCB conversion in the ProtoBuf parser anyway, so
this PR does that.
NHCB generic parser is no longer needed on top of the ProtoBuf parser.
The implementation adds the conversion after classic series are returned by the parser to be consistent with how the generic NHCB parser works. The general parser emits classic series while collecting data for NHCB, hence the order is classic series first, NHCB next.
However this means that the code needs to be able to skip classic series if NHCB conversion is enabled, but classic
histograms are not needed. According to #13532.
Which issue(s) does the PR fix:
Related to: #17075
Does this PR introduce a user-facing change?