Skip to content

feat: normalize "le" and "quantile" labels values upon ingestion#15164

Merged
machine424 merged 1 commit intoprometheus:mainfrom
machine424:quantile
Oct 19, 2024
Merged

feat: normalize "le" and "quantile" labels values upon ingestion#15164
machine424 merged 1 commit intoprometheus:mainfrom
machine424:quantile

Conversation

@machine424
Copy link
Member

@machine424 machine424 commented Oct 15, 2024

fixes #14485

@machine424 machine424 marked this pull request as draft October 15, 2024 14:25
f, err := strconv.ParseFloat(value, 64)
if err == nil {
// XXX: maybe use f even in case of error? (zero value)
value = formatOpenMetricsFloat(f)
Copy link
Member Author

@machine424 machine424 Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the optimization in formatOpenMetricsFloat, I cannot do better than

                                                     │   v0.txt    │               v1.txt               │
                                                     │   sec/op    │   sec/op     vs base               │
Parse/data=promtestdata.txt/parser=promtext-2          127.4µ ± 4%   151.4µ ± 6%  +18.78% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2   108.8µ ± 4%   115.9µ ± 8%   +6.52% (p=0.026 n=6)
geomean                                                117.7µ        132.4µ       +12.48%

                                                     │    v0.txt    │               v1.txt                │
                                                     │     B/s      │     B/s       vs base               │
Parse/data=promtestdata.txt/parser=promtext-2          249.7Mi ± 4%   210.2Mi ± 5%  -15.81% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2   221.6Mi ± 3%   208.0Mi ± 7%   -6.14% (p=0.026 n=6)
geomean                                                235.2Mi        209.1Mi       -11.10%

                                                     │    v0.txt    │               v1.txt               │
                                                     │     B/op     │     B/op      vs base              │
Parse/data=promtestdata.txt/parser=promtext-2          57.07Ki ± 0%   57.89Ki ± 0%  +1.44% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2   57.07Ki ± 0%   57.07Ki ± 0%       ~ (p=1.000 n=6)
geomean                                                57.07Ki        57.48Ki       +0.72%

                                                     │   v0.txt   │                v1.txt                │
                                                     │ allocs/op  │  allocs/op   vs base                 │
Parse/data=promtestdata.txt/parser=promtext-2          828.0 ± 0%   1038.0 ± 0%  +25.36% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2   828.0 ± 0%    828.0 ± 0%        ~ (p=1.000 n=6) ¹
geomean                                                828.0         927.1       +11.97%
¹ all samples are equal

on BenchmarkParse which makes use of "a lot" of Summaries.
(v1.txt is the PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's been 15-20k iterations in 2s

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(opened #15167 for the formatOpenMetricsFloat optimization)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry too much. Looking at this through the lens of a micro benchmark will of course flag a huge performance regression, but that's because you compare a naturally very cheap operation (take a string as is) with a somewhat involved operation (a parse-format roundtrip). I don't think this will matter a lot in the big picture.

In the unlikely case that this is a noticeable performance problem, we could play some tricks (like having a list of typical label values which we just pass through (0.9, 0.95, 0.99, 1.0, 0.5), simply adding .0 to plain integers, tests for some simple cases (few decimal places) where we can normalize directly or detect that the number is canonical already, … But as said, I don't think we have to prematurely do this now.

Copy link
Member Author

@machine424 machine424 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I was talking about performance from the BenchmarkParse perspective, which covers the whole parsing operation. I expect the added "parse-format roundtrip" to be a small part of this, which is why the overhead caught my attention.

But as Bryan mentioned, BenchmarkParse is missing the scrape's cache that will help Prometheus avoid the .Metric() path in most cases, so yes we should be expecting less overhead in Prometheus real scraping path.

@machine424
Copy link
Member Author

@beorn7, I'd like to know what you think would be an acceptable overhead of this string -> float -> string transformation (the second part being the more demanding) for the BenchmarkParse use case.

@machine424 machine424 marked this pull request as ready for review October 15, 2024 14:37
@machine424 machine424 requested a review from beorn7 October 16, 2024 09:40
@krajorama krajorama self-requested a review October 16, 2024 12:39
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here my comments. Executive summary:

  • Don't worry too much about performance at this point.
  • Normalize in the parser (as the approach is different depending on text vs. protobuf).
  • Always normalize, even for OpenMetrics.

f, err := strconv.ParseFloat(value, 64)
if err == nil {
// XXX: maybe use f even in case of error? (zero value)
value = formatOpenMetricsFloat(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry too much. Looking at this through the lens of a micro benchmark will of course flag a huge performance regression, but that's because you compare a naturally very cheap operation (take a string as is) with a somewhat involved operation (a parse-format roundtrip). I don't think this will matter a lot in the big picture.

In the unlikely case that this is a noticeable performance problem, we could play some tricks (like having a list of typical label values which we just pass through (0.9, 0.95, 0.99, 1.0, 0.5), simply adding .0 to plain integers, tests for some simple cases (few decimal places) where we can normalize directly or detect that the number is canonical already, … But as said, I don't think we have to prematurely do this now.

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addendum

@bboreham
Copy link
Member

Re the overhead, for prom and openmetrics this operation should only be done on initial scrape of a series, I believe.
After that, the post-normalization version should be looked up from the cache, so long as the target is still emitting the same exact bytes for all labels.


// normalize ensures that values of the "le" labels of classic histograms and "quantile" labels
// of summaries follow OpenMetrics formatting rules.
func normalize(t model.MetricType, l, v string) string {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming?

which are always ingested. To keep the classic histograms as well, enable
`scrape_classic_histograms` in the scrape job.

_Note about the format of `le` and `quantile` label values:_
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe, in addition to the release notes and migration guide, we could mention the normalization in https://prometheus.io/docs/concepts/data_model/?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@beorn7 beorn7 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to keep in mind that all the instructions here, being deleted in this PR, should make it into the migration guide, in adjusted form (essentially to tell the user how to retain the old formatting if they want to). /cc @jan--f

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll open the needed PR against prometheus/docs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closing the loop prometheus/docs#2572

@machine424 machine424 marked this pull request as draft October 18, 2024 12:30
@machine424 machine424 marked this pull request as ready for review October 18, 2024 13:19
@machine424 machine424 requested a review from beorn7 October 18, 2024 13:19
@machine424
Copy link
Member Author

machine424 commented Oct 18, 2024

I'll need to rebase after #15167 to adjust the added test.

beorn7
beorn7 previously approved these changes Oct 18, 2024
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea about the function name, feel free to disregard. Looks good.

which are always ingested. To keep the classic histograms as well, enable
`scrape_classic_histograms` in the scrape job.

_Note about the format of `le` and `quantile` label values:_
Copy link
Member

@beorn7 beorn7 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to keep in mind that all the instructions here, being deleted in this PR, should make it into the migration guide, in adjusted form (essentially to tell the user how to retain the old formatting if they want to). /cc @jan--f


// normalize ensures that values of the "le" labels of classic histograms and "quantile" labels
// of summaries follow OpenMetrics formatting rules.
func normalize(t model.MetricType, l, v string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe normalizeFloatsInLabelValues ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every other name would be better than normalize :)

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

Co-authored-by: beorn7 <beorn@grafana.com>
@machine424 machine424 merged commit d8c1605 into prometheus:main Oct 19, 2024
krajorama added a commit that referenced this pull request Oct 21, 2024
Update test cases

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
krajorama added a commit that referenced this pull request Oct 21, 2024
Scrape test for NHCB modified.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama mentioned this pull request Oct 22, 2024
KeyOfSpectator added a commit to AliyunContainerService/prometheus that referenced this pull request Oct 23, 2024
* master: (667 commits)
  NHCB scrape: refactor state handling and speed up scrape test (prometheus#15193)
  feat(tools): add debug printouts to rules unit testing (prometheus#15196)
  docs: add keep_firing_for in alerting rules
  api: Add rule group pagination to list rules api (prometheus#14017)
  Update scrape/scrape.go
  benchmark, rename parser omtext_with_nhcb
  goimports run
  Better docstring on test
  Remove omcounterdata.txt as redundant
  Fix failing benchmarks
  Add unit test to show that current wrapper is sub-optimal
  Rename convert_classic_histograms to convert_classic_histograms_to_nhcb
  More followup to prometheus#15164
  Follow up prometheus#15178
  Followup to prometheus#15164
  test(cmd/prometheus): speed up test execution by t.Parallel() when possible
  feat: normalize "le" and "quantile" labels values upon ingestion
  scrape: provide a fallback format (prometheus#15136)
  Disallowing configure AM with the v1 api (prometheus#13883)
  feat: ProtobufParse.formatOpenMetricsFloat: improve float formatting by using strconv.AppendFloat instead of fmt.Sprint
  ...

# Conflicts:
#	go.mod
#	go.sum
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

scrape: Normalize le and quantile labels upon ingestion in v3

5 participants