feat: normalize "le" and "quantile" labels values upon ingestion#15164
feat: normalize "le" and "quantile" labels values upon ingestion#15164machine424 merged 1 commit intoprometheus:mainfrom
Conversation
model/textparse/promparse.go
Outdated
| f, err := strconv.ParseFloat(value, 64) | ||
| if err == nil { | ||
| // XXX: maybe use f even in case of error? (zero value) | ||
| value = formatOpenMetricsFloat(f) |
There was a problem hiding this comment.
Even with the optimization in formatOpenMetricsFloat, I cannot do better than
│ v0.txt │ v1.txt │
│ sec/op │ sec/op vs base │
Parse/data=promtestdata.txt/parser=promtext-2 127.4µ ± 4% 151.4µ ± 6% +18.78% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2 108.8µ ± 4% 115.9µ ± 8% +6.52% (p=0.026 n=6)
geomean 117.7µ 132.4µ +12.48%
│ v0.txt │ v1.txt │
│ B/s │ B/s vs base │
Parse/data=promtestdata.txt/parser=promtext-2 249.7Mi ± 4% 210.2Mi ± 5% -15.81% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2 221.6Mi ± 3% 208.0Mi ± 7% -6.14% (p=0.026 n=6)
geomean 235.2Mi 209.1Mi -11.10%
│ v0.txt │ v1.txt │
│ B/op │ B/op vs base │
Parse/data=promtestdata.txt/parser=promtext-2 57.07Ki ± 0% 57.89Ki ± 0% +1.44% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2 57.07Ki ± 0% 57.07Ki ± 0% ~ (p=1.000 n=6)
geomean 57.07Ki 57.48Ki +0.72%
│ v0.txt │ v1.txt │
│ allocs/op │ allocs/op vs base │
Parse/data=promtestdata.txt/parser=promtext-2 828.0 ± 0% 1038.0 ± 0% +25.36% (p=0.002 n=6)
Parse/data=promtestdata.nometa.txt/parser=promtext-2 828.0 ± 0% 828.0 ± 0% ~ (p=1.000 n=6) ¹
geomean 828.0 927.1 +11.97%
¹ all samples are equal
on BenchmarkParse which makes use of "a lot" of Summaries.
(v1.txt is the PR)
There was a problem hiding this comment.
Note that there's been 15-20k iterations in 2s
There was a problem hiding this comment.
(opened #15167 for the formatOpenMetricsFloat optimization)
There was a problem hiding this comment.
I wouldn't worry too much. Looking at this through the lens of a micro benchmark will of course flag a huge performance regression, but that's because you compare a naturally very cheap operation (take a string as is) with a somewhat involved operation (a parse-format roundtrip). I don't think this will matter a lot in the big picture.
In the unlikely case that this is a noticeable performance problem, we could play some tricks (like having a list of typical label values which we just pass through (0.9, 0.95, 0.99, 1.0, 0.5), simply adding .0 to plain integers, tests for some simple cases (few decimal places) where we can normalize directly or detect that the number is canonical already, … But as said, I don't think we have to prematurely do this now.
There was a problem hiding this comment.
Yep, I was talking about performance from the BenchmarkParse perspective, which covers the whole parsing operation. I expect the added "parse-format roundtrip" to be a small part of this, which is why the overhead caught my attention.
But as Bryan mentioned, BenchmarkParse is missing the scrape's cache that will help Prometheus avoid the .Metric() path in most cases, so yes we should be expecting less overhead in Prometheus real scraping path.
|
@beorn7, I'd like to know what you think would be an acceptable overhead of this |
beorn7
left a comment
There was a problem hiding this comment.
Here my comments. Executive summary:
- Don't worry too much about performance at this point.
- Normalize in the parser (as the approach is different depending on text vs. protobuf).
- Always normalize, even for OpenMetrics.
model/textparse/promparse.go
Outdated
| f, err := strconv.ParseFloat(value, 64) | ||
| if err == nil { | ||
| // XXX: maybe use f even in case of error? (zero value) | ||
| value = formatOpenMetricsFloat(f) |
There was a problem hiding this comment.
I wouldn't worry too much. Looking at this through the lens of a micro benchmark will of course flag a huge performance regression, but that's because you compare a naturally very cheap operation (take a string as is) with a somewhat involved operation (a parse-format roundtrip). I don't think this will matter a lot in the big picture.
In the unlikely case that this is a noticeable performance problem, we could play some tricks (like having a list of typical label values which we just pass through (0.9, 0.95, 0.99, 1.0, 0.5), simply adding .0 to plain integers, tests for some simple cases (few decimal places) where we can normalize directly or detect that the number is canonical already, … But as said, I don't think we have to prematurely do this now.
|
Re the overhead, for prom and openmetrics this operation should only be done on initial scrape of a series, I believe. |
model/textparse/promparse.go
Outdated
|
|
||
| // normalize ensures that values of the "le" labels of classic histograms and "quantile" labels | ||
| // of summaries follow OpenMetrics formatting rules. | ||
| func normalize(t model.MetricType, l, v string) string { |
| which are always ingested. To keep the classic histograms as well, enable | ||
| `scrape_classic_histograms` in the scrape job. | ||
|
|
||
| _Note about the format of `le` and `quantile` label values:_ |
There was a problem hiding this comment.
maybe, in addition to the release notes and migration guide, we could mention the normalization in https://prometheus.io/docs/concepts/data_model/?
There was a problem hiding this comment.
And maybe also https://prometheus.io/docs/concepts/metric_types/ .
There was a problem hiding this comment.
We have to keep in mind that all the instructions here, being deleted in this PR, should make it into the migration guide, in adjusted form (essentially to tell the user how to retain the old formatting if they want to). /cc @jan--f
There was a problem hiding this comment.
(I'll open the needed PR against prometheus/docs)
|
|
beorn7
left a comment
There was a problem hiding this comment.
Just an idea about the function name, feel free to disregard. Looks good.
| which are always ingested. To keep the classic histograms as well, enable | ||
| `scrape_classic_histograms` in the scrape job. | ||
|
|
||
| _Note about the format of `le` and `quantile` label values:_ |
There was a problem hiding this comment.
We have to keep in mind that all the instructions here, being deleted in this PR, should make it into the migration guide, in adjusted form (essentially to tell the user how to retain the old formatting if they want to). /cc @jan--f
model/textparse/openmetricsparse.go
Outdated
|
|
||
| // normalize ensures that values of the "le" labels of classic histograms and "quantile" labels | ||
| // of summaries follow OpenMetrics formatting rules. | ||
| func normalize(t model.MetricType, l, v string) string { |
There was a problem hiding this comment.
Maybe normalizeFloatsInLabelValues ?
There was a problem hiding this comment.
every other name would be better than normalize :)
Signed-off-by: machine424 <ayoubmrini424@gmail.com> Co-authored-by: beorn7 <beorn@grafana.com>
Update test cases Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Scrape test for NHCB modified. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* master: (667 commits) NHCB scrape: refactor state handling and speed up scrape test (prometheus#15193) feat(tools): add debug printouts to rules unit testing (prometheus#15196) docs: add keep_firing_for in alerting rules api: Add rule group pagination to list rules api (prometheus#14017) Update scrape/scrape.go benchmark, rename parser omtext_with_nhcb goimports run Better docstring on test Remove omcounterdata.txt as redundant Fix failing benchmarks Add unit test to show that current wrapper is sub-optimal Rename convert_classic_histograms to convert_classic_histograms_to_nhcb More followup to prometheus#15164 Follow up prometheus#15178 Followup to prometheus#15164 test(cmd/prometheus): speed up test execution by t.Parallel() when possible feat: normalize "le" and "quantile" labels values upon ingestion scrape: provide a fallback format (prometheus#15136) Disallowing configure AM with the v1 api (prometheus#13883) feat: ProtobufParse.formatOpenMetricsFloat: improve float formatting by using strconv.AppendFloat instead of fmt.Sprint ... # Conflicts: # go.mod # go.sum
fixes #14485