promql(native histograms): Introduce exponential interpolation by beorn7 · Pull Request #14677 · prometheus/prometheus

beorn7 · 2024-08-15T12:51:02Z

The linear interpolation (assuming that observations are uniformly
distributed within a bucket) is a solid and simple assumption in lack
of any other information. However, the exponential bucketing used by
standard schemas of native histograms has been chosen to cover the
whole range of observations in a way that bucket populations are
spread out over buckets in a reasonably way for typical distributions
encountered in real-world scenarios.

This is the origin of the idea implemented here: If we divide a given
bucket into two (or more) smaller exponential buckets, we "most
naturally" expect that the samples in the original buckets will split
among those smaller buckets in a more or less uniform fashion. With
this assumption, we end up with an "exponential interpolation", which
therefore appears to be a better match for histograms with exponential
bucketing.

This commit leaves the linear interpolation in place for NHCB, but
changes the interpolation for exponential native histograms to
exponential. This affects histogram_quantile and
histogram_fraction (because the latter is more or less the inverse
of the former).

The zero bucket has to be treated specially because the assumption
above would lead to an "interpolation to zero" (the bucket density
approaches infinity around zero, and with the postulated uniform usage
of buckets, we would end up with an estimate of zero for all quantiles
ending up in the zero bucket). We simply fall back to linear
interpolation within the zero bucket.

At the same time, this commit makes the call to stick with the
assumption that the zero bucket only contains positive observations
for native histograms without negative buckets (and vice versa). (This
is an assumption relevant for interpolation. It is a mostly academic
point, as the zero bucket is supposed to be very small anyway.
However, in cases where it is relevantly broad, the assumption helps
a lot in practice.)

This commit also updates and completes the documentation to match both
details about interpolation.

As a more high level note: The approach here attempts to strike a
balance between a more simplistic approach without any assumption, and
a more involved approach with more sophisticated assumptions. I will
shortly describe both for reference:

The "zero assumption" approach would be to not interpolate at all, but
always return the harmonic mean of the bucket boundaries of the
bucket the quantile ends up in. This has the advantage of minimizing
the maximum possible relative error of the quantile estimation.
(Depending on the exact definition of the relative error of an
estimation, there is also an argument to return the arithmetic mean of
the bucket boundaries.) While limiting the maximum possible relative
error is a good property, this approach would throw away the
information if a quantile is closer to the upper or lower end of the
population within a bucket. This can be valuable trending information
in a dashboard. With any kind of interpolation, the maximum possible
error of a quantile estimation increases to the full width of a bucket
(i.e. it more than doubles for the harmonic mean approach, and
precisely doubles for the arithmetic mean approach). However, in
return the expectation value of the error decreases. The increase of
the theoretical maximum only has practical relevance for pathologic
distributions. For example, if there are thousand observations within
a bucket, they could all be at the upper bound of the bucket. If the
quantile calculation picks the 1st observation in the bucket as the
relevant one, an interpolation will yield a value close to the lower
bucket boundary, while the true quantile value is close to the upper
boundary.

The "fancy interpolation" approach would be one that analyses the
actual distribution of samples in the histogram. A lot of statistics
could be applied based on the information we have available in the
histogram. This would include the population of neighboring (or even
all) buckets in the histogram. In general, the resolution of a native
histogram should be quite high, and therefore, those "fancy"
approaches would increase the computational cost quite a bit with very
little practical benefits (i.e. just tiny corrections of the estimated
quantile value). The results are also much harder to reason with.

beorn7 · 2024-08-15T13:15:55Z

@suntala @zenador maybe this is of interest for you (and then you might want to help with the review).

@suntala for context, this is one of the topics you considered working on.

suntala · 2024-08-23T13:54:44Z

Thanks for pinging me on this! Curious to see how this was implemented. It might take a week or two, but I'll definitely take a look.

promql/quantile.go

beorn7 · 2024-08-28T12:15:24Z

Commit with new math pushed. Also updated commit and PR description (because we have to revert to linear interpolation for the zero bucket).

Overall, I think the math is even easier to understand. I thought my previous approach is "simpler" but completely missed that I maneuvered myself into a corner where I have only treated the special case of schema 0 properly (and came up with a nonsensical treatment of the zero bucket).

suntala

The explanation in the doc comment about the quantile being right at the bucket boundary of a histogram with a greater resolution really helps.

I want to take another look to better understand the calculations being performed. At first I was trying to figure out how to match (at least approximately) the quantile to a bucket boundary of a native histogram which has an integer schema value. Seems that's not relevant though--the hypothetical histogram might have a non-integer schema value.

promql/quantile.go

suntala · 2024-09-06T14:33:24Z

docs/querying/functions.md

+You can use `histogram_quantile(1, v instant-vector)` to get the estimated
+maximum value stored in a histogram.
+
+Buckets of classic histograms are cumulative. Therefore, the following should


Are native histograms not cumulative?

No. Native histograms (in particular the float version, which is the one we always use within PromQL) are "regular histograms". Each bucket contains just the observations falling into it.

suntala · 2024-09-06T15:01:49Z

promql/quantile.go

+	logLower := math.Log2(math.Abs(bucket.Lower))
+	logUpper := math.Log2(math.Abs(bucket.Upper))
+	if bucket.Lower > 0 { // Positive bucket.
+		return math.Exp2(logLower + (logUpper-logLower)*fraction)


A comment here might be helpful since this is the main calculation for the exponential interpolation. It would particularly help to mention how this translates into a hypothetical bucket boundary. Expanding on the part fraction plays here might be enough.

I tried to make the explanation a bit more detailed.

Thank you. I'll take another look, probably only next week though.

beorn7 · 2024-09-10T11:31:34Z

Thanks for your review.

At first I was trying to figure out how to match (at least approximately) the quantile to a bucket boundary of a native histogram which has an integer schema value. Seems that's not relevant though--the hypothetical histogram might have a non-integer schema value.

Not sure I'm following here. The (so-called) schema is always an integer. (However, the resulting bucket boundaries can be (and often will be) irrational numbers.)

krajorama

LGTM. I have a feeling there's some optimization possible to the calculations with Frexp, but base 2 log and exp should be quick anyway.

krajorama · 2024-09-19T12:02:27Z

promql/quantile.go

+			if v > 0 {
+				fraction = (logV - logLower) / (logUpper - logLower)
+			} else {
+				fraction = 1 - ((logV - logUpper) / (logLower - logUpper))


for a second I thought this was wrong, because logV-logUpper is negative, but then realized that logLower-logUpper is also negative :)

beorn7 · 2024-09-19T12:11:14Z

Thanks. I'll resolve the merge conflicts and will merge on green.

The linear interpolation (assuming that observations are uniformly distributed within a bucket) is a solid and simple assumption in lack of any other information. However, the exponential bucketing used by standard schemas of native histograms has been chosen to cover the whole range of observations in a way that bucket populations are spread out over buckets in a reasonably way for typical distributions encountered in real-world scenarios. This is the origin of the idea implemented here: If we divide a given bucket into two (or more) smaller exponential buckets, we "most naturally" expect that the samples in the original buckets will split among those smaller buckets in a more or less uniform fashion. With this assumption, we end up with an "exponential interpolation", which therefore appears to be a better match for histograms with exponential bucketing. This commit leaves the linear interpolation in place for NHCB, but changes the interpolation for exponential native histograms to exponential. This affects `histogram_quantile` and `histogram_fraction` (because the latter is more or less the inverse of the former). The zero bucket has to be treated specially because the assumption above would lead to an "interpolation to zero" (the bucket density approaches infinity around zero, and with the postulated uniform usage of buckets, we would end up with an estimate of zero for all quantiles ending up in the zero bucket). We simply fall back to linear interpolation within the zero bucket. At the same time, this commit makes the call to stick with the assumption that the zero bucket only contains positive observations for native histograms without negative buckets (and vice versa). (This is an assumption relevant for interpolation. It is a mostly academic point, as the zero bucket is supposed to be very small anyway. However, in cases where it _is_ relevantly broad, the assumption helps a lot in practice.) This commit also updates and completes the documentation to match both details about interpolation. As a more high level note: The approach here attempts to strike a balance between a more simplistic approach without any assumption, and a more involved approach with more sophisticated assumptions. I will shortly describe both for reference: The "zero assumption" approach would be to not interpolate at all, but _always_ return the harmonic mean of the bucket boundaries of the bucket the quantile ends up in. This has the advantage of minimizing the maximum possible relative error of the quantile estimation. (Depending on the exact definition of the relative error of an estimation, there is also an argument to return the arithmetic mean of the bucket boundaries.) While limiting the maximum possible relative error is a good property, this approach would throw away the information if a quantile is closer to the upper or lower end of the population within a bucket. This can be valuable trending information in a dashboard. With any kind of interpolation, the maximum possible error of a quantile estimation increases to the full width of a bucket (i.e. it more than doubles for the harmonic mean approach, and precisely doubles for the arithmetic mean approach). However, in return the _expectation value_ of the error decreases. The increase of the theoretical maximum only has practical relevance for pathologic distributions. For example, if there are thousand observations within a bucket, they could _all_ be at the upper bound of the bucket. If the quantile calculation picks the 1st observation in the bucket as the relevant one, an interpolation will yield a value close to the lower bucket boundary, while the true quantile value is close to the upper boundary. The "fancy interpolation" approach would be one that analyses the _actual_ distribution of samples in the histogram. A lot of statistics could be applied based on the information we have available in the histogram. This would include the population of neighboring (or even all) buckets in the histogram. In general, the resolution of a native histogram should be quite high, and therefore, those "fancy" approaches would increase the computational cost quite a bit with very little practical benefits (i.e. just tiny corrections of the estimated quantile value). The results are also much harder to reason with. Signed-off-by: beorn7 <beorn@grafana.com>

suntala · 2024-09-19T13:10:45Z

Unfortunately, I won't be able to take another look as planned. Something came up and I'll need to be OOO for several days. Thanks for the clarifications.

beorn7 requested a review from roidelapluie as a code owner August 15, 2024 12:51

beorn7 requested review from krajorama and removed request for roidelapluie August 15, 2024 13:12

beorn7 mentioned this pull request Aug 15, 2024

promql (histograms): Refine interpolation behavior #11215

Closed

krajorama reviewed Aug 26, 2024

View reviewed changes

promql/quantile.go Show resolved Hide resolved

beorn7 force-pushed the beorn7/histogram branch from 7477cdd to 43be51a Compare August 28, 2024 12:11

suntala reviewed Sep 6, 2024

View reviewed changes

beorn7 force-pushed the beorn7/histogram branch from 08d0ff4 to 213c248 Compare September 10, 2024 11:29

krajorama previously approved these changes Sep 19, 2024

View reviewed changes

beorn7 dismissed krajorama’s stale review via 91ba3f4 September 19, 2024 12:15

beorn7 force-pushed the beorn7/histogram branch from 213c248 to 91ba3f4 Compare September 19, 2024 12:15

beorn7 force-pushed the beorn7/histogram branch from 91ba3f4 to 6fcd225 Compare September 19, 2024 12:19

beorn7 merged commit df9916e into main Sep 19, 2024

beorn7 deleted the beorn7/histogram branch September 19, 2024 16:09

harry671003 mentioned this pull request Nov 21, 2024

Update prometheus thanos-io/promql-engine#487

Merged

julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024

Merge pull request prometheus#14677 from prometheus/beorn7/histogram

fc554ca

This was referenced Oct 7, 2025

Add rank estimation for exponential histograms elastic/elasticsearch#135692

Merged

Use interpolation over exponential histogram samples elastic/elasticsearch#136354

Open

Conversation

beorn7 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beorn7 commented Aug 15, 2024

Uh oh!

suntala commented Aug 23, 2024

Uh oh!

Uh oh!

beorn7 commented Aug 28, 2024

Uh oh!

suntala left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

suntala Sep 6, 2024

Choose a reason for hiding this comment

Uh oh!

beorn7 Sep 10, 2024

Choose a reason for hiding this comment

Uh oh!

suntala Sep 6, 2024

Choose a reason for hiding this comment

Uh oh!

beorn7 Sep 10, 2024

Choose a reason for hiding this comment

Uh oh!

suntala Sep 12, 2024

Choose a reason for hiding this comment

Uh oh!

beorn7 commented Sep 10, 2024

Uh oh!

krajorama left a comment

Choose a reason for hiding this comment

Uh oh!

krajorama Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

beorn7 commented Sep 19, 2024

Uh oh!

suntala commented Sep 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

beorn7 commented Aug 15, 2024 •

edited

Loading