Skip to content

PromQL (histograms): counter reset hint (AKA gauge vs. counter) is inconsistent when aggregating native histograms #17308

@beorn7

Description

@beorn7

What did you do?

Aggregate a number of native histograms with avg/sum or their _over_time versions where those histograms are not all gauge histograms.

This is a relatively rare use case (as you should commonly only aggregate gauge histograms), and the counter reset hint is not heavily used from the user perspective. Thus, I would not let this issue block declaring NH a stable feature, but just consider it a known bug. However, let's aim for fixing it in time. Therefore, marking it as P2 for now.

What did you expect to see?

A consistent and reproducible handling of those:

  1. If all counter reset hints are the same, the result has that same counter reset hint.
  2. Otherwise, if there is at least one GaugeType, the outcome is GaugeType.
  3. Otherwise, the outcome is UnknownCounterReset.
  4. If in any case (except (1)), there is a direct contradiction in the mix (CounterReset vs. CounterReset), add a warn-level annotation.

What did you see instead? Under which circumstances?

The aggregation happen in pseudo-random order, and this leads to different outcomes. Furthermore, the avg calculation is using incremental mean calculation, which uses the Sub method of the FloatHistogram, which always sets the counter reset hint to GaugeType.

This results in the following behavior in the cases above:

  1. Works for sum/sum_over_time, but avg/avg_over_timealways results inGaugeType` even if the counter reset hint shared by all involved histograms is a different one.
  2. This works already.
  3. Works for sum/sum_over_time, but avg/avg_over_timealways results inGaugeType`.
  4. The warn-level annotation is omitted in some cases, because a "lucky" order of aggregation might turn the contradicting counter reset hints into UnknownCounterReset or GaugeType before they hit their counterpart.

System information

No response

Prometheus version


Prometheus configuration file

Alertmanager version


Alertmanager configuration file

Logs


Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions