-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Description
What did you do?
Aggregate a number of native histograms with avg/sum or their _over_time versions where those histograms are not all gauge histograms.
This is a relatively rare use case (as you should commonly only aggregate gauge histograms), and the counter reset hint is not heavily used from the user perspective. Thus, I would not let this issue block declaring NH a stable feature, but just consider it a known bug. However, let's aim for fixing it in time. Therefore, marking it as P2 for now.
What did you expect to see?
A consistent and reproducible handling of those:
- If all counter reset hints are the same, the result has that same counter reset hint.
- Otherwise, if there is at least one
GaugeType, the outcome isGaugeType. - Otherwise, the outcome is
UnknownCounterReset. - If in any case (except (1)), there is a direct contradiction in the mix (
CounterResetvs.CounterReset), add a warn-level annotation.
What did you see instead? Under which circumstances?
The aggregation happen in pseudo-random order, and this leads to different outcomes. Furthermore, the avg calculation is using incremental mean calculation, which uses the Sub method of the FloatHistogram, which always sets the counter reset hint to GaugeType.
This results in the following behavior in the cases above:
- Works for
sum/sum_over_time, butavg/avg_over_timealways results inGaugeType` even if the counter reset hint shared by all involved histograms is a different one. - This works already.
- Works for
sum/sum_over_time, butavg/avg_over_timealways results inGaugeType`. - The warn-level annotation is omitted in some cases, because a "lucky" order of aggregation might turn the contradicting counter reset hints into
UnknownCounterResetorGaugeTypebefore they hit their counterpart.
System information
No response
Prometheus version
Prometheus configuration file
Alertmanager version
Alertmanager configuration file
Logs