Skip to content

promql(histograms): Fix counter reset hint handling when aggregating#17312

Merged
krajorama merged 1 commit intomainfrom
beorn7/histogram
Oct 10, 2025
Merged

promql(histograms): Fix counter reset hint handling when aggregating#17312
krajorama merged 1 commit intomainfrom
beorn7/histogram

Conversation

@beorn7
Copy link
Member

@beorn7 beorn7 commented Oct 9, 2025

Fixes #17308.

As explained adding the warn-annotation about conflicting counter reset hints doesn't happen consistently. Furthermore, because of incremental mean calculation being used so far (which includes subtraction), avg calculation always created gauge histograms.

The fix is to make Sub behave like Add WRT counter reset handling, and then set the result of a subtraction to gauge explicitly in actual PromQL subtraction (rather than using Sub for something else, like incremental mean calculation). Also, track the presence of a CounterReset hint and a NotCounterReset hint separately for the entirety of aggregated histograms and create the warn-annotation based on that.

As a minor fix, this commit also consistently creates the warn annotation in aggregation to be about "aggregation" rather than "subtraction" or "addition", because the latter are just internal operations within the aggregation, which is not of interest for the user.

Does this PR introduce a user-facing change?

[BUGFIX] Consistent handling of gauge vs. counter histograms in aggregations.

@beorn7 beorn7 requested a review from roidelapluie as a code owner October 9, 2025 00:23
@beorn7 beorn7 requested review from krajorama and removed request for roidelapluie October 9, 2025 00:23
@beorn7
Copy link
Member Author

beorn7 commented Oct 9, 2025

/cc @crush-on-anechka

Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting there, I think I understand the changes now. I have some nit comments and request some clarifying docstrings.

Fixes #17308.

As explained adding the warn-annotation about conflicting counter
reset hints doesn't happen consistently. Furthermore, because of
incremental mean calculation being used so far (which includes
subtraction), avg calculation always created gauge histograms.

The fix is to make Sub behave like Add WRT counter reset handling, and
then set the result of a subtraction to gauge explicitly in actual
PromQL subtraction (rather than using Sub for something else, like
incremental mean calculation). Also, track the presence of a
CounterReset hint and a NotCounterReset hint separately for the
entirety of aggregated histograms and create the warn-annotation based
on that.

As a minor fix, this commit also consistently creates the warn
annotation in aggregation to be about "aggregation" rather than
"subtraction" or "addition", because the latter are just internal
operations within the aggregation, which is not of interest for the
user.

Signed-off-by: beorn7 <beorn@grafana.com>
Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krajorama krajorama merged commit 8558b72 into main Oct 10, 2025
46 checks passed
@krajorama krajorama deleted the beorn7/histogram branch October 10, 2025 05:25
Comment on lines +1103 to +1108
switch h.H.CounterResetHint {
case histogram.CounterReset:
counterResetSeen = true
case histogram.NotCounterReset:
notCounterResetSeen = true
}
Copy link
Contributor

@charleskorn charleskorn Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be done for s.Histograms[0].H as well? Otherwise if the first histogram is CounterReset and the rest are NotCounterReset (or vice versa), then the annotation won't be emitted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I actually did this right for avg_over_time (and I vaguely remember that I even wrote a test for it, but now I'm confused because I run the same tests for avg_over_time and sum_over_time). I will investigate and create a fix with a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


counterResetCollision = hasCounterResetCollision(h, other)

h.CounterResetHint = GaugeType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore, bookmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PromQL (histograms): counter reset hint (AKA gauge vs. counter) is inconsistent when aggregating native histograms

3 participants