Skip to content

metrics/counter: wrap values over 2^53#139

Merged
olix0r merged 1 commit intolinkerd:masterfrom
lucab:ups/metrics-counter-u53
Nov 18, 2018
Merged

metrics/counter: wrap values over 2^53#139
olix0r merged 1 commit intolinkerd:masterfrom
lucab:ups/metrics-counter-u53

Conversation

@lucab
Copy link
Contributor

@lucab lucab commented Nov 17, 2018

This implements Prometheus reset semantics for counters, in order to
preserve precision when deriving rate of increase.
Wrapping is based on the fact that Prometheus models counters as f64
(52-bits mantissa), thus integer values over 2^53 are not guaranteed to
be correctly exposed.

This implements Prometheus reset semantics for counters, in order to
preserve precision when deriving rate of increase.
Wrapping is based on the fact that Prometheus models counters as `f64`
(52-bits mantissa), thus integer values over 2^53 are not guaranteed to
be correctly exposed.

Signed-off-by: Luca Bruno <luca.bruno@coreos.com>
@lucab lucab force-pushed the ups/metrics-counter-u53 branch from 3c87461 to a3df62a Compare November 17, 2018 11:33
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @lucab! This looks great.

@wmorgan
Copy link
Member

wmorgan commented Nov 17, 2018

@lucab nice!

@olix0r olix0r merged commit 4f9adf9 into linkerd:master Nov 18, 2018
@lucab
Copy link
Contributor Author

lucab commented Nov 19, 2018

@olix0r late followup to this. Derived metrics are now precise but this is discarding the actual counter magnitude when the counter reaches ~8 Pebi-units.

For comparison, Go Prometheus library tries to tackle this in a peculiar way, which is still going to loose precision at exposition time due to the uint64 -> float64 conversion.

Speaking with some other monitoring folks in the office today, it looks like it would make sense for the counter to store two values internally (current wrapping-integer and number of wraps), and serialize two metrics at exposition time (normal counter metric plus a _magnitude one). What are your thoughts on that?

@olix0r
Copy link
Member

olix0r commented Nov 19, 2018

@lucab That sounds reasonable, though I'm not sure if the storage overhead is worth it. Specifically, it would be unfortunate to have to export a _magnitude for each bucket in a histogram, as this would effectively double to cost of exporting a histogram for a marginal operational benefit --- if my napkin math is right, a counter incremented once per millisecond would not wrap for ~285,000 years.

@lucab
Copy link
Contributor Author

lucab commented Nov 21, 2018

No indeed. My bad, I didn't see histograms reuse this exposition path, let's leave it as is for the now then. This doesn't concern much the single-increments counters, but more the "quantity adding" ones (data-transfer counters and similar metering).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants