-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Description
A target fully supporting OpenMetrics will expose ..._created lines for each metric child. The way Prometheus currently ingests OpenMetrics, each of those will create an additional time series. The value is (mostly) constant and will therefore compress well. However, the number of time series will (almost) double, which is a significant bump in resource usage. This is in harsh contrast to the almost non-existent usefulness of those time series in the current Prometheus context.
Prometheus negotiates OpenMetrics by default (and that behavior cannot be disabled). Thus, with each target starting to support OpenMetrics, the number of time series will increase. This will come as a surprise to the operator of the Prometheus server if the OpenMetrics support is just a side effect of upgrading to a new version of a 3rd-party-supported target. For example, once K8s components expose OpenMetrics including the ..._created, a routine K8s upgrade will suddenly start to expose all those ..._created lines.
While technically, a metrics change is something operators should take into account when upgrading targets, I expect a lot of confusion and surprise and even monitoring outages if we don't handle the change in a more robust way. See prometheus/client_python#438 for reactions when the Python client started to support OpenMetrics.
The currently recommended action is probably to add metric_relabel_configs to drop all metrics with a name ending on _created. This has a number of issues:
- It is opt-in, and it is so in a very non-obvious way. Even if we advertise this practice aggressively, I'd expect many operators to not notice.
- It will drop all metrics with a name ending on
_created, even those that are not auto-created but regular metrics. - It has a more or less significant performance impact.
I propose to handle the ..._created lines in a different way.
The minimal option would be to automatically ignore those lines if all the following is true:
- It's an OpenMetrics exposition.
- The
..._createdline has a corresponding "proper" metric (e.g.foobar_createdgoes along withfoobarorfoobar_totalorfoobar_sum/foobar_countetc).
The above will avoid dropping regular metrics that happen to have a name ending on _created.
However, perhaps we can do even better and actually make use of the ..._creaed lines: If the creation timestamp passes certain sanity checks (earlier than the scrape time, but not too much), we can artificially insert "zero" samples for counters, histograms, and summaries that spring into existence with a value greater than zero. This would finally provide a solution to the long-standing problem described in #1673 .
Metadata
Metadata
Assignees
Type
Projects
Status