Skip to content

scrape (histograms): Automatically reduce resolution rather than fail scrape #12864

@beorn7

Description

@beorn7

Proposal

Currently, we fail the entire scrape if the configured bucket limit is exceeded in a native histogram (scrape config option native_histogram_bucket_limit). It would be better if we simply reduced the resolution (in other words: decreased the schema) sufficiently to be within the native_histogram_bucket_limit. This is computationally easy (just adding up neighboring buckets).

Once we have done this, it might also be useful to add another scrape config option like native_histogram_max_schema to limit the resolution overall (and automatically lower the resolution of scraped histograms accordingly). This avoids needlessly storing histograms at a higher resolution than needed (but gives other scrapers still the option to scrape the histogram at the full resolution provided by the target).

This is particularly relevant with OTel exponential histograms, as they follow the strategy to start with a very high resolution and only reduce it once the bucket limit is reached. While this approach minimizes the config required by the user, it creates a lot of "resolution change noise". On the backend side, there is usually a good idea of a maximum useful resolution (e.g. because histograms get aggregated over time or across labels anyway, thereby using the lowest common resolution). Paying the storage resources to store the frequent resolution changes as well as the short periods of higher resolution should be avoided.

(For reference, Mimir has a similar feature request, but I think this is definitely useful here in upstream Prometheus.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions