Skip to content

Functions for range and subquery range_start_ts(), range_end_ts(), range_span_seconds(), range_step_seconds() #16962

@ringerc

Description

@ringerc

Proposal

To make it easier to write range queries without relying so heavily on client-side substitution of time range and duration values, it would be extremely helpful to have functions that evaluate to the range query (or subquery)'s start-timestamp, end-timestamp, and most importantly step duration in seconds.

The problem

Presently, if I have a client executing a range query in 30s steps over a 10m time span, then executing it in 1h steps over a 24h time span, I have no way to make the PromQL itself adapt to this. This is most important for the step duration.

E.g. if I want a range query showing the periods in which containers are unready, I can write:

# find containers that were unready over any part of the range
group by (uid, container) (
  min_over_time(kube_pod_container_status_ready[10m] @ end()) == 0
)
# Only report range query results for containers that were unready at any point
# in the overall range
* on (uid, container)
  group_right()
  min by (uid, container, pod, namespace) (
      # only examine container readiness states for samples strictly in the time
      # period of this range step.
      avg_over_time(kube_pod_container_status_ready[30s])
  )

... but I have to hard-code the range duration (10m) and step (30s) in the query itself, or rely on client-side query substitution. And as there's no standard for such substitution this makes the queries harder to author in a manner that's portable between rules, dashboards, etc.

Proposed solution

I want to instead be able to use the currently-nonexistent function-like expressions range_step_seconds() and range_span_seconds() to make the query adapt to the execution context.

Spec

The following expressions would be evaluated to float values in units of seconds (time spans or unix epoch timestamps as appropriate):

  • range_step_seconds() - evaluates to the step size in seconds for the range query or subquery currently executing
  • range_span_seconds() - evaluates to the total time-span covered by the current range query or subquery
  • range_start_ts() - evaluates to the start timestamp for the current range query or subquery ; not the same as @ start() because it can be used outside modifier context, and it works within subqueries in instant queries
  • range_end_ts() - evaluates to the end timestamp for the current range query or subquery; same comments as start

*range_step_start_ts() - evaluates to the start timestamp for the current range query or subquery's step, the timestamp at which metric{}[range_step_seconds()] begins

  • range_step_end_ts() - evaluates to the end timestamp for the current range query or subquery's step, the timestamp at which metric{}[range_step_seconds()] ends
  • range_step_count() - convenience for range_span_seconds() / range_step_seconds()

range_end_ts() is distinct from timestamp(last_over_time(up{}[range_span_seconds()])) in that it (a) is cheaper and (b) returns the exact timestamp of the end-of-range, not the timestamp of the sample timestamp closest to the end of the range. And range_start_ts() is even more important due to PromQL's lack of first_over_time(...).

range_step_start_ts() and range_step_end_ts() should be usable in @ modifiers and anywhere a float-literal is accepted.

range_step_seconds() and range_span_seconds() should be usable:

  • in range selectors e.g. foo[range_span_seconds()], foo[range_step_seconds()]
  • in subquery range specifiers e.g. (expr)[1h:range_step_seconds()], (expr)[range_span_seconds():5s], (expr)[1h:range_step_seconds()], (expr)[range_span_seconds():range_step_seconds()]
  • where float literals are accepted e.g. foo / range_span_seconds()
  • in offset modifiers e.g. offset range_span_seconds()

range_start_ts(), range_end_ts(), range_step_end_ts() and range_step_end_ts() should be usable:

  • in @ modifiers
  • anywhere a float literal is legal

Details and examples

It would look like this:

# find containers that were unready over any part of the range
group by (uid, container) (
  min_over_time(kube_pod_container_status_ready[range_span_seconds()] @ end()) == 0
)
# Only report range query results for containers that were unready at any point
# in the overall range
* on (uid, container)
  group_right()
  min by (uid, container, pod, namespace) (
      # only examine container readiness states for samples strictly in the time
      # period of this range step.
      avg_over_time(kube_pod_container_status_ready[range_step_seconds()])
  )

Since these are not true functions they'd need some special case handling in the parser etc. This also means they cannot be implemented as Prometheus user-defined extension functions.

The reason I don't propose $range_step_seconds etc is to avoid conflicts with the $var syntax commonly used by tools like Grafana and by Prometheus's own env-var variable expansion features.

Importantly, unlike @ end() and @ start() these should be evaluated lazily in a range subquery context within an instant query, so they are meaningful when used in instant queries like

# number of times containers were unready for more than 60 contiguous seconds
count_over_time(
  (
    max_over_time(kube_pod_container_status_ready[range_step_seconds()]) == 0
  )[30m:1m]
)

... where the inner expression should not need to repeat the outer expression's step and span. That gets time consuming and hard to maintain as query complexity grows, and makes it much harder to maintain reusable libraries of PromQL expression snippets.

range_step_seconds(), range_duration_seconds(), range_start_ts(), range_end_ts() etc should expand to float scalar literals, so they can also be used in scalar expressions that compute things like time-percentages of the range's duration or the current step, like

# r
sum(
  count_over_time(
    scrape_samples_scraped[range_step_seconds()]
  )
) / (range_step_seconds())

Unsolved issues

While it'd be very helpful to be able to use this in subqueries, especially when they're used in an outer instant query, the question then arises of .... which subquery? When there are nested subqueries, or where there are subqueries used within range queries, which range should be used for the step, duration, etc?

Ideally this would probably use some kind of label system, so the desired level of evaluation could be labelled, then the label passed to the step pseudo-function. But this would be adding way more complexity and syntax change, so it doesn't make sense to roll into this proposal.

In the absence of that, should these expressions evaluate by default in the inner or outer nesting of range queries or subqueries? Or is it necessary to support something like range_span_seconds(inner) and range_span_seconds(outer) from the start? I suspect that evaluating them in the outer-most range is the simplest to implement, as evaluating in the inner-most range requires query analysis to "stop" evaluation at the boundary of an inner nested subquery.

Implementation advice

If this idea is broadly appreciated I would be interested in attempting to implement it. Some advice on possible pitfalls and challenges, the appropriate subsystem/layer, and where to start looking would be welcomed, though I'd likely figure it out otherwise.

Related discussions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions