Functions for range and subquery range_start_ts(), range_end_ts(), range_span_seconds(), range_step_seconds()

### Proposal

To make it easier to write range queries without relying so heavily on client-side substitution of time range and duration values, it would be extremely helpful to have functions that evaluate to the range query (or subquery)'s start-timestamp, end-timestamp, and most importantly step duration in seconds.

## The problem

Presently, if I have a client executing a range query in 30s steps over a 10m time span, then executing it in 1h steps over a 24h time span, I have no way to make the PromQL itself adapt to this. This is most important for the step duration.

E.g. if I want a range query showing the periods in which containers are unready, I can write:

```promql
# find containers that were unready over any part of the range
group by (uid, container) (
  min_over_time(kube_pod_container_status_ready[10m] @ end()) == 0
)
# Only report range query results for containers that were unready at any point
# in the overall range
* on (uid, container)
  group_right()
  min by (uid, container, pod, namespace) (
      # only examine container readiness states for samples strictly in the time
      # period of this range step.
      avg_over_time(kube_pod_container_status_ready[30s])
  )
```

... but I have to hard-code the range duration (`10m`) and step (`30s`) in the query itself, or rely on client-side query substitution. And as there's no standard for such substitution this makes the queries harder to author in a manner that's portable between rules, dashboards, etc.

## Proposed solution

I want to instead be able to use the currently-nonexistent function-like expressions `range_step_seconds()` and `range_span_seconds()` to make the query adapt to the execution context. 

### Spec

The following expressions would be evaluated to float values in units of seconds (time spans or unix epoch timestamps as appropriate):

* `range_step_seconds()` - evaluates to the step size in seconds for the range query *or subquery* currently executing
* `range_span_seconds()` - evaluates to the total time-span covered by the current range query *or subquery*
* `range_start_ts()` - evaluates to the start timestamp for the current range query *or subquery* ; not the same as `@ start()` because it can be used outside modifier context, and it works within subqueries in instant queries
* `range_end_ts()` -  evaluates to the end timestamp for the current range query *or subquery*; same comments as start

*`range_step_start_ts()` -  evaluates to the start timestamp for the current range query or subquery's step, the timestamp at which `metric{}[range_step_seconds()]` begins
* `range_step_end_ts()` -  evaluates to the end timestamp for the current range query or subquery's step, the timestamp at which `metric{}[range_step_seconds()]` ends
* `range_step_count()` - convenience for `range_span_seconds() / range_step_seconds()`

`range_end_ts()` is distinct from `timestamp(last_over_time(up{}[range_span_seconds()]))` in that it (a) is cheaper and (b) returns the exact timestamp of the end-of-range, not the timestamp of the sample timestamp closest to the end of the range. And `range_start_ts()` is even more important due to PromQL's lack of `first_over_time(...)`.

`range_step_start_ts()` and `range_step_end_ts()` should be usable in `@` modifiers and anywhere a float-literal is accepted.

`range_step_seconds()` and `range_span_seconds()` should be usable:

* in range selectors e.g. `foo[range_span_seconds()]`, `foo[range_step_seconds()]`
* in subquery range specifiers e.g. `(expr)[1h:range_step_seconds()]`, `(expr)[range_span_seconds():5s]`, `(expr)[1h:range_step_seconds()]`, `(expr)[range_span_seconds():range_step_seconds()]` 
* where float literals are accepted e.g. `foo / range_span_seconds()`
* in `offset` modifiers e.g. `offset range_span_seconds()`

`range_start_ts()`, `range_end_ts()`, `range_step_end_ts()` and `range_step_end_ts()` should be usable:

* in `@` modifiers
* anywhere a float literal is legal

### Details and examples

It would look like this:

```promql
# find containers that were unready over any part of the range
group by (uid, container) (
  min_over_time(kube_pod_container_status_ready[range_span_seconds()] @ end()) == 0
)
# Only report range query results for containers that were unready at any point
# in the overall range
* on (uid, container)
  group_right()
  min by (uid, container, pod, namespace) (
      # only examine container readiness states for samples strictly in the time
      # period of this range step.
      avg_over_time(kube_pod_container_status_ready[range_step_seconds()])
  )
```

Since these are not true functions they'd need some special case handling in the parser etc. This also means they cannot be implemented as Prometheus user-defined extension functions.

The reason I don't propose `$range_step_seconds` etc is to avoid conflicts with the `$var` syntax commonly used by tools like Grafana and by Prometheus's own env-var variable expansion features.

Importantly, unlike `@ end()` and `@ start()` these should be evaluated _lazily_ in a range subquery context within an instant query, so they are meaningful when used in instant queries like

```promql
# number of times containers were unready for more than 60 contiguous seconds
count_over_time(
  (
    max_over_time(kube_pod_container_status_ready[range_step_seconds()]) == 0
  )[30m:1m]
)
```

... where the inner expression should not need to repeat the outer expression's step and span. That gets time consuming and hard to maintain as query complexity grows, and makes it much harder to maintain reusable libraries of PromQL expression snippets.

`range_step_seconds()`, `range_duration_seconds()`, `range_start_ts()`, `range_end_ts()` etc should expand to float scalar literals, so they can also be used in scalar expressions that compute things like time-percentages of the range's duration or the current step, like

```
# r
sum(
  count_over_time(
    scrape_samples_scraped[range_step_seconds()]
  )
) / (range_step_seconds())
```

## Unsolved issues

While it'd be very helpful to be able to use this in subqueries, especially when they're used in an outer instant query, the question then arises of .... which subquery? When there are nested subqueries, or where there are subqueries used within range queries, which range should be used for the step, duration, etc?

Ideally this would probably use some kind of label system, so the desired level of evaluation could be labelled, then the label passed to the step pseudo-function. But this would be adding way more complexity and syntax change, so it doesn't make sense to roll into this proposal.

In the absence of that, should these expressions evaluate by default in the inner or outer nesting of range queries or subqueries? Or is it necessary to support something like `range_span_seconds(inner)` and `range_span_seconds(outer)` from the start? I suspect that evaluating them in the outer-most range is the simplest to implement, as evaluating in the inner-most range requires query analysis to "stop" evaluation at the boundary of an inner nested subquery.

## Implementation advice

If this idea is broadly appreciated I would be interested in attempting to implement it. Some advice on possible pitfalls and challenges, the appropriate subsystem/layer, and where to start looking would be welcomed, though I'd likely figure it out otherwise.

## Related discussions

* https://github.com/prometheus/prometheus/issues/15862#issuecomment-2618506484
* https://docs.google.com/document/d/1jMeDsLvDfO92Qnry_JLAXalvMRzMSB1sBr9V7LolpYM/edit?usp=sharing
* https://github.com/prometheus/prometheus/issues/15838 - `start()` and `end()` don't work in subqueries in instant queries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functions for range and subquery range_start_ts(), range_end_ts(), range_span_seconds(), range_step_seconds() #16962

Proposal

The problem

Proposed solution

Spec

Details and examples

Unsolved issues

Implementation advice

Related discussions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Functions for range and subquery range_start_ts(), range_end_ts(), range_span_seconds(), range_step_seconds() #16962

Description

Proposal

The problem

Proposed solution

Spec

Details and examples

Unsolved issues

Implementation advice

Related discussions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions