Skip to content

PromQL: Labels of name "start" and "end" no longer allowed in grouping opts #9040

@claytonpeters

Description

@claytonpeters

What did you do?

Following an upgrade from 2.10.0 to 2.27.1, the alerting ruleset failed to load due to a supposed error in one of the rules which contained a reference to the label end. I have since re-created the issue within Prometheus itself with a smaller test case. The issue appears to stem from the introduction of the "@ modifier" in 2.25. If the label name start or end is referenced in a by(), ignoring(), group_left(), or group_right() clause then an error is presented. This also appears to cause errors if you have metrics named start or end as well.

Note that this occurs despite the promql-at-modifier not being enabled.

Following some testing, the following behaviour is seen:

# The following queries are arbitrary examples and give meaningless results, but demonstrate the bug:

# CORRECT: Both fine and work as expected
prometheus_http_request_duration_seconds_sum and on(job) up
prometheus_http_request_duration_seconds_sum and on(job, instance) up

# CORRECT: Also works with labels that don't exist
prometheus_http_request_duration_seconds_sum and on(job, this_is_fictional) up

# CORRECT: Label name can happily overlap with function name
prometheus_http_request_duration_seconds_sum and on(job, sum_over_time) up

# BROKEN: However, Label list CANNOT contain "start" or "end" in 2.27.1 (presumably since 2.25.0). Both work in 2.10.0
prometheus_http_request_duration_seconds_sum and on(job, start) up
prometheus_http_request_duration_seconds_sum and on(job, end) up

# BROKEN: This is true of ignoring(), group_left(), etc. - These all give the same error in 2.27.1, but give correct error messages in 2.20.0
prometheus_http_request_duration_seconds_sum and ignoring(foo, end) up
prometheus_http_request_duration_seconds_sum * on(job) group_left(end) up 
prometheus_http_request_duration_seconds_sum * on(job) group_left(start) up 

# CORRECT: You _can_ however happily match labels called start and end:
up{start!="foo",end!="bar"}

# BROKEN: You _cannot_ however have metrics called start and/or end:
start{job="foo"}
end{job="foo"}

What did you expect to see?

Prometheus should have loaded the alerting rules correctly, and parsed these particular rules correctly

What did you see instead? Under which circumstances?

If I use start or end as a reference to a label in the by(), ignoring(), group_left(), or group_right() functions, I get one of the following:

Error executing query: invalid parameter "query": 1:58: parse error: unexpected "start" in grouping opts, expected label
Error executing query: invalid parameter "query": 1:58: parse error: unexpected "end" in grouping opts, expected label

Environment

  • System information:
Linux 3.10.0-1160.11.1.el7.x86_64 x86_64
  • Prometheus version:
prometheus, version 2.27.1 (branch: HEAD, revision: db7f0bcec27bd8aeebad6b08ac849516efa9ae02)
  build user:       root@fd804fbd4f25
  build date:       20210518-14:17:54
  go version:       go1.16.4
  platform:         linux/amd64
  • Prometheus configuration file:

Not particularly relevant in this case, but the test instance I recreated on had pretty basic config:

global:
  scrape_interval: 30s
  evaluation_interval: 15s
  scrape_timeout: 5s
  external_labels: {}

rule_files:
  - "/config/rules.yml"

# Alerting: Disabled in the test instance
#alerting:
#  alertmanagers:
#    - static_configs:
#        - targets: []

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  • Logs:

Following the upgrade, we saw the following in the logs:

level=error ts=2021-07-01T21:40:45.362Z caller=manager.go:956 component="rule manager" msg="loading groups failed" err="/config/rules.yml: 222:11: group \"REDACTED\", rule 10, \"REDACTED\": could not parse expression: 1:162: parse error: unexpected \"end\" in grouping opts, expected label"
level=error ts=2021-07-01T21:40:45.362Z caller=main.go:977 msg="Failed to apply configuration" err="error loading rules, previous rule set restored"

When attempting to recreate the issue outside of the upgrade, the following errors can be generated in the Prometheus query UI when utilising the start and end label names:

Error executing query: invalid parameter "query": 1:58: parse error: unexpected "start" in grouping opts, expected label
Error executing query: invalid parameter "query": 1:58: parse error: unexpected "end" in grouping opts, expected label

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions