Skip to content

Prometheus Input fails with metric_version = 2 #8617

@johnseekins

Description

@johnseekins

Relevant telegraf.conf:

[global_tags]
  # dc = "us-east-1" # will tag all metrics with dc=us-east-1
  # rack = "1a"
  ## Environment variables can be used as tags, and throughout the config file
  # user = "$USER"
[agent]
  interval = "30s"
  round_interval = false
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "2s"
  flush_interval = "10s"
  flush_jitter = "5s"
  precision = ""
  debug = false
  quiet = false
  logfile = ""
  hostname = ""
  omit_hostname = false
[[inputs.prometheus]]
  ## An array of urls to scrape metrics from.
  urls = ["http://localhost:9000/minio/prometheus/metrics"]
  metric_version = 2
  name_override = "minio"
  tagexclude = ["url"]

System info:

$ lsb_release -rc
Release:	10
Codename:	buster

$ dpkg -l | grep telegraf
ii  telegraf                             1.17.0-1                              amd64        Plugin-driven server agent for reporting metrics into InfluxDB.

$ minio -v
minio version RELEASE.2020-12-12T08-39-07Z

Steps to reproduce:

  1. Install telegraf and minio on a host
  2. Attempt to scrape prometheus stats from minio using telegraf with metric_version = 2.
$ sudo -u telegraf telegraf --test --config-directory /etc/telegraf/telegraf.d/ --input-filter prometheus
2020-12-24T17:51:35Z I! Starting Telegraf 1.17.0
2020-12-24T17:51:35Z I! Using config file: /etc/telegraf/telegraf.conf
2020-12-24T17:51:35Z E! [inputs.prometheus] Error in plugin: error reading metrics for http://localhost:9000/minio/prometheus/metrics: reading text format failed: text format parsing error in line 1: expected float as value, got ""
2020-12-24T17:51:35Z E! [telegraf] Error running agent: input plugins recorded 1 errors
  1. Switch to metric_version = 1 and stats will be collected, although the naming is then messed up.

Expected behavior:

  1. When scraping stats from minio's prometheus status page, telegraf fails to parse the output values
  2. If I set metric_version = 1 in the prometheus input, the stats are collected.
  3. This wasn't the case before 1.17.0. metric_version = 2 has been working well for quite some time.

Actual behavior:

Described above.

Additional info:

This is also occurring with other prometheus client endpoints (traefik, for example) on another Debian 10 system.

Example minio stats pulled from the host:

$ curl -Ss localhost:9000/minio/prometheus/metrics | head
# HELP disk_storage_available Total available space left on the disk
# TYPE disk_storage_available gauge
disk_storage_available{disk="/data/Blobstore"} 1.1720781201408e+13
# HELP disk_storage_total Total space on the disk
# TYPE disk_storage_total gauge
disk_storage_total{disk="/data/Blobstore"} 1.6001493008384e+13
# HELP disk_storage_used Total disk storage used on the disk
# TYPE disk_storage_used gauge
disk_storage_used{disk="/data/Blobstore"} 4.280711806976e+12
# HELP go_gc_duration_seconds A summary of the GC invocation durations.

It seems clear to me those are valid prometheus stats. What's happening here?

Metadata

Metadata

Assignees

Labels

area/prometheusbugunexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions