Skip to content

Infer types in Prometheus remote_write#19944

Merged
ChrsMark merged 26 commits intoelastic:masterfrom
ChrsMark:infer_types_remote_write
Jul 29, 2020
Merged

Infer types in Prometheus remote_write#19944
ChrsMark merged 26 commits intoelastic:masterfrom
ChrsMark:infer_types_remote_write

Conversation

@ChrsMark
Copy link
Copy Markdown
Member

@ChrsMark ChrsMark commented Jul 15, 2020

This PR add types for Prometheus metrics that are collected via remote_write metricset.
These metrics are in raw format when received so a heuristic approach is applied to extract types from metrics' name patterns.

Closes #17675.

TODOs

  • Implementation
  • Tests
  • Update docs

How to test this PR

Env setup

Use https://github.com/ChrsMark/docker-prometheus-playground for a ready to use prometheus env.
For remote write this config is needed: https://github.com/ChrsMark/docker-prometheus-playground/blob/master/prometheus/prometheus.yml#L10
If using a darwin machine the following should be enough:

remote_write:
- url: "http://host.docker.internal:9201/write"

On linux machines in order to access host from within the Prometheus container one can set network_mode: host in the docker-compose.yml and then host can be reach using 0.0.0.0.

Testing

  1. Use oss version and test collector and remote_write (without types) metricsets for regressions.
  2. Use basic version and test collector (with types) for regressions.
  3. Use basic version and test that remote_write metricset:
    a. uses types as expected.
    b. can identify types based on input patterns: https://github.com/elastic/beats/pull/19944/files?file-filters%5B%5D=.asciidoc#diff-b69e84fad87eb2ae563115f43675a547R118

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2020
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark force-pushed the infer_types_remote_write branch from 1d4c919 to c0c6a0f Compare July 15, 2020 14:00
@ChrsMark ChrsMark added the Team:Platforms Label for the Integrations - Platforms team label Jul 15, 2020
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2020
Signed-off-by: chrismark <chrismarkou92@gmail.com>
ChrsMark added 7 commits July 16, 2020 11:05
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/integrations-platforms (Team:Platforms)

@ChrsMark
Copy link
Copy Markdown
Member Author

This one is ready for a first round of review. Implementation is complete along with unit tests to cover all the metrics processing chain from the heuristic approach of extracting types to grouping metrics with same labesets together.

ChrsMark added 2 commits July 17, 2020 17:26
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark added test-plan Add this PR to be manual test plan [zube]: In Review and removed [zube]: In Progress labels Jul 21, 2020
ChrsMark added 2 commits July 22, 2020 12:48
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Copy Markdown
Member Author

jenkins run the tests please

ChrsMark added 2 commits July 27, 2020 10:32
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Copy Markdown
Member Author

@exekias I did some cleanups and it looks like this is ready for a final review so feel free to hit it once you get the chance.

Copy link
Copy Markdown
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how things are looking, I left a few minor comments!

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Copy link
Copy Markdown
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM although lint errors are legit, please review

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Copy link
Copy Markdown
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice feature and well documented 👍 I have added some small suggestions, nothing serious.

. `_count` suffix: the metric is of Counter type
. `_bucket` suffix and `le` in labels: the metric is of Histogram type

Everything else is handled as a Gauge.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is there a reason to default to gauge and not to counter?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we check if a metric is a counter or a histogram and if not then we conclude that is a gauge. This is the "heuristic". If we fall-back to counters then we will not have a way to identify gauges.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant if there was a reason to add a heuristic to detect counters and handle everything else as gauges and not the other way around. Maybe it is more difficult to find a heuristic to detect gauges 🤔

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Copy link
Copy Markdown
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments!

. `_count` suffix: the metric is of Counter type
. `_bucket` suffix and `le` in labels: the metric is of Histogram type

Everything else is handled as a Gauge.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant if there was a reason to add a heuristic to detect counters and handle everything else as gauges and not the other way around. Maybe it is more difficult to find a heuristic to detect gauges 🤔

const (
counterType = "counter_type"
histogramType = "histgram_type"
otherType = "other_type"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, it is not only about gauges 👍 Should we mention in some place that summaries are handled as gauges?

ChrsMark added 2 commits July 29, 2020 13:34
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Copy Markdown
Member Author

Thanks for the review @jsoriano!

Regarding #19944 (comment), it is easier to identify counters and histograms, gauge does not follow a specific pattern. This approach was also defined at #19944 (comment).

I also added a note in the docs about Summary handling 🍻 .

@ChrsMark ChrsMark merged commit b797a7e into elastic:master Jul 29, 2020
ChrsMark added a commit to ChrsMark/beats that referenced this pull request Jul 29, 2020
ChrsMark added a commit that referenced this pull request Jul 29, 2020
v1v added a commit to v1v/beats that referenced this pull request Jul 30, 2020
…ne-2.0

* upstream/master: (29 commits)
  Add an explicit system test for processes on unix systems (elastic#20320)
  [Autodiscovery] Ignore ErrInputNotFinished errors in autodiscover config checks (elastic#20305)
  [CI] Update README.md with CI references (elastic#20316)
  Add ECK doc links to Heartbeat docs (elastic#20284)
  [Filebeat] Add export tests to x-pack/filebeat (elastic#20156)
  feat(ci): support building docker images for PRs (elastic#20323)
  Update system tests dependency (elastic#20287)
  [Libbeat] Log debug message if the Kibana dashboard can not be imported from the archive (elastic#12211) (elastic#20150)
  [Filebeat][Gsuite] Transform all dates to timestamp with processor (elastic#20308)
  Infer types in Prometheus remote_write (elastic#19944)
  Remove unnecessary restarts of metricsets while using Node autodiscover (elastic#19974)
  docs: update changelog on master branch (elastic#20259)
  feat(ci): support storing artifacts for PRs in separate dirs (elastic#20282)
  [CI] Change upstream reference (elastic#20296)
  [Filebeat] Updates to Suricata module (elastic#20220)
  [docs] Fix Windows download link for agent (elastic#20258)
  [docs] Rename release highlights to what's new (elastic#20255)
  fix: update the display name of the multibranch job (elastic#20265)
  [Elastic Agent] Add basic protocol to control Elastic Agent. (elastic#20146)
  Cisco ASA: Fix message 106100 (elastic#20245)
  ...
@andresrc andresrc added the test-plan-added This PR has been added to the test plan label Oct 3, 2020
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
@zube zube bot removed the [zube]: Done label Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review Team:Platforms Label for the Integrations - Platforms team test-plan Add this PR to be manual test plan test-plan-added This PR has been added to the test plan v7.10.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infer types in Prometheus remote_write API

5 participants