[Windows] Add metric_type mapping for the fields of `service` datastream. by ritalwar · Pull Request #7200 · elastic/integrations

ritalwar · 2023-08-01T05:00:07Z

Enhancement

What does this PR do?

This PR adds metric type mapping for the fields of service datastream.

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have verified that all data streams collect metrics or logs.
I have added an entry to my package's changelog.yml file.
I have verified that Kibana version constraints are current according to guidelines.
Relates Windows TSDB Enablement #6993

Screenhots

Refer: #6993

elasticmachine · 2023-08-01T09:20:34Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-08-28T11:35:51.121+0000
Duration: 20 min 47 sec

Test stats 🧪

Test	Results
Failed	0
Passed	150
Skipped	0
Total	150

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

elasticmachine · 2023-08-01T09:20:45Z

🌐 Coverage report

Name	Metrics % (`covered/total`)	Diff
Packages	100.0% (`8/8`)	💚
Files	91.667% (`11/12`)	👍
Classes	91.667% (`11/12`)	👍
Methods	85.156% (`109/128`)	👍
Lines	92.459% (`5726/6193`)	👍
Conditionals	100.0% (`0/0`)	💚

agithomas · 2023-08-01T10:19:19Z

packages/windows/data_stream/service/fields/fields.yml

    - name: uptime.ms
      type: long
      format: duration
+      metric_type: gauge


Should uptime be a gauge/counter is a common doubt. LGTM!

Since the uptime is not cumulative and continuously increases without any resets, it is more appropriate to represent it as a gauge.

Since the uptime is not cumulative and continuously increases without any resets

That sounds like a counter right? @felixbarny any thoughts?

It does indeed sound like a counter if the following assumptions are true:

The value is monotonically incrementing over time

It resets when the service restarts

However, it's somewhat different from other counters in that it wouldn't make sense to visualize the rate of that counter as the rate will always be the elapsed time: In a 60s interval, the value will increase by 60, so the rate will just be a flat line. But it still seems like a counter.

Are we visualizing the uptime in any way? If so, how?

I had a look at what is used across various packages for assigning metric_type for uptime metrics. The distribution goes as below.

Gauge

Redis

GCP Redis

GCP Compute

HA Proxy

Memcached

Influxdb

Elasticsearch (JVM max uptime)

Mongodb

Couchbase

Counter

Apache

Elastic Package Registry

System (uptime datastream)

AWS (RDS)

So, we may have a lack of consistency here. But, as uptime datastream of system package already considers metric_type as counter, in the absence of a clear source of truth, uptime.ms of this (service) datastream can be assigned counter type. This ensures there exists consistency within the same package.

Isn't that the definition of monotonically increasing?

Yes, I just wanted to rule out any confusion associated with continuous or monotonically increasing increments, making it a "counter."

It resets as it restarts, right?

Yes.

Does it really matter if we define it as a counter or a gauge in this specific scenario? Excluding the monotonicity property of a counter I think when it comes to deciding if a metric is a counter or a gauge the question we need to ask is, for instance: does it make sense to calculate a sum (or average or...) aggregate over that metric? Or do we need to first calculate a rate and then aggregate? Also imagine to use the uptime in a computation...for instance you divide a quantity by the uptime to get some kind of rate (over time, for instance average number of bytes processed by a host in a certain (up)time window). In that case you would need to use the difference between two values of the uptime (at t1 and t2)...dividing just by t1 or t2 does not make sense, right? For a gauge you would need to sum all values between t1 and t2, to account for possible negative values... Which means that a measure as a point in time value does not make sense. So, in my opinion uptime is a counter.

I'm +1 for counter.

There are two aspects to a counter:

Monotonically increasing between resets - true for uptime

Discrete - theoretically that's true for classical counter use cases (e.g. number of requests for a webpage) and false for uptime. In practice everything is discrete in our current computing systems so it doesn't matter (we always count seconds or ms or some unit of time). The problem is that it is not intuitive to users because they think of time as continuous and not discrete. On the plus side they will get the right visualization and behavior, because practically this data is exactly like counter data. I think the upside out-weights the downside in this case.

Following the offline discussions, we decided to display "uptime" metrics as gauge. This choice comes from discussing the best metric type. We realized that using a counter for uptime could make calculating changes over time difficult. By considering the idea of "temporality," which is about reporting metrics as cumulative or delta values, we agreed that uptime, being a distinct metric, should be shown as a gauge. This way, it fits well with the immediate and non-negative nature of uptime values.

agithomas

LGTM!

packages/windows/changelog.yml

Dismissing as the PR link is not correct.

agithomas

LGTM!

elasticmachine · 2023-08-28T12:50:10Z

Package windows - 1.34.1 containing this change is available at https://epr.elastic.co/search?package=windows

[Windows] Add metric_type mapping for the fields of datastream.

bb06c9e

ritalwar requested review from a team as code owners August 1, 2023 05:00

ritalwar requested review from agithomas, belimawr, fearful-symmetry and lalit-satapathy August 1, 2023 05:00

ritalwar mentioned this pull request Aug 1, 2023

Windows TSDB Enablement #6993

Closed

6 tasks

Merge branch 'main' into windows_tsdb_metrictype_service_6993

079cf67

agithomas reviewed Aug 1, 2023

View reviewed changes

agithomas previously approved these changes Aug 1, 2023

View reviewed changes

agithomas reviewed Aug 1, 2023

View reviewed changes

packages/windows/changelog.yml Outdated Show resolved Hide resolved

Update changelog.yml

f098521

agithomas approved these changes Aug 1, 2023

View reviewed changes

Merge branch 'main' into windows_tsdb_metrictype_service_6993

9637469

andrewkroh added the Integration:windows Windows label Aug 1, 2023

Update metric_type to counter for uptime.ms

a6f9f15

belimawr approved these changes Aug 4, 2023

View reviewed changes

ritalwar and others added 3 commits August 17, 2023 11:42

Merge branch 'main' into windows_tsdb_metrictype_service_6993

dfedef9

Update metric_type value for uptime.

b0ae485

Merge branch 'main' into windows_tsdb_metrictype_service_6993

04e70b8

harnish-crest-data approved these changes Aug 22, 2023

View reviewed changes

ritalwar mentioned this pull request Aug 25, 2023

Changing uptime metric_type from counter to gauge across all packages for consistent mapping #7538

Closed

ritalwar and others added 2 commits August 28, 2023 15:46

Merge branch 'main' into windows_tsdb_metrictype_service_6993

ebf5218

Update README.md

28914fc

ritalwar merged commit fecc5c6 into elastic:main Aug 28, 2023

Conversation

ritalwar commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist

Screenhots

Uh oh!

elasticmachine commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

Build stats

Test stats 🧪

🤖 GitHub comments

Uh oh!

elasticmachine commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🌐 Coverage report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agithomas Aug 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna Aug 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agithomas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agithomas left a comment

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Aug 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

ritalwar commented Aug 1, 2023 •

edited

Loading

elasticmachine commented Aug 1, 2023 •

edited

Loading

elasticmachine commented Aug 1, 2023 •

edited

Loading

agithomas Aug 2, 2023 •

edited

Loading

salvatore-campagna Aug 2, 2023 •

edited

Loading