Skip to content

Update kubernetes apiserver metrics and dashboard#31973

Merged
MichaelKatsoulis merged 9 commits intoelastic:mainfrom
MichaelKatsoulis:revisit-apiserver-metrics
Jun 22, 2022
Merged

Update kubernetes apiserver metrics and dashboard#31973
MichaelKatsoulis merged 9 commits intoelastic:mainfrom
MichaelKatsoulis:revisit-apiserver-metrics

Conversation

@MichaelKatsoulis
Copy link
Copy Markdown
Contributor

@MichaelKatsoulis MichaelKatsoulis commented Jun 17, 2022

What does this PR do?

Update kubernetes apiserver metrics and dashboard

Why is it important?

As described in #31834 apiserver metricset collects deprecated metrics from apiserver prometheus endpoint. These are:

  • http_request_duration_microseconds
  • http_request_size_bytes
  • http_response_size_bytes
  • http_requests_total
  • apiserver_request_latencies
  • etcd_object_counts
  • client

Some of those values can be taken instead from different prometheus fields:

  • http_request_duration_microseconds ---> None
  • http_response_size_bytes ---> None
  • http_requests_total ---> None
  • apiserver_request_latencies ---> apiserver_request_duration_seconds
  • http_request_size_bytes ---> None
  • etcd_object_counts ---> apiserver_storage_objects

Also apiserver_watch_events_sizes and apiserver_response_sizes are interesting metrics we where not collecting.

As part of this PR the following elasticsearch fields have been dropped (they where null in the latest kubernetes versions (after 1.20))

  • kubernetes.apiserver.http.*
  • kubernetes.apiserver.request.latency.*
  • kubernetes.request.client

and new fields have been added:

  • kubernetes.apiserver.watch.events.kind
  • kubernetes.apiserver.watch.events.size.bytes.*
  • kubernetes.apiserver.response.size.bytes.*

Also the OOTB dashboards where broken because they where using deprecated fields. They have been updated.

apiserver

Checklist

  • My code follows the style guidelines of this project

  • I have commented my code, particularly in hard-to-understand areas

  • I have made corresponding changes to the documentation

  • I have made corresponding change to the default configuration files

  • I have added tests that prove my fix is effective or that my feature works

  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

  • Closes Kubernetes Apiserver metricset collects deprecated metrics #31834

How to test this PR locally

Note

Bullet 3 of #31834 (comment). for storing prometheus histogram types as histograms in elasticsearch will be part of a follow up PR as the implementation needs further investigation. It requires a more generic solution and it is not only for apiserver.

Also removing the following elasticsearch fields as part of this PR should not be considered as a breaking changes, but as a bug fix as those fields where no longer populated to ES.

  • kubernetes.apiserver.http.*
  • kubernetes.apiserver.request.latency.*
  • kubernetes.request.client

@MichaelKatsoulis MichaelKatsoulis requested a review from a team as a code owner June 17, 2022 12:03
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 17, 2022
@MichaelKatsoulis MichaelKatsoulis requested a review from a team June 17, 2022 12:04
@MichaelKatsoulis MichaelKatsoulis marked this pull request as draft June 17, 2022 12:04
@MichaelKatsoulis MichaelKatsoulis added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Jun 17, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 17, 2022
@MichaelKatsoulis MichaelKatsoulis removed the request for review from a team June 17, 2022 12:06
@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Jun 17, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-06-21T09:44:55.569+0000

  • Duration: 58 min 10 sec

Test stats 🧪

Test Results
Failed 0
Passed 3526
Skipped 873
Total 4399

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@MichaelKatsoulis MichaelKatsoulis marked this pull request as ready for review June 20, 2022 11:53
Copy link
Copy Markdown
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It's good to see such cleanups in the codebase!

- make `system/filesystem` code sensitive to `hostfs` and migrate libraries to `elastic-agent-opts` {pull}31001[31001]
- Fix kubernetes module's internal cache expiration issue. This avoid metrics like `kubernetes.container.cpu.usage.limit.pct` from not being populated. {pull}31785[31785]
- add missing HealthyHostCount and UnHealthyHostCount for application ELB. {pull}31853[31853]
- update kubernetes apiserver metricset to not collect deprecated metrics and fix dashboard
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR number?


rcPost14 := false
for _, event := range events {
if ok, _ := event.HasKey("request.count"); ok {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really nice cleanup! 🍻

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was ugly




*`kubernetes.apiserver.request.client`*::
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note in the PR's description that it's not a breaking change but a bug fix instead to avoid future confusions.

@MichaelKatsoulis MichaelKatsoulis merged commit b0bbd16 into elastic:main Jun 22, 2022
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* Update kubernetes apiserver metrics and dashboard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kubernetes Apiserver metricset collects deprecated metrics

3 participants