Skip to content

Cherry-pick #21113 to 7.x: libbeat/cmd/instance: report cgroup stats#21334

Merged
axw merged 2 commits intoelastic:7.xfrom
axw:backport_21113_7.x
Sep 30, 2020
Merged

Cherry-pick #21113 to 7.x: libbeat/cmd/instance: report cgroup stats#21334
axw merged 2 commits intoelastic:7.xfrom
axw:backport_21113_7.x

Conversation

@axw
Copy link
Copy Markdown
Member

@axw axw commented Sep 26, 2020

Cherry-pick of PR #21113 to 7.x branch. Original message:

What does this PR do?

Report cgroup limits/stats on Linux, similar to what Elasticsearch reports through node stats: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html

Metric names are based on (but not exactly the same as) the system.process.cgroup.* fields.

Why is it important?

This is important for reporting accurate resource usage in containerised environments.

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

I couldn't see docs or tests to update - please point me to them if there are any.

How to test this PR locally

  1. Build and run a beat on Linux, with monitoring enabled
  2. Observe that cgroup metrics are reported

Related issues

Closes #14691

* libbeat/cmd/instance: report cgroup stats

(cherry picked from commit b4c7a93)
@axw axw added [zube]: In Review backport Team:Integrations Label for the Integrations team labels Sep 26, 2020
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 26, 2020
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 26, 2020
@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Sep 26, 2020

💔 Tests Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #21334 updated]

  • Start Time: 2020-09-29T01:16:40.843+0000

  • Duration: 73 min 39 sec

Test stats 🧪

Test Results
Failed 6
Passed 15556
Skipped 1293
Total 16855

Test errors

Expand to view the tests failures

  • Name: Build&Test / x-pack/elastic-agent-build / TestFleetGateway – application

    • Age: 1
    • Duration: 0
    • Error Details: Failed
  • Name: Build&Test / x-pack/elastic-agent-build / TestFleetGateway/send_no_event_and_receive_no_action – application

    • Age: 1
    • Duration: 0
    • Error Details: Failed
  • Name: Build&Test / x-pack/elastic-agent-build / TestFleetGateway/Successfully_connects_and_receives_a_series_of_actions – application

    • Age: 1
    • Duration: 0
    • Error Details: Failed
  • Name: Build&Test / x-pack/elastic-agent-build / TestFleetGateway/Periodically_communicates_with_Fleet – application

    • Age: 1
    • Duration: 0
    • Error Details: Failed
  • Name: Build&Test / libbeat-build / TestOutputReload – pipeline

    • Age: 1
    • Duration: 67.1
    • Error Details: Failed
  • Name: Build&Test / libbeat-build / TestOutputReload/client – pipeline

    • Age: 1
    • Duration: 36.18
    • Error Details: Failed

Steps errors

Expand to view the steps failures

  • Name: mage build test

    • Description: mage build test

    • Duration: 21 min 13 sec

    • Start Time: 2020-09-29T01:51:18.347+0000

    • log

  • Name: Notifies GitHub of the status of a Pull Request

    • Description: script returned exit code 1

    • Duration: 0 min 1 sec

    • Start Time: 2020-09-29T02:12:45.103+0000

    • log

  • Name: mage build test

    • Description: mage build test

    • Duration: 11 min 4 sec

    • Start Time: 2020-09-29T01:51:58.498+0000

    • log

  • Name: Notifies GitHub of the status of a Pull Request

    • Description: FAILURE

    • Duration: 0 min 1 sec

    • Start Time: 2020-09-29T02:03:11.660+0000

    • log

  • Name: Terraform Apply on x-pack/metricbeat/module/aws

    • Description:

    • Duration: 0 min 2 sec

    • Start Time: 2020-09-29T01:52:03.715+0000

    • log

  • Name: Terraform Apply on x-pack/metricbeat/module/aws

    • Description:

    • Duration: 0 min 2 sec

    • Start Time: 2020-09-29T01:52:11.926+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-apache.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-kibana.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-linux.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-ceph.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-envoyproxy.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-logstash.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-elasticsearch.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-traefik.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-mysql.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-prometheus.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-graphite.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-memcached.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-couchdb.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-system.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-kvm.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-vsphere.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-http.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-haproxy.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-consul.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-zookeeper.xml
[2020-09-29T02:28:55.134Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-redis.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-couchbase.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-kafka.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-nginx.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-aerospike.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-nats.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-windows.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-postgresql.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-munin.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-uwsgi.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-etcd.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-docker.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-beat.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-mongodb.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-golang.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-rabbitmq.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-jolokia.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-auditbeat-build/x-pack/auditbeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-auditbeat-build/x-pack/auditbeat/build/TEST-python-integration.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-auditbeat-build/x-pack/auditbeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-auditbeat-build/x-pack/auditbeat/build/TEST-go-integration.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-filebeat-windows-windows-2019/x-pack/filebeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-filebeat-windows-windows-2019/x-pack/filebeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/filebeat-windows-windows-2019/filebeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/filebeat-windows-windows-2019/filebeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-windows-windows-2019/auditbeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-windows-windows-2019/auditbeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-build/auditbeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-build/auditbeat/build/TEST-python-integration.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-build/auditbeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/auditbeat-build/auditbeat/build/TEST-go-integration.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/packetbeat-build/packetbeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/packetbeat-build/packetbeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-functionbeat-build/x-pack/functionbeat/build/TEST-go-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/x-pack-functionbeat-build/x-pack/functionbeat/build/TEST-python-unit.xml
[2020-09-29T02:28:55.135Z] ./src/github.com/elastic/beats/metricbeat-pythonIntegTest/metricbeat/build/TEST-python-integration.xml
[2020-09-29T02:28:55.135Z] + cat
[2020-09-29T02:28:55.135Z] + /usr/local/bin/runbld ./runbld-script --job-name elastic+beats+pull-request
[2020-09-29T02:28:55.135Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-09-29T02:29:01.754Z] runbld>>> runbld started
[2020-09-29T02:29:01.754Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-09-29T02:29:04.323Z] runbld>>> The following profiles matched the job 'elastic+beats+pull-request' in order of occurrence in the config (last value wins).
[2020-09-29T02:29:04.323Z] runbld>>> Matches in the system config:
[2020-09-29T02:29:04.323Z] runbld>>> - Matched ^elastic\+beats
[2020-09-29T02:29:04.323Z] runbld>>> - Matched ^elastic\+beats\+pull-request
[2020-09-29T02:29:05.279Z] runbld>>> Debug logging enabled.
[2020-09-29T02:29:05.279Z] runbld>>> Storing result
[2020-09-29T02:29:05.544Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-09-29T02:29:05.544Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20200929022905-4107745D
[2020-09-29T02:29:05.544Z] runbld>>> Adding system facts.
[2020-09-29T02:29:06.947Z] runbld>>> Adding vcs info for the latest commit:  e65a4a28ca166507ed1ecb87c97bdc7b8af0f474
[2020-09-29T02:29:06.947Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-09-29T02:29:06.947Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-09-29T02:29:06.947Z] Processing JUnit reports with runbld...
[2020-09-29T02:29:06.947Z] + echo 'Processing JUnit reports with runbld...'
[2020-09-29T02:29:07.211Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-09-29T02:29:07.211Z] runbld>>> DURATION: 26ms
[2020-09-29T02:29:07.211Z] runbld>>> STDOUT: 40 bytes
[2020-09-29T02:29:07.211Z] runbld>>> STDERR: 49 bytes
[2020-09-29T02:29:07.211Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-09-29T02:29:07.211Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-21334
[2020-09-29T02:29:08.165Z] runbld>>> Storing build metadata: 
[2020-09-29T02:29:08.165Z] runbld>>> Adding test report.
[2020-09-29T02:29:08.165Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-21334/src/github.com/elastic/beats
[2020-09-29T02:29:09.119Z] runbld>>> Found 99 test output files
[2020-09-29T02:29:11.050Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-21334/src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-graphite.xml
[2020-09-29T02:29:11.050Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-21334/src/github.com/elastic/beats/metricbeat-goIntegTest/metricbeat/build/TEST-go-integration-windows.xml
[2020-09-29T02:29:11.627Z] runbld>>> Test output logs contained: Errors: 0 Failures: 6 Tests: 16855 Skipped: 1061
[2020-09-29T02:29:11.891Z] runbld>>> Storing result
[2020-09-29T02:29:11.891Z] runbld>>> FAILURES: 6
[2020-09-29T02:29:13.291Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-09-29T02:29:13.291Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20200929022905-4107745D
[2020-09-29T02:29:13.291Z] runbld>>> Email notification disabled by environment variable.
[2020-09-29T02:29:13.291Z] runbld>>> Slack notification disabled by environment variable.
[2020-09-29T02:29:19.346Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-21334
[2020-09-29T02:29:19.566Z] [INFO] getVaultSecret: Getting secrets
[2020-09-29T02:29:19.638Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-09-29T02:29:20.236Z] + chmod 755 generate-build-data.sh
[2020-09-29T02:29:20.236Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21334/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21334/runs/2 FAILURE 4359127
[2020-09-29T02:29:20.236Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21334/runs/2/steps/?limit=10000 -o steps-info.json

Copy link
Copy Markdown
Contributor

@fearful-symmetry fearful-symmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Copy Markdown
Contributor

@adriansr adriansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this PR have been causing a crash in master version for some days. I sent a fix in #21355, maybe you want to add the extra check to this backport.

Gosigar's cgroups GetStatsForProcesses can return a nil Stats pointer
and no error when the ["blkio", "cpu", "cpuacct", "memory"] subsystems
are on the root cgroup.

Related elastic#21113
@axw
Copy link
Copy Markdown
Member Author

axw commented Sep 29, 2020

Thanks @adriansr, I've cherry-picked your commit.

@axw axw requested a review from adriansr September 29, 2020 01:15
@axw axw merged commit 8f4d194 into elastic:7.x Sep 30, 2020
@axw axw deleted the backport_21113_7.x branch September 30, 2020 02:35
axw added a commit to axw/apm-server that referenced this pull request Sep 30, 2020
axw added a commit to elastic/apm-server that referenced this pull request Oct 2, 2020
@zube zube bot removed the [zube]: Done label Dec 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Team:Integrations Label for the Integrations team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants