Skip to content

[metricbeat] - Allow metricsets to report their status via v2 protocol#40400

Merged
VihasMakwana merged 50 commits intoelastic:mainfrom
VihasMakwana:add-metricbeat-status-reporter
Aug 6, 2024
Merged

[metricbeat] - Allow metricsets to report their status via v2 protocol#40400
VihasMakwana merged 50 commits intoelastic:mainfrom
VihasMakwana:add-metricbeat-status-reporter

Conversation

@VihasMakwana
Copy link
Copy Markdown
Contributor

@VihasMakwana VihasMakwana commented Jul 31, 2024

Proposed commit message

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

  • How to test with agent:
    • Build agentbeat locally from this branch.
    • Download/Build elastic-agent.
    • Put the agentbeat binary in elastic-agent/data/elastic-agent-*/components/
    • Install the agent in unprivileged mode and observe the health.
      • You should see DEGRADED state in a couple of minutes.
      • You should also see non-fatal error fetching PID metrics and permission denied errors.

Related issues

Use cases

Screenshots

Screenshot 2024-07-18 at 10 43 38 PM

@VihasMakwana VihasMakwana requested review from AndersonQ and rdner and removed request for leehinman August 1, 2024 16:09
VihasMakwana added a commit to elastic/elastic-agent-system-metrics that referenced this pull request Aug 1, 2024
## What does this PR do?

- Previously, we weren't passing errors to the caller while monitoring
set of processes.
- With the recent introduction of the status reporter for metricsets, it
is impossible to change the status to degraded if such errors are not
passed to the caller.
- Fix this by passing errors to the caller. We also populate the process
related information to our best-effort.

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added an entry in `CHANGELOG.md`

## Manual testing and general information

- See elastic/beats#40400 for testing it on
`metricbeat`
**NOTE**: 
   - **Only applicable if you're using `system/process` module**
- Non-fatal errors are only received when you have insufficient
privileges.

Steps:
   - While receiving any error, test for nature of error
   - call `errors.Is(err, NonFatalErr{}))` on received error
- If true, error is non-fatal and you can proceed further (metrics will
be partially available, most probably insufficient privileges).
      - Else, log the error and stop execution (metrics will be empty)

Genreal info related to the changes in this PR:
- While getting process related information, you might also receive a
non-nil error.
   - Such errors come in two flavours:
       - Fatal errors: 
- This indicates that the error was fatal (for eg. `no process found`,)
- Caller should stop further execution if they receive fatal errors
       - Non-fatal errors: 
- This indicates that the error was fatal (for eg. `not enough
privileges`)
          - It means that metrics are partially filled.
- Further execution can be continued if non-fatal errors are encountered

- Closes
#164
@VihasMakwana VihasMakwana added the backport-8.15 Automated backport to the 8.15 branch with mergify label Aug 2, 2024
Copy link
Copy Markdown
Member

@jennypavlova jennypavlova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack Monitoring changes LGTM

@pierrehilbert pierrehilbert requested a review from leehinman August 2, 2024 13:39
@VihasMakwana VihasMakwana requested a review from rdner August 6, 2024 07:47
@VihasMakwana VihasMakwana merged commit 2060383 into elastic:main Aug 6, 2024
mergify bot pushed a commit that referenced this pull request Aug 6, 2024
#40400)

* fix: initial commit

* tests: add integration test cases

* fix: expand testing scenarios

* fix: add comments

* fix: move integration tests to system/process

* cleanup

* fix: ci

* fix: ci and typos

* chore: update changelog

* fix: add helper

* fix: remove extra space

* fix: ci

* fix: move integration tests to x-pack

* fix: add null check

* fix: ci

* fix: remove unused code

* fix: lint

* fix: lint and imports

* fix: ci windows

* inting for windows

* fix lint linux

* fix: go imports

* fix: switch to the generic way

* chore: make error descriptive

* fix: move status report after fetch

* fix: typo

* fix: remove nolint

* Squashed commit of the following:

commit 18d38af
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Wed Jul 24 01:23:54 2024 +0530

    fix: add comments

commit 806cda4
Merge: 2e0bd28 b5b67a1
Author: VihasMakwana <121151420+VihasMakwana@users.noreply.github.com>
Date:   Wed Jul 24 01:20:38 2024 +0530

    Merge branch 'main' into metricbeat-process-multierr

commit 2e0bd28
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Wed Jul 24 01:20:14 2024 +0530

    fix: typo

commit 82dc103
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Wed Jul 24 01:19:35 2024 +0530

    fix: typo

commit b5b67a1
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Jul 23 18:13:16 2024 +0000

    build(deps): bump the azure-sdks group with 2 updates (#40310)

    * build(deps): bump the azure-sdks group with 2 updates

    Bumps the azure-sdks group with 2 updates: [github.com/Azure/go-autorest/autorest](https://github.com/Azure/go-autorest) and [github.com/Azure/go-autorest/autorest/adal](https://github.com/Azure/go-autorest).

    Updates `github.com/Azure/go-autorest/autorest` from 0.11.28 to 0.11.29
    - [Release notes](https://github.com/Azure/go-autorest/releases)
    - [Changelog](https://github.com/Azure/go-autorest/blob/main/CHANGELOG.md)
    - [Commits](Azure/go-autorest@autorest/v0.11.28...autorest/v0.11.29)

    Updates `github.com/Azure/go-autorest/autorest/adal` from 0.9.21 to 0.9.22
    - [Release notes](https://github.com/Azure/go-autorest/releases)
    - [Changelog](https://github.com/Azure/go-autorest/blob/main/CHANGELOG.md)
    - [Commits](Azure/go-autorest@autorest/adal/v0.9.21...autorest/adal/v0.9.22)

    ---
    updated-dependencies:
    - dependency-name: github.com/Azure/go-autorest/autorest
      dependency-type: direct:production
      update-type: version-update:semver-patch
      dependency-group: azure-sdks
    - dependency-name: github.com/Azure/go-autorest/autorest/adal
      dependency-type: direct:production
      update-type: version-update:semver-patch
      dependency-group: azure-sdks
    ...

    Signed-off-by: dependabot[bot] <support@github.com>

    * Update NOTICE.txt

    ---------

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

commit 197396f
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Jul 23 13:32:02 2024 -0400

    build(deps): bump the gcp-sdks group with 9 updates (#40311)

    * build(deps): bump the gcp-sdks group with 9 updates

    Bumps the gcp-sdks group with 9 updates:

    | Package | From | To |
    | --- | --- | --- |
    | [cloud.google.com/go/bigquery](https://github.com/googleapis/google-cloud-go) | `1.55.0` | `1.62.0` |
    | [cloud.google.com/go/monitoring](https://github.com/googleapis/google-cloud-go) | `1.16.0` | `1.20.1` |
    | [cloud.google.com/go/pubsub](https://github.com/googleapis/google-cloud-go) | `1.33.0` | `1.40.0` |
    | [cloud.google.com/go/compute](https://github.com/googleapis/google-cloud-go) | `1.23.0` | `1.27.2` |
    | [cloud.google.com/go/redis](https://github.com/googleapis/google-cloud-go) | `1.13.1` | `1.16.2` |
    | [cloud.google.com/go/compute/metadata](https://github.com/googleapis/google-cloud-go) | `0.2.3` | `0.4.0` |
    | [cloud.google.com/go/iam](https://github.com/googleapis/google-cloud-go) | `1.1.2` | `1.1.10` |
    | [cloud.google.com/go/longrunning](https://github.com/googleapis/google-cloud-go) | `0.5.1` | `0.5.9` |
    | [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) | `1.30.1` | `1.42.0` |

    Updates `cloud.google.com/go/bigquery` from 1.55.0 to 1.62.0
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@spanner/v1.55.0...spanner/v1.62.0)

    Updates `cloud.google.com/go/monitoring` from 1.16.0 to 1.20.1
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/documentai/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@kms/v1.16.0...video/v1.20.1)

    Updates `cloud.google.com/go/pubsub` from 1.33.0 to 1.40.0
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@pubsub/v1.33.0...pubsub/v1.40.0)

    Updates `cloud.google.com/go/compute` from 1.23.0 to 1.27.2
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/documentai/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@pubsub/v1.23.0...compute/v1.27.2)

    Updates `cloud.google.com/go/redis` from 1.13.1 to 1.16.2
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@asset/v1.13.1...redis/v1.16.2)

    Updates `cloud.google.com/go/compute/metadata` from 0.2.3 to 0.4.0
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@netapp/v0.2.3...v0.4.0)

    Updates `cloud.google.com/go/iam` from 1.1.2 to 1.1.10
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@iam/v1.1.2...iam/v1.1.10)

    Updates `cloud.google.com/go/longrunning` from 0.5.1 to 0.5.9
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@auth/v0.5.1...longrunning/v0.5.9)

    Updates `cloud.google.com/go/storage` from 1.30.1 to 1.42.0
    - [Release notes](https://github.com/googleapis/google-cloud-go/releases)
    - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/documentai/CHANGES.md)
    - [Commits](googleapis/google-cloud-go@pubsub/v1.30.1...spanner/v1.42.0)

    ---
    updated-dependencies:
    - dependency-name: cloud.google.com/go/bigquery
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/monitoring
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/pubsub
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/compute
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/redis
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/compute/metadata
      dependency-type: indirect
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/iam
      dependency-type: indirect
      update-type: version-update:semver-patch
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/longrunning
      dependency-type: indirect
      update-type: version-update:semver-patch
      dependency-group: gcp-sdks
    - dependency-name: cloud.google.com/go/storage
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: gcp-sdks
    ...

    Signed-off-by: dependabot[bot] <support@github.com>

    * Update NOTICE.txt

    ---------

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

commit 8940f7d
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 21:02:27 2024 +0530

    fix: update notice

commit 58bc2ff
Merge: 9433065 dd671a6
Author: VihasMakwana <121151420+VihasMakwana@users.noreply.github.com>
Date:   Tue Jul 23 20:59:16 2024 +0530

    Merge branch 'main' into metricbeat-process-multierr

commit 9433065
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 20:57:58 2024 +0530

    chore: update tests

commit c1d4aba
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 20:55:45 2024 +0530

    fix: add specifc version metric-system

commit dd671a6
Author: Vinit Chauhan <chauhanvinit23@gmail.com>
Date:   Tue Jul 23 10:20:37 2024 -0400

    filebeat/decode_cef - Add option to ignore empty values (#40268)

    Added option to ignore empty values in the decode_cef processor.

    In the decode_cef processor, when there are empty values in the extensions section, we get errors during log parsing. This change provides a flag in decode_cef config to override this default behavior and ignore the fields with empty value. Some example errors that this helps handle are:

        error in field 'cn1': strconv.ParseInt: parsing "": invalid syntax
        error in field 'destinationTranslatedAddress': value is not a valid IP address

    Closes #40236

commit add7a45
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 19:13:54 2024 +0530

    fix: unit test

commit 0293645
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 16:31:49 2024 +0530

    fix: remove ioutil

commit e842010
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 16:14:01 2024 +0530

    fix: update notice

commit 246d730
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 16:13:15 2024 +0530

    fix: add license, remove uuid5

commit ac01831
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 15:03:08 2024 +0530

    update: go.mod

commit 42101c8
Merge: 091fff8 7263696
Author: VihasMakwana <121151420+VihasMakwana@users.noreply.github.com>
Date:   Tue Jul 23 15:02:20 2024 +0530

    Merge branch 'main' into metricbeat-process-multierr

commit 091fff8
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 14:58:51 2024 +0530

    fix: test

commit fd6d312
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Tue Jul 23 14:57:13 2024 +0530

    fix: update go.mod, update uuid and metrics version

commit 7263696
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Jul 22 19:32:38 2024 +0000

    build(deps): bump github.com/elastic/elastic-agent-libs from 0.9.13 to 0.9.15 (#40300)

    * build(deps): bump github.com/elastic/elastic-agent-libs

    Bumps [github.com/elastic/elastic-agent-libs](https://github.com/elastic/elastic-agent-libs) from 0.9.13 to 0.9.15.
    - [Release notes](https://github.com/elastic/elastic-agent-libs/releases)
    - [Commits](elastic/elastic-agent-libs@v0.9.13...v0.9.15)

    ---
    updated-dependencies:
    - dependency-name: github.com/elastic/elastic-agent-libs
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <support@github.com>

    * Update NOTICE.txt

    ---------

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

commit e3d8f3b
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Jul 22 13:44:51 2024 -0400

    build(deps): bump github.com/elastic/elastic-agent-client/v7 from 7.14.0 to 7.15.0 (#40304)

    * build(deps): bump github.com/elastic/elastic-agent-client/v7

    Bumps [github.com/elastic/elastic-agent-client/v7](https://github.com/elastic/elastic-agent-client) from 7.14.0 to 7.15.0.
    - [Release notes](https://github.com/elastic/elastic-agent-client/releases)
    - [Commits](elastic/elastic-agent-client@v7.14.0...v7.15.0)

    ---
    updated-dependencies:
    - dependency-name: github.com/elastic/elastic-agent-client/v7
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <support@github.com>

    * Update NOTICE.txt

    ---------

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

commit 3e95d53
Author: Maurizio Branca <maurizio.branca@elastic.co>
Date:   Mon Jul 22 18:24:59 2024 +0200

    Add CSP SDKs to the `allow` list (#40150)

commit f3f772f
Author: VihasMakwana <121151420+VihasMakwana@users.noreply.github.com>
Date:   Fri Jul 19 17:52:19 2024 +0530

    [filebeat][log] Enable status reporter for log input (#40075)

    * chore: initial commit, without tests

    * chore: tests

    * chore: add test cases

    * fix: add null check

    * fix: remove println

    * fix: lint

    * goimports

    * remove println

    * fix: changelog

    * update test for windows

    * fix: fix some comments

    * chore: add starting state in NewInput

    * fix: add sample output to verify the status

    * fix: remove println

    * fix: add integration tag

    * Update CHANGELOG.next.asciidoc

    Co-authored-by: Denis <denis@rdner.de>

    * fix: remove redundant bool

    * fix: add degraded

    ---------

    Co-authored-by: Pierre HILBERT <pierre.hilbert@elastic.co>
    Co-authored-by: Denis <denis@rdner.de>

commit 463bbb4
Author: Dan Kortschak <dan.kortschak@elastic.co>
Date:   Fri Jul 19 06:32:29 2024 +0930

    x-pack/filebeat/input/websocket: do minor clean-up in main loop (#40145)

    * remove unneeded goroutine
    * fix logging: The body was previously not being logged since an io.ReadCloser
      is not a JSON-serialisable type.

commit 908553d
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Thu Jul 18 19:04:52 2024 +0530

    chore: rename function

commit 51a7854
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Thu Jul 18 18:42:00 2024 +0530

    chore: update process summary

commit 21b102b
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Thu Jul 18 16:44:19 2024 +0530

    chore: add degradable error

commit 942f8c7
Author: Alejandro Fernández Haro <alejandro.haro@elastic.co>
Date:   Wed Jul 17 20:52:14 2024 +0200

    [Metricbeat/kibana/status] Add support for v8format (#40275)

commit 1bfcecb
Author: Vihas Makwana <vihas.makwana@elastic.co>
Date:   Wed Jul 17 23:31:10 2024 +0530

    fix: multierror support

* fix: nits and comments

* fix: fix notice, and test

* fix notice

* fix notice

* fix: lint

* fix: nits

* fix: update notice, go.mod

* fix: update notice, go.mod to v0.11.0

* temp

* fix: use ErrorIs

* fix: use ErrorIsf

---------

Co-authored-by: Pierre HILBERT <pierre.hilbert@elastic.co>
(cherry picked from commit 2060383)

# Conflicts:
#	NOTICE.txt
#	go.mod
#	go.sum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.15 Automated backport to the 8.15 branch with mergify Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

7 participants