Skip to content

[Auditbeat] Cherry-pick #9693 to 6.6: Report process errors#9845

Merged
cwurm merged 1 commit intoelastic:6.6from
cwurm:backport_9693_6.6
Jan 4, 2019
Merged

[Auditbeat] Cherry-pick #9693 to 6.6: Report process errors#9845
cwurm merged 1 commit intoelastic:6.6from
cwurm:backport_9693_6.6

Conversation

@cwurm
Copy link
Copy Markdown
Contributor

@cwurm cwurm commented Jan 2, 2019

Cherry-pick of PR #9693 to 6.6 branch. Original message:

So far, the process metricset has been rather strict. If an unexpected error occurred while collecting process information, the whole collection would stop and return an error.

This changes it to keep iterating through processes even when that happens. The unexpected error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles (with a typical collection frequency of 1s, that would flood the log and ES).

For error documents, it sets event.kind: error and event.action: process_error.

Fyi, I have renamed ProcessInfo to Process not just because it now contains more than just types.ProcessInfo, but also to bring it in line with Socket in socket.go. Socket already contains an Error field (and that was the inspiration for this change).

Beware: The diff Github shows is misleading in places, it shows replacements/deletions where a few lines have just moved down a bit.

Some additional background on why this change can be found in this comment thread on a PR that introduced some error catching during process collection.

If anybody wants to test what happens with errors, run it as non-root and comment the continue statement in line 375 - it will report errors for processes of other users. At some point, we might want to have a test that simulates an error.

@cwurm cwurm changed the title Cherry-pick #9693 to 6.6: [Auditbeat] Report process errors [Auditbeat] Cherry-pick #9693 to 6.6: Report process errors Jan 2, 2019
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/secops

@cwurm cwurm requested review from andrewkroh and webmat and removed request for webmat January 2, 2019 16:14
Changes the process metricset to keep iterating through processes even when an unexpected error occurs. The error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles.

(cherry picked from commit 2cd7c42)
@cwurm cwurm force-pushed the backport_9693_6.6 branch from 795acfb to 18df8ae Compare January 3, 2019 16:10
Copy link
Copy Markdown
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No diff vs #9693. LGTM

@cwurm cwurm merged commit 33e0227 into elastic:6.6 Jan 4, 2019
@cwurm cwurm deleted the backport_9693_6.6 branch January 4, 2019 11:06
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Changes the process metricset to keep iterating through processes even when an unexpected error occurs. The error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles.

(cherry picked from commit f7ce3b1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants