[Auditbeat] Fix up socket dataset runaway CPU usage#19764
Merged
andrewstucki merged 3 commits intoelastic:masterfrom Jul 9, 2020
Merged
[Auditbeat] Fix up socket dataset runaway CPU usage#19764andrewstucki merged 3 commits intoelastic:masterfrom
andrewstucki merged 3 commits intoelastic:masterfrom
Conversation
Contributor
|
Pinging @elastic/siem (Team:SIEM) |
Contributor
Contributor
|
Great investigation @andrewstucki, thanks for fixing this! It explains the profiles shared in the discuss thread: With onSockDestroyed() taking most of the CPU and only doing a map lookup. |
adriansr
approved these changes
Jul 9, 2020
Contributor
adriansr
left a comment
There was a problem hiding this comment.
LGTM, needs a changelog entry
added 2 commits
July 9, 2020 07:14
Author
|
Just added on to your previous changelog entry and rebased from master to hopefully fix the weird CI mage issues |
Contributor
|
I've shared a custom 7.8.1 build with this fix on the discuss thread. |
andrewstucki
pushed a commit
to andrewstucki/beats
that referenced
this pull request
Jul 9, 2020
* Fix up socket dataset * Add Changelog entry (cherry picked from commit cb4cedc)
andrewstucki
pushed a commit
to andrewstucki/beats
that referenced
this pull request
Jul 9, 2020
* Fix up socket dataset * Add Changelog entry (cherry picked from commit cb4cedc)
andrewstucki
pushed a commit
to andrewstucki/beats
that referenced
this pull request
Jul 9, 2020
* Fix up socket dataset * Add Changelog entry (cherry picked from commit cb4cedc)
andrewstucki
pushed a commit
that referenced
this pull request
Jul 9, 2020
andrewstucki
pushed a commit
that referenced
this pull request
Jul 9, 2020
v1v
added a commit
to v1v/beats
that referenced
this pull request
Jul 9, 2020
* upstream/master: Add `docker logs` support to the Elastic Log Driver (elastic#19531) [Elastic Agent] Fix saving of agent configuration on Windows to have proper ACLs (elastic#19793) Send the config revision down to the endpoint application. (elastic#19759) [Elastic Agent] Add support for multiple hosts in connection to kibana (elastic#19628) Remove the downloadConfig and retryConfig from plugin/process.Application and plugin/service.Application. (elastic#19603) Update go version to 1.14.4 (elastic#19753) ci: set builds as skipped when they do not match the trigger (elastic#19750) [Auditbeat] Fix up socket dataset runaway CPU usage (elastic#19764) Convert cloudfoundry input to v2 (elastic#19717)
melchiormoulin
pushed a commit
to melchiormoulin/beats
that referenced
this pull request
Oct 14, 2020
* Fix up socket dataset * Add Changelog entry
leweafan
pushed a commit
to leweafan/beats
that referenced
this pull request
Apr 28, 2023
…unaway CPU usage (elastic#19783) * [Auditbeat] Fix up socket dataset runaway CPU usage (elastic#19764) * Fix up socket dataset * Add Changelog entry (cherry picked from commit f1ef970) * fix up changelog * Fix changelog
leweafan
pushed a commit
to leweafan/beats
that referenced
this pull request
Apr 28, 2023
…lastic#19781) * Fix up socket dataset * Add Changelog entry (cherry picked from commit f1ef970)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What does this PR do?
Fix for auditbeat runaway CPU usage: #19141
So, here's the explanation, basically everything was pretty much as described in the previous PR (#19033), the only additional things that I found were that:
*socketis terminated by another socket with a different kerneltidit's moved to theclosingLRU list.*socketis added to the statesocksmap with the ptr reference pointing to itonSockTerminatedis called againonSockTerminatedthe socket is pruned again from thesocksmap with the call todelete(s.socks, sock.sock)socksmap now refers to the new*socketrather than the old one*sockettimes outonSockDestroyedis called on it with the code that's doing the peek on thesocketLRUin the reaper codesocksmap in step 5onSockDestroyedwas running the following code:foundwas returningfalseand the function was returnings.socketLRU.peek()the same socket was getting returned over and over, resulting in the reaper routine getting wedged in a tightforloop (hence the high CPU usage).The fix
Basically we pass a reference to the
*socketobject in the reaper'sonSockDestroyedcall, that way we don't have to look up the socket ins.socksand, instead handle the socket closure directly.Related issues