[8.x](backport #42032) auditbeat: system/process module backed by quark#42810
[8.x](backport #42032) auditbeat: system/process module backed by quark#42810
Conversation
This introduces a new provider for the sytem/process module in linux. The main motivation is to address some of the limitations of the current implementation. The gosysinfo provider sends state reports by scraping /proc from time to time, so it loses all short lived processes. Some customers also would like to have full telemetry but can't run auditd for various reasons. As a bonus we get some extra ECS fields that were not available before. MAIN DIFFERENCES: * Publishes every process in the system, regardless of lifespan. * Publishes exec events for an existing process (without a fork). * Aggregates fork+exec+exit within one event. * Adds event.exit_code for processes that exited, can't express exit_time in ECS? * Include the original process.args, sysinfo reports args that were fetched when it parsed /proc, so a userland process can masquerade itself. For the initial /proc scraping we report the current value like sysinfo. We can't get the original value since the kernel overwrites it, if you wanna have fun: https://github.com/systemd/systemd/blob/main/src/basic/argv-util.c#L165 * Adds process.args_count. * Adds process.interactive and if true, process.tty.char_device.{major,minor} * Attempts to hash all processes, not just long lived ones. * Hashing is not rate-limited anymore, but it's cached and refreshed based on metadata. It's a LRU keyed by path and refreshed if the metadata of the file changes, statx(2) if the kernel supports, stat(2) otherwise. * No more periodic state reports, only initial batch. * No more saving the timestamp of the last state-report in disk. * No more /proc parsing during runtime, only on boot. MISSING: * Unify entity id with sessionview. * Docs. EXTRA CHANGES: * Added statx(2) to seccomp_linux so we can properly use CachedHasher. * Updated quark to 0.3 so we have namespace inode numbers. Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com> Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co> (cherry picked from commit ce6156b)
|
Pinging @elastic/sec-linux-platform (Team:Security-Linux Platform) |
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
1 similar comment
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
1 similar comment
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
3 similar comments
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
|
This pull request has not been merged yet. Could you please review and merge it @haesbaert? 🙏 |
|
I'll commit this tonight after my tests finish. |
Proposed commit message
This introduces a new provider for the sytem/process module in linux.
The main motivation is to address some of the limitations of the current implementation. The gosysinfo provider sends state reports by scraping /proc from time to time, so it loses all short lived processes. Some customers also would like to have full telemetry but can't run auditd for various reasons.
As a bonus we get some extra ECS fields that were not available before.
MAIN DIFFERENCES:
MISSING:
Publish metrics from quark.Stats().Done, but naming and gauges should be discussed.EXTRA CHANGES:
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.How to test this PR locally
Run auditbeat on linux with the following configuration:
(edit)
process.backendwasquarkRelated issues
Integrated PRs related to this
List of previous work done to minimize the size of this PR
Screenshots
Non interactive SSH
Below is a shot of a non interactive ssh session, done with

ssh fc39vm /bin/echo hi from quarkio.It shows the intermediary processes of sshd until we fork the shell and echo, the interesting bits is that we can see a process that forked+execed and then execs again: sshd forks+execs mksh,, which in turn execs /bin/echo, without forking.
Comparison against the sysinfo provider for a long lived process:
Here we run a long sleep and just compare the events against the existing provider on 8.14.3:

On event.type, event.action and others
I've tried to keep things as close as possible to the old provider, but it's really just a suggestion at this point and it's likely we want to change things
As you can see, expressing things in event.action is not great, I'm
all open to suggestions, life would be easier if it could be an
array. I've tried to compromise more states into fewer words.
process_changed_image might look a bit weird, but it's less ambiguous
than "executed". Again really open to suggestions here and I have no
strong feelings about it.
event.kind is now always
eventas there is no more state reports every X seconds.The initial state report at init remains, but it's also
event.On the state of this PR
This doesn't include the documentation bits, I'd like to do this in a subsequent PR once the naming, config and whatnot is decided.
We should unify process.entity_id with sessionviewer, and we can do it in this PR, worth noting that the gosysinfo backend calculates things differently as well, so this is no worse than that.
I'm going out on holidays, but I'm taking this PR out of draft so that we can start the discussion and interested parties can test it.
This is an automatic backport of pull request #42032 done by Mergify.