Skip to content

enrich container id from process id#15947

Merged
exekias merged 14 commits intoelastic:masterfrom
BrookHF:cid_from_pid
Mar 25, 2020
Merged

enrich container id from process id#15947
exekias merged 14 commits intoelastic:masterfrom
BrookHF:cid_from_pid

Conversation

@BrookHF
Copy link
Copy Markdown
Contributor

@BrookHF BrookHF commented Jan 29, 2020

Type of change

  • Enhancement

What does this PR do?

This PR enables add process metadata to enrich container id in Kubernetes environment by cgroup using process id.

Why is it important?

Include important metadata of the container id in the Kubernetes environment, which beats don't have this ability yet.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

Deploy Auditbeat in minikube with the include_cid set to true. Deploy an sshd server in minikube. After the Auditbeat is running, ssh into the sshd server, which will trigger an Auditbeat event. See if the event coming from the sshd container is enriched with container id cid

Related issues

Closes #14967

Use cases

Screenshots

image

Logs

@elasticmachine
Copy link
Copy Markdown
Contributor

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine
Copy link
Copy Markdown
Contributor

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I have a few suggestions for config and testing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be configurable. Let's do it the same way as add_docker_metadata.

HostFS string `config:"system.hostfs"` // Specifies the mount point of the host’s filesystem for use in monitoring a host from within a container.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be configurable too. I'd prefer to have this be search list so that it works out of the box with multiple tools (/kubepods, /docker). I don't know how the other tools name their cgroups, but if we can cover them too this would be great.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there is a mock cidProvider implementation for testing the processor. But that leaves this part untested. I recommend mocking out the cgroup.ProcessCgroupPaths function in order to get test coverage of this provider. add_docker_metadata does this with:

// processGroupPaths returns the cgroups associated with a process. This enables
// unit testing by allowing us to stub the OS interface.
var processCgroupPaths = cgroup.ProcessCgroupPaths

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Feb 25, 2020

💚 CLA has been signed

@BrookHF BrookHF requested a review from andrewkroh February 25, 2020 21:28
@BrookHF
Copy link
Copy Markdown
Contributor Author

BrookHF commented Feb 25, 2020

image
Updated configuration, test, and ECS field

@BrookHF
Copy link
Copy Markdown
Contributor Author

BrookHF commented Feb 25, 2020

Also, I signed the CLA, but it still shows not signed

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those updates. I have just a few minor suggestions on the updates.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// wantContainerID = false

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend leaving this as nil when container.id is not in the mappings. Then adjust the enrich function accordingly. This will save it from doing the work when it's disabled.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!
Instead of making the cidProvider nil when it is disabled, I passed a dummy function to the cidProvider which also avoids the use of cgroup.ProcessCgroupPaths to see the cgroup of the process.
The reason for that is I felt that the use of nil is a code smell, as we could not extend nil in the future. Also, the extra steps introduced by the dummy function will probably be optimized away by the compiler.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
directory of the host `/`. HostPath is the path where /proc reside. For different
directory of the host `/`. This is the path where `/proc` is mounted. For different

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
runtime configuration of Kubernetes or docker, the host_path can be set to
runtime configuration of Kubernetes or Docker the `host_path` can be set to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
overwrite the default `host_path`
overwrite the default.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`/kubepods` and `/docker`. CgroupPrefix is the prefix where the container id is
`/kubepods` and `/docker`. This is the prefix where the container ID is

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inside cgroup. or different runtime configuration of Kubernetes or docker, the
inside cgroup. For different runtime configuration of Kubernetes or Docker the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cgroup_prefixes can be set to overwrite the default `cgroup_prefixes`
`cgroup_prefixes` can be set to overwrite the defaults.

@BrookHF BrookHF requested a review from andrewkroh February 26, 2020 22:02
@BrookHF
Copy link
Copy Markdown
Contributor Author

BrookHF commented Feb 26, 2020

I recommend leaving this as nil when container.id is not in the mappings. Then adjust the enrich function accordingly. This will save it from doing the work when it's disabled.

Thanks for the suggestion!
Instead of making the cidProvider nil when it is disabled, I passed a dummy function to the cidProvider which also avoids the use of cgroup.ProcessCgroupPaths to see the cgroup of the process.
The reason for that is I felt that the use of nil is a code smell, as we could not extend nil in the future. Also, the extra steps introduced by the dummy function will probably be optimized away by the compiler.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dummyProcessCgroupPaths := func(_ string, pid int) (map[string]string, error) {
dummyProcessCgroupPaths := func(_ string, _ int) (map[string]string, error) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cidProvider = newCidProvider(config.HostPath, []string{}, dummyProcessCgroupPaths)
cidProvider = newCidProvider(config.HostPath, nil, dummyProcessCgroupPaths)

https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// add container.id into meta for mapping to pickup
// enrichContainerID adds container.id into meta for mapping to pickup.

I like to follow godoc style even for unexported types/methods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IncludeCid bool `config:"include_cid"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for that is I felt that the use of nil is a code smell, as we could not extend nil in the future. Also, the extra steps introduced by the dummy function will probably be optimized away by the compiler.

I still think that leaving this cidProvider as nil is simpler when the feature is disabled. It doesn't leave anything to chance with compiler optimizations.

I'm not sure what you meant by "as we could not extend nil in the future".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small style different, I changed it to use nil.

Trying to explain "as we could not extend nil in the future".

  • Put the cidProvider to nil, in the case of container.id is disabled, Then there has to be a nil check in the enrich function. If there is another case, there has to be another type of check for what cidProvider.
  • Whereas, by changing the behavior of the cidProvider so that when container.id is disabled it will not add container id. If there is another case, just pass an instance of cidProvider which behaved differently, then there is not any change required in the enrich.
  • Furthermore, say if there is any change required in the runtime. It would be possible to pass a different instance of cidProvider to the processor determined by some conditions.

It is not specifically required in this PR, but just thought it could leave more flexibility

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just overwrite/expand existing provider instead of creating another one? Some functionalities might still go for regular provider and e.g. try to read /proc/$some_pid/exe instead $host_path/proc/$some_pid/exe

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to pass back the meta reference. Just have it return the error.

@andrewkroh
Copy link
Copy Markdown
Member

jenkins, test this please

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just two minor nits to cleanup.

@BrookHF BrookHF requested a review from exekias March 19, 2020 20:45
Copy link
Copy Markdown
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI

@exekias
Copy link
Copy Markdown
Contributor

exekias commented Mar 24, 2020

jenkins test this please

@exekias exekias merged commit 3cb957d into elastic:master Mar 25, 2020
@exekias
Copy link
Copy Markdown
Contributor

exekias commented Mar 25, 2020

Thank you so much for your contribution!!

exekias pushed a commit to exekias/beats that referenced this pull request Mar 25, 2020
* enrich container id from process id

(cherry picked from commit 3cb957d)
@exekias exekias added the v7.7.0 label Mar 25, 2020
exekias pushed a commit that referenced this pull request Mar 25, 2020
* enrich container id from process id

(cherry picked from commit 3cb957d)

Co-authored-by: Fang He <fangbrookhe@gmail.com>
exekias pushed a commit to exekias/beats that referenced this pull request Mar 25, 2020
* enrich container id from process id

(cherry picked from commit 3cb957d)
exekias pushed a commit that referenced this pull request Mar 25, 2020
* enrich container id from process id

(cherry picked from commit 3cb957d)

Co-authored-by: Fang He <fangbrookhe@gmail.com>
@exekias exekias added the test-plan Add this PR to be manual test plan label Mar 26, 2020
@andresrc andresrc added the test-plan-added This PR has been added to the test plan label Mar 27, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
* enrich container id from process id

(cherry picked from commit 3aaf027)

Co-authored-by: Fang He <fangbrookhe@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review Team:Platforms Label for the Integrations - Platforms team test-plan Add this PR to be manual test plan test-plan-added This PR has been added to the test plan v7.7.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ContainerID support for non docker containers

10 participants