cgroups: Add support for cgroup.kill added in Linux Kernel 5.14#3199
cgroups: Add support for cgroup.kill added in Linux Kernel 5.14#3199flouthoc wants to merge 1 commit intoopencontainers:mainfrom
cgroup.kill added in Linux Kernel 5.14#3199Conversation
e5e673a to
b79fade
Compare
|
PS: this is already added in |
b79fade to
67bfe95
Compare
67bfe95 to
b89bae4
Compare
|
This would fix #3135 |
b89bae4 to
25916c6
Compare
|
@cyphar @AkihiroSuda @kolyshkin Addressed all the feedback points in latest commit. Could you please take a look again. |
libcontainer/init_linux.go
Outdated
| if err := m.Freeze(configs.Frozen); err != nil { | ||
| logrus.Warn(err) | ||
| } |
There was a problem hiding this comment.
Again, the whole point of introducing cgroup.kill is you don't have to freeze/thaw, and yet you do that.
There was a problem hiding this comment.
@kolyshkin ah i moved it after "cgroup.kill" in latest commit but as per man-pages it says
+ Killing a cgroup tree will deal with concurrent forks appropriately and
+ is protected against migrations. If callers require strict guarantees
+ they can issue the cgroup.kill request after a freezing the cgroup via
+ cgroup.freeze.```
There was a problem hiding this comment.
Please see the discussion between @brauner and @rgushchin, which stars here.
In short, the sentence you quote was only in v1 of the patchset and it was dropped later. IOW freezing is not required, and it's one of the "selling points" of this feature.
There was a problem hiding this comment.
@kolyshkin thanks for pointing out this discussion this has been fixed in last commit.
There was a problem hiding this comment.
Marking this block as resolved since this was fixed.
libcontainer/cgroups/systemd/v1.go
Outdated
| defer m.mu.Unlock() | ||
| // Since "cgroup.kill" is not a subsystem and we don't want to regenerate path using pid | ||
| // strip dir path from any available subsystem, most likely devices since its always there for v1 | ||
| dirPath, _ := filepath.Split(m.paths["devices"]) |
There was a problem hiding this comment.
So, why are you using filepath.Split here?
There was a problem hiding this comment.
@kolyshkin We want path to tree where we are going to write "cgroup.kill" for v1 , example /sys/fs/cgroup/containerID/cgroup.kill
There was a problem hiding this comment.
- The path you provided as an example does not make sense for cgroup v1.
- The path that will be produced as a result of
filepath.Splitwould be different from your example (it will actually be pointing to the parent cgroup, and I don't think it's a good idea to use it).
So, it looks like you're not testing this code at all (at least for v1). Do you?
There was a problem hiding this comment.
@kolyshkin ah my bad i didn't spent time testing this on v1 but I followed the path construction which was being created on crun let me harden this for v1 and if its not supported i will remove it. But if its not available for v1 i think man-pages should be updated.
There was a problem hiding this comment.
@kolyshkin I am unable to write file cgroup.kill when working with v1 under any of the paths even the base path but works when cgroup_hierarchy=1 also tried with root. But since its returning error for v1 so code fallbacks to original slow process killing. PS: did not test this with using v1 + hybrid
There was a problem hiding this comment.
@kolyshkin Now i am thinking should v1 managers just return error("Not supported") but i am not sure if it works with v1+hybrid or not.
There was a problem hiding this comment.
Todo: switch this with m.paths[""] to get the path for v1+hybrid. Wait for : #2087
|
@flouthoc on what distro are you testing this? Problem is, we don't have 5.14+ kernel in CI yet. |
25916c6 to
12be34a
Compare
|
@kolyshkin I am using ubuntu 21.04 but |
|
I have booted 5.14 myself and, as I was suspecting, the feature is only available for cgroup v2 (or when using v1 + hybrid hierarchy). So, it does not make sense to add to v1 managers (unless they gain hybrid support), and the code you're using to get paths to cgroup.kill in v1 managers is probably not working. Given the fact that this is not ready, and we don't have kernel 5.14 in CI, I think we have to postpone it to 1.2.0. |
|
@kolyshkin fair points i think we can hold on to this till kernel |
There's no cgroup.kill for v1, but it is there in case hybrid hierarchy is used. Support for hybrid hierarchy is added by various PRs. In this particular case, we need #2087, and once it's in, we can use |
|
@kolyshkin Sure makes sense lets wait for hybrid hierarchy. |
|
Marked as draft as we need
Problem is, #3059 needs criu 3.16 which is not yet released. |
|
@kolyshkin @cyphar Just curious how would |
|
Just want to do this testing myself here cause i think it will take some time to get new kernel in CI meanwhile i can just do testing on my own and we can keep PR on hold. |
|
Nvm i'll wait for dependent PRs then will pick this up again. I think would be hard to plumb things temporarily into code. |
12be34a to
473ca37
Compare
|
Just putting mock code so its easier to revisit, please don't review yet. |
|
@flouthoc can you revisit this? I almost ended up rewriting this from scratch (which would be a colossal waste of time). |
|
@kolyshkin Sure I'll revisit this. Thanks for the reminder. |
|
I completely forgot about this patch. Ping me to review it when you're done rebasing. :D |
|
@flouthoc can you take a look and refresh this PR? This would be nice to have in 1.2.0 |
473ca37 to
cd61d36
Compare
Following PR adds support for `cgroup.kill` to kill all the processes in a cgroup instead of iterating over each process to save cost if possible. Linux kernel 5.14 adds support for `cgroup.kill`. Ref * https://lwn.net/Articles/855049/ * https://lwn.net/Articles/855924/ Signed-off-by: Aditya Rajan <arajan@redhat.com> Signed-off-by: Aditya R <arajan@redhat.com>
cd61d36 to
84ae6e2
Compare
|
I am working on an alternative, implemented without adding a new public method to cgroup managers, and modelled after cgroup.kill kernel tests. Still a WIP. |
|
@kolyshkin Thanks a lot for taking this over, i am closing this one. |
Following PR adds support for
cgroup.killto kill all the processes ina cgroup instead of iterating over each process to save cost if
possible.
Linux kernel 5.14 adds support for
cgroup.kill.Ref