-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
cgroup v2: Don't require CPU controller for CPU accounting in 4.15+ #10507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
(This is partially based on #9665, but the author has gone AWOL.) |
|
looks pretty ok. One thing is missing though: in bus_cgroup_set_property() we should probably only request CGROUP_MASK_CPUACCT to be invalidated when bus_cgroup_set_boolean() is called when the CPUAccounting property is changed dynamically. I'd also like to see a second commit in this PR which turns on global CPU accounting whenever need_cpu_controller_for_accounting() is false. i.e. in that case it's free, hence we should use it. THis would mean setting arg_default_cpu_accounting early on in main() somewhere. Would also need an update to the systemd-system.conf man page. |
|
The commit author of #9665 was "Ryutaroh Matsumoto". |
Thanks, that helps. I'll make sure it's in the commit/PR message and look at the rest tomorrow. |
|
Updated. |
|
Tested live on my laptop in four configurations:
|
src/core/dbus-cgroup.c
Outdated
| return bus_cgroup_set_boolean(u, name, &c->cpu_accounting, CGROUP_MASK_CPUACCT|CGROUP_MASK_CPU, message, flags, error); | ||
| if (streq(name, "CPUAccounting")) { | ||
| CGroupMask mask = CGROUP_MASK_CPUACCT; | ||
| if (need_cpu_controller_for_accounting()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is correct, since this might not be local, right? If this isn't correct for that reason I'll just remove it and only set CGROUP_MASK_CPUACCT I suppose, unless you have any better ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use the cached version here?
poettering
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some issues
src/core/dbus-cgroup.c
Outdated
| return bus_cgroup_set_boolean(u, name, &c->cpu_accounting, CGROUP_MASK_CPUACCT|CGROUP_MASK_CPU, message, flags, error); | ||
| if (streq(name, "CPUAccounting")) { | ||
| CGroupMask mask = CGROUP_MASK_CPUACCT; | ||
| if (need_cpu_controller_for_accounting()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use the cached version here?
|
(and sorry for not realizing the above earlier) |
|
Updated and tested the tristate does the right thing when manually set. Thanks for the suggestions! :-) |
7097b64 to
6ab76ab
Compare
src/core/cgroup.c
Outdated
| } | ||
|
|
||
| bool cpu_accounting_is_cheap(void) { | ||
| return !(get_cpu_accounting_mask() & CGROUP_MASK_CPU); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return get_cpu_accounting_mask() == 0;It's cheap precisely when we need no controller for it at all...
src/core/cgroup.c
Outdated
| if (cgroup_context_has_cpu_weight(c) || | ||
| cgroup_context_has_cpu_shares(c) || | ||
| c->cpu_quota_per_sec_usec != USEC_INFINITY) | ||
| mask |= CGROUP_MASK_CPUACCT | CGROUP_MASK_CPU; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, we might as well start being accurate here, and only set 'CGROUP_MASK_CPU` in this case...
src/core/cgroup.c
Outdated
|
|
||
| if ((u->cgroup_realized_mask & CGROUP_MASK_CPU) == 0) | ||
| if (!cpu_accounting_is_cheap() && (u->cgroup_realized_mask & CGROUP_MASK_CPU) == 0) | ||
| return -ENODATA; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this check before the cg_all_unified() check above, in a generic fashion:
if (get_cpu_accounting_mask() & ~u->cgroup_realized_mask != 0)
return -ENODATA;
The above checks whether any of the controllers we need are among the unrealized controllers. This kind of check is pretty nice as it works for all three cases the same way.
|
added some comments on the first patch. the second commit looks perfect! |
|
Updated. As always, thanks for the tips :) |
poettering
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, the lack of emojis in that comment means it's not perfect, but I'll let it pass ;-)
excellent work, thanks!
src/core/cgroup.c
Outdated
| * ║ Unified │ nothing │ CGROUP_MASK_CPU ║ | ||
| * ╟───────────────┼─────────────────────┼─────────────────────╢ | ||
| * ║ Hybrid/Legacy │ CGROUP_MASK_CPUACCT │ CGROUP_MASK_CPUACCT ║ | ||
| * ╚═══════════════╧═════════════════════╧═════════════════════╝ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beautiful! but why no emojis? ;-)
7f90bd4 to
729f8e2
Compare
|
This pull request introduces 2 alerts when merging 729f8e2 into fee04d7 - view on LGTM.com new alerts:
Comment posted by LGTM.com |
|
@cdown: It sounds like your earlier test patch is obsolete with this? |
|
This is ready for final review now. Two FIXMEs are added pending #10580 being cleaned up, but it doesn't make it worse than it is without this PR, since before we unconditionally used both cpuacct and cpu together anyway :-)
Yeah, no need for changes in debian downstream. We'll follow up on the JoinControllers shenanigans in #10580. |
|
Just a friendly poke, this is ready for review :-) Nothing has changed since you accepted it other than the FIXMEs being added for the JoinController bug (and thus always using both cpu and cpuacct controllers on legacy hierarchy together). CI passes. |
|
This pull request introduces 2 alerts when merging b38ee91 into 3da1da8 - view on LGTM.com new alerts:
Comment posted by LGTM.com |
|
Heya, sorry for not reporting back quickly on this. We did some internal surveying inside of RH to figure out if any customers actually used the joincontroller config thingy. Turns out they do not. So I'll prepare a patch now removing the ability to configure which controllers are mounted together, and we simply hardcode that cpu + cpuacct are mounted together. |
|
Thanks! Once you've done that, I'll rebase on top of it and go back to the FIXMEless version. |
This removes the ability to configure which cgroup controllers to mount together. Instead, we'll now hardcode that "cpu" and "cpuacct" are mounted together as well as "net_cls" and "net_prio". The concept of mounting controllers together has no future as it does not exist to cgroupsv2. Moreover, the current logic is systematically broken, as revealed by the discussions in systemd#10507. Also, we surveyed Red Hat customers and couldn't find a single user of the concept (which isn't particularly surprising, as it is broken...) This reduced the (already way too complex) cgroup handling for us, since we now know whenever we make a change to a cgroup for one controller to which other controllers it applies.
… mask according to cpu/cpuacct joint mounting Note that for cgroup_context_get_mask() this doesn't actually change much, but it does prepare the ground for systemd#10507 later on.
|
Heya, sorry for the delay. I have now prepared #10785 which removes the JoinControllers= setting. Would be thankful for a review, and then eventuall a rebase of your work on top of it. |
|
Oh, and I noticed that cgtop should probably be updated by your patch too |
This removes the ability to configure which cgroup controllers to mount together. Instead, we'll now hardcode that "cpu" and "cpuacct" are mounted together as well as "net_cls" and "net_prio". The concept of mounting controllers together has no future as it does not exist to cgroupsv2. Moreover, the current logic is systematically broken, as revealed by the discussions in systemd#10507. Also, we surveyed Red Hat customers and couldn't find a single user of the concept (which isn't particularly surprising, as it is broken...) This reduced the (already way too complex) cgroup handling for us, since we now know whenever we make a change to a cgroup for one controller to which other controllers it applies.
… mask according to cpu/cpuacct joint mounting Note that for cgroup_context_get_mask() this doesn't actually change much, but it does prepare the ground for systemd#10507 later on.
477e903 to
2c6c5a7
Compare
systemd only uses functions that are as of Linux 4.15+ provided externally to the CPU controller (currently usage_usec), so if we have a new enough kernel, we don't need to set CGROUP_MASK_CPU for CPUAccounting=true as the CPU controller does not need to necessarily be enabled in this case. Part of this patch is modelled on an earlier patch by Ryutaroh Matsumoto (see PR systemd#9665).
If CPU accounting is cheap, no controller necessarily needs to be enabled here for us to be able to read statistics.
We now don't enable the CPU controller just for CPU accounting if we are on 4.15+ and using pure unified hierarchy, as this is provided externally to the CPU controller. This makes CPUAccounting=yes essentially free, so enabling it by default when it's cheap seems like a good idea.
|
@poettering arm64 failure is unrelated, this is good to go. |
|
Looks excellent! Thanks! |
Summary: Rebase two fixes onto the version merged upstream: systemd/systemd#10507 systemd/systemd#10567 and backport a few more: systemd/systemd#10411 systemd/systemd#10493 systemd/systemd#10757 systemd/systemd#10876 These are almost all cgroup2 related. Reviewed By: cdown Differential Revision: D13351498 fbshipit-source-id: 87c8428d48dbb0eb2ae7d34f7381fff88f83872f

systemd only uses functions that are as of Linux 4.15+ provided
externally to the CPU controller (currently usage_usec), so if we have a
new enough kernel, we don't need to set CGROUP_MASK_CPU for
CPUAccounting=true as the CPU controller does not need to necessarily be
enabled in this case.