internal: consider hotplugged CPUs in PossibleCPUs and OnlineCPUs#58
internal: consider hotplugged CPUs in PossibleCPUs and OnlineCPUs#58tklauser wants to merge 1 commit intocilium:masterfrom
Conversation
With CPU hotplugging present, /sys/devices/system/cpu/online may contain not only ranges such as "0-3" but a mixture of CPU ranges and individual CPUs, e.g. "0,3-7" (meaning CPU 0 and 3-7). Consider this fact in the implementation of parseCPUs and also don't cache the results in OnlineCPUs, as CPUs might appear and disappear during runtime. See Documentation/admin-guide/cputopology.rst in the Linux kernel source tree for more information. Implementation based on github.com/tklauser/numcpus Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
lmb
left a comment
There was a problem hiding this comment.
Thanks for the PR! When we wrote the code to parse CPUs we chose to ignore hotplugging, since it adds complications:
- OnlineCPUs: for PerfEventArrays use the highest online CPU to control MaxElements. However, perf ring buffers can't be created for offline CPUs, so either the code in
perfneeds to know about which CPUs are online, or learn how to ignore errors due to trying to initialize a perf ring for an offline CPU. There is also a race condition here: what if online CPUs change between calls to OnlineCPUs? - PossibleCPUs: is it even possible to have ranges here? If yes there are more problems with per-CPU values.
We never hotplug CPUs, so we didn't bother dealing with these. If you have a use case for this, I'd suggest the following:
- Figure out whether possible CPUs can be a range.
- If they can't be a range, change PerfEventArrays to use PossibleCPUs as MaxElements. If they can we need to go back to the drawing board.
- Change the perf ring code in
perfto try and create a ring for each slot in the PerfEventArray, and ignore errors due to offline CPUs. - Figure out how to test this in CI.
| cpus := strings.Trim(string(buf), "\n ") | ||
| n := int(0) | ||
| for _, cpuRange := range strings.Split(cpus, ",") { | ||
| if len(cpuRange) == 0 { |
There was a problem hiding this comment.
Is an empty group legitimate? If not we should bail out here.
|
Thanks for your review.
No, I don't think it is. It's always
I currently don't have a use case for this, it just stood out to me while reading the code that this might be an issue. But I didn't consider the implications this has wrt. perf ring buffers. So I think I'll close this PR for now and resurrect if there ever comes the need to consider hotplugged CPUs. |
With CPU hotplugging present, /sys/devices/system/cpu/online
may contain not only ranges such as "0-3" but a mixture of CPU ranges
and individual CPUs, e.g. "0,3-7" (meaning CPU 0 and 3-7). Consider this
fact in the implementation of parseCPUs and also don't cache the results
in OnlineCPUs, as CPUs might appear and disappear during runtime.
See Documentation/admin-guide/cputopology.rst in the Linux kernel source
tree for more information.
Implementation based on github.com/tklauser/numcpus
Signed-off-by: Tobias Klauser tklauser@distanz.ch