Skip to content

internal: consider hotplugged CPUs in PossibleCPUs and OnlineCPUs#58

Closed
tklauser wants to merge 1 commit intocilium:masterfrom
tklauser:consider-cpu-hotplugging
Closed

internal: consider hotplugged CPUs in PossibleCPUs and OnlineCPUs#58
tklauser wants to merge 1 commit intocilium:masterfrom
tklauser:consider-cpu-hotplugging

Conversation

@tklauser
Copy link
Copy Markdown
Member

@tklauser tklauser commented Feb 5, 2020

With CPU hotplugging present, /sys/devices/system/cpu/online
may contain not only ranges such as "0-3" but a mixture of CPU ranges
and individual CPUs, e.g. "0,3-7" (meaning CPU 0 and 3-7). Consider this
fact in the implementation of parseCPUs and also don't cache the results
in OnlineCPUs, as CPUs might appear and disappear during runtime.

See Documentation/admin-guide/cputopology.rst in the Linux kernel source
tree for more information.

Implementation based on github.com/tklauser/numcpus

Signed-off-by: Tobias Klauser tklauser@distanz.ch

With CPU hotplugging present, /sys/devices/system/cpu/online
may contain not only ranges such as "0-3" but a mixture of CPU ranges
and individual CPUs, e.g. "0,3-7" (meaning CPU 0 and 3-7). Consider this
fact in the implementation of parseCPUs and also don't cache the results
in OnlineCPUs, as CPUs might appear and disappear during runtime.

See Documentation/admin-guide/cputopology.rst in the Linux kernel source
tree for more information.

Implementation based on github.com/tklauser/numcpus

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Copy link
Copy Markdown
Contributor

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! When we wrote the code to parse CPUs we chose to ignore hotplugging, since it adds complications:

  • OnlineCPUs: for PerfEventArrays use the highest online CPU to control MaxElements. However, perf ring buffers can't be created for offline CPUs, so either the code in perf needs to know about which CPUs are online, or learn how to ignore errors due to trying to initialize a perf ring for an offline CPU. There is also a race condition here: what if online CPUs change between calls to OnlineCPUs?
  • PossibleCPUs: is it even possible to have ranges here? If yes there are more problems with per-CPU values.

We never hotplug CPUs, so we didn't bother dealing with these. If you have a use case for this, I'd suggest the following:

  • Figure out whether possible CPUs can be a range.
  • If they can't be a range, change PerfEventArrays to use PossibleCPUs as MaxElements. If they can we need to go back to the drawing board.
  • Change the perf ring code in perf to try and create a ring for each slot in the PerfEventArray, and ignore errors due to offline CPUs.
  • Figure out how to test this in CI.

Comment thread internal/cpu.go
cpus := strings.Trim(string(buf), "\n ")
n := int(0)
for _, cpuRange := range strings.Split(cpus, ",") {
if len(cpuRange) == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an empty group legitimate? If not we should bail out here.

@tklauser
Copy link
Copy Markdown
Member Author

tklauser commented Feb 5, 2020

Thanks for your review.

Thanks for the PR! When we wrote the code to parse CPUs we chose to ignore hotplugging, since it adds complications:

* OnlineCPUs: for PerfEventArrays use the highest online CPU to control MaxElements. However, perf ring buffers can't be created for offline CPUs, so either the code in `perf` needs to know about which CPUs are online, or learn how to ignore errors due to trying to initialize a perf ring for an offline CPU. There is also a race condition here: what if online CPUs change between calls to OnlineCPUs?

* PossibleCPUs: is it even possible to have ranges here? If yes there are more problems with per-CPU values.

No, I don't think it is. It's always 0-n afaict. I changed the commit back to only get PossibleCPUs once (as before) but forgot to adjust the commit/PR message.

We never hotplug CPUs, so we didn't bother dealing with these. If you have a use case for this, I'd suggest the following:

I currently don't have a use case for this, it just stood out to me while reading the code that this might be an issue. But I didn't consider the implications this has wrt. perf ring buffers. So I think I'll close this PR for now and resurrect if there ever comes the need to consider hotplugged CPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants