Use number of possible CPUs for PerfEventArray.MaxEntries#81
Merged
Use number of possible CPUs for PerfEventArray.MaxEntries#81
Conversation
We currently use the highest possible CPU to determine the
size of a PerfEventArray. This has several problems: first,
the code to parse the online CPUs doesn't correctly handle
ranges: "0,2-7" is treated like there is only one online CPU.
This can happen when a CPU is disabled at runtime using
echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online
We silently create a map with an incorrect size, and user
code starts to fail with E2BIG.
Fix this by using the number of possible CPUs as the map size.
The number can only be changed by a reboot and so is safe to use.
It's also simpler to use, since we don't have to deal with
multiple ranges like in the online CPU case.
For some reason we call Resume twice when creating a new reader. Also switch from explicitly enumerating pauseFds to using range.
PerfEventArrays are now sized to the possible CPUs in the system. This means we may try creating a ring buffer for an offline CPU. In this case, perf_event_open helpfully returns ENODEV, which we can handle gracefully. Note that the reader won't work correctly if CPUs are added after it has been initialized. Systems that only have some of their CPU sockets populated will work however.
tklauser
added a commit
to cilium/cilium
that referenced
this pull request
Jun 15, 2020
This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
borkmann
pushed a commit
to cilium/cilium
that referenced
this pull request
Jun 16, 2020
This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
borkmann
pushed a commit
to cilium/cilium
that referenced
this pull request
Jun 16, 2020
[ upstream commit 00ef71b ] This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
borkmann
pushed a commit
to cilium/cilium
that referenced
this pull request
Jun 16, 2020
[ upstream commit 00ef71b ] This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
gandro
added a commit
to cilium/cilium
that referenced
this pull request
Jul 8, 2020
[ upstream commit 00ef71b ] This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
brb
pushed a commit
to cilium/cilium
that referenced
this pull request
Jul 9, 2020
[ upstream commit 00ef71b ] This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Based on discussion in #58 and prompted by @iAklis and his PRs #78 and #80.
This fixes the behaviour on systems that have one or more CPUs disabled. Previously things would only work if the disabled CPUs were on the end of the range.
It's still not possible to use PerfEventArrays that have more than the total number of CPUs in the system (as suggested by @iAklis), since I'm not sure that's a good idea to do. We also don't deal with CPUs being added or removed after a perf.Reader has been created.
More details in the commit descriptions.