Skip to content

eventsmap: Use present cpuset instead of possible for max_entries#12119

Closed
gandro wants to merge 1 commit intomasterfrom
pr/gandro/eventsmap-use-present-cpuset
Closed

eventsmap: Use present cpuset instead of possible for max_entries#12119
gandro wants to merge 1 commit intomasterfrom
pr/gandro/eventsmap-use-present-cpuset

Conversation

@gandro
Copy link
Copy Markdown
Member

@gandro gandro commented Jun 16, 2020

This fixes an issue where cilium monitor fails to report any events
on AKS, due to the perf_event_array duplicates being created
with max_entries=2 and max_entries=128, presumably causing the datapath
to write to the first one, while the agent is reading from the second
one.

This bug occurs for example on AKS due to the present/possible cpuset on
the VMs. The default Standard_D2s_v3 node size has 2 present CPUs, but
128 possible CPUs in /sys/devices/system/cpu.

Fixes #12070.

This fixes an issue where `cilium monitor` fails to report any events
on AKS, due to the `perf_event_array` duplicates being created
with max_entries=2 and max_entries=128, presumably causing the datapath
to write to the first one, while the agent is reading from the second
one.

This bug occurs for example on AKS due to the present/possible cpuset on
the VMs. The default `Standard_D2s_v3` node size has 2 present CPUs, but
128 possible CPUs in /sys/devices/system/cpu.

Fixes #12070.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro added priority/release-blocker area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jun 16, 2020
@gandro gandro requested review from a team and pchaigno June 16, 2020 19:16
@gandro gandro requested a review from a team as a code owner June 16, 2020 19:16
@aanm
Copy link
Copy Markdown
Member

aanm commented Jun 16, 2020

@gandro do you know which commit is this fixing?

@gandro
Copy link
Copy Markdown
Member Author

gandro commented Jun 16, 2020

While I have validated that this seems to fix the linked issue, I still don't understand why.

git bisect blames commit 1256a13 and indeed, reverting the commit removes the commit removes the perf_event_array map with max_entries=128. Similarly, the parent commit also does not seem to have the issue.

It's also not clear to me if we want to use GetNumPresentCPUs for other per-CPU maps as well. (metricsmap and signalmap). They don't seem to suffer duplicates, so it doesn't seem to be needed?

@pchaigno
Copy link
Copy Markdown
Member

An alternative fix could be to switch the datapath to use possible for max_entries. It would be more involved, but would be consistent with cilium/ebpf's behavior. It might be required if we expect offline CPUs to come back online.

&Value{},
int(unsafe.Sizeof(Value{})),
common.GetNumPossibleCPUs(log),
common.GetNumPresentCPUs(log),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we should use present or possible or what the role of 1256a13 is, but we want to use the same value in datapath and Golang for all maps. So we should also fix this for signalmap.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on previous PRs and deliberately chose possible so if there is any hotplugging, we would handle that correctly (ie not just drop events on newly hotplugged CPUs).

Do we actually need to be fixing up the BPF side to ensure that it opens using possible instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pchaigno @gandro @joestringer It sounds like -D__NR_CPUS__=$(nproc) is broken then in bpf_compile() from bpf/init.sh. Does echo $(nproc) return 2 or 128?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on hotplug, if suddenly CPU 4 would start generating events these would otherwise not be visible if we use present instead of possible. All of BPF kernel side uses possible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gandro reported that nproc returns 2. I think we want nproc --all. runtime.NumCPU() also returns 2.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, such that agent and datapath both has the max_entries=128.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable to me as well. I can try to whip up a fix and validate it. Are there cases where __NR_CPUS__ should not be equal nproc --all?

Copy link
Copy Markdown
Member Author

@gandro gandro Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative PR to have the datapath always use possible: #12121

@gandro
Copy link
Copy Markdown
Member Author

gandro commented Jun 16, 2020

Closing in favor of #12121

@gandro gandro closed this Jun 16, 2020
@gandro gandro deleted the pr/gandro/eventsmap-use-present-cpuset branch June 16, 2020 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cilium monitor broken on AKS 1.15 (engine 0.47)

5 participants