bpf: Fix 4.19 program size and complexity issues#11915
Closed
bpf: Fix 4.19 program size and complexity issues#11915
Conversation
03bbcaa to
90c8fa2
Compare
16150cb to
62e8b9e
Compare
Contributor
|
hey @pchaigno i'm still hitting the same issue following https://docs.cilium.io/en/latest/gettingstarted/minikube/ |
12950fd to
04f0c98
Compare
Signed-off-by: Paul Chaignon <paul@cilium.io>
This fix is required to get the host policies to work properly with BPF-backed NodePort services. It is not a sufficient fix, but required to resolve the 4.19 program size issue in subsequent commits, so sending now. Fixes: 88bf291 ("bpf: Enforce host policies for IPv4") Signed-off-by: Paul Chaignon <paul@cilium.io>
04f0c98 to
390afe7
Compare
Member
Author
|
test-me-please |
Member
Author
I suspect that the |
qmonnet
reviewed
Jun 8, 2020
Member
qmonnet
left a comment
There was a problem hiding this comment.
All changes so far look good to me (minor nits inline below), thanks a lot for this work!
Contributor
|
@pchaigno i tested this branch on minikube, and it does fix the issue i was seeing 💯 👍 |
joestringer
approved these changes
Jun 8, 2020
Comment on lines
+505
to
+508
| // DoesNotRunOn419Kernel is the complement function of RunsOn419Kernel. | ||
| func DoesNotRunOn419Kernel() bool { | ||
| return !RunsOn419Kernel() | ||
| } |
Member
There was a problem hiding this comment.
nit: Can't we just open-code the !RunsOn419Kernel() in the callers?
Member
Author
There was a problem hiding this comment.
Yeah, probably. I was following the pattern established by other, similar helpers.
On Linux 4.19 (and any kernel with 4096 BPF instructions limit), when
running with kube-proxy-replacement=strict, IPv4, IPv6, fragment support
and in debug mode, Cilium fails to load bpf_overlay and bpf_host with:
Prog section '2/19' rejected: Argument list too long (7)
- Type: 3
- Attach Type: 0
- Instructions: 4203 (107 over limit)
- License: GPL
Error filling program arrays!
2/19 is section CILIUM_CALL_NODEPORT_REVNAT. We can reduce the number of
instructions in that section by breaking the IPv4 and IPv6 cases with
tail calls. Instead of introducing a new tail call on each path, we can
turn CILIUM_CALL_NODEPORT_REVNAT into CILIUM_CALL_IPV{4,6}_NODEPORT_REVNAT.
Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit extends tail_call.cocci to:
- recognize invoke_tailcall_if() calls followed by 'return ret;', as
used in nodeport.h;
- refactor the code a bit using disjunctions;
- ensure the return statement we are trying to match (e.g., return
DROP_MISSED_TAIL_CALL or return send_drop_notify()) is not preceeded by
another return statement.
Fixes: f02bd2a ("cocci: Detect unlogged missed tail calls")
Signed-off-by: Paul Chaignon <paul@cilium.io>
relax_verifier() is supposed to help the verifier prune branches and therefore decrease the reported complexity. On 4.19 however, the call to relax_verifier in CILIUM_CALL_IPV6_FROM_LXC actually increases the complexity because it adds instructions in complexity-hot paths. After a quick check, it doesn't seem to be required for 4.9 either, so let's see if we can remove it. Signed-off-by: Paul Chaignon <paul@cilium.io>
When IPv6 is disabled (e.g., default quick-install), the verifier rejects
the from-container section of bpf_lxc with a complexity error. This
error happens because when one of IPv6 and IPv4 is disabled, we don't
use a tail call to call tail_handle_ipv{4,6} and from-container gets
just long enough to cause a complexity error.
This commit fixes it by always using a tail call to execute
tail_handle_ipv{4,6}.
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
We disable it to avoid hitting complexity issues on that kernel. The fragment tracking test disables debug mode instead. Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
390afe7 to
6b14bad
Compare
Member
Author
|
retest-net-next |
qmonnet
approved these changes
Jun 9, 2020
brb
added a commit
that referenced
this pull request
Jun 10, 2020
Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Signed-off-by: Martynas Pumputis <m@lambda.lt>
brb
added a commit
that referenced
this pull request
Jun 11, 2020
Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Finally, disable ipsec + vxlan test on 4.19, as it is broken with the contemporary 4.19 setup (TODO to investigate it). Signed-off-by: Martynas Pumputis <m@lambda.lt>
aanm
pushed a commit
that referenced
this pull request
Jun 11, 2020
Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Finally, disable ipsec + vxlan test on 4.19, as it is broken with the contemporary 4.19 setup (TODO to investigate it). Signed-off-by: Martynas Pumputis <m@lambda.lt>
aanm
pushed a commit
that referenced
this pull request
Jun 11, 2020
[ upstream commit 6beb6b3 ] Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Finally, disable ipsec + vxlan test on 4.19, as it is broken with the contemporary 4.19 setup (TODO to investigate it). Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
joestringer
pushed a commit
that referenced
this pull request
Jun 12, 2020
[ upstream commit 6beb6b3 ] Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Finally, disable ipsec + vxlan test on 4.19, as it is broken with the contemporary 4.19 setup (TODO to investigate it). Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR does mainly two things, implemented by the 10 following commits.
Fix program size and complexity issues:
tail_call.cocci.relax_verifier()that increases complexity.Enable
kube-proxy-replacement=strictin 4.19 CI build:RunsOn419Kernel()helperkube-proxy-replacement=strict.Fixes: #11175.
/cc @brb