workflows/tests-datapath-verifier: Test mcpu v3 with RHEL 8.6 kernel#40390
Merged
dylandreimerink merged 3 commits intomainfrom Jul 9, 2025
Merged
workflows/tests-datapath-verifier: Test mcpu v3 with RHEL 8.6 kernel#40390dylandreimerink merged 3 commits intomainfrom
dylandreimerink merged 3 commits intomainfrom
Conversation
Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Member
Author
|
/ci-verifier |
When compiling the BPF programs on RHEL 8.6 with mCPU v3, clang produces bytecode in which there exists a path where the `info` pointer passed to `nodeport_add_tunnel_encap` is `NULL`. This shouldn't happen because `tunnel_endpoint` can only be non 0 if `info` is non-NULL. It seems only on RHEL 8.6, this relation between `info` and `tunnel_endpoint` cannot be tracked by the verifier. And we get the following verifier error: ``` parent didn't have regs=2 stack=0 marks last_idx 1049 first_idx 1033 regs=2 stack=0 before 1049: (71) r1 = *(u8 *)(r1 +0) ; return ctx_store_bytes(ctx, off + ETH_ALEN + ETH_ALEN, 1071: (79) r9 = *(u64 *)(r10 -272) ; if (info->flag_ipv6_tunnel_ep) 1072: (71) r1 = *(u8 *)(r9 +23) R9 invalid mem access 'inv' ```` Adding an additional NULL check for `info` fixes the issue. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Member
Author
|
/ci-verifier |
When compiling with -mcpu=v3, the compiler generated byte code that
was rejected by the verifier on RHEL 8.6. The error seems to be
unrelated to the mcpu value, but rather a bug in the compiler that
specifically got triggered with the full set of settings.
The error was:
```
; if (IS_ERR(ret))
2038: (61) r1 = *(u32 *)(r10 -192)
2039: (56) if w1 != 0x0 goto pc-248
R7=invP0
2040: (05) goto pc-514
; if (a->d1 != b->d1)
1527: (71) r1 = *(u8 *)(r7 +32)
R7 invalid mem access 'inv'
```
It is very odd, it says R7 is null. It tries to take offset 32, which
is the nexthdr of the IPv6 header. `iphdr` in
`snat_v6_rev_nat_handle_icmp_pkt_toobig`. But before its passed to
the function where the read happens we copy that field into
`struct ipv6_ct_tuple tuple` which is stack allocated.
The compiler seems to track the chain of assignments, and completely
eliminate the step of reading the packet data into the stack. It seems
the verifier can still find a branch where it could attempt to read
from the ctx->data pointer, while its 0.
The only way I found to get the compiler to not do this optimization
is to insert an asm volatile statement that does nothing but takes
the memory address of the tuple, which forces the compiler to
put it on the stack and not optimize it away.
Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
2c81f2d to
45a9a21
Compare
Member
Author
|
/test |
YutaroHayakawa
approved these changes
Jul 8, 2025
Member
YutaroHayakawa
left a comment
There was a problem hiding this comment.
This was a fun one. Thanks!
dylandreimerink
added a commit
that referenced
this pull request
Jul 9, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 9, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 9, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 9, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 15, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 15, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 15, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 15, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink
added a commit
that referenced
this pull request
Jul 15, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 16, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since #40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
rabelmervin
pushed a commit
to rabelmervin/cilium
that referenced
this pull request
Aug 18, 2025
In v1.18 we changed the minimum supported kernel version to v5.10. For a while we still were testing RHEL 8.6 against v5.4, but that is no longer the case since cilium#40390. So we can remove the compile permutations for v5.4. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
julianwiedmann
added a commit
that referenced
this pull request
Sep 1, 2025
#40390 removed the last usage of KERNEL=54 in CI, no need to special-case it any longer. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann
added a commit
that referenced
this pull request
Sep 1, 2025
#40390 removed the last usage of KERNEL=54 in CI, no need to special-case it any longer. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
5 tasks
github-merge-queue bot
pushed a commit
that referenced
this pull request
Sep 4, 2025
#40390 removed the last usage of KERNEL=54 in CI, no need to special-case it any longer. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently when compiling verifier tests we use a mcpu value that is based on a kernel version number passed in by the test matrix. For RHEL 8.6 we had it set at
54. For54specifically we were compiling withmcpu=v2instead ofmcpu=v3which we use for all other kernels. However, while working on #40367 I discovered that the probe we use during runtime to detect mcpu version tells us that RHEL 8.6 should usev3as well.So, that means that our CI is not aligned with what we actually run. Furthermore, when you change the value (the first commit in this PR) it actually breaks the complexity tests. The other two commits in this PR tweak the code so the compiler emits bytecode that passes RHEL 8.6 when under
mcpu=v3. See the commit messages for details.