Skip to content

[v1.18] loader: XDP attach type fallback logic#44499

Merged
ti-mo merged 4 commits intov1.18from
pr/v1.18-backport-2026-02-23-12-45
Feb 26, 2026
Merged

[v1.18] loader: XDP attach type fallback logic#44499
ti-mo merged 4 commits intov1.18from
pr/v1.18-backport-2026-02-23-12-45

Conversation

@ti-mo
Copy link
Copy Markdown
Contributor

@ti-mo ti-mo commented Feb 23, 2026

Once this PR is merged, a GitHub action will update the labels of these PRs:

 41967 44209

viktor-kurchenko and others added 4 commits February 23, 2026 12:58
[ upstream commit 62f4856 ]

The commit adds retry logic in case the XDP program loading failed with
the `invalid argument` error. The error might indicate that the network
interface is configured with a jumbo MTU, so we can retry loading after
setting the `BPF_F_XDP_HAS_FRAGS` flag and hope that the NIC driver is XDP
Fragment aware.

Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 7c633a7 ]

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit c861775 ]

Due to changes in newer kernels and the cilium/ebpf library, XDP
programs will in future be loaded as ebpf.AttachXDP. However, older
Cilium versions will have already created links with ebpf.AttachNone
programs. The kernel does not allow us to change the program of a link
if its attach type does not match.

This means that we can only use the new XDP attach type when a link is
newly created. This commit adds logic which detects errors on link
update and attempts to load and attach with the other attach type
instead. So when upgrading from an older version to a newer version,
new links are created as XDP attach type, but existing links will remain
using the AttachNone. On downgrade, all links will be created with
AttachNone, and existing links will continue to use AttachXDP.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Co-authored-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit is unique to release branches. ebpf-go will now return
AttachXDP as the attach type of XDP programs by default. This is something
the Cilium versions that want to upgrade to/from versions using the new
ebpf-go release need to be aware of.

This commit restores the old behaviour of the library on top of having the
retry loop added in the previous commit, making sure we don't use the new
attach type unless strictly necessary.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo added kind/backports This PR provides functionality previously merged into master. backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. labels Feb 23, 2026
@ti-mo ti-mo changed the title v1.18 Backports 2026-02-23 [v1.18] loader: XDP attach type fallback logic Feb 23, 2026
@ti-mo
Copy link
Copy Markdown
Contributor Author

ti-mo commented Feb 23, 2026

/test

@ti-mo ti-mo marked this pull request as ready for review February 23, 2026 13:22
@ti-mo ti-mo requested a review from a team as a code owner February 23, 2026 13:22
Copy link
Copy Markdown
Contributor

@viktor-kurchenko viktor-kurchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ti-mo ti-mo enabled auto-merge February 24, 2026 08:48
@ti-mo ti-mo added this pull request to the merge queue Feb 26, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 26, 2026
Merged via the queue into v1.18 with commit fbf6a4e Feb 26, 2026
310 checks passed
@ti-mo ti-mo deleted the pr/v1.18-backport-2026-02-23-12-45 branch February 26, 2026 11:11
@julianwiedmann
Copy link
Copy Markdown
Member

@ti-mo @dylandreimerink @viktor-kurchenko

Looks like this PR broke GKE across the board for v1.18 - at least reverting it allows Cilium to start up again.

Given that v1.19 looks fine, most likely something in the backport is broken. I intend to merge the revert - unless someone has a specific idea what's wrong, and sufficient cycles to fix it?

@viktor-kurchenko
Copy link
Copy Markdown
Contributor

Given that v1.19 looks fine, most likely something in the backport is broken. I intend to merge the revert - unless someone has a specific idea what's wrong, and sufficient cycles to fix it?

I won't be able to look today. So, I don't mind to revert it and investigate after.
Thanks @julianwiedmann

@lictw
Copy link
Copy Markdown

lictw commented Mar 23, 2026

It seems this isn't limited to GKE: updating to Cilium 1.18.8 also broke Talos v1.12.4.
The cilium-* fail to start with the following error: (CrashLoop)

panic: Start or stop failed to finish on time, aborting forcefully

As a result, pods are failing to start with the error: unable to connect to Cilium agent.

blacksd added a commit to kube-on-the-cheap/platform that referenced this pull request Mar 23, 2026
Cilium 1.18.8 breaks Talos due to XDP attach fallback logic
(cilium/cilium#44499) causing BPF dead code elimination probe
failures and hive startup timeout panics on both nodes.

Upgrade to 1.19.x and apply required migration changes:
- CiliumLoadBalancerIPPool apiVersion v2alpha1 -> v2
- Explicitly enable mesh authentication (default changed in 1.19)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants