Skip to content

bpf: don't use bpf_redirect_neigh() from overlay programs#42000

Merged
julianwiedmann merged 3 commits intomainfrom
pr/jwi/main/bpf-fib-overlay
Oct 6, 2025
Merged

bpf: don't use bpf_redirect_neigh() from overlay programs#42000
julianwiedmann merged 3 commits intomainfrom
pr/jwi/main/bpf-fib-overlay

Conversation

@julianwiedmann
Copy link
Copy Markdown
Member

@julianwiedmann julianwiedmann commented Oct 3, 2025

Yusuke chased down a kernel bug 0 that causes memory leaks when bpf_redirect_neigh() is called from bpf_overlay.

As work-around we can opt-out from using the helper, and going down the fallback path in the callers (eg treating it like a call from XDP context).

Work-around a memory leak in the kernel networking stack, which occurs when redirecting packets inside a node that previously arrived via Cilium's overlay interface.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 3, 2025
@julianwiedmann julianwiedmann force-pushed the pr/jwi/main/bpf-fib-overlay branch from b7d735e to 00d0398 Compare October 3, 2025 05:17
@julianwiedmann julianwiedmann force-pushed the pr/jwi/main/bpf-fib-overlay branch 2 times, most recently from 245e928 to c27db88 Compare October 3, 2025 09:50
mcast.h by accident pulls in common.h, but before the various file-wide
defines like IS_BPF_OVERLAY are set. This makes them unusable in places
like overloadable_skb.h.

Fix things up making common.h the first regular header that gets included,
as done in all other programs. For this also shuffle the tailcall.h further
down, even though it's currently harmless.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This reverts commit 3c1b39a.

For the following patch we need to tolerate that bpf_redirect_neigh() is
not available when egress_gw_fib_lookup_and_redirect_v*() is called from
inside bpf_overlay.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Yusuke chased down a kernel bug [0] that causes memory leaks when
bpf_redirect_neigh() is called from bpf_overlay.

As work-around we can opt-out from using the helper, and going down the
fallback path in the callers (eg treating it like a call from XDP context).

[0]: https://lore.kernel.org/netdev/20251003073418.291171-1-daniel@iogearbox.net/T/#u
Reported-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann julianwiedmann force-pushed the pr/jwi/main/bpf-fib-overlay branch from c27db88 to 8eb7efa Compare October 6, 2025 07:21
@julianwiedmann julianwiedmann added area/kernel Requires upstream work in the Linux kernel. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels Oct 6, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 6, 2025
@julianwiedmann
Copy link
Copy Markdown
Member Author

I'm tempted to add a fine-grained neigh_resolver_without_nh_available() to more accurately reflect the condition which triggers the memory leak. But let's go with this for now, and do the fine-tuning separately.

@julianwiedmann
Copy link
Copy Markdown
Member Author

/test

@julianwiedmann julianwiedmann marked this pull request as ready for review October 6, 2025 07:27
@julianwiedmann julianwiedmann requested review from a team as code owners October 6, 2025 07:27
Copy link
Copy Markdown
Member

@ysksuzuki ysksuzuki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@julianwiedmann julianwiedmann added this pull request to the merge queue Oct 6, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Oct 6, 2025
Merged via the queue into main with commit 7350d08 Oct 6, 2025
667 of 675 checks passed
@julianwiedmann julianwiedmann deleted the pr/jwi/main/bpf-fib-overlay branch October 6, 2025 12:03
julianwiedmann added a commit that referenced this pull request Oct 7, 2025
#42000 limited the usage of
bpf_redirect_neigh() from overlay programs, to work-around a kernel bug
that causes a memory leak.

This bug only manifests when bpf_redirect_neigh() is called without
next-hop information - or in Cilium terms, without a preceding FIB lookup.
By annotating such specific usage of bpf_redirect_neigh() with a
fine-grained capability check, we can otherwise allow the use of
bpf_redirect_neigh() from overlay context.

Start by introducing the neigh_resolver_without_nh_available() check in all
relevant places.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann
Copy link
Copy Markdown
Member Author

I'm tempted to add a fine-grained neigh_resolver_without_nh_available() to more accurately reflect the condition which triggers the memory leak. But let's go with this for now, and do the fine-tuning separately.

Started this in #42052.

julianwiedmann added a commit that referenced this pull request Oct 7, 2025
#42000 limited the usage of
bpf_redirect_neigh() from overlay programs, to work-around a kernel bug
that causes a memory leak.

This bug only manifests when bpf_redirect_neigh() is called without
next-hop information - or in Cilium terms, without a preceding FIB lookup.
By annotating such specific usage of bpf_redirect_neigh() with a
fine-grained capability check, we can otherwise allow the use of
bpf_redirect_neigh() from overlay context.

Start by introducing the neigh_resolver_without_nh_available() check in all
relevant places.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 10, 2025
#42000 limited the usage of
bpf_redirect_neigh() from overlay programs, to work-around a kernel bug
that causes a memory leak.

This bug only manifests when bpf_redirect_neigh() is called without
next-hop information - or in Cilium terms, without a preceding FIB lookup.
By annotating such specific usage of bpf_redirect_neigh() with a
fine-grained capability check, we can otherwise allow the use of
bpf_redirect_neigh() from overlay context.

Start by introducing the neigh_resolver_without_nh_available() check in all
relevant places.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
aditighag pushed a commit to aditighag/cilium that referenced this pull request Oct 14, 2025
cilium#42000 limited the usage of
bpf_redirect_neigh() from overlay programs, to work-around a kernel bug
that causes a memory leak.

This bug only manifests when bpf_redirect_neigh() is called without
next-hop information - or in Cilium terms, without a preceding FIB lookup.
By annotating such specific usage of bpf_redirect_neigh() with a
fine-grained capability check, we can otherwise allow the use of
bpf_redirect_neigh() from overlay context.

Start by introducing the neigh_resolver_without_nh_available() check in all
relevant places.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/kernel Requires upstream work in the Linux kernel. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

3 participants