Skip to content

NodePort: dynamic SNAT#44625

Merged
ldelossa merged 4 commits intomainfrom
ldelossa/nodeport-dynamic-snat
Mar 24, 2026
Merged

NodePort: dynamic SNAT#44625
ldelossa merged 4 commits intomainfrom
ldelossa/nodeport-dynamic-snat

Conversation

@ldelossa
Copy link
Copy Markdown
Contributor

@ldelossa ldelossa commented Mar 4, 2026

This pull request uses the fairly new BPF_FIB_LOOKUP_SRC FIB lookup flag to support dynamic source IP resolution in the N/S node port load balancing path.

A new flag is introduced to opt into this new functionality --enable-dynamic-source-lookup-nodeport, nodePort.enableDynamicSourceLookup being the Helm analog.

Additionally a probe is added which short-circuits the source IP lookup functionality if the kernel does not support it.

When the flag is enabled and the kernel supports the feature the N/S load balancing path will dynamically resolve the preferred source IP, from the kernel's FIB perspective, it should use for the SNAT operation prior to forwarding the packet to the final destination.

This pull request makes no change in the data path which immediately delivers a node port request to a local backend, as no SNAT is required.

nodeport: support n/s dynamic source ip resolution 

@ldelossa ldelossa requested review from a team as code owners March 4, 2026 14:53
@ldelossa ldelossa added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/loadbalancing Impacts load-balancing and Kubernetes service implementations labels Mar 4, 2026
@ldelossa ldelossa requested a review from a team as a code owner March 4, 2026 14:53
@ldelossa ldelossa added the release-note/major This PR introduces major new functionality to Cilium. label Mar 4, 2026
@ldelossa ldelossa requested review from a team as code owners March 4, 2026 14:53
@ldelossa ldelossa added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/major This PR introduces major new functionality to Cilium. labels Mar 4, 2026
@ldelossa ldelossa requested a review from joamaki March 4, 2026 14:53
@ldelossa ldelossa added the area/loadbalancing Impacts load-balancing and Kubernetes service implementations label Mar 4, 2026
@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from 795e4b1 to cfda5f1 Compare March 4, 2026 15:08
@ldelossa
Copy link
Copy Markdown
Contributor Author

ldelossa commented Mar 4, 2026

/test

Copy link
Copy Markdown
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just dropping a few minor comments. It's a bit annoying that we're hiding this feature behind a new config flag (default false), that will make e2e test coverage really difficult.

  • What's your overall intention towards testing? I think we'll definitely want some BPF-level tests to demonstrate this code-path in action, and what results the FIB lookup potentially returns (mocking would be good enough here for a start). So it's easier to catch regressions, and others later working on the code have a reference.

  • I'm less certain about e2e-level testing - in the current framework, are we able to test more than the good path? I feel there might be some value in having a more involved e2e-test once we also switch the BPF-Masq path over - confirming that an external endpoint is accessed with the expected interface / source IP. Similar how we currently test EGW.

  • I see an upgrade impact here, where in-flight connections would potentially change their source IP.

  • When users modify their routing setup on-the-fly and this changes what source IP is returned for some connections, this would also impact established connections. But I guess the answer there is "don't do that".

@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from cfda5f1 to e782ada Compare March 5, 2026 16:00
@ldelossa
Copy link
Copy Markdown
Contributor Author

ldelossa commented Mar 5, 2026

@julianwiedmann re:

Just dropping a few minor comments. It's a bit annoying that we're hiding this feature behind a new config flag (default false), that will make e2e test coverage really difficult.

  • What's your overall intention towards testing? I think we'll definitely want some BPF-level tests to demonstrate this code-path in action, and what results the FIB lookup potentially returns (mocking would be good enough here for a start). So it's easier to catch regressions, and others later working on the code have a reference.

Yes, I think for now, we add a fib lookup mock which injects a different source IP from the fallback IP used (direct routing IP), and simply confirm we see this in the resulting skb_buff. I can add this as an additional commit to this PR.

  • I'm less certain about e2e-level testing - in the current framework, are we able to test more than the good path? I feel there might be some value in having a more involved e2e-test once we also switch the BPF-Masq path over - confirming that an external endpoint is accessed with the expected interface / source IP. Similar how we currently test EGW.

Yeah, for e2e to be worth it, we need a multi-homed node I think, and we need to confirm the packet egresses that interface with the expected source IP. I'm not familiar with EGW tests, but I can look into this to understand if it provides inspiration.

  • I see an upgrade impact here, where in-flight connections would potentially change their source IP.

  • When users modify their routing setup on-the-fly and this changes what source IP is returned for some connections, this would also impact established connections. But I guess the answer there is "don't do that".

Yes, I would say this is an operational issue, just like modifying your routing table at runtime. No safeguards there on native linux, you need to know what you're doing. That is why this is 'opt-in', it gives a switch for the user, saying "okay now, native FIB matters, and we need to be conscious of changes there".

@ldelossa ldelossa added the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Mar 5, 2026
@ldelossa
Copy link
Copy Markdown
Contributor Author

ldelossa commented Mar 5, 2026

Added don't merge label until eBPF tests are implemented.

@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from e782ada to b1fe67d Compare March 5, 2026 18:54
Copy link
Copy Markdown
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few smaller things left ...

@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from 73d4507 to a51d1a6 Compare March 19, 2026 19:04
@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from a51d1a6 to c9e8c30 Compare March 19, 2026 19:15
@julianwiedmann
Copy link
Copy Markdown
Member

(looking at the CI failures, you'll need to adjust for #44755)

@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from c9e8c30 to ae30fd9 Compare March 20, 2026 13:33
@ldelossa
Copy link
Copy Markdown
Contributor Author

@julianwiedmann yup just saw that on my end prior to reading that comment, handled in latest force push.

@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch 2 times, most recently from 78203d8 to 47a5253 Compare March 20, 2026 14:14
@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch 2 times, most recently from 04d006c to 54c7ee8 Compare March 21, 2026 00:24
Copy link
Copy Markdown
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ldelossa
Copy link
Copy Markdown
Contributor Author

/test

@ldelossa ldelossa removed the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Mar 24, 2026
@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch 2 times, most recently from 50b9932 to c7322ee Compare March 24, 2026 17:35
Add a flag for opt-in of dynamic source IP resolution in the N/S
nodeport loadbalancing path, not yet implemented.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
Add a probe to test if the BPF_FIB_LOOKUP_SRC flag is supported for the
bpf_fib_lookup helper.

This probe results in a new config variable declared in node.h,
`supports_fib_lookup_src`, which informs the data-path if the preferred
source IP can be retrieved on fib lookup.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
Perform dynamic source IP resolution for SNAT in the N/S nodeport load
balancing path.

Introduce the `fib_lookup_src_v[4|6]` for resolving a source address
given a destination.

Utilize this method in the nodeport egress SNAT path for both native
routing mode SNAT and post-SNAT tunnel mode encapsulation.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
Add three new tests which ensure source IP resolution works correctly in
the nodeport load balancing path when supported.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
@ldelossa ldelossa force-pushed the ldelossa/nodeport-dynamic-snat branch from c7322ee to 8d0620d Compare March 24, 2026 17:38
@ldelossa
Copy link
Copy Markdown
Contributor Author

/test

@ldelossa ldelossa added this pull request to the merge queue Mar 24, 2026
Merged via the queue into main with commit a3752f5 Mar 24, 2026
704 of 713 checks passed
@ldelossa ldelossa deleted the ldelossa/nodeport-dynamic-snat branch March 24, 2026 22:30
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/loadbalancing Impacts load-balancing and Kubernetes service implementations ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants