bpf: Generalize SNAT conflict detection#43955
Conversation
Moving forward the node-port code path will rely on iptables to identify locally generated traffic and create NAT reservations for this traffic in order to avoid reservation conflicts. We will consider it a fatal error moving forward to disable iptable rule installation while node-port or eBPF masquerading is enabled. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
f4ee1b3 to
2d20789
Compare
32d19a6 to
4eb8509
Compare
|
/test |
julianwiedmann
left a comment
There was a problem hiding this comment.
Prior to this pull request the node-port code made the assumption that locally generated traffic would always be sourced from the direct routing interface IP and will always egress on this interface as well.
I think it's the other way around - so far the N/S code was always using one specific saddr, and thus it didn't need to care about any other host traffic.
On a more general note (not necessarily something we need to resolve here) - did you consider why we even need this conflict resolution? Why can't we teach the host netns to avoid using the relevant port range? I remember @borkmann mentioning that we even have some logic for this in the agent ...
rastislavs
left a comment
There was a problem hiding this comment.
ACK for the agent bit
|
(feel free to remove the dont-merge label when the comments are resolved) |
fc6577b to
6ed3e59
Compare
|
/test |
julianwiedmann
left a comment
There was a problem hiding this comment.
One small change, otherwise this looks good to me!
Prior to this commit the nodeport SNAT conflict detection assumed host traffic was sourced from the direct routing interface's IP and always egressed on this device. Remove this assumption by detecting locally generated traffic from any egress interface 'bpf_host' is attached to. This removes the dependency on the direct routing interface in the node-port path. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
6ed3e59 to
f81a7f8
Compare
|
/test |
|
/test |
| // The NodePort code must identify locally generated traffic to avoid | ||
| // NAT port allocation conflicts. | ||
| if (params.KPRConfig.KubeProxyReplacement || params.DaemonConfig.EnableBPFMasquerade) && !params.DaemonConfig.InstallIptRules { | ||
| return fmt.Errorf("kube-proxy replacement (--%s) or BPF masquerade (--%s) requires iptables rules (--%s=\"true\")", |
There was a problem hiding this comment.
@ldelossa I think we should also mention this in the upgrade notes for v1.20. Could you please send a patch?
Within the node-port/masq path we must detect locally generated host traffic and create a SNAT reservation or translation to avoid conflict with pod traffic.
Prior to this pull request the node-port code made the assumption that locally generated traffic would always be sourced from the direct routing interface IP and will always egress on this interface as well.
Moving forward we want to remove the direct routing interface and allow multi-homed clusters, where egress traffic maybe sourced from a multitude of source IPs and leave on arbitrary devices.
Therefore, we update the node-port nat conflict detection in the
to-netdevpath to identity host traffic generically.Host traffic can be generated from: