bpf: lxc: also handle non-DSR Nodeport services in per-packet LB path#44507
bpf: lxc: also handle non-DSR Nodeport services in per-packet LB path#44507julianwiedmann merged 1 commit intomainfrom
Conversation
|
/test |
|
This will need some trivial changes to the upgrade notes (maybe docs too), but the code changes should be good for review already. |
gyutaeb
left a comment
There was a problem hiding this comment.
Everything looks good to me! The integration of the test case is well done. Also, thank you for explaining the Network policy implications; I learned a lot from it. I'm glad to see the structure improving.
FWIW, I don't see the Network policy aspects as important. Folks would already have the "correct" policy in place to support access via ClusterIP - it's just that now they can remove the slightly awkward tolerations for egressing to REMOTE_NODE and ingressing from WORLD (and WORLD would still be required to support proper N/S access to the Nodeport service). |
#41963 added support for wildcard matching of Nodeport services in bpf_lxc's per-packet LB path. When accessing a DSR Nodeport service on a remote node, the LB now happens at the source. This addressed the scenario where client and backend sit on the same node, and therefore reply traffic would never pass through a native interface hook for RevDNAT. Extend this logic to also cover non-DSR Nodeport services. They are not affected by the mentioned bug (since replies would first go back to the intermediate LB node) - but doing the LB at the source still provides benefits. It allows for precise network policy, and saves resources on the intermediate node. In detail the Network policy implications are: 1. client egress policy needs to allows access to the service's backends. But no longer needs to allow access to remote nodes. 2. backend ingress policy needs to allow access by client pods. But no longer needs to allow WORLD for serving in-cluster clients. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
8bcffd2 to
7fd30ed
Compare
|
now also added docs @gyutaeb I believe this roughly also applies to the DSR scenario, so I kept it generic for all NodePort-type services. |
|
/test |
qmonnet
left a comment
There was a problem hiding this comment.
Ack for doc change, I haven't looked at the datapath change in details though.
ldelossa
left a comment
There was a problem hiding this comment.
Nice! Cleans things up well.
Thanks for adding the docs! I agree that making it generic for all NodePort-type services is the right approach. LGTM. |
This removes the test introduced by a899c4a ("bpf: test: cover NAT LB with catch-all EGW policy"), as part of #36929. The test targeted the scenario of an in-cluster client which accesses a NodePort service on a remote node. But with the changes from #44507, this type of access no longer exists, as the traffic will always get DNATed at the source node. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This removes the test introduced by a899c4a ("bpf: test: cover NAT LB with catch-all EGW policy"), as part of #36929. The test targeted the scenario of an in-cluster client which accesses a NodePort service on a remote node. But with the changes from #44507, this type of access no longer exists, as the traffic will always get DNATed at the source node. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This removes the test introduced by a899c4a ("bpf: test: cover NAT LB with catch-all EGW policy"), as part of cilium#36929. The test targeted the scenario of an in-cluster client which accesses a NodePort service on a remote node. But with the changes from cilium#44507, this type of access no longer exists, as the traffic will always get DNATed at the source node. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit c3238e8 ] [ backporter's notes: added lb4/6_svc_is_nodeport() check before the existing nodeport_uses_dsr4/6() check. On main, nodeport_uses_dsr4/6() was removed by PR cilium#44507, but that PR is not backported here. ] After the wildcard NodePort lookup in the per-packet LB path, verify that the returned service actually has the SVC_FLAG_NODEPORT flag set. Without this check, a non-NodePort service entry (e.g. HostPort) that happens to match the wildcard port could be incorrectly treated as a NodePort service. This matches SocketLB's sock4_wildcard_lookup_full() / sock6_wildcard_lookup_full() behavior, which guards the wildcard result with lb4_svc_is_nodeport() / lb6_svc_is_nodeport(). Signed-off-by: saiaunghlyanhtet <saiaunghlyanhtet2003@gmail.com>
[ upstream commit 3e363a8 ] [ backporter's notes: applied to tc_lxc_lb{4,6}_dsr_nodeport.c and tc_lxc_lb{4,6}_hybrid_dsr_nodeport.c instead of tc_lxc_lb{4,6}_nodeport.c, as PR cilium#44507 which renamed these files is not backported. ] Signed-off-by: saiaunghlyanhtet <saiaunghlyanhtet2003@gmail.com>
[ upstream commit 3e363a8 ] [ backporter's notes: applied to tc_lxc_lb{4,6}_dsr_nodeport.c and tc_lxc_lb{4,6}_hybrid_dsr_nodeport.c instead of tc_lxc_lb{4,6}_nodeport.c, as PR cilium#44507 which renamed these files is not backported. ] Signed-off-by: saiaunghlyanhtet <saiaunghlyanhtet2003@gmail.com>
[ upstream commit c3238e8 ] [ backporter's notes: added lb4/6_svc_is_nodeport() check before the existing nodeport_uses_dsr4/6() check. On main, nodeport_uses_dsr4/6() was removed by PR cilium#44507, but that PR is not backported here. ] After the wildcard NodePort lookup in the per-packet LB path, verify that the returned service actually has the SVC_FLAG_NODEPORT flag set. Without this check, a non-NodePort service entry (e.g. HostPort) that happens to match the wildcard port could be incorrectly treated as a NodePort service. This matches SocketLB's sock4_wildcard_lookup_full() / sock6_wildcard_lookup_full() behavior, which guards the wildcard result with lb4_svc_is_nodeport() / lb6_svc_is_nodeport(). Signed-off-by: saiaunghlyanhtet <saiaunghlyanhtet2003@gmail.com>
[ upstream commit c3238e8 ] [ backporter's notes: added lb4/6_svc_is_nodeport() check before the existing nodeport_uses_dsr4/6() check. On main, nodeport_uses_dsr4/6() was removed by PR #44507, but that PR is not backported here. ] After the wildcard NodePort lookup in the per-packet LB path, verify that the returned service actually has the SVC_FLAG_NODEPORT flag set. Without this check, a non-NodePort service entry (e.g. HostPort) that happens to match the wildcard port could be incorrectly treated as a NodePort service. This matches SocketLB's sock4_wildcard_lookup_full() / sock6_wildcard_lookup_full() behavior, which guards the wildcard result with lb4_svc_is_nodeport() / lb6_svc_is_nodeport(). Signed-off-by: saiaunghlyanhtet <saiaunghlyanhtet2003@gmail.com>
#41963 added support for wildcard matching of Nodeport services in bpf_lxc's per-packet LB path. When accessing a DSR Nodeport service on a remote node, the LB now happens at the source. This addressed the scenario where client and backend sit on the same node, and therefore reply traffic would never pass through a native interface hook for RevDNAT.
Extend this logic to also cover non-DSR Nodeport services. They are not affected by the mentioned bug (since replies would first go back to the intermediate LB node) - but doing the LB at the source still provides benefits. It allows for precise network policy, and saves resources on the intermediate node.
In detail the Network policy implications are: