Running the host firewall on a kube-proxy-free cluster with IPv4 only results in the following error:
level=warning msg="Prog section 'to-netdev' rejected: Argument list too long (7)!" subsys=datapath-loader
level=warning msg=" - Type: 3" subsys=datapath-loader
level=warning msg=" - Attach Type: 0" subsys=datapath-loader
level=warning msg=" - Instructions: 4179 (83 over limit)" subsys=datapath-loader
level=warning msg=" - License: GPL" subsys=datapath-loader
This is easy to reproduce e.g. on GKE with the following command:
helm install cilium ./cilium --namespace cilium --set nodeinit.enabled=true --set nodeinit.reconfigureKubelet=true --set nodeinit.removeCbrBridge=true --set cni.binPath=/home/kubernetes/bin --set tunnel=disabled --set endpointRoutes.enabled=false --set ipam.mode=kubernetes --set nativeRoutingCIDR=$NATIVE_CIDR --set nodeinit.restartPods=true --set hostFirewall=true
The regression doesn't affect 1.9.0. It was probably introduced by some of the recent work on IPv4 fragment tracking.
The section failing to load, to-netdev, performs the following actions:
- Host policy enforcement
- Bandwidth rate limiter enforcement (disabled on GKE; not required to reproduce)
- NAT & service handling
The complexity issue only happens when IPv6 is disabled (maybe also when IPv4 is disabled) because the third step above is compiled into a separate tail call only when both IPv4 and IPv6 are enabled:
|
static __always_inline int nodeport_nat_fwd(struct __ctx_buff *ctx) |
|
{ |
|
int ret = CTX_ACT_OK; |
|
__u16 proto; |
|
|
|
if (!validate_ethertype(ctx, &proto)) |
|
return CTX_ACT_OK; |
|
switch (proto) { |
|
#ifdef ENABLE_IPV4 |
|
case bpf_htons(ETH_P_IP): |
|
invoke_tailcall_if(__and(is_defined(ENABLE_IPV4), is_defined(ENABLE_IPV6)), |
|
CILIUM_CALL_IPV4_ENCAP_NODEPORT_NAT, |
|
tail_handle_nat_fwd_ipv4); |
|
break; |
|
#endif /* ENABLE_IPV4 */ |
|
#ifdef ENABLE_IPV6 |
|
case bpf_htons(ETH_P_IPV6): |
|
invoke_tailcall_if(__and(is_defined(ENABLE_IPV4), is_defined(ENABLE_IPV6)), |
|
CILIUM_CALL_IPV6_ENCAP_NODEPORT_NAT, |
|
tail_handle_nat_fwd_ipv6); |
|
break; |
I will send a pull request to fix it by compiling the NAT & service handling into a separate program when the host firewall is enabled.
Running the host firewall on a kube-proxy-free cluster with IPv4 only results in the following error:
This is easy to reproduce e.g. on GKE with the following command:
The regression doesn't affect 1.9.0. It was probably introduced by some of the recent work on IPv4 fragment tracking.
The section failing to load,
to-netdev, performs the following actions:The complexity issue only happens when IPv6 is disabled (maybe also when IPv4 is disabled) because the third step above is compiled into a separate tail call only when both IPv4 and IPv6 are enabled:
cilium/bpf/lib/nodeport.h
Lines 1629 to 1649 in 89c0f08
I will send a pull request to fix it by compiling the NAT & service handling into a separate program when the host firewall is enabled.