Summary
When Cilium is configured with kube-proxy, hostNetwork pods (or applications on hosts) cannot establish connectivity with pods on remote nodes via ClusterIP. Regular pods are not affected by this issue.
Environment
- Cilium v1.8.2
- Kernel 5.4
- Kubernetes v1.17 via KIND v0.7.0
How to reproduce the issue
- Create 2-node KIND cluster from cilium git tree using
kind create cluster --config=.github/kind-config.yaml
- Load Cilium via these instructions:
|
- name: Load local images into kind cluster |
|
run: | |
|
kind load docker-image --name chart-testing cilium/cilium:latest |
|
kind load docker-image --name chart-testing cilium/operator-generic:latest |
|
|
|
- name: Install cilium chart |
|
run: | |
|
helm install cilium ./install/kubernetes/cilium \ |
|
--wait \ |
|
--namespace kube-system \ |
|
--set global.nodeinit.enabled=true \ |
|
--set global.kubeProxyReplacement=partial \ |
|
--set global.hostServices.enabled=false \ |
|
--set global.externalIPs.enabled=true \ |
|
--set global.nodePort.enabled=true \ |
|
--set global.hostPort.enabled=true \ |
|
--set config.ipam=kubernetes \ |
|
--set global.pullPolicy=Never |
(I later modified the image manually to v1.8.2 but you can docker pull ... + kind load docker-image ... to preload specific versions)
- Deploy connectivity check YAML
- Edit the host-to-echo-b-clusterip deployment via
kubectl edit deploy host-to-b-multi-node-clusterip
- Replace
livenessProbe with readinessProbe
- Observe that the new pod created for this deployment never becomes ready
Symptoms
Source node sees SYN go out to the remote pod via the overlay:
root@kind-control-plane:/home/cilium# cilium monitor
...
-> overlay flow 0x88ee9aa7 identity 6->0 state new ifindex cilium_vxlan orig-ip 0.0.0.0: 172.17.0.3:28440 -> 10.244.1.175:80 tcp SYN
Destination node (kind-worker) sees the SYN, passes to pod, which responds with a SYN-ACK. This gets passed to the stack:
root@kind-worker:/home/cilium# cilium monitor
...
-> endpoint 351 flow 0xfd614d2c identity 6->13896 state new ifindex lxccc74c557b238 orig-ip 172.17.0.3: 172.17.0.3:29228 -> 10.244.1.175:80 tcp SYN
-> stack flow 0xeac287bd identity 13896->6 state reply ifindex 0 orig-ip 0.0.0.0: 10.244.1.175:80 -> 172.17.0.3:29228 tcp SYN, ACK
With tcpdump at the destination node (kind-worker), we see that the response is being SNAT'd to the node's IP:
# tcpdump -nvei eth0 | grep 172.17.0.3
...
172.17.0.3.50507 > 10.244.1.175.80: Flags [S], cksum 0xb8e5 (incorrect -> 0x36d2), seq 2626802039, win 64240, options [mss 1460,sackOK,TS val 3387777011 ecr 0,nop,wscale 7], length 0
...
172.17.0.2.57629 > 172.17.0.3.50507: Flags [S.], cksum 0x5856 (incorrect -> 0x72a4), seq 2606385031, ack 2626802040, win 64308, options [mss 1410,sackOK,TS val 901524270 ecr 3387777011,nop,wscale 7], length 0
Mitigation
I manually modified the cilium-config ConfigMap to configure enable-bpf-masquerade: "false" then restarted Cilium pods, and the connectivity check started working.
Summary
When Cilium is configured with kube-proxy,
hostNetworkpods (or applications on hosts) cannot establish connectivity with pods on remote nodes via ClusterIP. Regular pods are not affected by this issue.Environment
How to reproduce the issue
kind create cluster --config=.github/kind-config.yamlcilium/.github/workflows/smoke-test.yaml
Lines 126 to 143 in 0a4f34a
(I later modified the image manually to v1.8.2 but you can
docker pull ...+kind load docker-image ...to preload specific versions)kubectl edit deploy host-to-b-multi-node-clusteriplivenessProbewithreadinessProbeSymptoms
Source node sees SYN go out to the remote pod via the overlay:
Destination node (
kind-worker) sees the SYN, passes to pod, which responds with a SYN-ACK. This gets passed to the stack:With
tcpdumpat the destination node (kind-worker), we see that the response is being SNAT'd to the node's IP:Mitigation
I manually modified the
cilium-configConfigMap to configureenable-bpf-masquerade: "false"then restarted Cilium pods, and the connectivity check started working.