cilium: fix up source address selection for cluster ip#8141
Merged
Conversation
Member
Author
|
test-me-please |
The CI test 'K8sServicesTest Checks ClusterIP Connectivity Checks service on same node' failed recently due to a buggy L7 policy removal from prior test; fixed via d8ff018 ("test: fix incorrect deletion statement for policy"). This incorrect deletion uncovered another bug however triggered by curl failing to connect to the endpoint backing the service: # curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 8 http://10.110.178.169/ -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" time-> DNS: '0.000038()', Connect: '0.000000',Transfer '0.000000', total '3.203689' (Note, DNS has nothing to do with the failure.) # docker exec -ti 6d5465170bf4 bash # cilium bpf lb list SERVICE ADDRESS BACKEND ADDRESS 10.110.178.169:80 0.0.0.0:0 (6) <--- service from above curl 10.10.0.183:80 (6) 10.10.0.62:80 (6) 10.101.55.250:2379 10.10.1.67:2379 (3) 0.0.0.0:0 (3) 10.10.1.162:2379 (3) 10.10.0.24:2379 (3) 10.96.0.10:53 0.0.0.0:0 (2) 10.10.1.185:53 (2) 10.96.0.1:443 192.168.36.11:6443 (1) 0.0.0.0:0 (1) # exit # ip route get 10.110.178.169 10.110.178.169 via 10.0.2.2 dev enp0s3 src 10.0.2.15 uid 0 cache # ip route get 10.10.0.183 10.10.0.183 via 10.10.0.160 dev cilium_host src 10.10.0.160 uid 0 cache # ip route get 10.10.0.62 10.10.0.62 via 10.10.0.160 dev cilium_host src 10.10.0.160 uid 0 cache # ip r default via 10.0.2.2 dev enp0s3 proto dhcp src 10.0.2.15 metric 100 10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 10.0.2.2 dev enp0s3 proto dhcp scope link src 10.0.2.15 metric 100 10.10.0.0/24 via 10.10.0.160 dev cilium_host src 10.10.0.160 10.10.0.160 dev cilium_host scope link 10.10.1.0/24 via 10.10.0.160 dev cilium_host src 10.10.0.160 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 172.28.128.0/24 dev enp0s9 proto kernel scope link src 172.28.128.3 192.168.9.0/24 dev br-e6fcdf93b490 proto kernel scope link src 192.168.9.1 192.168.36.0/24 dev enp0s8 proto kernel scope link src 192.168.36.11 The curl is executed out of hostns and the service IP is xlated via Kubernetes cluster IP which hooks into iptables nat table in post-routing. What could be seen in cilium monitor was that at the cilium_host device (bpf_netdev on egress), the following mapping was found in the ipcache ... Successfully mapped daddr=10.0.2.15 to identity=2 ... and the packet dropped later on due to policy. The id mapping of 2 (world) turns out to be the 0/0 catch-all: # cilium bpf ipcache get 10.0.2.15 10.0.2.15 maps to identity 2 0 0.0.0.0 Aside form the mapping, the other part which is unexpected is the source IP selection. Assumption would have been that given the backend is local, we would have used 10.10.0.160 based on this route: 10.10.0.0/24 via 10.10.0.160 dev cilium_host src 10.10.0.160 As can be seen from above `ip route get` however the source IP selection is based upon the service IP. In retrospect this makes sense because the Kubernetes iptables backend selection is at post-routing, so nothing in terms of source IP will change at that point. Thus, add a SNAT rule that comes after the KUBE-POSTROUTING chain such that we can fix the source address to cilium_host's address instead where it will later also find host identity in the ipcache. Fixes: #7902 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
1817172 to
10f2ffc
Compare
Member
Author
|
test-me-please |
Member
Author
|
test-missed-k8s |
tgraf
approved these changes
May 28, 2019
Member
Author
|
timeout, retrying |
Member
Author
|
test-missed-k8s |
Member
Author
|
test-me-please |
Member
|
Test failures: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/12645/ Running again to see if they reproduce. |
Member
|
Sigh, I fat-fingered and closed the PR facepalm |
Member
Author
|
Hit #7105 |
Member
Author
|
test-me-please |
Contributor
|
@borkmann Since we are only hitting known flakes, let's merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The CI test 'K8sServicesTest Checks ClusterIP Connectivity Checks service
on same node' failed recently due to a buggy L7 policy removal from prior
test; fixed via d8ff018 ("test: fix incorrect deletion statement for
policy"). This incorrect deletion uncovered another bug however triggered
by curl failing to connect to the endpoint backing the service:
[...] (see commit message)
Fixes: #7902
This change is