Skip to content

test: Set devices and enable host firewall in kube-proxy CI#11969

Merged
aanm merged 5 commits intomasterfrom
pr/pchaigno/set-device-in-ci
Jun 17, 2020
Merged

test: Set devices and enable host firewall in kube-proxy CI#11969
aanm merged 5 commits intomasterfrom
pr/pchaigno/set-device-in-ci

Conversation

@pchaigno
Copy link
Copy Markdown
Member

@pchaigno pchaigno commented Jun 8, 2020

This pull request sets devices and enables the host firewall in kube-proxy CIs.
See commit messages for details.

Fixes: #11972

@pchaigno pchaigno added the release-note/misc This PR makes changes that have no direct user impact. label Jun 8, 2020
@coveralls
Copy link
Copy Markdown

coveralls commented Jun 8, 2020

Coverage Status

Coverage decreased (-0.04%) to 37.057% when pulling c917376 on pr/pchaigno/set-device-in-ci into bf64aa9 on master.

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jun 9, 2020

retest-4.9

https://jenkins.cilium.io/job/Cilium-PR-K8s-newest-kernel-4.9/678/
Those same three tests have been failing in the two last runs, although not with the same error messages:

Suite-k8s-1.18.K8sDatapathConfig AutoDirectNodeRoutes Check connectivity with automatic direct nodes routes
Suite-k8s-1.18.K8sDatapathConfig AutoDirectNodeRoutes Check direct connectivity with per endpoint routes
Suite-k8s-1.18.K8sDatapathConfig  Transparent encryption DirectRouting Check connectivity with  transparent encryption and direct routing

I also managed to reproduce locally.

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jun 10, 2020

I looked a bit into the kube-dns issue. Apparently, in direct routing mode, DNS replies are coming from the wrong address:

$ dig kubernetes.default.svc.cluster.local @10.0.1.157
;; reply from unexpected source: 192.168.36.12#53, expected 10.0.1.157#53
;; reply from unexpected source: 192.168.36.12#53, expected 10.0.1.157#53

The query reaches the second node with kube-dns as expected, but then the answer leaves the second node with source IP address 192.168.36.12 instead of the 10.0.1.157 expected by the client. It looks like we are either (1) NATing at second node when we shouldn't or (2) missing a reverse NATing.

@brb
Copy link
Copy Markdown
Member

brb commented Jun 10, 2020

We should either disable masquerading for those tests global.masquerade=false, or if the pod cidr allocation is known, then set global.nativeRountingCIDR. The latter will change the masquerade rule from -A CILIUM_POST_nat -s 10.0.1.0/24 ! -d 10.0.1.0/24 ! -o cilium_+ -m comment --comment "cilium masquerade non-cluster" -j MASQUERADE to A CILIUM_POST_nat -s 10.0.1.0/24 ! -d 10.0.0.0/16 ! -o cilium_+ -m comment --comment "cilium masquerade non-cluster" -j MASQUERADE which will prevent from the unnecessary masquerading which makes the rev-DNAT xlation impossible.

@brb
Copy link
Copy Markdown
Member

brb commented Jun 10, 2020

One more data point: when we do NOT set --device(s), a reply from bpf_lxc does not enter nat/POSTROUTING chain (in which the MASQ rule is installed).

@brb
Copy link
Copy Markdown
Member

brb commented Jun 10, 2020

tc filter del dev enp0s8 ingress fixes the issue.

@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch 5 times, most recently from 5296abc to 5cb4ada Compare June 12, 2020 05:33
@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch from 5cb4ada to 8766e0f Compare June 12, 2020 07:11
@pchaigno pchaigno closed this Jun 12, 2020
@pchaigno pchaigno reopened this Jun 12, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch from 8766e0f to 05f731a Compare June 12, 2020 14:11
@pchaigno pchaigno changed the title test: Set --device in all CI pipelines test: Set devices and enable host firewall in kube-proxy CIs Jun 12, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch 4 times, most recently from 1d9bcb4 to 5032bb7 Compare June 15, 2020 06:06
Previously the devices were only configured when kube-proxy was
disabled, for use in BPF-based NodePort. This commit always sets the
private and default interfaces, in preparation for the host firewall
enablement in kube-proxy CI pipelines.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch from 5032bb7 to 284bd18 Compare June 16, 2020 13:24
@pchaigno
Copy link
Copy Markdown
Member Author

test-me-please

@pchaigno pchaigno marked this pull request as ready for review June 16, 2020 18:27
@pchaigno pchaigno requested a review from a team June 16, 2020 18:27
@pchaigno pchaigno requested a review from a team as a code owner June 16, 2020 18:27
@pchaigno pchaigno changed the title test: Set devices and enable host firewall in kube-proxy CIs test: Set devices and enable host firewall in kube-proxy CI Jun 16, 2020
@pchaigno pchaigno requested a review from brb June 16, 2020 19:05
@pchaigno pchaigno added needs-backport/1.8 area/CI Continuous Integration testing issue or flake labels Jun 16, 2020
Comment thread test/k8sT/DatapathConfiguration.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a small comment explaining why (1) when we run w/ kube-proxy, (2) we disable masquerade for this particular test case.

Copy link
Copy Markdown
Member Author

@pchaigno pchaigno Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brb TBH, I'm not sure I understand why this change is necessary. It is supposed to work with masquerade enabled, isn't it? Is there an issue opened somewhere to track this?

@maintainer-s-little-helper maintainer-s-little-helper Bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 17, 2020
Copy link
Copy Markdown
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one suggestion to add a comment.

pchaigno added 4 commits June 17, 2020 15:21
Otherwise, setting devices results in invalid masquerading rules.

Related: #12141
Suggested-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Paul Chaignon <paul@cilium.io>
For IPSec, we attach bpf_network to the private interface. If we set the
devices, we will however attach bpf_host to the private interface.

Since we currently cannot have both devices specific and IPSec enabled,
unset devices for the IPSec test.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Both features cannot be used together currently since they will both try
to attach different programs, bpf_host and bpf_network, to the
encryption interface.

Fixes: f74087e ("daemon: Introduce enable-host-firewall option")
Signed-off-by: Paul Chaignon <paul@cilium.io>
The host firewall was previously enabled only in kube-proxy-free CI
pipelines. This commit enables it in kube-proxy CI pipelines as well.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/set-device-in-ci branch from 284bd18 to c917376 Compare June 17, 2020 13:25
@aanm aanm merged commit 989eced into master Jun 17, 2020
@aanm aanm deleted the pr/pchaigno/set-device-in-ci branch June 17, 2020 13:47
jrajahalme added a commit that referenced this pull request Jun 17, 2020
Without this override we'd get the following in the generated Cilium yaml:

$ diff -C3 good-cilium.yaml cilium-1619730510c46898.yaml
*** good-cilium.yaml	2020-06-17 14:07:44.000000000 -0700
--- cilium-1619730510c46898.yaml	2020-06-17 14:46:51.000000000 -0700
***************
*** 224,229 ****
--- 224,232 ----
    install-iptables-rules: "true"
    auto-direct-node-routes: "false"
    native-routing-cidr: 10.0.0.0/8
+   # List of devices used to attach bpf_host.o (implements BPF NodePort,
+   # host-firewall and BPF masquerading)
+   devices: "eth0 eth0\neth0"
    kube-proxy-replacement:  "probe"
    node-port-mode: "snat"
    node-port-bind-protection: "true"

Fixes: #11969
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
jrajahalme added a commit that referenced this pull request Jun 18, 2020
Without this override we'd get the following in the generated Cilium yaml:

$ diff -C3 good-cilium.yaml cilium-1619730510c46898.yaml
*** good-cilium.yaml	2020-06-17 14:07:44.000000000 -0700
--- cilium-1619730510c46898.yaml	2020-06-17 14:46:51.000000000 -0700
***************
*** 224,229 ****
--- 224,232 ----
    install-iptables-rules: "true"
    auto-direct-node-routes: "false"
    native-routing-cidr: 10.0.0.0/8
+   # List of devices used to attach bpf_host.o (implements BPF NodePort,
+   # host-firewall and BPF masquerading)
+   devices: "eth0 eth0\neth0"
    kube-proxy-replacement:  "probe"
    node-port-mode: "snat"
    node-port-bind-protection: "true"

Fixes: #11969
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
joestringer pushed a commit that referenced this pull request Jun 18, 2020
Without this override we'd get the following in the generated Cilium yaml:

$ diff -C3 good-cilium.yaml cilium-1619730510c46898.yaml
*** good-cilium.yaml	2020-06-17 14:07:44.000000000 -0700
--- cilium-1619730510c46898.yaml	2020-06-17 14:46:51.000000000 -0700
***************
*** 224,229 ****
--- 224,232 ----
    install-iptables-rules: "true"
    auto-direct-node-routes: "false"
    native-routing-cidr: 10.0.0.0/8
+   # List of devices used to attach bpf_host.o (implements BPF NodePort,
+   # host-firewall and BPF masquerading)
+   devices: "eth0 eth0\neth0"
    kube-proxy-replacement:  "probe"
    node-port-mode: "snat"
    node-port-bind-protection: "true"

Fixes: #11969
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@pchaigno pchaigno added the area/host-firewall Impacts the host firewall or the host endpoint. label Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/CI Continuous Integration testing issue or flake area/host-firewall Impacts the host firewall or the host endpoint. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting devices without NodePort breaks multi-node connectivity

6 participants