nodeinit: Disable default ip-masq-agent jumps#11782
Conversation
Signed-off-by: John Watson <johnw@planetscale.com>
|
Please set the appropriate release note label. |
Signed-off-by: John Watson <johnw@planetscale.com>
|
Nevermind about AKS, I see AKS itself has added cilium as a supported network plugin provider that fixes this in the future. Just updating AKS doc can fix it now. |
|
test-gke |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
| {{- end }} | ||
|
|
||
| {{- if .Values.global.gke.enabled }} | ||
| # If the IP-MASQ chain exists, add back jump rule |
There was a problem hiding this comment.
This needs a more descriptive comment on why this is needed instead of what the code does. Why is this needed? Why does Cilium need to manipulate k8s iptables rules?
There was a problem hiding this comment.
@tgraf Add more info to the comment in the install script portion
|
Commit 80b4ac05cf42d85ad575a1ad1e8086c2ac4eaea9 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
Signed-off-by: John Watson <johnw@planetscale.com>
80b4ac0 to
9a06f30
Compare
|
Commit 80b4ac05cf42d85ad575a1ad1e8086c2ac4eaea9 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
|
test-gke |
|
retest-netnext |
https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/2061/consoleFull is where I pulled the excerpt from |
|
retest-gke |
|
retest-gke |
|
Looks like just this over and over |
|
@dctrwatson have you been able to manually test this in your own GKE environment? The GKE CI job is not required, but this PR is specifically about GKE so the main thing we're looking for is reassurance that it works. We're working through some CI infrastructure provisioning issues with the GKE-based CI but as long as you've done due diligence this seems reasonable to merge. EDIT: I should maybe add that seeing |
Yes, have been using this with Cilium 1.7.x since a couple days before submitting this PR.
GKE Node Version: Though I had to fix the port for the 2 nodeport pods: --- connectivity-check.yaml 2020-08-07 18:45:53.970029385 -0700
+++ connectivity-check.yaml.orig 2020-08-07 18:47:32.146661105 -0700
@@ -365,13 +365,13 @@
- -c
- sleep 1000000000
image: docker.io/byrnedo/alpine-curl:0.1.8
- imagePullPolicy: IfNotPresent
+ imagePullPolicy: IfNotPresent
livenessProbe:
exec:
- command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:41000" ]
+ command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:31313" ]
readinessProbe:
exec:
- command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:41000" ]
+ command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:31313" ]
name: pod-to-b-intra-node-hostport
---
apiVersion: apps/v1
@@ -506,10 +506,10 @@
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
- command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:41000" ]
+ command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:31313" ]
readinessProbe:
exec:
- command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:41000" ]
+ command: [ "curl", "-sS", "--fail", "-o", "/dev/null", "echo-b-host-headless:31313" ]
name: pod-to-b-multi-node-nodeport |
Hmm. I don't actually see a nodeport declaration in that v1.7 version of the connectivity-check. The port 41000 seems to actually be a host application listening on the port. Seems like we need to fix that up. If you compare to the v1.8 YAML, there's actually a nodeport configured on port |
|
test-gke |
|
I just tried with the v1.8 connectivity-check.yaml and it worked out of the box! |
|
test-gke |
|
The gke flake seems to be caused by a known flake, let's merge. |
|
Discussed during SIG-datapath meeting: This fixes BPF masq (including ip-masq-agent) in v1.8. However the way it applies right now, if BPF masquerade is disabled, it could break ip-masq-agent configuration in GKE. The default Cilium configuration should be to enable BPF masquerade (on new enough kernels), in which case we want the logic introduced here (delete the rule). The concern we have is that when disabling BPF masquerade, this could cause unintended SNAT. The check switches on the |
|
Oh, interesting. This was originally against v1.7, so The intention is that by using Cilium, that ip-masq-agent is no longer used, thus making I see 3 configurations:
So I think another flag would have to be added explicitly stating to fallback to old behavior of using ip-masq-agent? |
|
FWIW, on GKE option 3 means we should just let GKE's ip-masq-agent do its thing, so the current behavior might be an issue in this case? |
|
@dctrwatson right now the check is just for whether GKE is enabled: Perhaps a check for |
@Weil0ng that's the behavior that I found to be surprising and why I made this PR. My expectation was that Cilium manages all aspects of networking once installed by overriding kube-proxy, ip-masq-agent, etc (on new enough kernels.) I'm not opposed to the old behavior of falling back to ip-masq-agent if docs are updated to reflect that and in GKE, there are default masquerade rules that will always apply. So to actually disable masquerade to have 3 flags:
@joestringer that would satisfy my intention for case (3). But as @Weil0ng points out, maybe not everyone wants to bypass ip-masq-agent rules? |
|
bypassing/replacing ip-masq-agent or not is more of a strategic decision. there are users who have custom maps on ip-masq-agent so it might be beneficial to preserve the old behavior (default back to ip-masq-agent |
|
@Weil0ng @David0922 @joestringer I think #12952 handles the unexpected behavior this PR introduced |
Noticed some surprising behavior where disabling Cilium masquerade does not actually disable masquerade in GKE.
In GKE, even if the IP masquerade agent is not enabled, the instance configure script adds some default rules
This removes the jump from
POSTROUTINGto theIP-MASQchain