Fix GKE Helm options for CI and docs.#12087
Conversation
|
test-gke |
|
test-me-please |
There was a problem hiding this comment.
Note that we need this particular change on v1.7 as well (but not the ipam option). Given you're on backports this week, probably easiest for you to make sure the right commit from this PR goes back.
There was a problem hiding this comment.
Actually, it looks like we backported this change to the docs but never backported the helm side interpretation so it has been a no-op the entire time 🤦
|
test-gke |
|
Pushed one more commit only touching the GKE jenkinsfile after all other CI jobs already succeeded. |
|
test-gke |
1 similar comment
|
test-gke |
|
@nebril added one more commit to pass Hubble-relay image with the proper tag to Ginkgo, please have a look! |
|
test-gke |
|
This needs #12076, waiting it to get merged before rebasing. |
92a1fc9 to
f7fbf0a
Compare
|
rebased, retesting.. |
|
test-gke |
|
5 failures in https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/1687/ |
|
test-gke |
|
GKE pipeline failed but now with some more data: https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/1688/ |
f7fbf0a to
2b4e618
Compare
|
test-gke |
Helm name for the native routing CIDR is "global.nativeRoutingCIDR". Cilium agent command line option name is "native-routing-cidr". Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
GKE install guide specifies the "ipam.config=kubernetes" option while CI gkeHelmOverrides do not. Adding this option to CI gkeHelmOverrides allowed Ginkgo runs in GKE to succeed. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Without this override we'd get the following in the generated Cilium yaml:
$ diff -C3 good-cilium.yaml cilium-1619730510c46898.yaml
*** good-cilium.yaml 2020-06-17 14:07:44.000000000 -0700
--- cilium-1619730510c46898.yaml 2020-06-17 14:46:51.000000000 -0700
***************
*** 224,229 ****
--- 224,232 ----
install-iptables-rules: "true"
auto-direct-node-routes: "false"
native-routing-cidr: 10.0.0.0/8
+ # List of devices used to attach bpf_host.o (implements BPF NodePort,
+ # host-firewall and BPF masquerading)
+ devices: "eth0 eth0\neth0"
kube-proxy-replacement: "probe"
node-port-mode: "snat"
node-port-bind-protection: "true"
Fixes: #11969
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Using the "fat" cilium-operator image fails with:
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: ContainerCannotRun
Message: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: \"cilium-operator-generic\": executable file not found in $PATH": unknown
Exit Code: 127
Looks like the "cilium-operator" image tries to exec "cilium-operator-generic"?
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Pass the full image name to Ginkgo in the GKE Jenkinsfile, as otherwise Ginkgo tries to use the default tag (latest), which is not available for the test setup. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
…docs Basic regular expressions in sed do not support '+', while the command line arguments for extended regular expressions are different for GNU sed and Mac OSX. Use (unquoted) '*' instead, as there is no harm replacing '[]' with '[]'. This form work both on Linux and Mac OSX. Add a note to the e2e testing docs so that users know to run the script if namespaces get stuck at 'Terminating'. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
GKE can time out on apply when installing Cilium, use a longer timeout. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Create the cluster lock in namespace 'cilium-ci-lock' instead of 'default' so that it survives the CI deleting all resources in the default namespace: 15:57:58 22:57:57 STEP: Deleting rs [lock-6f4fb6cdfb] in namespace default 15:57:58 22:57:57 STEP: Waiting for 1 deletes to return (lock-6f4fb6cdfb) 15:57:58 22:57:57 STEP: Deleting deployment [lock] in namespace default 15:57:58 22:57:57 STEP: Waiting for 1 deletes to return (lock) Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
0214ce8 to
af11191
Compare
|
test-me-please |
|
test-gke |
|
rebased, no changes |
|
test-gke |
1 similar comment
|
test-gke |
|
The only change here that can affect non-GKE tests is the doubling of the Apply timeout from 30 seconds to 60 seconds. Due to the cluster locking bug all GKE test runs can step on each other at any time until this PR is merged and all PRs are rebased to include the locking fix. There has been many successful GKE CI runs, but AFAIK all of them have been on cluster Given the above this PR is ready merge. |
GKE install guide specifies the "ipam.config=kubernetes" option while
CI gkeHelmOverrides do not. Adding this option to CI gkeHelmOverrides
allowed Ginkgo runs in GKE to succeed.
Helm name for the native routing CIDR is
"global.nativeRoutingCIDR". Cilium agent command line option name is
"native-routing-cidr".
Fixes: #12053