Conversation
[ upstream commit 8f624a7 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 269c985 ] If the cluster has just been recreated, it can have no nodepool yet. Don't choose such clusters because scaling it will fail. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 82f0cd0 ] We are hitting a lot of issues caused by state left over from test run. This change deletes the cluster after the test run and relies on our infra recreating the cluster for subsequent builds. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 0868c6c ] Remove the generated temp files after they are no longer needed. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 425fbd7 ] We already have the test where the request fails, so add one for the case where it must succeed from outside. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 5382f2a ] Hubble v0.5 used to display FQDN names without a trailing dot for absolute paths, i.e. `cilium.io` instead of `cilium.io.`. We accidentally changed this in Hubble v0.6 when we started accessing the Cilium DNS cache directly. This in in turn broke filtering on FQDNs (--{from-,to-,}fqdn), as the filtering logic does not assume trailing dots, same as Hubble UI, the CLI and metrics. This commit restores the old behavior of stripping trailing dots from the source and destination name. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 00fc18f ] This performs an additional `hubble observe` invocation when testing DNS-based policies. We expect Hubble to annotate the `destination_names` field of the test flows with the observed DNS name exactly as stated in the test. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 86729fb ] Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit edff374 ] This PR adds in an option 'enable-health-check-nodeport' to the cilium-agent for disabling 'HealthCheckNodePort' based on the KubeProxyReplacement configuration. It also adds in an opton 'enableHealthCheckNodePort' to the helm charts and documents the new config options with impacts when kubeProxyReplacement is set to 'partial'. Fixes: #11168 Signed-off-by: Swaminathan Vasudevan <svasudevan@suse.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 082e879 ] This change removes reliance on hardcoded node names when retrieving pods in several tests, which caused GKE tests (where we don't control node names) to fail. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 54275be ] Signed-off-by: Vlad Ungureanu <vladu@palantir.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit d131945 ] It's no longer required to specify podCIDR [1] when provisioning k8s with kubeadm for Cilium. Removing this allows us to simplify the guide by getting rid of passing the ipam mode to helm which was introduced by #12246. [1]: kubernetes/website#21432 (comment) Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit fd56ee9 ] Noticed we didn't mention anything about it yet, so state it explicitly here. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit f4ea3f6 ] Also, bump tests to 1.18.5, 1.17.8 and 1.16.12 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
|
test-backport-1.8 |
|
In all failed jobs it's the same test failing |
| res := kubectl.HubbleObserve(helpers.CiliumNamespace, ciliumPod, | ||
| fmt.Sprintf("--last 1 --from-pod %s/%s --to-fqdn %q", | ||
| namespaceForTest, appPods[helpers.App2], "*.cilium.io")) | ||
| res.ExpectContainsFilterLine("{.destination_names[0]}", "vagrant-cache.ci.cilium.io") |
There was a problem hiding this comment.
Looks like this newly added line is the culprit of the test failures. Tracing a bit of git blame it seems like this may rely on #12332, but that PR was already backported to v1.8. @christarazi I think you did the backport of that PR, do you recall anything unusual about the backport of that one?
Anyway, if it's so reliably failing in every job then this shouldn't be a huge effort to reproduce locally & iterate to locate/fix the issue.
There was a problem hiding this comment.
Sorry I'm not seeing which PR you're referring to. I see the backport PR I've done, but not the upstream PR.
I don't recall any conflicts when doing 1.8 backports. Also, the tests passed the first time in the backport PR (#12332), so it seems we have a new failure here, rather than in the previous.
There was a problem hiding this comment.
Oh, I quoted the wrong PR, I meant #12252 . However if there weren't any conflicts then this won't provide us any simple answers anyway, something else must be missing.
There was a problem hiding this comment.
I have absolutely no idea what could be wrong here, since the full stdout is:
{"time":"2020-07-13T11:21:23.607974208Z","verdict":"FORWARDED","ethernet":{"source":"82:6a:80:39:5f:4b","destination":"5a:7e:bc:be:37:ab"},"IP":{"source":"10.0.0.131","destination":"147.75.38.95","ipVersion":"IPv4"},"l4":{"TCP":{"source_port":56212,"destination_port":80,"flags":{"ACK":true}}},"source":{"ID":277,"identity":25122,"namespace":"202007131320k8spolicytestbasictesttrafficredirectionstoproxytes","labels":["k8s:appSecond=true","k8s:id=app2","k8s:io.cilium.k8s.policy.cluster=default","k8s:io.cilium.k8s.policy.serviceaccount=app2-account","k8s:io.kubernetes.pod.namespace=202007131320k8spolicytestbasictesttrafficredirectionstoproxytes","k8s:zgroup=testapp"],"pod_name":"app2-88b7f8c4b-l784f"},"destination":{"identity":2,"labels":["reserved:world"]},"Type":"L3_L4","node_name":"k8s1","destination_names":["vagrant-cache.ci.cilium.io"],"event_type":{"type":4,"sub_type":3},"traffic_direction":"EGRESS","Summary":"TCP Flags: ACK"}
so the .destination[0] selector should clearly return vagrant-cache.ci.cilium.io.
The output of the failing test also seems weird:
[2020-07-07T11:12:48.199Z] <string>: vagrant-cache.ci.cilium.io
[2020-07-07T11:12:48.199Z] to be an element of
[2020-07-07T11:12:48.199Z] <[]interface {} | len:1, cap:1>: [
[2020-07-07T11:12:48.199Z] [
[2020-07-07T11:12:48.199Z] vagrant-cache.ci.cilium.io,
[2020-07-07T11:12:48.199Z] ],
[2020-07-07T11:12:48.199Z] ]
So... basically... "I could not find vagrant-cache.ci.cilium.io in [ vagrant-cache.ci.cilium.io, ]." Huh?
I suspect that maybe the backported ExpectContainsFilterLine function #12252 is now working properly, but still, I have no idea why it could be. Anyway, I'm going to check it with Delve and see if it leads me anywhere.
There was a problem hiding this comment.
So... basically... "I could not find
vagrant-cache.ci.cilium.ioin[ vagrant-cache.ci.cilium.io, ]." Huh?
Sounds a bit like a type issue,i.e. the needle is of type string, but the haystack is of type[]string, so there is never a match. We had a similar issue before: #11794 -- but not sure why this would only happen on the v1.8 branch.
There was a problem hiding this comment.
Good catch, thanks a lot!
[ upstream commit a20dcd1 ] Current matchers implementation makes policy related specs always pass even when value is present in the output. This patch updates ExpectContainsFilterLine and ExpectDoesNotContainFilterLine matchers to match expected value against string slice rather than slice of FilterBuffer. This patch also updates policy specs to use ExpectContainsFilterLine matcher where applicable. Minor consistency changes were also added to policy specs. Signed-off-by: Arthur Evstifeev <aevstifeev@gitlab.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
|
test-backport-1.8 |
|
The CI is still failing with: which is due to @gandro Any pointers which backport might be missing? |
"connect: no route to host" looks like a connectivity problem from the vagrant machine to the image registry. Nothing specific to Hubble as far as I can tell, as there are other non-Hubble failures as well. Packet having connectivity issues? |
|
test-backport-1.8 |
Once this PR is merged, you can update the PR labels via: