v1.8 backports 2020-07-15 by brb · Pull Request #12536 · cilium/cilium

brb · 2020-07-15T15:37:11Z

v1.8 backports 2020-07-15

bpf: clear getpeername{4,6} progs #12443 -- bpf: clear getpeername{4,6} progs (@tklauser)
Replace bpf bitfields with a single u8 field and masks #12415 -- Replace bpf bitfields with a single u8 field and masks (@kkourt)
Improve etcd fail-over scenarios #12427 -- Improve etcd fail-over scenarios (@tgraf)
bpf: Fix BPF masq when running with non-hybrid DSR #12456 -- bpf: Fix BPF masq when running with non-hybrid DSR (@brb)
Host Policies can break kube if applied incorrectly. #12289 -- Host Policies can break kube if applied incorrectly. (@jedsalazar)
Fix small CRD issue with toGroups #12440 -- Fix small CRD issue with toGroups (@lbernail)
refactor stringSlice type CLI arguments #12457 -- refactor stringSlice type CLI arguments (@jiejhih)
cilium: hostport service map fixes #12446 -- cilium: hostport service map fixes (@borkmann)
operator: rate limit GC of security identities #12451 -- operator: rate limit GC of security identities (@aanm)
cilium: simplify hostport's genServiceMappings #12473 -- cilium: simplify hostport's genServiceMappings (@borkmann)
Include BPF FS mount status in sysdump output #12191 -- Include BPF FS mount status in sysdump output (@soumynathan)
pkg/k8s: use copy of objectmeta when fetching from local stores #12470 -- pkg/k8s: use copy of objectmeta when fetching from local stores (@aanm)
test: Simplify DNS proxy port / TFTP source port conflict test. #12462 -- test: Simplify DNS proxy port / TFTP source port conflict test. (@jrajahalme)
datapath/linux: protect against concurrent access in NodeValidateImplementation #12461 -- datapath/linux: protect against concurrent access in NodeValidateImplementation (@tklauser)
docs: Improve github actions release docs #12479 -- docs: Improve github actions release docs (@joestringer)
policy/api: Support unmanaged entity in policies #12474 -- policy/api: Support unmanaged entity in policies (@pchaigno)
cilium: hostport follow-up #12492 -- cilium: hostport follow-up (@borkmann)
docs: document hostport accessibility in kube-proxy-free guide #12494 -- docs: document hostport accessibility in kube-proxy-free guide (@borkmann)
Makefile: Fix render-docs with parallel make jobs #12480 -- Makefile: Fix render-docs with parallel make jobs (@joestringer)
Document kvstore leases, update links #12481 -- Document kvstore leases, update links (@joestringer)
doc: Fix clustermesh documentation to set the correct identityMode #12153 -- doc: Fix clustermesh documentation to set the correct identityMode (@soumynathan)
policy: Fix enforcement status of host when PolicyEnforcement=always #12497 -- policy: Fix enforcement status of host when PolicyEnforcement=always (@pchaigno)
daemon: Fatal on incompatible host firewall options #12495 -- daemon: Fatal on incompatible host firewall options (@pchaigno)
Adds TRACE_TO_NETWORK obs label and trace pkts in to-netdev prog. #12245 -- Adds TRACE_TO_NETWORK obs label and trace pkts in to-netdev prog. (@Weil0ng)
helm: update notes to include Hubble Relay #12486 -- helm: update notes to include Hubble Relay (@rolinh)
clustermesh: Tidy up services connection on failure to reconnect #12526 -- clustermesh: Tidy up services connection on failure to reconnect (@joestringer)
hubble: add missing code documentation #12528 -- hubble: add missing code documentation (@rolinh)
Adds documentation for limiting identity-relevant labels used when evaluating Cilium Identities #12517 -- Adds documentation for limiting identity-relevant labels used when evaluating Cilium Identities (@seanmwinn)
bpf: explicitly set ttl in tunnel key #12529 -- bpf: explicitly set ttl in tunnel key (@borkmann)
Add heartbeat to etcd quorum check #12453 -- Add heartbeat to etcd quorum check (@tgraf)
docs(troubleshooting): Remove bugtool related step with --serve flag #12532 -- docs(troubleshooting): Remove bugtool related step with --serve flag (@sayboras)
endpoint: Write headerfile under full regeneration #12524 -- endpoint: Write headerfile under full regeneration (@christarazi)
bpf: Fix monitor aggregation for 'from-network' #12559 -- bpf: Fix monitor aggregation for 'from-network' (@joestringer)

Once this PR is merged, you can update the PR labels via:

$ for pr in 12443 12415 12427 12456 12289 12440 12457 12446 12451 12473 12191 12470 12462 12461 12479 12474 12492 12494 12480 12481 12153 12497 12495 12245 12486 12526 12528 12517 12529 12453 12532 12524 12559; do contrib/backporting/set-labels.py $pr done 1.8; done

[ upstream commit c8ae6ed ] Attempt to clear previously loaded getpeername progs in bpf/init.sh in case $HOSTLB is disabled. Fixes: f9c2d70 ("bpf: implement getpeername for sock lb programs") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit a091ba1 ] This avoids complications of the compiler reordering the fields which we also use from Go code, and also allows the use of masks in checks. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 414cadf ] Instead of just a boolean, return the reason of failure in isConnectedAndHasQuorum() Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 5875b77 ] In general, the etcd client library should fail over to a healthy etcd endpoint and quorum errors should automatically resolve. If this does not happen for some reason, report unhealthy status to trigger a restart of the agent. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit b95650b ] Shuffling the list of etcd endpoints on each bootstrap has two positive effects: * Agents in the cluster will connect to different etcd members. * In case etcd fails to fail-over to due to a bug, a Cilium agent restart has a chance to fail-over to a healthy etcd member. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 53c17cd ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit d7e2076 ] Previously, BPF masquerading could not be performed when running with non-hybrid DSR due to a call to nodeport_nat_fwd() being compiled out. The latter is responsible for SNATing packets from pods to world. Fix: 0962c02 ("datapath: Enable BPF MASQ for veth mode in IPv4") Signed-off-by: Martynas Pumputis <m@lambda.lt>

@pchaigno

[ upstream commit 31592c5 ] Host Policies can break kube if applied incorrectly. If a user unintentionally applies an overly aggressive host-level netpolicy it will break access to their nodes and break kube. Given the blast radius of a mistake and the undocumented fix, I discussed an initial solution in the docs with @pchaigno and we agreed that a good start is preventing an exmaple policy from loading until the user changes example values. This has the benefit of explicitly requiring some thought from users before applying a policy that could lead to a k8s outage. Happy to discuss alternative ideas here but the goal is to protect operators from breaking their environments while testing this feature. Signed-off-by: Jed Salazar <jed@isovalent.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 678e7f0 ] Fixes: f0049da ("pkg/k8s: fix all structural issues with CNP validation") Signed-off-by: Laurent Bernaille <laurent.bernaille@datadoghq.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 9fe5b33 ] Signed-off-by: JieJhih Jhang <jiejhihjhang@gmail.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit f6ffe0d ] Name it generically and add another bool that allows to control whether the dst port must be in the nodeport range or outside of it. For the current users, it is set so that they match on the configured NodePort range. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 48bf28a ] Prepare the socket LB for a hostport-related wildcard lookup. The NodePort and HostPort range are exclusive, so the lookup is still fast given we can bail out early w/o map lookup in sock{4,6}_wildcard_lookup(). Aside from the range, we must match on HOST_ID dst address for the hostport lookup. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 7555632 ] Ensure that we don't process such packets from outside if they ever land on our node. Instead of doing a more expensive memcmp() on the address, just include this into the test for the underlying type. The latter is still present for the case where the agent is restarted & reconfigured and map entries are still stale before k8s resync. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 81353c6 ] In IPv6 case we already use net.IPv6zero in LoadBalancerNodeAddresses(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 7651ecd ] In order to be compatible with portmap's HostPort mode, we need to expose the host port on all addresses for the case where no hostIP was specified. This also includes the loopback address. For the case where hostIP was specified, we only expose the user provided IP address under HostPort. A typical portmap rule looks like (removing CNI-HOSTPORT-SETMARK from here): [...] [6:408] -A PREROUTING -m addrtype --dst-type LOCAL -j CNI-HOSTPORT-DNAT [...] [0:0] -A CNI-DN-1ba85aa0447ede8ba6dab -d 8.8.8.8/32 -p tcp -m tcp --dport 999 -j DNAT --to-destination 10.0.0.217:80 [0:0] -A CNI-DN-489a67438eed03d272897 -d 127.0.0.1/32 -p tcp -m tcp --dport 298 -j DNAT --to-destination 10.0.0.27:80 [2:120] -A CNI-DN-526ac7f749f8260cb0d9f -p tcp -m tcp --dport 666 -j DNAT --to-destination 10.0.0.188:80 [2:120] -A CNI-DN-81de08afe40acdfde3fac -d 192.168.178.29/32 -p tcp -m tcp --dport 777 -j DNAT --to-destination 10.0.0.138:80 [2:120] -A CNI-HOSTPORT-DNAT -p tcp -m comment --comment "dnat name: \"portmap\" id: \"dd111758b3cc7d46e0c027725f4c1cc3d5ac349c6cb1d6ebb919a5f433f155aa\"" -m multiport --dports 666 -j CNI-DN-526ac7f749f8260cb0d9f [...] In above example, hostPort 666 has no hostIP specified; hostPort 777 had the hostIP of 192.168.178.29 specified. In such case access via 127.0.0.1:777 is not possible. hostPort 298 had 127.0.0.1 as hostIP specified, so it's only available via loopback but not from outside world. And hostPort 999 had 8.8.8.8 as hostIP specified. It's not forbidden but it won't match/xlate if 8.8.8.8 is not a local address on the node. Implement these semantics also for the BPF-based replacement by reworking the k8s watcher's genServiceMappings(). Fixes: #12116 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 1550a36 ] Final one to enforce drop of surrogate entries for external traffic. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 56e0e72 ] Extend the batch of hostPort tests to improve coverage. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 651199c ] Some keys can be reused in between GC function calls. To avoid them being GCed we should not mark them as stale keys and wait for the next GC call to be executed. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit fe9ca34 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 0892086 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit ea57e36 ] To prevent bursts of security identities from being deleted in the KVStore, possibly causing Cilium agent to have a high CPU usage due policy calculation, this commit adds a rate limiter for such KVStore deletes. For example, in case there are 1000 identities to GCed, the operator will delete 250 every minute until all 1000 identities are GCed. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 93e4845 ] Simplify the code which handles HostPort's frontend/backend generation to make it less complex. Suggested-by: Andre Martins <andre@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit fb82325 ] This PR adds in a new cilium bpf command to show the status of the bpffs filesystem mount. Fixes: #12057 Signed-off-by: Swaminathan Vasudevan <svasudevan@suse.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 3a311ab ] This commit fixes a bug introduced by 9975bba where we were accidentally writing into the object metadata of the local pod store, which should never happen. Doing that could cause k8s update pod events to give to the Update function handlers an old pod structure with such fields modified. Fixes: 9975bba ("pkg/endpoint: fetch pod and namespace labels from local stores") Reported-by: Deepesh Pathak <deepshpathak@gmail.com> Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 657295c ] Since we address NodePort in k8s2 using the DNS proxy port of k8s2 as the source port from k8s1, one round is enough regardless of the backend selection, as in both cases the replies are reverse NATted at k8s2 (where the port conflict was happening before it was fixed by #12248). Fixes: #12336 Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

…ementation [ upstream commit d88d5f9 ] The linuxNodeHandler.neighByNode map is also protected by linuxNodeHandler.mutex in all other code paths. Protect access to it in (*linuxNodeHandler).NodeValidateImplementation as well. Fixes #12460 Fixes: 6c06c51 ("node: Remove permanent ARP entry when remote node is deleted") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 84dda5f ] We missed this step during the v1.8 release cycle. Clarify the required steps to hopefully reduce the likelihood we skip past it. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 5e9c427 ] 0a9685e introduced the reserved:unmanaged identity, but did not introduce the associated entity. This commit fixes it and allows the following in policies: - fromEntity: - unmanaged The goal is to allow policies to work even before all pods are restarted and managed by Cilium. In particular, this can be an issue with host policies since the host is more likely than pods to need to talk with unmanaged pods (e.g., on GKE right after deploying Cilium). Fixes: 0a9685e ("identity: Introduce reserved:unmanaged identity") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 7e6a7bb ] Do not convert the loopback address into a surrogate, but instead, if explicitly selected, ignore it and dump a warning message to the agent log. Converting to surrogate is not comparable given this would also allow _local_ Pods to connect via the Node's IP. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 2c0da38 ] Document the expected behavior with regards to addressing. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 4d831cd ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit cc162e5 ] "Backend not initialized" does not mean much to users. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

@sayboras

[ upstream commit 8599664 ] Reported-by: @sayboras Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit a751039 ] The initial status message of the etcd subsystem is: ``` KVStore: Ok No connection to etcd ``` This can be misleading as it does not indicate whether the etcd session was ever established or not. Clarify this: ``` KVStore: Ok Waiting for initial connection to be established ``` Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 03b3c10 ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 0b0dd55 ] When releasing the etcd connection, sessions are attempted to be revoked. In the event of an unhealthy etcd connection, the operation will fail and time out. This operation will take a long time though. Instead of blocking, release the resources in the background. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 3e951e1 ] Good condition: ``` cluster2: ready, 4 nodes, 3 identities, 1 services, 0 failures (last: never) ``` Bad condition: ``` cluster2: not-ready, 0 nodes, 0 identities, 0 services, 1 failures (last: 9s ago) ``` Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 23265dd ] The flag --serve is removed in bugtool in PR #6237, hence related docs should be removed as well. Signed-off-by: Tam Mach <sayboras@yahoo.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

[ upstream commit 21783d5 ] Previously, regenerating an endpoint manually via CLI resulted in compilation errors as seen below. These errors occurred because the endpoint headerfile (ep_config.h or lxc_config.h) was not written to the `<endpoint ID>_next` directory (will be reffered to as "next"), where the BPF compilation takes place. The reason the headerfile was not written to the "next" directory was because Cilium only wrote the headerfile if it was changed (via hash). However, a manual regeneration triggered through the API, sets the regeneration level to "compile+load" (a full regeneration), where Cilium expects all relevant files, including the headerfile, to be present in the "next" directory (`<endpoint ID>_next`). Hence, the compilation errors. This commit fixes this issue by checking whether there has been a request for a full regeneration, in which case, we write the headerfile to the "next" directory. ``` root@k8s2:/var/run/cilium/state# clang -emit-llvm -O2 -target bpf -std=gnu89 -nostdinc -D__NR_CPUS__=2 -Wall -Wextra -Werror -Wshadow -Wno-address-of-packed-member -Wno-unknown-warning-option -Wno-gnu-variable-s ized-type-not-at-end -Wdeclaration-after-statement -I/var/run/cilium/state/globals -I1285_next -I/var/lib/cilium/bpf -I/var/lib/cilium/bpf/include -c /var/lib/cilium/bpf/bpf_lxc.c -o /tmp/c In file included from /var/lib/cilium/bpf/bpf_lxc.c:22: /var/lib/cilium/bpf/lib/icmp6.h:50:29: error: use of undeclared identifier 'NODE_MAC' union macaddr smac, dmac = NODE_MAC; ^ /var/lib/cilium/bpf/lib/icmp6.h:359:30: error: use of undeclared identifier 'NODE_MAC' union macaddr router_mac = NODE_MAC; ^ In file included from /var/lib/cilium/bpf/bpf_lxc.c:26: /var/lib/cilium/bpf/lib/lxc.h:67:29: error: use of undeclared identifier 'NODE_MAC' union macaddr router_mac = NODE_MAC; ^ /var/lib/cilium/bpf/bpf_lxc.c:159:10: error: implicit declaration of function 'lookup_ip6_remote_endpoint' [-Werror,-Wimplicit-function-declaration] info = lookup_ip6_remote_endpoint(&orig_dip); ^ /var/lib/cilium/bpf/bpf_lxc.c:159:10: note: did you mean 'lookup_ip6_endpoint'? /var/lib/cilium/bpf/lib/eps.h:13:1: note: 'lookup_ip6_endpoint' declared here lookup_ip6_endpoint(struct ipv6hdr *ip6) ^ /var/lib/cilium/bpf/bpf_lxc.c:159:8: error: incompatible integer to pointer conversion assigning to 'struct remote_endpoint_info *' from 'int' [-Werror,-Wint-conversion] info = lookup_ip6_remote_endpoint(&orig_dip); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /var/lib/cilium/bpf/bpf_lxc.c:529:10: error: implicit declaration of function 'lookup_ip4_remote_endpoint' [-Werror,-Wimplicit-function-declaration] info = lookup_ip4_remote_endpoint(orig_dip); ^ /var/lib/cilium/bpf/bpf_lxc.c:529:10: note: did you mean 'lookup_ip4_endpoint'? /var/lib/cilium/bpf/lib/eps.h:35:1: note: 'lookup_ip4_endpoint' declared here lookup_ip4_endpoint(const struct iphdr *ip4) ^ /var/lib/cilium/bpf/bpf_lxc.c:529:8: error: incompatible integer to pointer conversion assigning to 'struct remote_endpoint_info *' from 'int' [-Werror,-Wint-conversion] info = lookup_ip4_remote_endpoint(orig_dip); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /var/lib/cilium/bpf/bpf_lxc.c:1027:10: error: implicit declaration of function 'lookup_ip6_remote_endpoint' [-Werror,-Wimplicit-function-declaration] info = lookup_ip6_remote_endpoint(src); ^ /var/lib/cilium/bpf/bpf_lxc.c:1027:8: error: incompatible integer to pointer conversion assigning to 'struct remote_endpoint_info *' from 'int' [-Werror,-Wint-conversion] info = lookup_ip6_remote_endpoint(src); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /var/lib/cilium/bpf/bpf_lxc.c:1256:10: error: implicit declaration of function 'lookup_ip4_remote_endpoint' [-Werror,-Wimplicit-function-declaration] info = lookup_ip4_remote_endpoint(ip4->saddr); ^ /var/lib/cilium/bpf/bpf_lxc.c:1256:8: error: incompatible integer to pointer conversion assigning to 'struct remote_endpoint_info *' from 'int' [-Werror,-Wint-conversion] info = lookup_ip4_remote_endpoint(ip4->saddr); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 11 errors generated. ``` Fixes #10630 Fixes #12005 Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

joestringer · 2020-07-16T18:39:14Z

Hit also #12555 in the k8s-1.13 run.

https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3312/

joestringer · 2020-07-16T18:40:13Z

test-missed-k8s

joestringer · 2020-07-16T22:41:24Z

I believe once we pick up the backport of #12559, the remaining k8s CI test should begin passing again.

borkmann · 2020-07-17T12:50:12Z

(all green except Cilium-Ginkgo-Test-k8s/Cilium-PR-Ginkgo-Tests-K8s: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3313/)

[ upstream commit 037bef5 ] Previously, we did not take into account 'from-network' sources in the monitor aggregation logic check in `send_trace_notify()`, which was fine because we rarely ever sent such events (limited to ipsec for instance). However, since commit c470e28 we also use this in bpf_host which suddenly means that any and all traffic from the network will trigger monitor events, flooding the monitor output. Fixes: 7a4b0be ("bpf: Add MonitorAggregation option") Fixes: c470e28 ("Adds TRACE_TO_NETWORK obs label and trace pkts in to-netdev prog.") Fixes: #12555 Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

brb · 2020-07-17T12:51:00Z

test-backport-1.8

brb · 2020-07-17T12:51:20Z

@joestringer I've included #12559 into this backport.

qmonnet

Looks all good to me, let's wait for the tests and we're good to go!

borkmann · 2020-07-17T18:06:32Z

retest-missed-k8s
(https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3315/ Suite-k8s-1.13.K8sServicesTest Checks service across nodes with L7 policy Tests NodePort with L7 Policy failed.)

tklauser

👍 for my changes

brb · 2020-07-20T08:35:27Z

Pulling images has failed. E.g.

	   Warning  Failed     7m39s                   kubelet, k8s1      Failed to pull image "docker.io/cilium/cilium:v1.7": rpc error: code = Unknown desc = error pulling image configuration: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/99/99fc355af08e6fe2381193f228684260b21c0713e46fc4b2510b78cf09e2fe3d/data?verify=1595023904-BW1MMQaahRs0Bh431uW4W5GuIgI%3D: dial tcp: lookup production.cloudflare.docker.com on 8.8.4.4:53: read udp 10.0.2.15:51846->8.8.4.4:53: i/o timeout

brb · 2020-07-20T08:37:34Z

test-missed-k8s

brb · 2020-07-20T12:00:49Z

test-missed-k8s

Edit: Hit #11690: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3320/testReport/junit/Suite-k8s-1/13/K8sDatapathConfig_Etcd_Check_connectivity/

aanm

LGTM for my changes

christarazi · 2020-07-20T17:04:25Z

test-missed-k8s

brb · 2020-07-21T08:09:26Z

test-missed-k8s

(Damn, GH UI didn't show that it has been already merged.)

tklauser and others added 30 commits July 15, 2020 17:17

etcd: Return error in isConnectedAndHasQuorum()

ce0cc39

[ upstream commit 414cadf ] Instead of just a boolean, return the reason of failure in isConnectedAndHasQuorum() Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

etcd: Report number of consecutive errors

cd053eb

[ upstream commit 53c17cd ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

Fix small CRD issue with toGroups

a5384c0

[ upstream commit 678e7f0 ] Fixes: f0049da ("pkg/k8s: fix all structural issues with CNP validation") Signed-off-by: Laurent Bernaille <laurent.bernaille@datadoghq.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

refactor stringSlice type CLI arguments

60257a0

[ upstream commit 9fe5b33 ] Signed-off-by: JieJhih Jhang <jiejhihjhang@gmail.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

cilium: use IPv4zero for v4 surrogate

f2182fe

[ upstream commit 81353c6 ] In IPv6 case we already use net.IPv6zero in LoadBalancerNodeAddresses(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

cilium: push routable flag into service map from agent

4d88de8

[ upstream commit 1550a36 ] Final one to enforce drop of surrogate entries for external traffic. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

test: extend hostPort test cases also to loopback and v4-in-v6

1c4b294

[ upstream commit 56e0e72 ] Extend the batch of hostPort tests to improve coverage. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

operator: propagate context for delete identity

b5bf66f

[ upstream commit fe9ca34 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

pkg/rate: add rate limiter package to rate limit requests

5d46498

[ upstream commit 0892086 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

Include BPF FS mount status in sysdump output

9185e7b

[ upstream commit fb82325 ] This PR adds in a new cilium bpf command to show the status of the bpffs filesystem mount. Fixes: #12057 Signed-off-by: Swaminathan Vasudevan <svasudevan@suse.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

docs: document hostport accessibility in kube-proxy-free guide

538b319

[ upstream commit 2c0da38 ] Document the expected behavior with regards to addressing. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martynas Pumputis <m@lambda.lt>

tgraf and others added 9 commits July 16, 2020 14:56

doc: Document etcd failure behavior

6eff941

[ upstream commit 4d831cd ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

clustermesh: Improve error message for inital connection attempt

a1adeea

[ upstream commit cc162e5 ] "Backend not initialized" does not mean much to users. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

clustermesh: Fix comments on scope of rc.mutex

34b552c

[ upstream commit 8599664 ] Reported-by: @sayboras Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

kvstore: Log errors while closing etcd client

b1b6776

[ upstream commit 03b3c10 ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt>

docs(troubleshooting): Remove --serve flag in bugtool

57b0aac

[ upstream commit 23265dd ] The flag --serve is removed in bugtool in PR #6237, hence related docs should be removed as well. Signed-off-by: Tam Mach <sayboras@yahoo.com> Signed-off-by: Martynas Pumputis <m@lambda.lt>

christarazi mentioned this pull request Jul 16, 2020

CI: K8sHealthTest checks cilium-health status between nodes: Expected string not to be empty #6728

Closed

brb requested a review from qmonnet July 17, 2020 12:51

qmonnet approved these changes Jul 17, 2020

View reviewed changes

tklauser approved these changes Jul 20, 2020

View reviewed changes

aanm approved these changes Jul 20, 2020

View reviewed changes

christarazi mentioned this pull request Jul 20, 2020

v1.8: CI: K8sDatapathConfig Etcd Check connectivity #11690

Closed

christarazi merged commit 97e2582 into v1.8 Jul 20, 2020

christarazi deleted the pr/brb/backport-v1.8-20200715 branch July 20, 2020 21:55

Conversation

brb commented Jul 15, 2020 • edited by christarazi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joestringer commented Jul 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joestringer commented Jul 16, 2020

Uh oh!

joestringer commented Jul 16, 2020

Uh oh!

borkmann commented Jul 17, 2020

Uh oh!

brb commented Jul 17, 2020

Uh oh!

brb commented Jul 17, 2020

Uh oh!

qmonnet left a comment

Choose a reason for hiding this comment

Uh oh!

borkmann commented Jul 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tklauser left a comment

Choose a reason for hiding this comment

Uh oh!

brb commented Jul 20, 2020

Uh oh!

brb commented Jul 20, 2020

Uh oh!

brb commented Jul 20, 2020 • edited by christarazi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aanm left a comment

Choose a reason for hiding this comment

Uh oh!

christarazi commented Jul 20, 2020

Uh oh!

brb commented Jul 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

brb commented Jul 15, 2020 •

edited by christarazi

Loading

joestringer commented Jul 16, 2020 •

edited

Loading

borkmann commented Jul 17, 2020 •

edited

Loading

brb commented Jul 20, 2020 •

edited by christarazi

Loading

brb commented Jul 21, 2020 •

edited

Loading