https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/18496/testReport/Suite-k8s-1/11/K8sDatapathConfig_MonitorAggregation_Checks_that_monitor_aggregation_restricts_notifications/
Cilium logs contain the following messages for coredns that seem related:
2020-04-03T14:27:49.939971063Z level=warning msg="JoinEP: Failed to load program" containerID=83dd70756c datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=957 error="Failed to replace Qdisc for lxceb5f17118667: Link not found" file-path=957_next/bpf_lxc.o identity=3583 ipv4=10.10.1.222 ipv6="f00d::a0c:0:0:5ad" k8sPodName=kube-system/coredns-687db6485c-65fst subsys=datapath-loader veth=lxceb5f17118667
2020-04-03T14:27:49.940044547Z level=error msg="Error while rewriting endpoint BPF program" containerID=83dd70756c datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=957 error="Failed to replace Qdisc for lxceb5f17118667: Link not found" identity=3583 ipv4=10.10.1.222 ipv6="f00d::a0c:0:0:5ad" k8sPodName=kube-system/coredns-687db6485c-65fst subsys=endpoint
Stacktrace
/home/jenkins/workspace/Cilium-PR-Ginkgo-Tests-Validated/k8s-1.11-gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:430
Failure while waiting for all cilium endpoints to reach ready state
Expected
<*helpers.SSHMetaError | 0xc0001552e0>: Error: context deadline exceeded
Extended info: Cilium Pod: cilium-dctmj Endpoint: 381 Identity: 3583 State: ready
Cilium Pod: cilium-dctmj Endpoint: 957 Identity: 3583 State: not-ready
Cilium Pod: cilium-dctmj Endpoint: 1035 Identity: 38794 State: ready
Cilium Pod: cilium-dctmj Endpoint: 2113 Identity: 44755 State: ready
Cilium Pod: cilium-dctmj Endpoint: 2236 Identity: 9564 State: ready
Cilium Pod: cilium-dctmj Endpoint: 2865 Identity: 4 State: ready
Cilium Pod: cilium-jvkg9 Endpoint: 187 Identity: 38794 State: ready
Cilium Pod: cilium-jvkg9 Endpoint: 331 Identity: 44755 State: ready
Cilium Pod: cilium-jvkg9 Endpoint: 745 Identity: 4 State: ready
to be nil
Standard output
⚠️ Found a "JoinEP: " in logs
⚠️ Found a "JoinEP: " in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 4
⚠️ Number of "level=warning" in logs: 24
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
Top 5 errors/warnings:
Unable to release endpoint ID
Unable to restore endpoint, ignoring
Error while rewriting endpoint BPF program
endpoint regeneration failed
Disabling NodePort's \
Cilium pods: [cilium-dctmj cilium-jvkg9]
Netpols loaded:
CiliumNetworkPolicies loaded:
Endpoint Policy Enforcement:
Pod Ingress Egress
test-k8s2-848b6f7864-8vb4j
testds-rw59d
app2-dc85b4585-fmdq8
app1-798d4f944d-9fjxd
app3-68fb594d47-lfcg8
coredns-687db6485c-579m5
coredns-687db6485c-65fst
testclient-v7cm2
testclient-vz9jk
testds-6plh5
Cilium agent 'cilium-dctmj': Status: Ok Health: Ok Nodes "" ContinerRuntime: Kubernetes: Ok KVstore: Ok Controllers: Total 31 Failed 0
Cilium agent 'cilium-jvkg9': Status: Ok Health: Ok Nodes "" ContinerRuntime: Kubernetes: Ok KVstore: Ok Controllers: Total 21 Failed 0
Standard error
Show standard error
STEP: Installing Cilium
STEP: Installing DNS Deployment
STEP: Restarting DNS Pods
STEP: Performing Cilium preflight check
STEP: Performing Cilium status preflight check
STEP: Performing Cilium controllers preflight check
STEP: Performing Cilium health check
Cilium is not ready yet: connectivity health is failing: Cluster connectivity is unhealthy on 'cilium-jvkg9': Exitcode: 255
Stdout:
Stderr:
Error: Cannot get status/probe: Put "http://%2Fvar%2Frun%2Fcilium%2Fhealth.sock/v1beta/status/probe": dial unix /var/run/cilium/health.sock: connect: no such file or directory
command terminated with exit code 255
STEP: Performing Cilium status preflight check
STEP: Performing Cilium controllers preflight check
STEP: Performing Cilium health check
STEP: Performing Cilium status preflight check
STEP: Performing Cilium controllers preflight check
STEP: Performing Cilium health check
STEP: Performing Cilium status preflight check
STEP: Performing Cilium controllers preflight check
STEP: Performing Cilium health check
STEP: Performing Cilium service preflight check
STEP: Performing K8s service preflight check
STEP: Waiting for cilium-operator to be ready
STEP: Waiting for kube-dns to be ready
STEP: Running kube-dns preflight check
STEP: Performing K8s service preflight check
STEP: Making sure all endpoints are in ready state
=== Test Finished at 2020-04-03T14:32:21Z====
===================== TEST FAILED =====================
cmd: kubectl get pods -o wide --all-namespaces
Exitcode: 0
Stdout:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default test-k8s2-848b6f7864-8vb4j 2/2 Running 0 5m 10.10.1.214 k8s2 <none>
default testclient-v7cm2 1/1 Running 0 5m 10.10.0.201 k8s1 <none>
default testclient-vz9jk 1/1 Running 0 5m 10.10.1.167 k8s2 <none>
default testds-6plh5 2/2 Running 0 5m 10.10.1.239 k8s2 <none>
default testds-rw59d 2/2 Running 0 5m 10.10.0.165 k8s1 <none>
external-ips-test app1-798d4f944d-9fjxd 2/2 Running 0 55m 10.10.0.238 k8s1 <none>
external-ips-test app2-dc85b4585-fmdq8 1/2 Running 0 55m 10.10.0.18 k8s1 <none>
external-ips-test app3-68fb594d47-lfcg8 1/2 Running 0 55m 10.10.1.158 k8s2 <none>
external-ips-test host-client-hw7q8 1/1 Running 0 55m 192.168.36.12 k8s2 <none>
external-ips-test host-client-k8z6r 1/1 Running 0 55m 192.168.36.11 k8s1 <none>
external-ips-test host-server-1-56c9467d4b-fwlvr 2/2 Running 0 55m 192.168.36.11 k8s1 <none>
external-ips-test host-server-2-b8d89c58c-jrtmg 2/2 Running 0 55m 192.168.36.11 k8s1 <none>
kube-system cilium-dctmj 1/1 Running 1 5m 192.168.36.12 k8s2 <none>
kube-system cilium-jvkg9 1/1 Running 0 5m 192.168.36.11 k8s1 <none>
kube-system cilium-operator-8677f5c767-tpcwk 1/1 Running 0 5m 192.168.36.12 k8s2 <none>
kube-system coredns-687db6485c-579m5 1/1 Running 0 5m 10.10.1.98 k8s2 <none>
kube-system etcd-k8s1 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
kube-system kube-apiserver-k8s1 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
kube-system kube-controller-manager-k8s1 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
kube-system kube-scheduler-k8s1 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
kube-system log-gatherer-97wzg 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
kube-system log-gatherer-kml6f 1/1 Running 0 1h 192.168.36.12 k8s2 <none>
kube-system log-gatherer-nl9k9 1/1 Running 0 1h 192.168.36.13 k8s3 <none>
kube-system registry-adder-7f6nb 1/1 Running 0 1h 192.168.36.13 k8s3 <none>
kube-system registry-adder-cbrmj 1/1 Running 0 1h 192.168.36.12 k8s2 <none>
kube-system registry-adder-sns97 1/1 Running 0 1h 192.168.36.11 k8s1 <none>
Stderr:
Fetching command output from pods [cilium-dctmj cilium-jvkg9]
cmd: kubectl exec -n kube-system cilium-dctmj -- cilium bpf tunnel list
Exitcode: 0
Stdout:
TUNNEL VALUE
f00d::a0b:0:0:0:0 192.168.36.11:0
10.10.0.0:0 192.168.36.11:0
Stderr:
cmd: kubectl exec -n kube-system cilium-dctmj -- cilium endpoint list
Exitcode: 0
Stdout:
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
381 Disabled Disabled 3583 k8s:io.cilium.k8s.policy.cluster=default f00d::a0c:0:0:b5d2 10.10.1.98 ready
k8s:io.cilium.k8s.policy.serviceaccount=coredns
k8s:io.kubernetes.pod.namespace=kube-system
k8s:k8s-app=kube-dns
957 Disabled Disabled 3583 k8s:io.cilium.k8s.policy.cluster=default f00d::a0c:0:0:5ad 10.10.1.222 not-ready
k8s:io.cilium.k8s.policy.serviceaccount=coredns
k8s:io.kubernetes.pod.namespace=kube-system
k8s:k8s-app=kube-dns
1035 Disabled Disabled 38794 k8s:io.cilium.k8s.policy.cluster=default f00d::a0c:0:0:c76b 10.10.1.167 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testDSClient
2113 Disabled Disabled 44755 k8s:io.cilium.k8s.policy.cluster=default f00d::a0c:0:0:bfd7 10.10.1.239 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testDS
2236 Disabled Disabled 9564 k8s:io.cilium.k8s.policy.cluster=default f00d::a0c:0:0:5486 10.10.1.214 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=test-k8s2
2865 Disabled Disabled 4 reserved:health f00d::a0c:0:0:cdb6 10.10.1.9 ready
Stderr:
cmd: kubectl exec -n kube-system cilium-jvkg9 -- cilium bpf tunnel list
Exitcode: 0
Stdout:
TUNNEL VALUE
10.10.1.0:0 192.168.36.12:0
f00d::a0c:0:0:0:0 192.168.36.12:0
Stderr:
cmd: kubectl exec -n kube-system cilium-jvkg9 -- cilium endpoint list
Exitcode: 0
Stdout:
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
187 Disabled Disabled 38794 k8s:io.cilium.k8s.policy.cluster=default f00d::a0b:0:0:bb7 10.10.0.201 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testDSClient
331 Disabled Disabled 44755 k8s:io.cilium.k8s.policy.cluster=default f00d::a0b:0:0:9a7a 10.10.0.165 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testDS
745 Disabled Disabled 4 reserved:health f00d::a0b:0:0:9531 10.10.0.61 ready
Stderr:
f9f3b3b1_K8sDatapathConfig_MonitorAggregation_Checks_that_monitor_aggregation_restricts_notifications.zip
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/18496/testReport/Suite-k8s-1/11/K8sDatapathConfig_MonitorAggregation_Checks_that_monitor_aggregation_restricts_notifications/
Cilium logs contain the following messages for coredns that seem related:
Stacktrace
Standard output
Standard error
Show standard error
f9f3b3b1_K8sDatapathConfig_MonitorAggregation_Checks_that_monitor_aggregation_restricts_notifications.zip