Skip to content

CI: K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master: Node Lost #10930

@joestringer

Description

@joestringer

K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master CI failure

We seem to have lost a node, which causes cilium state cleanup to fail. Infrastructure issue?

Seen in PR #10910, doesn't seem related to PR.
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/18664/
--->
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/18664/execution/node/198/log/?consoleFull

test_results_Cilium-PR-Ginkgo-Tests-Validated_18664_BDD-Test-PR-K8s-1.11-net-next-kubeproxy-free.zip

19:23:32  K8sUpdates 
19:23:32    Tests upgrade and downgrade from a Cilium stable image to master
19:23:32    /home/jenkins/workspace/Cilium-PR-Ginkgo-Tests-Validated/k8s-1.11-gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:430
19:23:49  STEP: Deleting Cilium, CoreDNS, and etcd-operator...
19:23:55  STEP: Waiting for pods to be terminated..
19:23:55  STEP: Cleaning Cilium state
19:33:59  === Test Finished at 2020-04-10T02:33:56Z====
19:34:29  ===================== TEST FAILED =====================
19:34:59  cmd: kubectl get pods -o wide --all-namespaces
19:34:59  Exitcode: 0 
19:34:59  Stdout:
19:34:59   	 NAMESPACE           NAME                             READY   STATUS     RESTARTS   AGE   IP              NODE     NOMINATED NODE
19:34:59  	 external-ips-test   app1-798d4f944d-6wtqc            2/2     Running    0          46m   10.10.0.102     k8s1     <none>
19:34:59  	 external-ips-test   app2-dc85b4585-c9c4g             1/2     Running    0          46m   10.10.0.119     k8s1     <none>
19:34:59  	 external-ips-test   app3-68fb594d47-l97wc            0/2     Pending    0          6m    <none>          <none>   <none>
19:34:59  	 external-ips-test   host-client-nhj4x                1/1     NodeLost   0          46m   192.168.36.12   k8s2     <none>
19:34:59  	 external-ips-test   host-client-q5n4p                1/1     Running    0          46m   192.168.36.11   k8s1     <none>
19:34:59  	 external-ips-test   host-server-1-56c9467d4b-hcpsj   2/2     Running    0          46m   192.168.36.11   k8s1     <none>
19:34:59  	 external-ips-test   host-server-2-b8d89c58c-ntr5q    2/2     Running    0          46m   192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         cilium-hvvs6                     1/1     Running    0          11m   192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         cilium-pw685                     0/1     Init:0/1   0          11m   192.168.36.12   k8s2     <none>
19:34:59  	 kube-system         coredns-687db6485c-5f272         1/1     Running    0          6m    10.10.0.146     k8s1     <none>
19:34:59  	 kube-system         coredns-687db6485c-h98qq         0/1     Unknown    0          11m   <none>          k8s2     <none>
19:34:59  	 kube-system         etcd-k8s1                        1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         kube-apiserver-k8s1              1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         kube-controller-manager-k8s1     1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         kube-scheduler-k8s1              1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         log-gatherer-cbhxj               1/1     NodeLost   0          1h    192.168.36.12   k8s2     <none>
19:34:59  	 kube-system         log-gatherer-txst6               1/1     Running    0          1h    192.168.36.13   k8s3     <none>
19:34:59  	 kube-system         log-gatherer-vjdqj               1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         registry-adder-4qf7h             1/1     Running    0          1h    192.168.36.11   k8s1     <none>
19:34:59  	 kube-system         registry-adder-gd5tg             1/1     NodeLost   0          1h    192.168.36.12   k8s2     <none>
19:34:59  	 kube-system         registry-adder-zbmv8             1/1     Running    0          1h    192.168.36.13   k8s3     <none>
19:34:59  	 
19:34:59  Stderr:
19:34:59   	 
19:34:59  
19:34:59  Fetching command output from pods [cilium-hvvs6 cilium-pw685]
20:11:09  cmd: kubectl exec -n kube-system cilium-hvvs6 -- cilium endpoint list
20:11:09  Exitcode: 0 
20:11:09  Stdout:
20:11:09   	 ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                       IPv6                 IPv4          STATUS   
20:11:09  	            ENFORCEMENT        ENFORCEMENT                                                                                                       
20:11:09  	 319        Disabled           Disabled          4          reserved:health                                   f00d::a0b:0:0:6f3c   10.10.0.196   ready   
20:11:09  	 2957       Disabled           Disabled          57680      k8s:io.cilium.k8s.policy.cluster=default          f00d::a0b:0:0:5568   10.10.0.146   ready   
20:11:09  	                                                            k8s:io.cilium.k8s.policy.serviceaccount=coredns                                              
20:11:09  	                                                            k8s:io.kubernetes.pod.namespace=kube-system                                                  
20:11:09  	                                                            k8s:k8s-app=kube-dns                                                                         
20:11:09  	 
20:11:09  Stderr:
20:11:09   	 
20:11:09  
20:11:09  cmd: kubectl exec -n kube-system cilium-pw685 -- cilium endpoint list
20:11:09  Exitcode: 1 
20:11:09  Stdout:
20:11:09   	 
20:11:09  Stderr:
20:11:09   	 Error from server: error dialing backend: dial tcp 192.168.36.12:10250: i/o timeout
20:11:09  	 
20:11:09  
20:11:09  ===================== Exiting AfterFailed =====================
20:29:09  Sending interrupt signal to process
20:29:29  After 20s process did not stop

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/CIContinuous Integration testing issue or flakekind/bug/CIThis is a bug in the testing code.needs/triageThis issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions