Expected Behavior
Calico CNI plugin tears down Pod in a timely manner.
Current Behavior
Calico CNI plugin shows errors terminating Pods, and therefore eviction takes too long. Especially relevant in Kubernetes conformance testing.
Aug 18 18:19:04.521: INFO: At 2021-08-18 18:18:01 +0000 UTC - event for taint-eviction-a1: {kubelet ip-10-0-8-52} FailedKillPod: error killing pod: failed to "KillPodSandbox" for "0701ef9b-e
17d-43b5-a48f-89fa3ac00999" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"taint-eviction-a1_taint-multiple-pods-4011\" network: error
getting ClusterInformation: connection is unauthorized: Unauthorized"
The natural things to check are RBAC permissions, which match recommendations:
- apiGroups:
- crd.projectcalico.org
resources:
- globalfelixconfigs
- felixconfigurations
- bgppeers
- globalbgpconfigs
- bgpconfigurations
- ippools
- ipamblocks
- globalnetworkpolicies
- globalnetworksets
- networkpolicies
- networksets
- clusterinformations
- hostendpoints
- blockaffinities
verbs:
- get
- list
- watch
...
To be certain, we can use the actual kubeconfig Calico writes to the host's /etc/cni/net.d. It does indeed seem to have permission to get clusterinformations. The error above is unusual.
./kubectl --kubeconfig /etc/cni/net.d/calico-kubeconfig auth can-i get clusterinformations --all-namespaces
yes
Steps to Reproduce (for bugs)
sonobuoy run --e2e-focus="NoExecuteTaintManager Multiple Pods" --e2e-skip="" \
--plugin-env=e2e.E2E_EXTRA_ARGS="--non-blocking-taints=node-role.kubernetes.io/controller"
Context
This issue affects Kubernetes Conformance tests:
Summarizing 1 Failure:
[Fail] [sig-node] NoExecuteTaintManager Multiple Pods [Serial] [It] evicts pods with minTolerationSeconds [Disruptive] [Conformance]
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/leafnodes/runner.go:113
The test in question creates two Pods that don't tolerate a taint, and expects them to be terminated within certain times. In Kubelet logs, the Calico CNI plugin is complaining with the logs above and termination takes too long.
Your Environment
Expected Behavior
Calico CNI plugin tears down Pod in a timely manner.
Current Behavior
Calico CNI plugin shows errors terminating Pods, and therefore eviction takes too long. Especially relevant in Kubernetes conformance testing.
The natural things to check are RBAC permissions, which match recommendations:
To be certain, we can use the actual kubeconfig Calico writes to the host's
/etc/cni/net.d. It does indeed seem to have permission to get clusterinformations. The error above is unusual.Steps to Reproduce (for bugs)
Context
This issue affects Kubernetes Conformance tests:
The test in question creates two Pods that don't tolerate a taint, and expects them to be terminated within certain times. In Kubelet logs, the Calico CNI plugin is complaining with the logs above and termination takes too long.
Your Environment