Is there an existing issue for this?
What happened?
- Cilium run as expect;
- When something occured, pods scheduled to a kubernetes node, got never created, as cni fails; (the rest nodes keep working as expect, pods already running on this node works fine)
- This won't auto recover;
- If delete the daemon set pod on this node, trigger re-init, the problem fixed.
Cilium Version
v1.8.1
Kernel Version
4.19.91-24.8.an8.x86_64
Kubernetes Version
v1.19.7
Sysdump
No response
Relevant log output
E1221 10:24:15.458364 9396 pod_workers.go:191] Error syncing pod 166301b4-3195-40a4-a16e-6972e467ef68 ("fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)"), skipping: failed to "CreatePodSandbox" for "fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)" with CreatePodSandboxError: "CreatePodSandbox for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)\" failed: rpc error: code = Unknown desc = [failed to set up sandbox container \"06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309\" network for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw\": networkPlugin cni failed to set up pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3\" network: Unable to create endpoint: Put \"http:///var/run/cilium/cilium.sock/v1/endpoint/cilium-local:0\": context deadline exceeded, failed to stop sandbox container \"06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309\" for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw\": Error response from daemon: No such container: 06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309]"
level=error msg="Error while rewriting endpoint BPF program" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint
level=warning msg="generating BPF for endpoint failed, keeping stale directory." containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 file-path=3946_next_fail identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint
level=warning msg="Regeneration of endpoint failed" bpfCompilation=0s bpfLoadProg=0s bpfWaitForELF="50.608µs" bpfWriteELF=0s buildDuration=13m4.152869852s containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ mapSync="68.12µs" policyCalculation="55.975µs" prepareBuild="406.159µs" proxyConfiguration="13.729µs" proxyPolicyCalculation="18.176µs" proxyWaitForAck=0s reason="updated security labels" subsys=endpoint waitingForCTClean=13m4.150046915s waitingForLock="15.991µs"
level=error msg="endpoint regeneration failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint
level=warning msg="Error getting CEP" containerID= controller="sync-to-k8s-ciliumendpoint (3946)" datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="Get \"https://192.168.0.1:443/apis/cilium.io/v2/namespaces/finloan-test-stg3/ciliumendpoints/fas-app-stg3-z-5bb78f5fb8-4nsdw\": context canceled" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
level=info msg="Releasing key" key="[k8s:app=fas-app-stg3 k8s:io.cilium.k8s.namespace.labels.field.cattle.io/projectId=p-nqc2q k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=finloan-test-stg3 k8s:nonce=z k8s:stark-app=fas k8s:stark-ns=finloan-test-stg3 k8s:stark-res=Deployment]" subsys=allocator
level=warning msg="Released IP after expiration" ip=10.225.150.4 subsys=ipam uuid=47d82cfb-80d5-11ed-abd6-083a88c5eb90
level=info msg="Removed endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=251 identity=11767 ipv4= ipv6= k8sPodName=/ subsys=endpoint
level=warning msg="Creation of endpoint failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=251 error="timeout while waiting for initial endpoint generation to complete" identity=11767 ipv4= ipv6= k8sPodName=/ subsys=daemon
Anything else?
No response
Code of Conduct
Is there an existing issue for this?
What happened?
Cilium Version
v1.8.1
Kernel Version
4.19.91-24.8.an8.x86_64
Kubernetes Version
v1.19.7
Sysdump
No response
Relevant log output
E1221 10:24:15.458364 9396 pod_workers.go:191] Error syncing pod 166301b4-3195-40a4-a16e-6972e467ef68 ("fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)"), skipping: failed to "CreatePodSandbox" for "fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)" with CreatePodSandboxError: "CreatePodSandbox for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3(166301b4-3195-40a4-a16e-6972e467ef68)\" failed: rpc error: code = Unknown desc = [failed to set up sandbox container \"06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309\" network for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw\": networkPlugin cni failed to set up pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw_finloan-test-stg3\" network: Unable to create endpoint: Put \"http:///var/run/cilium/cilium.sock/v1/endpoint/cilium-local:0\": context deadline exceeded, failed to stop sandbox container \"06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309\" for pod \"fas-app-stg3-z-5bb78f5fb8-4nsdw\": Error response from daemon: No such container: 06391fb699c3c116c9dc5988bef364f711fce5390bee3302b687b67cc0a47309]" level=error msg="Error while rewriting endpoint BPF program" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint level=warning msg="generating BPF for endpoint failed, keeping stale directory." containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 file-path=3946_next_fail identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint level=warning msg="Regeneration of endpoint failed" bpfCompilation=0s bpfLoadProg=0s bpfWaitForELF="50.608µs" bpfWriteELF=0s buildDuration=13m4.152869852s containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ mapSync="68.12µs" policyCalculation="55.975µs" prepareBuild="406.159µs" proxyConfiguration="13.729µs" proxyPolicyCalculation="18.176µs" proxyWaitForAck=0s reason="updated security labels" subsys=endpoint waitingForCTClean=13m4.150046915s waitingForLock="15.991µs" level=error msg="endpoint regeneration failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="BPF template compilation failed: serializer context deadline exceeded" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpoint level=warning msg="Error getting CEP" containerID= controller="sync-to-k8s-ciliumendpoint (3946)" datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=3946 error="Get \"https://192.168.0.1:443/apis/cilium.io/v2/namespaces/finloan-test-stg3/ciliumendpoints/fas-app-stg3-z-5bb78f5fb8-4nsdw\": context canceled" identity=8890 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer level=info msg="Releasing key" key="[k8s:app=fas-app-stg3 k8s:io.cilium.k8s.namespace.labels.field.cattle.io/projectId=p-nqc2q k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=finloan-test-stg3 k8s:nonce=z k8s:stark-app=fas k8s:stark-ns=finloan-test-stg3 k8s:stark-res=Deployment]" subsys=allocator level=warning msg="Released IP after expiration" ip=10.225.150.4 subsys=ipam uuid=47d82cfb-80d5-11ed-abd6-083a88c5eb90 level=info msg="Removed endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=251 identity=11767 ipv4= ipv6= k8sPodName=/ subsys=endpoint level=warning msg="Creation of endpoint failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=129 endpointID=251 error="timeout while waiting for initial endpoint generation to complete" identity=11767 ipv4= ipv6= k8sPodName=/ subsys=daemonAnything else?
No response
Code of Conduct