-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Containerd v1.6.12 slow memory leak when pod readiness probe gets stuck forever #7802
Description
Description
Test with non responding curl command
Pod.yaml
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 5
exec:
command:
- /app/readiness-probe.sh
readiness-probe.sh
#!/bin/bash
curl -sS -X GET "http://localhost:9000/health/readiness"
When the curl command not responding, exec probe readiness bash script starts curl as a foreground process for probing the http server. periodSeconds: 5 , timeoutSeconds: 5 , the following happens, the bash process times out after 5 s and containerd/shim deletes the bash process it started but not the foreground process curl.
After the timeout the next probe is run after 2 minutes and 5 seconds and not 5 s as per spec (periodSeconds). This is due to the kubelet configuration variable runtimeRequestTimeout in /var/lib/kubelet/config.yaml runtimeRequestTimeout is set to 0s, if 0 an internal timer in kubelet will be set to a default value of 2 minutes + the spec timeoutSeconds, in this case 2m + 5s.
This leads to a slow leaking memory in containerd.
Steps to reproduce the issue
To simulate the same scenario and easy reproduction I have replaced the stuck curl command with a very long running sleep. If you deploy this pod the problem will be reproduced.
Pod with stuck probe at sleep.
apiVersion: v1
kind: Pod
metadata:
labels:
test: readiness
name: readiness-exec
spec:
containers:
- name: readiness
image: registry.k8s.io/busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600000000 # A long huge sleep to keep the pod running.
readinessProbe:
exec:
command:
- /bin/sh
- -c
- sleep 100000000 & # Forked a very long running sleep to simulate stuck probe
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 5
Describe the results you received and expected
With the above pod deployed, a monitoring of containerd memory shows the slow memory leak in the worker where pod is running:
worker:~> echo; date; ps -e -o rss,args | grep containerd| grep log-level
Mon 12 Dec 2022 12:52:45 PM UTC
81656 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 01:03:09 PM UTC
82180 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 01:09:06 PM UTC
82964 /usr/local/bin/containerd --log-level=warn
...
...
Mon 12 Dec 2022 02:34:44 PM UTC
90264 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 03:03:13 PM UTC
92312 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 03:25:04 PM UTC
94360 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 04:00:28 PM UTC
96408 /usr/local/bin/containerd --log-level=warn
Mon 12 Dec 2022 04:28:40 PM UTC
100624 /usr/local/bin/containerd --log-level=warn
...
...
Tue 13 Dec 2022 02:42:26 AM UTC
150236 /usr/local/bin/containerd --log-level=warn
An exec to the pod
ps -ef shows sleeping processes with numbers of processes kept on increasing over time, one added every 2m 5sec as explained above.
ps -ef
PID USER COMMAND
1 root /bin/sh -c touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600000000
15 root sleep 100000000
16 root sh
24 root sleep 600000000
37 root sleep 100000000
46 root sleep 100000000
53 root sleep 100000000
60 root sleep 100000000
67 root sleep 100000000
75 root sleep 100000000
82 root sleep 100000000
89 root sleep 100000000
96 root sleep 100000000
103 root sleep 100000000
110 root sleep 100000000
117 root sleep 100000000
124 root sleep 100000000
132 root sleep 100000000
138 root sleep 100000000
144 root sleep 100000000
151 root sleep 100000000
...
...
What version of containerd are you using?
containerd github.com/containerd/containerd v1.6.12 a05d175
Any other relevant information
worker:~> runc -version
runc version 1.1.4
commit: v1.1.4-0-g5fd4c4d1
spec: 1.0.2-dev
go: go1.17.6
libseccomp: 2.5.3
worker:~> sudo crictl info
{
"status": {
"conditions": [
{
"type": "RuntimeReady",
"status": true,
"reason": "",
"message": ""
},
{
"type": "NetworkReady",
"status": true,
"reason": "",
"message": ""
}
]
},
"cniconfig": {
"PluginDirs": [
"/opt/cni/bin"
],
"PluginConfDir": "/etc/cni/net.d",
"PluginMaxConfNum": 1,
"Prefix": "eth",
"Networks": [
{
"Config": {
"Name": "cni-loopback",
"CNIVersion": "0.3.1",
"Plugins": [
{
"Network": {
"type": "loopback",
"ipam": {},
"dns": {}
},
"Source": "{\"type\":\"loopback\"}"
}
],
"Source": "{\n\"cniVersion\": \"0.3.1\",\n\"name\": \"cni-loopback\",\n\"plugins\": [{\n \"type\": \"loopback\"\n}]\n}"
},
"IFName": "lo"
},
{
"Config": {
"Name": "k8s-pod-network",
"CNIVersion": "0.3.1",
"Plugins": [
{
"Network": {
"type": "calico",
"ipam": {
"type": "calico-ipam"
},
"dns": {}
},
"Source": "{\"container_settings\":{\"allow_ip_forwarding\":true},\"etcd_ca_cert_file\":\"/etc/cni/net.d/calico-tls/etcd-ca\",\"etcd_cert_file\":\"/etc/cni/net.d/calico-tls/etcd-cert\",\"etcd_endpoints\":\"https://10.0.16.2:2379,https://10.0.16.20:2379,https://10.0.16.4:2379\",\"etcd_key_file\":\"/etc/cni/net.d/calico-tls/etcd-key\",\"ipam\":{\"assign_ipv4\":\"true\",\"assign_ipv6\":\"false\",\"type\":\"calico-ipam\"},\"kubernetes\":{\"kubeconfig\":\"/etc/cni/net.d/calico-kubeconfig\"},\"log_level\":\"error\",\"mtu\":2070,\"policy\":{\"k8s_api_root\":\"https://[10.96.0.1]:443\",\"type\":\"k8s\"},\"type\":\"calico\"}"
},
{
"Network": {
"type": "tuning",
"ipam": {},
"dns": {}
},
"Source": "{\"sysctl\":{\"net.ipv4.tcp_mtu_probing\":\"1\"},\"type\":\"tuning\"}"
},
{
"Network": {
"type": "portmap",
"capabilities": {
"portMappings": true
},
"ipam": {},
"dns": {}
},
"Source": "{\"capabilities\":{\"portMappings\":true},\"snat\":true,\"type\":\"portmap\"}"
}
],
"Source": "{\n \"name\": \"k8s-pod-network\",\n \"cniVersion\": \"0.3.1\",\n \"plugins\": [\n {\n \"type\": \"calico\",\n \"log_level\": \"error\",\n \"etcd_endpoints\": \"https://10.0.16.2:2379,https://10.0.16.20:2379,https://10.0.16.4:2379\",\n \"etcd_key_file\": \"/etc/cni/net.d/calico-tls/etcd-key\",\n \"etcd_cert_file\": \"/etc/cni/net.d/calico-tls/etcd-cert\",\n \"etcd_ca_cert_file\": \"/etc/cni/net.d/calico-tls/etcd-ca\",\n \"mtu\": 2070,\n \"ipam\": {\n \"type\": \"calico-ipam\",\n \"assign_ipv4\": \"true\",\n \"assign_ipv6\": \"false\"\n },\n \"container_settings\": {\n \"allow_ip_forwarding\": true\n },\n \"policy\": {\n \"type\": \"k8s\",\n \"k8s_api_root\": \"https://[10.96.0.1]:443\"\n },\n \"kubernetes\": {\n \"kubeconfig\": \"/etc/cni/net.d/calico-kubeconfig\"\n }\n },\n {\n \"type\": \"tuning\",\n \"sysctl\": {\"net.ipv4.tcp_mtu_probing\": \"1\"}\n },\n {\n \"type\": \"portmap\",\n \"snat\": true,\n \"capabilities\": {\"portMappings\": true}\n }\n ]\n}"
},
"IFName": "eth0"
}
]
},
"config": {
"containerd": {
"snapshotter": "overlayfs",
"defaultRuntimeName": "runc",
"defaultRuntime": {
"runtimeType": "",
"runtimePath": "",
"runtimeEngine": "",
"PodAnnotations": null,
"ContainerAnnotations": null,
"runtimeRoot": "",
"options": null,
"privileged_without_host_devices": false,
"baseRuntimeSpec": "",
"cniConfDir": "",
"cniMaxConfNum": 0
},
"untrustedWorkloadRuntime": {
"runtimeType": "",
"runtimePath": "",
"runtimeEngine": "",
"PodAnnotations": null,
"ContainerAnnotations": null,
"runtimeRoot": "",
"options": null,
"privileged_without_host_devices": false,
"baseRuntimeSpec": "",
"cniConfDir": "",
"cniMaxConfNum": 0
},
"runtimes": {
"runc": {
"runtimeType": "io.containerd.runc.v2",
"runtimePath": "",
"runtimeEngine": "",
"PodAnnotations": null,
"ContainerAnnotations": null,
"runtimeRoot": "",
"options": {
"SystemdCgroup": true
},
"privileged_without_host_devices": false,
"baseRuntimeSpec": "",
"cniConfDir": "",
"cniMaxConfNum": 0
}
},
"noPivot": false,
"disableSnapshotAnnotations": true,
"discardUnpackedLayers": false,
"ignoreRdtNotEnabledErrors": false
},
"cni": {
"binDir": "/opt/cni/bin",
"confDir": "/etc/cni/net.d",
"maxConfNum": 1,
"confTemplate": "",
"ipPref": ""
},
"registry": {
"configPath": "/etc/containerd/certs.d",
"mirrors": {},
"configs": null,
"auths": null,
"headers": null
},
"imageDecryption": {
"keyModel": "node"
},
"disableTCPService": true,
"streamServerAddress": "127.0.0.1",
"streamServerPort": "0",
"streamIdleTimeout": "4h0m0s",
"enableSelinux": false,
"selinuxCategoryRange": 1024,
"sandboxImage": "registry.eccd.local:5000/pause:3.8-1-3e405cfb",
"statsCollectPeriod": 10,
"systemdCgroup": false,
"enableTLSStreaming": false,
"x509KeyPairStreaming": {
"tlsCertFile": "",
"tlsKeyFile": ""
},
"maxContainerLogSize": 16384,
"disableCgroup": false,
"disableApparmor": false,
"restrictOOMScoreAdj": false,
"maxConcurrentDownloads": 3,
"disableProcMount": false,
"unsetSeccompProfile": "",
"tolerateMissingHugetlbController": true,
"disableHugetlbController": true,
"device_ownership_from_security_context": true,
"ignoreImageDefinedVolumes": false,
"netnsMountsUnderStateDir": false,
"enableUnprivilegedPorts": false,
"enableUnprivilegedICMP": false,
"containerdRootDir": "/var/lib/docker/containerd/root",
"containerdEndpoint": "/run/containerd/containerd.sock",
"rootDir": "/var/lib/docker/containerd/root/io.containerd.grpc.v1.cri",
"stateDir": "/run/containerd/io.containerd.grpc.v1.cri"
},
"golang": "go1.17.6",
"lastCNILoadStatus": "OK",
"lastCNILoadStatus.default": "OK"
}
worker:~> uname -a
Linux worker-pool1-xxxxxx 5.14.21-150400.24.33-default #1 SMP PREEMPT_DYNAMIC Fri Nov 4 13:55:06 UTC 2022 (76cfe60) x86_64 x86_64 x86_64 GNU/Linux
worker:~> cat /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 169.254.20.10
clusterDomain: cluster.local
containerLogMaxFiles: 5
containerLogMaxSize: 50Mi
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
featureGates:
AllAlpha: false
LegacyServiceAccountTokenNoAutoGeneration: false
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageGCHighThresholdPercent: 80
imageGCLowThresholdPercent: 75
imageMinimumGCAge: 0s
kind: KubeletConfiguration
kubeletCgroups: /ccd.slice/kubelet.service
logging:
flushFrequency: 0
options:
json:
infoBufferSize: "0"
verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
serverTLSBootstrap: true
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
- TLS_RSA_WITH_AES_256_GCM_SHA384
- TLS_RSA_WITH_AES_128_GCM_SHA256
volumeStatsAggPeriod: 0s
cpuManagerPolicy: static
reservedSystemCPUs: "1,3"
systemReserved: {'ephemeral-storage': '1Gi', 'cpu': '1000m', 'memory': '500Mi'}
Show configuration if it is related to CRI plugin.
version = 2
root = "/var/lib/docker/containerd/root"
state = "/run/containerd"
plugin_dir = ""
disabled_plugins = []
required_plugins = []
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
tcp_address = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
[ttrpc]
address = ""
uid = 0
gid = 0
[debug]
address = ""
uid = 0
gid = 0
level = ""
[metrics]
address = ""
grpc_histogram = false
[cgroup]
path = ""
[timeouts]
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
pause_threshold = 0.02
deletion_threshold = 0
mutation_threshold = 100
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
disable_tcp_service = true
stream_server_address = "127.0.0.1"
stream_server_port = "0"
stream_idle_timeout = "4h0m0s"
enable_selinux = false
sandbox_image = "registry.eccd.local:5000/pause:3.8-1-3e405cfb"
stats_collect_period = 10
systemd_cgroup = false
enable_tls_streaming = false
max_container_log_line_size = 16384
disable_cgroup = false
disable_apparmor = false
restrict_oom_score_adj = false
max_concurrent_downloads = 3
device_ownership_from_security_context = true
disable_proc_mount = false
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "runc"
no_pivot = false
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
max_conf_num = 1
conf_template = ""
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
shim = "containerd-shim"
runtime = "runc"
runtime_root = ""
no_shim = false
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.snapshotter.v1.devmapper"]
root_path = ""
pool_name = ""
base_image_size = ""