application crash due to k8s 1.9.x open the kernel memory accounting by default

when we upgrade the k8s from 1.6.4 to 1.9.0,  after a few days,  the product environment report the machine is hang and jvm crash in container randomly , we found the cgroup memory css id is not release, when cgroup css id is large than 65535, the machine is hang, we must restart the machine. 

we had found runc/libcontainers/memory.go in k8s 1.9.0 had delete the if condition,  which cause the kernel memory open by default, but we are using kernel 3.10.0-514.16.1.el7.x86_64,  on this version, kernel memory limit is not stable, which leak the cgroup memory leak and application crash randomly

when we run  "docker run -d --name test001 --kernel-memory 100M  " , docker report 
WARNING: You specified a kernel memory limit on a kernel older than 4.0. Kernel memory limits are experimental on older kernels, it won't work as expected and can cause your system to be unstable.
 

```
k8s.io/kubernetes/vendor/github.com/opencontainers/runc/libcontainer/cgroups/fs/memory.go

-		if d.config.KernelMemory != 0 {
+			// Only enable kernel memory accouting when this cgroup
+			// is created by libcontainer, otherwise we might get
+			// error when people use `cgroupsPath` to join an existed
+			// cgroup whose kernel memory is not initialized.
 			if err := EnableKernelMemoryAccounting(path); err != nil {
 				return err
 			}
```
I want to know why kernel memory open by default?  can k8s consider the different kernel version?

**Is this a BUG REPORT or FEATURE REQUEST?**: BUG REPORT

> Uncomment only one, leave it on its own line: 
>
> /kind bug
> /kind feature


**What happened**:
application crash and cgroup memory leak

**What you expected to happen**:
application stable and cgroup memory doesn't leak

**How to reproduce it (as minimally and precisely as possible)**:
install k8s 1.9.x on kernel 3.10.0-514.16.1.el7.x86_64 machine, and create and delete pod repeatedly,  when create more than 65535/3 times , the kubelet report "cgroup no space left on device" error,  when the cluster run a few days , the container will crash. 

**Anything else we need to know?**:

**Environment**: kernel 3.10.0-514.16.1.el7.x86_64   
- Kubernetes version (use `kubectl version`): k8s 1.9.x
- Cloud provider or hardware configuration: 
- OS (e.g. from /etc/os-release):  
```
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
```
- Kernel (e.g. `uname -a`):  3.10.0-514.16.1.el7.x86_64 
- Install tools: rpm 
- Others:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

application crash due to k8s 1.9.x open the kernel memory accounting by default #61937

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

application crash due to k8s 1.9.x open the kernel memory accounting by default #61937

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions