-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Description
We observed higher memory usage (likely during container startup) after the fix for CVE 0a8e411.
We had a test that specifies 10m container cgroup limit, which never failed before, but now the container get oom-killed a lot. For example https://gubernator.k8s.io/build/kubernetes-jenkins/logs/ci-containerd-node-e2e-1-2/2500.
kernel: runc:[2:INIT] invoked oom-killer: gfp_mask=0x24000c0, order=0, oom_score_adj=998
kernel: runc:[2:INIT] cpuset=80e651c417ebd71d83e5023ee59b281e585497468bd71ee7c7b3ae6730d9ec8f mems_allowed=0
kernel: CPU: 0 PID: 333 Comm: runc:[2:INIT] Not tainted 4.4.64+ #1
kernel: Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
kernel: 0000000000000000 ffff880003e87ca8 ffffffff9f317334 ffff880003e87d88
kernel: ffff8800bb3e8000 ffff880003e87d18 ffffffff9f1a8fb4 ffff880003e87ce0
kernel: ffffffff9f13e780 ffff8800bb3eb500 0000000000000206 ffff880003e87cf0
kernel: Call Trace:
kernel: [<ffffffff9f317334>] dump_stack+0x63/0x8f
kernel: [<ffffffff9f1a8fb4>] dump_header+0x65/0x1d4
kernel: [<ffffffff9f13e780>] ? find_lock_task_mm+0x20/0xb0
kernel: [<ffffffff9f13ef1d>] oom_kill_process+0x28d/0x430
kernel: [<ffffffff9f1a3e6b>] ? mem_cgroup_iter+0x1db/0x390
kernel: [<ffffffff9f1a6374>] mem_cgroup_out_of_memory+0x284/0x2d0
kernel: [<ffffffff9f1a6de9>] mem_cgroup_oom_synchronize+0x2f9/0x310
kernel: [<ffffffff9f1a1ab0>] ? memory_high_write+0xc0/0xc0
kernel: [<ffffffff9f13f5f8>] pagefault_out_of_memory+0x38/0xa0
kernel: [<ffffffff9f045a27>] mm_fault_error+0x77/0x150
kernel: [<ffffffff9f046264>] __do_page_fault+0x414/0x420
kernel: [<ffffffff9f046292>] do_page_fault+0x22/0x30
kernel: [<ffffffff9f5b1f98>] page_fault+0x28/0x30
It seems to be caused by the memory spike introduced by binary copy. Should we always enforce a minimum memory limit for runc containers in the future?
My runc binary is statically linked:
$ ls -alh usr/local/sbin/runc
-rwxr-xr-x 1 lantaol primarygroup 7.8M Feb 12 13:46 usr/local/sbin/runc
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels