fix init process exited event lost #10603

ningmingxiao · 2024-08-16T05:52:29Z

@laurazard after this commit 892dc54
we find init exited events sometimes will be dropped.
fix:#10589
@imo-ininder

k8s-ci-robot · 2024-08-16T05:52:39Z

Hi @ningmingxiao. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ningmingxiao · 2024-08-16T05:58:21Z

I add some sleep can reproduce it

func (s *service) preStart(c *runc.Container) (handleStarted func(*runc.Container, process.Process), cleanup func()) {
	exits := make(map[int][]runcC.Exit)
	s.exitSubscribers[&exits] = struct{}{}

	if c != nil {
		// Remove container init process from s.running so it will once again be
		// treated as an early exit if it exits before handleStarted is called.
		pid := c.Pid()
		var newRunning []containerProcess
		for _, cp := range s.running[pid] {
			if cp.Container != c {
				newRunning = append(newRunning, cp)
			}
		}
		if len(newRunning) > 0 {
			s.running[pid] = newRunning
		} else {
			delete(s.running, pid)
		}
	}

	handleStarted = func(c *runc.Container, p process.Process) {
		var pid int
		if p != nil {
			pid = p.Pid()
		}

		_, init := p.(*process.Init)
		if !init {
			time.Sleep(time.Second * 30)
		}
		s.lifecycleMu.Lock()

		var initExits []runcC.Exit
		var initCps []containerProcess
		if !init {
			time.Sleep(time.Second * 30)

crictl create pod
crictl exec container (will hang some time)
open another terminal kill container init process and exec process.
crictl ps show container is still running.

samuelkarp · 2024-08-16T07:40:35Z

/ok-to-test

cmd/containerd-shim-runc-v2/task/service.go

ningmingxiao · 2024-08-16T09:25:57Z

/ok-to-test

done! thank you.

laurazard · 2024-08-16T10:56:41Z

@ningmingxiao can you explain why the exits are being lost/how this fixes it?

The original code skips exit events for init processes for as long as there is a pending exec for that container, and stores those skipped events in s.running for that pid.

This is because exits for the init process shouldn't be sent while there are still unresolved pending execs.

Those skipped events should then be processed after the pending execs are resolved.

Simple emitting those exit events where this change does (before the pending execs have been resolved) doesn't look correct (but I need to take a better look).

cmd/containerd-shim-runc-v2/task/service.go

laurazard · 2024-08-16T11:48:52Z

I also don't really understand your repro instructions: you're adding a time.Sleep, for the !init case, twice, once before the lifecycleMu.Lock() and once after. What does this demonstrate?

Trying to reproduce this, after killing the init process, the container correctly shows as stopped –

$ sudo ctr run -d docker.io/library/alpine:latest test sh -c "while true; do sleep 1; done"
$ sudo ctr t exec -d --exec-id 1 test sleep 120
[this hangs because of the introduced sleep]

[on another terminal]
$ ps aux | grep sleep
root       46277  0.0  0.0   1720   896 ?        Ss   12:44   0:00 sh -c while true; do sleep 1; done
root       46381  0.0  0.0  14060  4224 pts/10   S+   12:44   0:00 sudo ctr t exec -d --exec-id 1 borkeee sleep 120
root       46383  0.0  0.0  14060  1572 pts/9    Ss   12:44   0:00 sudo ctr t exec -d --exec-id 1 borkeee sleep 120
root       46384  0.0  0.2 1987836 23040 pts/9   Sl+  12:44   0:00 ctr t exec -d --exec-id 1 borkeee sleep 120
root       46404  0.0  0.0   1704   768 ?        Ss   12:44   0:00 sleep 120

$ kill -9 46277
$ sudo ctr t ls
TASK       PID      STATUS
test    46277    STOPPED

ningmingxiao · 2024-08-16T12:51:03Z

 root@LIN-FB738BFD367 pods]# cat pod.json 
{
    "metadata": {
       "name": "sandbox-01027951850",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}


[root@LIN-FB738BFD367 pods]# cat container.json 
{
        "metadata": {
                "name": "gpu"
        },
        "image": {
                "image": "docker.io/library/busybox:1.28"
        },
        "command": [
                "top"
        ],
        "tty":true

}

 
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
1e5aec477334f2616131c2d3f710af349aca0224c2261f6b5052d98667091767
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9
[root@LIN-FB738BFD367 pods]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   2 seconds ago       Running             gpu                 0                   e2181ce0e5c30       unknown

# ps -auxf 
root     1839998  0.0  0.0 1241592 14860 pts/9   Sl   20:40   0:00 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id e2181ce0e5c3023358273e43
root     1840031  0.0  0.0   1024     4 ?        Ss   20:40   0:00  \_ /pause
root     1840091  0.0  0.0   1240     4 pts/0    Ss+  20:40   0:00  \_ top

[root@LIN-FB738BFD367 pods]# crictl exec -it 5f6 sh
open another terminlal
# 
[root@LIN-FB738BFD367 ~]# kill -9 1840091             
[root@LIN-FB738BFD367 ~]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   38 seconds ago      Running             gpu                 0                   e2181ce0e5c30       unknown
# ps -auxf
root     1839998  0.0  0.0 1241592 14988 pts/9   Sl   20:40   0:00 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id e2181ce0e5c3023358273e43
root     1840031  0.0  0.0   1024     4 ?        Ss   20:40   0:00  \_ /pause
root     1842880  0.0  0.0   1244     4 pts/1    Ss+  20:41   0:00  \_ sh

[root@LIN-FB738BFD367 ~]# kill -9 1842880          
[root@LIN-FB738BFD367 ~]# crictl  ps
CONTAINER           IMAGE                            CREATED              STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   About a minute ago   Running             gpu                 0                   e2181ce0e5c30       unknown


after long time 
crictl ps will show container is runing. 
root@LIN-FB738BFD367 ~]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   8 minutes ago       Running             gpu                 0                   e2181ce0e5c30       unknown

containerd log
commit 919beb1

INFO[2024-08-16T20:40:41.594260261+08:00] CreateContainer within sandbox "e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a" for container &ContainerMetadata{Name:gpu,Attempt:0,} 
INFO[2024-08-16T20:40:41.594768938+08:00] Logging will be disabled due to empty log paths for sandbox ("/tmp") or container ("") 
INFO[2024-08-16T20:40:41.623434266+08:00] CreateContainer within sandbox "e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a" for &ContainerMetadata{Name:gpu,Attempt:0,} returns container id "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" 
INFO[2024-08-16T20:40:41.624527310+08:00] StartContainer for "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" 
INFO[2024-08-16T20:40:41.625515845+08:00] connecting to shim 5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9  address="unix:///run/containerd/s/487ed06380981c608328c0ef9115c338208b1ddbfcd6af78bc8243c2f04a3cc6" protocol=ttrpc version=3
INFO[2024-08-16T20:40:41.734063051+08:00] StartContainer for "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" returns successfully 
INFO[2024-08-16T20:42:08.095278885+08:00] TaskExit event in podsandbox handler container_id:"5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9"  id:"c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17"  pid:1842880  exit_status:137  exited_at:{seconds:1723812128  nanos:94596305} 
INFO[2024-08-16T20:42:08.095511442+08:00] Container exec "c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17" stdin closed 
ERRO[2024-08-16T20:42:08.095948969+08:00] Failed to resize process "c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17" console for container "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9"  error="cannot resize a stopped container: unknown"
WARN[2024-08-16T20:45:25.847943487+08:00] container event discarded                     container=2c67115babd793d8b5bb0b781399b165a4698558c9b3bff7550b5dea8f48b0d5 type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:25.848030440+08:00] container event discarded                     container=afed902f31aea8b2f95ce68de586163694458f74815de37fbf9707faa12b2aee type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:25.858362216+08:00] container event discarded                     container=cb20c90399b3ca1127f4170138eea404eaa8a34b354f3c9f4b2c55527dea462c type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:27.359043553+08:00] container event discarded                     container=ace36326a028d450dae3b3a181c8ba12b5749b3b2115726016154af254996dda type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:41.603021797+08:00] container event discarded                     container=e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a type=CONTAINER_CREATED_EVENT
WARN[2024-08-16T20:45:41.603129911+08:00] container event discarded                     container=e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a type=CONTAINER_STARTED_EVENT
WARN[2024-08-16T20:45:41.634464404+08:00] container event discarded                     container=5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9 type=CONTAINER_CREATED_EVENT
WARN[2024-08-16T20:45:41.744911609+08:00] container event discarded                     container=5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9 type=CONTAINER_STARTED_EVENT

can't see 1840091 exited log.
@laurazard

laurazard · 2024-08-16T13:02:10Z

@ningmingxiao so this happens if you kill the main process while the exec is starting/preStart hasn't run yet for the exec? There might be a missing lock/unlock/deadlock issue there, I'm unsure. Can you explain what you think the issue is/how your change fixes it?

ningmingxiao · 2024-08-16T13:23:31Z

@ningmingxiao so this happens if you kill the main process while the exec is starting/preStart hasn't run yet for the exec? There might be a missing lock/unlock/deadlock issue there, I'm unsure. Can you explain what you think the issue is/how your change fixes it?

Since you introduced a sleep, by the time you try to start the exec/lock gets released it might be in a weird state. It could happen that the init exit gets skipped but since the exec fails somewhere unexpected it's also not appended/never gets processed.

I add some log find init exit event will add in skipped

for _, initPidCp := range s.running[initPid] {
	if initPidCp.Container == c {
		initCps = append(initCps, initPidCp)
	} else {
		skipped = append(skipped, initPidCp)
	}
}

and handleProcessExit never run. exited is nil

		if pid == 0 || exited {
			s.lifecycleMu.Unlock()
			for _, ee := range ees {
				s.handleProcessExit(ee, c, p)
			}
			for _, eee := range initExits {
				for _, cp := range initCps {
					s.handleProcessExit(eee, cp.Container, cp.Process)
				}
			}
		}

laurazard · 2024-08-16T14:37:33Z

Where are the logs for the exec started in #10603 (comment)?

ningmingxiao · 2024-08-16T14:39:06Z

another way reproduce don't add sleep

 root@LIN-FB738BFD367 pods]# cat pod.json 
{
    "metadata": {
       "name": "sandbox-01027951850",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}


[root@LIN-FB738BFD367 pods]# cat container.json 
{
        "metadata": {
                "name": "gpu"
        },
        "image": {
                "image": "docker.io/library/busybox:1.28"
        },
        "command": [
                "top"
        ],
        "tty":true

}

 
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
1e5aec477334f2616131c2d3f710af349aca0224c2261f6b5052d98667091767

package main

import (
	"fmt"
	"os/exec"
	"sync"
	"time"
)

func main() {
	go func ()  {
		time.Sleep(time.Second*2)
		runCmd("kill -9 `pidof top`")
	}()
	test()
	fmt.Println("done")
}

func runCmd(command string) {
	cmd := exec.Command("bash", "-c", command)
	err := cmd.Run()
	if err != nil {
		fmt.Println(err)
	}
}

func test() {
	var wg sync.WaitGroup
	const numContainers = 500
	for i := 1; i <= numContainers; i++ {
		wg.Add(1)
		go func(i int) {
			defer wg.Done()
			runCmd("crictl exec 1e5aec477 date")
		}(i)
	}
	wg.Wait()
}

crictl ps show container still running.

cmd/containerd-shim-runc-v2/task/service.go

ningmingxiao · 2024-08-24T15:21:10Z

kill init children's process when receive init process is exited. and send event later. @corhere @laurazard
we don't need to prevent new exec process anymore.
#10634

laurazard · 2024-08-26T11:27:26Z

cmd/containerd-shim-runc-v2/task/service.go

+
 func (s *service) processExits() {
 	for e := range s.ec {
+		s.killInitChildren(e)


I don't believe this change alone fixes the issue. Even if we kill the execs when the init exits, what's there to prevent new execs from being consistently started (and thereby incrementing s.pendingExecs[c], never allowing us to process the init's exit?

after runc kill --all -9 , new exec will return error.

[root@LIN-FB738BFD367 pods]# crictl exec 6318 date
FATA[0000] execing command in container: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "d116cb9da07e0d27d9ddbbc8bd469bbc994adb67e12cbba86cce549148650e31": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
after runc kill -all -9 runc container status is stopped, runc exec will fail

how about change handleProcessExit to sendProcessExit? we just need init exited is last sent. we should call p.SetExited(e.Status) as soon as possible

We'd have to dig into what the consequences of calling p.SetExited (potentially) long before sending the exit event. Does that mean that if I check the shim status it'll say it's dead, but it hasn't emitted the init exit yet (potentially)? Does that break any kind of explicit (or implicit) contract?

Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>

samuelkarp · 2024-09-04T18:08:32Z

Closing in favor of #10650 and #10651 since work has moved there.

k8s-ci-robot added needs-ok-to-test size/S labels Aug 16, 2024

dosubot bot added area/runtime Runtime kind/bug labels Aug 16, 2024

ningmingxiao force-pushed the fix_event_lost_new branch from 1cd8061 to 6672b8e Compare August 16, 2024 06:02

ningmingxiao changed the title ~~fix init exited event sometimes lost~~ fix init process exited event lost Aug 16, 2024

ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 4009103 to 110d5b5 Compare August 16, 2024 06:34

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Aug 16, 2024

samuelkarp reviewed Aug 16, 2024

View reviewed changes

cmd/containerd-shim-runc-v2/task/service.go Outdated Show resolved Hide resolved

cmd/containerd-shim-runc-v2/task/service.go Outdated Show resolved Hide resolved

ningmingxiao force-pushed the fix_event_lost_new branch 4 times, most recently from d393cd4 to f2eb6e9 Compare August 16, 2024 09:25

ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 2d91851 to ba07947 Compare August 16, 2024 09:37

laurazard requested a review from cpuguy83 August 16, 2024 10:49

laurazard requested changes Aug 16, 2024

View reviewed changes

cmd/containerd-shim-runc-v2/task/service.go Outdated Show resolved Hide resolved

cartermckinnon mentioned this pull request Aug 23, 2024

bug(containerd): process exit event lost awslabs/amazon-eks-ami#1933

Closed

corhere requested changes Aug 23, 2024

View reviewed changes

cmd/containerd-shim-runc-v2/task/service.go Outdated Show resolved Hide resolved

corhere mentioned this pull request Aug 23, 2024

runc-shim: refuse to start execs after init exits #10634

Closed

ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 96bf2c6 to 7cec07e Compare August 24, 2024 15:18

k8s-ci-robot added size/S and removed size/M labels Aug 24, 2024

ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 65bf79f to 8868f15 Compare August 24, 2024 20:12

k8s-ci-robot added size/M and removed size/S labels Aug 24, 2024

ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 8077774 to 00ac973 Compare August 25, 2024 01:22

k8s-ci-robot added size/S and removed size/M labels Aug 25, 2024

ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 75369cf to 39d2441 Compare August 26, 2024 08:26

laurazard reviewed Aug 26, 2024

View reviewed changes

ningmingxiao force-pushed the fix_event_lost_new branch from 39d2441 to 106e75f Compare August 26, 2024 15:48

k8s-ci-robot added size/M and removed size/S labels Aug 26, 2024

shim:kill init children's process if init process is exited.

0349d8b

Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>

ningmingxiao force-pushed the fix_event_lost_new branch from 106e75f to 0349d8b Compare August 27, 2024 11:09

samuelkarp closed this Sep 4, 2024

dims mentioned this pull request Sep 5, 2024

runc-shim: fix races/prevent init exits from being dropped #10651

Merged

fix init process exited event lost #10603

fix init process exited event lost #10603

Uh oh!

Conversation

ningmingxiao commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Aug 16, 2024

Uh oh!

ningmingxiao commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samuelkarp commented Aug 16, 2024

Uh oh!

Uh oh!

Uh oh!

ningmingxiao commented Aug 16, 2024

Uh oh!

laurazard commented Aug 16, 2024

Uh oh!

Uh oh!

laurazard commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ningmingxiao commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laurazard commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ningmingxiao commented Aug 16, 2024

Uh oh!

laurazard commented Aug 16, 2024

Uh oh!

ningmingxiao commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ningmingxiao commented Aug 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laurazard Aug 26, 2024

Choose a reason for hiding this comment

Uh oh!

ningmingxiao Aug 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ningmingxiao Aug 26, 2024

Choose a reason for hiding this comment

Uh oh!

laurazard Aug 26, 2024

Choose a reason for hiding this comment

Uh oh!

samuelkarp commented Sep 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ningmingxiao commented Aug 16, 2024 •

edited

Loading

ningmingxiao commented Aug 16, 2024 •

edited

Loading

laurazard commented Aug 16, 2024 •

edited

Loading

ningmingxiao commented Aug 16, 2024 •

edited

Loading

laurazard commented Aug 16, 2024 •

edited

Loading

ningmingxiao commented Aug 16, 2024 •

edited

Loading

ningmingxiao commented Aug 24, 2024 •

edited

Loading

ningmingxiao Aug 26, 2024 •

edited

Loading