Skip to content

Conversation

@ningmingxiao
Copy link
Contributor

@ningmingxiao ningmingxiao commented Aug 16, 2024

@laurazard after this commit 892dc54
we find init exited events sometimes will be dropped.
fix:#10589
@imo-ininder

@k8s-ci-robot
Copy link

Hi @ningmingxiao. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ningmingxiao
Copy link
Contributor Author

ningmingxiao commented Aug 16, 2024

I add some sleep can reproduce it

func (s *service) preStart(c *runc.Container) (handleStarted func(*runc.Container, process.Process), cleanup func()) {
	exits := make(map[int][]runcC.Exit)
	s.exitSubscribers[&exits] = struct{}{}

	if c != nil {
		// Remove container init process from s.running so it will once again be
		// treated as an early exit if it exits before handleStarted is called.
		pid := c.Pid()
		var newRunning []containerProcess
		for _, cp := range s.running[pid] {
			if cp.Container != c {
				newRunning = append(newRunning, cp)
			}
		}
		if len(newRunning) > 0 {
			s.running[pid] = newRunning
		} else {
			delete(s.running, pid)
		}
	}

	handleStarted = func(c *runc.Container, p process.Process) {
		var pid int
		if p != nil {
			pid = p.Pid()
		}

		_, init := p.(*process.Init)
		if !init {
			time.Sleep(time.Second * 30)
		}
		s.lifecycleMu.Lock()

		var initExits []runcC.Exit
		var initCps []containerProcess
		if !init {
			time.Sleep(time.Second * 30)

crictl create pod
crictl exec container (will hang some time)
open another terminal kill container init process and exec process.
crictl ps show container is still running.

@ningmingxiao ningmingxiao changed the title fix init exited event sometimes lost fix init process exited event lost Aug 16, 2024
@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 4009103 to 110d5b5 Compare August 16, 2024 06:34
@samuelkarp
Copy link
Member

/ok-to-test

@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 4 times, most recently from d393cd4 to f2eb6e9 Compare August 16, 2024 09:25
@ningmingxiao
Copy link
Contributor Author

/ok-to-test

done! thank you.

@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 2d91851 to ba07947 Compare August 16, 2024 09:37
@laurazard laurazard requested a review from cpuguy83 August 16, 2024 10:49
@laurazard
Copy link
Member

@ningmingxiao can you explain why the exits are being lost/how this fixes it?

The original code skips exit events for init processes for as long as there is a pending exec for that container, and stores those skipped events in s.running for that pid.

This is because exits for the init process shouldn't be sent while there are still unresolved pending execs.

Those skipped events should then be processed after the pending execs are resolved.

Simple emitting those exit events where this change does (before the pending execs have been resolved) doesn't look correct (but I need to take a better look).

@laurazard
Copy link
Member

laurazard commented Aug 16, 2024

I also don't really understand your repro instructions: you're adding a time.Sleep, for the !init case, twice, once before the lifecycleMu.Lock() and once after. What does this demonstrate?

Trying to reproduce this, after killing the init process, the container correctly shows as stopped –

$ sudo ctr run -d docker.io/library/alpine:latest test sh -c "while true; do sleep 1; done"
$ sudo ctr t exec -d --exec-id 1 test sleep 120
[this hangs because of the introduced sleep]
[on another terminal]
$ ps aux | grep sleep
root       46277  0.0  0.0   1720   896 ?        Ss   12:44   0:00 sh -c while true; do sleep 1; done
root       46381  0.0  0.0  14060  4224 pts/10   S+   12:44   0:00 sudo ctr t exec -d --exec-id 1 borkeee sleep 120
root       46383  0.0  0.0  14060  1572 pts/9    Ss   12:44   0:00 sudo ctr t exec -d --exec-id 1 borkeee sleep 120
root       46384  0.0  0.2 1987836 23040 pts/9   Sl+  12:44   0:00 ctr t exec -d --exec-id 1 borkeee sleep 120
root       46404  0.0  0.0   1704   768 ?        Ss   12:44   0:00 sleep 120

$ kill -9 46277
$ sudo ctr t ls
TASK       PID      STATUS
test    46277    STOPPED

@ningmingxiao
Copy link
Contributor Author

ningmingxiao commented Aug 16, 2024

 root@LIN-FB738BFD367 pods]# cat pod.json 
{
    "metadata": {
       "name": "sandbox-01027951850",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}


[root@LIN-FB738BFD367 pods]# cat container.json 
{
        "metadata": {
                "name": "gpu"
        },
        "image": {
                "image": "docker.io/library/busybox:1.28"
        },
        "command": [
                "top"
        ],
        "tty":true

}

 
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
1e5aec477334f2616131c2d3f710af349aca0224c2261f6b5052d98667091767
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9
[root@LIN-FB738BFD367 pods]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   2 seconds ago       Running             gpu                 0                   e2181ce0e5c30       unknown

# ps -auxf 
root     1839998  0.0  0.0 1241592 14860 pts/9   Sl   20:40   0:00 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id e2181ce0e5c3023358273e43
root     1840031  0.0  0.0   1024     4 ?        Ss   20:40   0:00  \_ /pause
root     1840091  0.0  0.0   1240     4 pts/0    Ss+  20:40   0:00  \_ top

[root@LIN-FB738BFD367 pods]# crictl exec -it 5f6 sh
open another terminlal
# 
[root@LIN-FB738BFD367 ~]# kill -9 1840091             
[root@LIN-FB738BFD367 ~]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   38 seconds ago      Running             gpu                 0                   e2181ce0e5c30       unknown
# ps -auxf
root     1839998  0.0  0.0 1241592 14988 pts/9   Sl   20:40   0:00 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id e2181ce0e5c3023358273e43
root     1840031  0.0  0.0   1024     4 ?        Ss   20:40   0:00  \_ /pause
root     1842880  0.0  0.0   1244     4 pts/1    Ss+  20:41   0:00  \_ sh

[root@LIN-FB738BFD367 ~]# kill -9 1842880          
[root@LIN-FB738BFD367 ~]# crictl  ps
CONTAINER           IMAGE                            CREATED              STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   About a minute ago   Running             gpu                 0                   e2181ce0e5c30       unknown


after long time 
crictl ps will show container is runing. 
root@LIN-FB738BFD367 ~]# crictl ps
CONTAINER           IMAGE                            CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5f6c4461e154f       docker.io/library/busybox:1.28   8 minutes ago       Running             gpu                 0                   e2181ce0e5c30       unknown

containerd log
commit 919beb1

INFO[2024-08-16T20:40:41.594260261+08:00] CreateContainer within sandbox "e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a" for container &ContainerMetadata{Name:gpu,Attempt:0,} 
INFO[2024-08-16T20:40:41.594768938+08:00] Logging will be disabled due to empty log paths for sandbox ("/tmp") or container ("") 
INFO[2024-08-16T20:40:41.623434266+08:00] CreateContainer within sandbox "e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a" for &ContainerMetadata{Name:gpu,Attempt:0,} returns container id "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" 
INFO[2024-08-16T20:40:41.624527310+08:00] StartContainer for "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" 
INFO[2024-08-16T20:40:41.625515845+08:00] connecting to shim 5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9  address="unix:///run/containerd/s/487ed06380981c608328c0ef9115c338208b1ddbfcd6af78bc8243c2f04a3cc6" protocol=ttrpc version=3
INFO[2024-08-16T20:40:41.734063051+08:00] StartContainer for "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9" returns successfully 
INFO[2024-08-16T20:42:08.095278885+08:00] TaskExit event in podsandbox handler container_id:"5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9"  id:"c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17"  pid:1842880  exit_status:137  exited_at:{seconds:1723812128  nanos:94596305} 
INFO[2024-08-16T20:42:08.095511442+08:00] Container exec "c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17" stdin closed 
ERRO[2024-08-16T20:42:08.095948969+08:00] Failed to resize process "c98c01da1f4be3c39b2432fa99f42223a3635d52bae7f142dc49257d6bb7ea17" console for container "5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9"  error="cannot resize a stopped container: unknown"
WARN[2024-08-16T20:45:25.847943487+08:00] container event discarded                     container=2c67115babd793d8b5bb0b781399b165a4698558c9b3bff7550b5dea8f48b0d5 type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:25.848030440+08:00] container event discarded                     container=afed902f31aea8b2f95ce68de586163694458f74815de37fbf9707faa12b2aee type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:25.858362216+08:00] container event discarded                     container=cb20c90399b3ca1127f4170138eea404eaa8a34b354f3c9f4b2c55527dea462c type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:27.359043553+08:00] container event discarded                     container=ace36326a028d450dae3b3a181c8ba12b5749b3b2115726016154af254996dda type=CONTAINER_STOPPED_EVENT
WARN[2024-08-16T20:45:41.603021797+08:00] container event discarded                     container=e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a type=CONTAINER_CREATED_EVENT
WARN[2024-08-16T20:45:41.603129911+08:00] container event discarded                     container=e2181ce0e5c3023358273e43b54cd48b31fa9d2d421125f9a78d6b9a6a68618a type=CONTAINER_STARTED_EVENT
WARN[2024-08-16T20:45:41.634464404+08:00] container event discarded                     container=5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9 type=CONTAINER_CREATED_EVENT
WARN[2024-08-16T20:45:41.744911609+08:00] container event discarded                     container=5f6c4461e154f2399ec4fa473fac931a68df67bc59be46fe16936f0d2d3113d9 type=CONTAINER_STARTED_EVENT

can't see 1840091 exited log.
@laurazard

@laurazard
Copy link
Member

laurazard commented Aug 16, 2024

@ningmingxiao so this happens if you kill the main process while the exec is starting/preStart hasn't run yet for the exec? There might be a missing lock/unlock/deadlock issue there, I'm unsure. Can you explain what you think the issue is/how your change fixes it?

@ningmingxiao
Copy link
Contributor Author

@ningmingxiao so this happens if you kill the main process while the exec is starting/preStart hasn't run yet for the exec? There might be a missing lock/unlock/deadlock issue there, I'm unsure. Can you explain what you think the issue is/how your change fixes it?

Since you introduced a sleep, by the time you try to start the exec/lock gets released it might be in a weird state. It could happen that the init exit gets skipped but since the exec fails somewhere unexpected it's also not appended/never gets processed.

I add some log find init exit event will add in skipped

for _, initPidCp := range s.running[initPid] {
	if initPidCp.Container == c {
		initCps = append(initCps, initPidCp)
	} else {
		skipped = append(skipped, initPidCp)
	}
}

and handleProcessExit never run. exited is nil

		if pid == 0 || exited {
			s.lifecycleMu.Unlock()
			for _, ee := range ees {
				s.handleProcessExit(ee, c, p)
			}
			for _, eee := range initExits {
				for _, cp := range initCps {
					s.handleProcessExit(eee, cp.Container, cp.Process)
				}
			}
		}

@laurazard
Copy link
Member

Where are the logs for the exec started in #10603 (comment)?

@ningmingxiao
Copy link
Contributor Author

ningmingxiao commented Aug 16, 2024

another way reproduce don't add sleep

 root@LIN-FB738BFD367 pods]# cat pod.json 
{
    "metadata": {
       "name": "sandbox-01027951850",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}


[root@LIN-FB738BFD367 pods]# cat container.json 
{
        "metadata": {
                "name": "gpu"
        },
        "image": {
                "image": "docker.io/library/busybox:1.28"
        },
        "command": [
                "top"
        ],
        "tty":true

}

 
[root@LIN-FB738BFD367 pods]#  crictl  run --no-pull container.json  pod.json
1e5aec477334f2616131c2d3f710af349aca0224c2261f6b5052d98667091767
package main

import (
	"fmt"
	"os/exec"
	"sync"
	"time"
)

func main() {
	go func ()  {
		time.Sleep(time.Second*2)
		runCmd("kill -9 `pidof top`")
	}()
	test()
	fmt.Println("done")
}

func runCmd(command string) {
	cmd := exec.Command("bash", "-c", command)
	err := cmd.Run()
	if err != nil {
		fmt.Println(err)
	}
}

func test() {
	var wg sync.WaitGroup
	const numContainers = 500
	for i := 1; i <= numContainers; i++ {
		wg.Add(1)
		go func(i int) {
			defer wg.Done()
			runCmd("crictl exec 1e5aec477 date")
		}(i)
	}
	wg.Wait()
}

crictl ps show container still running.

@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 96bf2c6 to 7cec07e Compare August 24, 2024 15:18
@ningmingxiao
Copy link
Contributor Author

ningmingxiao commented Aug 24, 2024

kill init children's process when receive init process is exited. and send event later. @corhere @laurazard
we don't need to prevent new exec process anymore.
#10634

@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 2 times, most recently from 65bf79f to 8868f15 Compare August 24, 2024 20:12
@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 8077774 to 00ac973 Compare August 25, 2024 01:22
@ningmingxiao ningmingxiao force-pushed the fix_event_lost_new branch 3 times, most recently from 75369cf to 39d2441 Compare August 26, 2024 08:26

func (s *service) processExits() {
for e := range s.ec {
s.killInitChildren(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this change alone fixes the issue. Even if we kill the execs when the init exits, what's there to prevent new execs from being consistently started (and thereby incrementing s.pendingExecs[c], never allowing us to process the init's exit?

Copy link
Contributor Author

@ningmingxiao ningmingxiao Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after runc kill --all -9 , new exec will return error.

[root@LIN-FB738BFD367 pods]# crictl exec 6318 date
FATA[0000] execing command in container: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "d116cb9da07e0d27d9ddbbc8bd469bbc994adb67e12cbba86cce549148650e31": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
after runc kill -all -9 runc container status is stopped, runc exec will fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about change handleProcessExit to sendProcessExit? we just need init exited is last sent. we should call p.SetExited(e.Status) as soon as possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd have to dig into what the consequences of calling p.SetExited (potentially) long before sending the exit event. Does that mean that if I check the shim status it'll say it's dead, but it hasn't emitted the init exit yet (potentially)? Does that break any kind of explicit (or implicit) contract?

Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
@samuelkarp
Copy link
Member

Closing in favor of #10650 and #10651 since work has moved there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants