pkg/cri: add timeout to drain exec io by fuweid · Pull Request #7832 · containerd/containerd

fuweid · 2022-12-18T09:22:53Z

By default, the child processes spawned by exec process will inherit standard io file descriptors. The shim server creates a pipe as data channel. Both exec process and its children write data into the write end of the pipe. And the shim server will read data from the pipe. If the write end is still open, the shim server will continue to wait for data from pipe.

So, if the exec command is like bash -c "sleep 365d &", the exec process is bash and quit after create sleep 365d. But the sleep 365d will hold the write end of the pipe for a year! It doesn't make senses that CRI plugin should wait for it.

For this case, we should use timeout to drain exec process's io instead of waiting for it.

Fixes: #7802

Signed-off-by: Wei Fu fuweid89@gmail.com

fuweid · 2022-12-18T10:46:52Z

ping @dcantah @kevpar to help review this.

In the windows, it seems the child processes will exit if exec process exits.

thaJeztah · 2022-12-18T11:20:53Z

So, if the exec command is like bash -c "sleep 365d &", the exec process is bash and quit after create sleep 365d. But the sleep 365d will hold the write end of the pipe for a year! It doesn't make senses that CRI plugin should wait for it.

Could there be processes though that genuinely take a long time to process? I get that the given example might be unfortunate, but wondering if there's valid cases where it's desired to wait for the proces to finish and return a result

fuweid · 2022-12-18T13:50:22Z

Could there be processes though that genuinely take a long time to process?

I think that we should care the main exec process. If the child process is still working, I think the main exec process should wait for it instead of exit. After main exec process exits, the kernel will re-parent it to pid 1 in the pid namespace. It is unlikely to trace it in shim.

And exec process doesn't have dedicated cgroup. If child processes re-parent, it is impossible to trace the life-cycle of the processes and hard to clean. I don't think it is valid user case we should support.

dcantah · 2022-12-19T05:16:18Z

In the windows, it seems the child processes will exit if exec process exits.

What command/scenario are you testing this with? Not sure if I'll have time to review tomorrow, but I'll drop a tidbit for some scenarios where that may happen; I don't think the child should exit if the parent just runs to completion, but my memories hazy.

Would have to fully refresh my mind on the behavior, but fairly sure by default on Windows child processes inherit their parents console, so any event (signal) sent to the console gets propagated to the children as well (CTRL-C, CTRL-BREAK and so on). In those scenarios, yes the child will likely die as well. I think we tried to map everything but SIGKILL to a console event, and SIGKILL would be mapped to TerminateProcess iirc, Kevin would have to chime in here. It doesn't seem like calling TerminateProcess in the parent will kill the child, and ExitProcess is the same, so it may just be console events that are the worry (also if the parent creates the child and asks for it to have a new console that also should get around the child dying, but we can't control any of this).

fuweid · 2022-12-19T10:37:53Z

What command/scenario are you testing this with?

Some processes created by exec main process are still running after exec main process exits. And the processes are holding the stdout/stderr file descriptors which inherited from exec main process. In runc-shim, it will block forever until the processes exits. So we have to delete the exec-process recording in shim and force to release io.

First issue about this described in #3286. cc @thaJeztah

fuweid · 2023-03-01T00:01:38Z

Ping @thaJeztah @mikebrow @dmcgowan

By default, the child processes spawned by exec process will inherit standard io file descriptors. The shim server creates a pipe as data channel. Both exec process and its children write data into the write end of the pipe. And the shim server will read data from the pipe. If the write end is still open, the shim server will continue to wait for data from pipe. So, if the exec command is like `bash -c "sleep 365d &"`, the exec process is bash and quit after create `sleep 365d`. But the `sleep 365d` will hold the write end of the pipe for a year! It doesn't make senses that CRI plugin should wait for it. For this case, we should use timeout to drain exec process's io instead of waiting for it. Fixes: containerd#7802 Signed-off-by: Wei Fu <fuweid89@gmail.com>

Signed-off-by: Wei Fu <fuweid89@gmail.com>

mikebrow

see comments!

pkg/cri/config/config.go

pkg/cri/config/config_unix.go

pkg/cri/server/container_execsync.go

We should validate the drainExecSyncIO timeout at the beginning and raise the error for any invalid input. Signed-off-by: Wei Fu <fuweid89@gmail.com>

Signed-off-by: Wei Fu <fuweid89@gmail.com>

mikebrow

LGTM

dcantah · 2023-03-03T05:23:51Z

integration/container_exec_test.go

+)
+
+func TestContainerDrainExecIOAfterExit(t *testing.T) {
+	// FIXME(fuweid): support it for windows container.


Lets make an issue for this, I don't have enough time to look into how to support this at the moment :/ but it can be a followup

I was trying to test it by cmd /c powershell Start-Process -FilePath timeout.exe -ArgumentList -1 -NoNewWindow. But it fails. It seems that it doesn't support nohup here. Yeah. It should be handled in the follow-up.

dcantah · 2023-03-03T05:26:58Z

pkg/cri/config/config.go

+	// Validation for drain_exec_sync_io_timeout
+	if c.DrainExecSyncIOTimeout != "" {
+		if _, err := time.ParseDuration(c.DrainExecSyncIOTimeout); err != nil {
+			return fmt.Errorf("invalid drain exec sync io timeout: %w", err)


Can be a followup as the other two timeouts didn't follow this pattern, but it looks like prior there was a pattern of using the toml name of the configuration in the error message reported to the user. So this would be:

Suggested change

return fmt.Errorf("invalid drain exec sync io timeout: %w", err)

return fmt.Errorf("invalid `drain_exec_sync_io_timeout`: %w", err)

dcantah · 2023-03-03T05:32:02Z

pkg/cri/config/config_test.go

+				},
+				DrainExecSyncIOTimeout: "10",
+			},
+			expectedErr: "invalid drain exec sync io timeout: time: missing unit in duration \"10\"",


Matching against the entire error (meaning including the bit from the stdlib) seems brittle, but I don't know how much the Go team values not changing random error strings 🤷‍♂️. It's just an errors.New defined inline in the function so if they ever changed the text a little this would fail https://cs.opensource.google/go/go/+/refs/tags/go1.20.1:src/time/format.go;l=1652

good point! just verify the error should contains "invalid drain_exec_sync_io_timeout".

dcantah · 2023-03-03T05:33:22Z

pkg/cri/sbserver/container_execsync.go

+
+	select {
+	case <-timerCh:
+


nit: empty line

dcantah · 2023-03-03T05:36:41Z

pkg/cri/sbserver/container_execsync.go

+	_, err := execProcess.Delete(ctx, containerd.WithProcessKill)
+	if err != nil {
+		if !errdefs.IsNotFound(err) {
+			return fmt.Errorf("failed to release exec io by deleting exec process %q: %w",
+				execProcess.ID(), err)
+		}
+	}
+	return fmt.Errorf("failed to drain exec process %q io in %s because io is still held by other processes",
+		execProcess.ID(), drainExecIOTimeout)


I'm confused, if we successfully deleted the process, wouldn't there be a success route here? From line 300 down there's no happy path, just different ways to fail?

Yes. From line 300, it means that there are still exec-child-processes running in the container. The delete action is to remove the record about the exec-init-process. It doesn't mean that all the resources related to this exec context have been cleanup. The error message will warn the user that hey, you might have leaky processes issue in your container. please checkout. It might be caused by D status that exec-child-process run into. It might be caused by wrong configuration. The error can be helpful.

Okay the log above (to me) reads as if there's a way to get us unstuck from this state "Trying to delete exec process to release io" so it was surprising to see nothing but different failures below. This is fine though, we can reword this if needed

1. it's easy to check wrong input if using drain_exec_sync_io_timeout in error 2. avoid to use full error message, as part of error generated by go stdlib would be changed in the future 3. delete the extra empty line Signed-off-by: Wei Fu <fuweid89@gmail.com>

mikebrow

LGTM

fuweid changed the title ~~pkg/cri/server: add timeout to drain exec io~~ pkg/cri: add timeout to drain exec io Dec 18, 2022

fuweid force-pushed the fix-7802 branch from 7c421aa to 50f9dc1 Compare December 18, 2022 09:26

fuweid added area/cri Container Runtime Interface (CRI) cherry-picked/sbserver Changes are backported to sbserver labels Dec 18, 2022

fuweid force-pushed the fix-7802 branch from 50f9dc1 to c851f11 Compare December 18, 2022 09:29

fuweid changed the title ~~pkg/cri: add timeout to drain exec io~~ [RFC] pkg/cri: add timeout to drain exec io Dec 20, 2022

fuweid added the status/needs-discussion Needs discussion and decision from maintainers label Dec 20, 2022

fuweid added this to the 1.7 milestone Feb 11, 2023

fuweid added 2 commits March 2, 2023 13:06

pkg/cri/sbserver: add timeout to drain exec io

04dfd62

Signed-off-by: Wei Fu <fuweid89@gmail.com>

fuweid force-pushed the fix-7802 branch from c851f11 to 04dfd62 Compare March 2, 2023 05:11

fuweid changed the title ~~[RFC] pkg/cri: add timeout to drain exec io~~ pkg/cri: add timeout to drain exec io Mar 2, 2023

fuweid removed the status/needs-discussion Needs discussion and decision from maintainers label Mar 2, 2023

fuweid marked this pull request as draft March 2, 2023 07:49

fuweid force-pushed the fix-7802 branch 4 times, most recently from 99f74d5 to 652111c Compare March 2, 2023 11:22

*: fix typo and skip exec-io-drain-testcase in win

a9cbddd

Signed-off-by: Wei Fu <fuweid89@gmail.com>

fuweid force-pushed the fix-7802 branch from 652111c to a9cbddd Compare March 2, 2023 13:58

fuweid marked this pull request as ready for review March 2, 2023 16:20

*: add DrainExecSyncIOTimeout config and disable as by default

3c18dec

Signed-off-by: Wei Fu <fuweid89@gmail.com>

fuweid force-pushed the fix-7802 branch from cdc35e7 to 3c18dec Compare March 2, 2023 16:24

mikebrow reviewed Mar 2, 2023

View reviewed changes

pkg/cri/config/config.go Show resolved Hide resolved

pkg/cri/config/config.go Show resolved Hide resolved

pkg/cri/config/config_unix.go Outdated Show resolved Hide resolved

pkg/cri/server/container_execsync.go Outdated Show resolved Hide resolved

fuweid added 4 commits March 3, 2023 11:58

*: update drainExecSyncIO docs and validate the timeout

791f137

We should validate the drainExecSyncIO timeout at the beginning and raise the error for any invalid input. Signed-off-by: Wei Fu <fuweid89@gmail.com>

cri: disable drain-exec-IO if it is empty timeout

ffebcb1

Signed-off-by: Wei Fu <fuweid89@gmail.com>

integration: add testcase to drain exec IO in time

55e25f1

Signed-off-by: Wei Fu <fuweid89@gmail.com>

cri: add config ut for invalid drain io timeout value

01671e9

Signed-off-by: Wei Fu <fuweid89@gmail.com>

fuweid force-pushed the fix-7802 branch from 319ff52 to 01671e9 Compare March 3, 2023 04:01

cri/sbserver: ignore the NOT_FOUND error in exec cleanup

98cb6d7

Signed-off-by: Wei Fu <fuweid89@gmail.com>

mikebrow approved these changes Mar 3, 2023

View reviewed changes

dcantah reviewed Mar 3, 2023

View reviewed changes

*: fix code style issue

5946c10

1. it's easy to check wrong input if using drain_exec_sync_io_timeout in error 2. avoid to use full error message, as part of error generated by go stdlib would be changed in the future 3. delete the extra empty line Signed-off-by: Wei Fu <fuweid89@gmail.com>

mikebrow approved these changes Mar 3, 2023

View reviewed changes

dcantah approved these changes Mar 3, 2023

View reviewed changes

dmcgowan approved these changes Mar 3, 2023

View reviewed changes

dmcgowan merged commit 7a77da2 into containerd:main Mar 3, 2023

This was referenced Mar 7, 2023

Add release notes for v1.7.0-rc.2 #8216

Merged

Prepare release notes for v1.7.0-rc.3 #8233

Merged

Prepare release notes for v1.7.0 #8242

Merged

mikebrow added the cherry-pick/1.6.x label May 30, 2023

thaJeztah mentioned this pull request Jul 14, 2023

[release/1.6 backport] pkg/cri: add timeout to drain exec io #8828

Closed

yasin052 mentioned this pull request Dec 21, 2023

ExecSync did not return according to the timeout set in the request #9568

Closed

gaohuatao-1 mentioned this pull request Feb 6, 2024

A lots of ExecSync will cause goroutine leak #9668

Closed

fuweid mentioned this pull request Feb 6, 2024

[release/1.6] Add timeout to drain exec io #9768

Merged

fuweid deleted the fix-7802 branch February 6, 2024 06:29

fuweid mentioned this pull request Apr 19, 2024

ExecSync did not return according to the timeout set in the request #10094

Closed

novahe mentioned this pull request Oct 2, 2025

preStop sleep continues after PID 1 exits kubernetes/kubernetes#134338

Open

	return fmt.Errorf("invalid drain exec sync io timeout: %w", err)
	return fmt.Errorf("invalid `drain_exec_sync_io_timeout`: %w", err)

Conversation

fuweid commented Dec 18, 2022

Uh oh!

fuweid commented Dec 18, 2022

Uh oh!

thaJeztah commented Dec 18, 2022

Uh oh!

fuweid commented Dec 18, 2022

Uh oh!

dcantah commented Dec 19, 2022

Uh oh!

fuweid commented Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fuweid commented Mar 1, 2023

Uh oh!

mikebrow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mikebrow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikebrow left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fuweid commented Dec 19, 2022 •

edited

Loading