Skip to content

don't cancel container stop when cancelling context#45738

Merged
thaJeztah merged 1 commit intomoby:masterfrom
thaJeztah:dont_cancel_stop
Jun 20, 2023
Merged

don't cancel container stop when cancelling context#45738
thaJeztah merged 1 commit intomoby:masterfrom
thaJeztah:dont_cancel_stop

Conversation

@thaJeztah
Copy link
Copy Markdown
Member

Commit 90de570 passed through the request context to daemon.ContainerStop(). As a result, cancelling the context would cancel the "graceful" stop of the container, and would proceed with forcefully killing the container.

This patch partially reverts the changes from 90de570 and breaks the context to prevent cancelling the context from cancelling the stop.

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)


// containerStop sends a stop signal, waits, sends a kill signal.
func (daemon *Daemon) containerStop(ctx context.Context, ctr *container.Container, options containertypes.StopOptions) (retErr error) {
func (daemon *Daemon) containerStop(_ context.Context, ctr *container.Container, options containertypes.StopOptions) (retErr error) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been going back-and-forth whether to remove the context.Context here, but decided to go for the most minimal changes to fix the regression to reduce code-churn (and we may want to have the context available here in future for other purposes).


// Give the ContainerStop some time to make sure it's being handled,
// then cancel the context or close the client to cancel the stop.
time.Sleep(100 * time.Millisecond)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very happy with the test (and this sleep); better suggestions welcome 😅

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collude with the container: trap the stop signal and handle it by logging to stdout then sleeping some more, similar to the tests in wait_test.go. The test can then wait for the log message from the container to know when it can proceed.

@thaJeztah thaJeztah requested a review from corhere June 13, 2023 22:32
Comment on lines +99 to +104
switch tc {
case cancelContext:
assert.Check(t, is.ErrorType(err, errdefs.IsCancelled))
case closeConnection:
assert.Check(t, err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no synchronization between this goroutine and the test. These checks could potentially run after the test function has returned. I recommend sending the error over a channel and performing these checks on the test goroutine, e.g.:

stoppedCh := make(chan error)
go func() {
    stoppedCh <- apiClient.ContainerStop(ctxCancel, ...)
}()

// [...]

var stoppedErr error
select {
case stoppedErr = <-stoppedCh:
case <-time.After(...):
    t.Fatal("Didn't stop??")
}
// Assert stoppedErr

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks; yes, that probably works


// Give the ContainerStop some time to make sure it's being handled,
// then cancel the context or close the client to cancel the stop.
time.Sleep(100 * time.Millisecond)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collude with the container: trap the stop signal and handle it by logging to stdout then sleeping some more, similar to the tests in wait_test.go. The test can then wait for the log message from the container to know when it can proceed.

@neersighted neersighted changed the title don't cancel container stop when canelling context don't cancel container stop when cancelling context Jun 14, 2023
Comment on lines +115 to +117
// close the client connection
assert.Check(t, apiClient.Close())
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this test-case doesn't make sense at all; apiClient.Close() does not terminate requests, it closes idle connections;

moby/client/client.go

Lines 186 to 192 in a978888

// Close the transport used by the client
func (cli *Client) Close() error {
if t, ok := cli.client.Transport.(*http.Transport); ok {
t.CloseIdleConnections()
}
return nil
}

@thaJeztah thaJeztah force-pushed the dont_cancel_stop branch 4 times, most recently from b88c10b to b089f27 Compare June 16, 2023 12:43
@thaJeztah
Copy link
Copy Markdown
Member Author

Hmm... weird why this failed in the docker-py dockerfile;

Loaded image: emptyfs:latest
Loaded image ID: sha256:0df1207206e5288f4a989a2f13d1f5b3c4e70467702c1d5d21dfc9f002b7bd43
INFO: Building docker-sdk-python3:5.0.3...
tests/Dockerfile:6
--------------------
   5 |     ARG APT_MIRROR
   6 | >>> RUN sed -ri "s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g" /etc/apt/sources.list \
   7 | >>>     && sed -ri "s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g" /etc/apt/sources.list
   8 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c sed -ri \"s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g\" /etc/apt/sources.list     && sed -ri \"s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g\" /etc/apt/sources.list" did not complete successfully: exit code: 2

Unless the base-image changed 🤔

@thaJeztah
Copy link
Copy Markdown
Member Author

Commit 90de570 passed through the request
context to daemon.ContainerStop(). As a result, cancelling the context would
cancel the "graceful" stop of the container, and would proceed with forcefully
killing the container.

This patch partially reverts the changes from 90de570
and breaks the context to prevent cancelling the context from cancelling the stop.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah
Copy link
Copy Markdown
Member Author

Rebased to get the docker-py fix in

@thaJeztah thaJeztah merged commit 66c497c into moby:master Jun 20, 2023
@thaJeztah thaJeztah deleted the dont_cancel_stop branch June 20, 2023 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue with containerStop and long running tasks prematurely exiting

3 participants