Skip to content

daemon: containerStop: fix ordering of "stop" and "die" events#50220

Merged
thaJeztah merged 1 commit intomoby:masterfrom
thaJeztah:fix_event_ordering
Jun 20, 2025
Merged

daemon: containerStop: fix ordering of "stop" and "die" events#50220
thaJeztah merged 1 commit intomoby:masterfrom
thaJeztah:fix_event_ordering

Conversation

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Jun 17, 2025

relates to;

Commit 8e6cd44 added synchronisation to wait for the container's status to be updated in memory. However, since 952902e, a defer was used to produce the container's "stop" event.

As a result of the sychronisation that was added, the "die" event would now be produced before the "stop" event.

This patch moes the locking inside the defer to restore the previous behavior.

Unfortunately the order of events is still not guaranteed, because events are emited from multiple goroutines that don't have synchronisation between them; this is something to look at for follow ups. This patch keeps the status quo and should preserve the old behavior, which was "more" correct in most cases.

- What I did

- How I did it

- How to verify it

- Human readable description for the release notes

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah thaJeztah force-pushed the fix_event_ordering branch from 353a256 to c0486dc Compare June 17, 2025 16:59
@thaJeztah
Copy link
Member Author

@vvoland updated with your suggestion; PTAL 👍

Comment on lines +133 to +135
inspect, err := apiClient.ContainerInspect(ctx, cID)
assert.NilError(t, err)
assert.Check(t, is.Equal(false, inspect.State.Running))
Copy link
Contributor

@mabrarov mabrarov Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, it's better to use same API (/containers/json) as mentioned in #50133 to ensure this pull request doesn't introduce regression for #50133:

Suggested change
inspect, err := apiClient.ContainerInspect(ctx, cID)
assert.NilError(t, err)
assert.Check(t, is.Equal(false, inspect.State.Running))
containers, err := apiClient.ContainerList(ctx, containertypes.ListOptions{
All: true,
Filters: filters.NewArgs(filters.Arg("id", cID)),
})
assert.NilError(t, err)
assert.Assert(t, is.Len(containers, 1))
assert.Check(t, containers[0].State != containertypes.StateRunning)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! so my intent here was to just have a sanity check for the container to be stopped before we start checking the events, not so much to verify #50133

For that one, we should probably still look at adding a test specific to that.

Comment on lines +129 to +130
timeOut := 1
err := apiClient.ContainerStop(ctx, cID, containertypes.StopOptions{Timeout: &timeOut})
Copy link
Contributor

@vvoland vvoland Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either set a bigger timeout for Windows (like

var pollOpts []poll.SettingOp
if isWindows {
pollOpts = append(pollOpts, poll.WithTimeout(StopContainerWindowsPollTimeout))
}
) or just leave the default 10s.

Otherwise this will end up being flaky.

Copy link
Member Author

@thaJeztah thaJeztah Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... yup can do.

But thinking about this; it actually makes me curious now; everything in containerStop looks to be coded to have a promise that "once the function returned, the container is stopped". Or at least; there's checks in multiple places where we have an explicit "wait" for the container to reach the desired state;

moby/daemon/stop.go

Lines 98 to 101 in c0486dc

if status := <-ctr.Wait(subCtx, containertypes.WaitConditionNotRunning); status.Err() == nil {
// container did exit, so ignore any previous errors and return
return nil
}

Why do we need those checks? Have we been working around a bug? If stop was meant to be async, then I would've expected it to return immediately ("we signaled it to stop; check back later"), but we don't; the api call blocks until it (should be) exited (and the state (thus docker inspect?) to be persisted to disk?)

docker run -qdit --rm --name foo busybox

time curl -XPOST --unix-socket /var/run/docker.sock http://local/v1.51/containers/foo/stop

real	0m10.366s
user	0m0.004s
sys	0m0.032s


docker run -qdit --rm --name foo busybox

time curl -XPOST --unix-socket /var/run/docker.sock 'http://local/v1.51/containers/foo/stop?t=1'

real	0m1.307s
user	0m0.010s
sys	0m0.007s

@thaJeztah thaJeztah force-pushed the fix_event_ordering branch from c0486dc to ff960ce Compare June 17, 2025 18:57
Comment on lines +132 to +137
var pollOpts []poll.SettingOp
if testEnv.DaemonInfo.OSType == "windows" {
pollOpts = append(pollOpts, poll.WithTimeout(StopContainerWindowsPollTimeout))
}

poll.WaitOn(t, container.IsStopped(ctx, apiClient, cID), pollOpts...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvoland added the poll; PTAL

We should probably look at that comment I left; really curious now!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ per discussion on Slack; @vvoland didn't necessarily mean an actual poll here, just making sure the timeout is long enough for windows; this can probably be an inspect again, if needed at all (will have a look)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to care about this timeout at all? I would use default value, because this timeout is about graceful stop vs kill and in this test (if I'm not wrong) we are OK even if container is killed (i.e. was not able to stop within given/default timeout).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was my initial thinking behind setting it to 1 second; either it stops gracefully, or kill it after 1 second; both should produce stop -> die -> exited in that order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, default timeout would be better (lesser code -> better code). Just need to check if respective image stops correctly (doesn't ignore signal used for stop of container created from image).

@thaJeztah thaJeztah force-pushed the fix_event_ordering branch from ff960ce to 4b2081b Compare June 18, 2025 09:26
@thaJeztah
Copy link
Member Author

Hm... interesting; Windows still flipped the order somehow? https://github.com/moby/moby/actions/runs/15729094026/job/44326581774?pr=50220

=== Failed
=== FAIL: github.com/docker/docker/integration/container TestStopEventsOrder (2.80s)
    stop_test.go:151: assertion failed: 
        --- ←
        +++ →
          []events.Action{
        - 	"stop",
        + 	"die",
        - 	"die",
        + 	"stop",
          }

@thaJeztah thaJeztah force-pushed the fix_event_ordering branch 2 times, most recently from 282b14e to 147e2a8 Compare June 18, 2025 11:08
@mabrarov
Copy link
Contributor

mabrarov commented Jun 18, 2025

Hi @thaJeztah,

Hm... interesting; Windows still flipped the order somehow? https://github.com/moby/moby/actions/runs/15729094026/job/44326581774?pr=50220

=== Failed
=== FAIL: github.com/docker/docker/integration/container TestStopEventsOrder (2.80s)
    stop_test.go:151: assertion failed: 
        --- ←
        +++ →
          []events.Action{
        - 	"stop",
        + 	"die",
        - 	"die",
        + 	"stop",
          }

Even on Linux I don't see any synchronization guaranteeing the order of events b/w

daemon.LogContainerEventWithAttributes(c, events.ActionDie, attributes)

and

daemon.LogContainerEvent(ctr, events.ActionStop)

Both actions are performed in goroutines which can run at the same point of time and both actions are performed without holding any shared lock (like lock on container).

Do you?

@thaJeztah
Copy link
Member Author

No, you're right; I made the assumption that the code before this handled the correct order, but an initial look at the Wait function; the wait looks to depend on stopWaiters;

s.stopWaiters = append(s.stopWaiters, waitC)

Which gets unblocked through State.SetStopped;

s.notifyAndClear(&s.stopWaiters)

Which 🫠 gets called before statecounters are updated, state is persisted to disk, and the event is sent;

moby/daemon/monitor.go

Lines 120 to 130 in 147e2a8

c.SetStopped(&ctrExitStatus)
if !c.HasBeenManuallyRestarted {
defer daemon.autoRemove(&cfg.Config, c)
}
}
defer c.Unlock() // needs to be called before autoRemove
daemon.setStateCounter(c)
checkpointErr := c.CheckpointTo(context.TODO(), daemon.containersReplica)
daemon.LogContainerEventWithAttributes(c, events.ActionDie, attributes)

And the lock is released before that, so even setStateCounter() could potentially race, as there's no lock when getting the container state (I see it calls State.StateString() without lock).

So, while this PR (AFAIC) doesn't make things worse and mostly brings us back to the situation before #50136, there already was a race, and it just became more apparent with that PR merged.

@thaJeztah
Copy link
Member Author

Based on the above, perhaps we should consider removing the test for now, and re-apply it when we looked at the underlying issue in more depth.

I think the change in this PR still is good to take, as it would keep the status quo from before #50136

let me know what you think @vvoland @mabrarov

@vvoland
Copy link
Contributor

vvoland commented Jun 18, 2025

Let's keep the test, but add a skip with a TODO link for the ticket.

@thaJeztah
Copy link
Member Author

Yeah, was considering that, but didn't want to add a test that we know is flaky and skip from the start; let me just move the test to a draft PR, and use that as starting point for further work; I'll also draft up a tracking ticket to capture what was discussed in this (and the other PR) for future work to look into.

I knew some of this logic was convoluted, but it looks like there's just too much logic stacked on top of existing bits over the Years to fix various issues, and it's starting to show. We need to take a few steps back and make an inventory of the logic and rework some of it at least.

@thaJeztah thaJeztah force-pushed the fix_event_ordering branch from 147e2a8 to 7be2e74 Compare June 18, 2025 18:42
Commit 8e6cd44 added synchronisation to
wait for the container's status to be updated in memory. However, since
952902e, a defer was used to produce
the container's "stop" event.

As a result of the sychronisation that was added, the "die" event would
now be produced before the "stop" event.

This patch moves the locking inside the defer to restore the previous
behavior.

Unfortunately the order of events is still not guaranteed, because events
are emited from multiple goroutines that don't have synchronisation between
them; this is something to look at for follow ups. This patch keeps the status
quo and should preserve the old behavior, which was "more" correct in most
cases.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah
Copy link
Member Author

Now hoping GitHub recovers because it's been bad whole day

@thaJeztah
Copy link
Member Author

@robmry @vvoland ptal (it's green now 🎉)

@thaJeztah thaJeztah merged commit a0f36cc into moby:master Jun 20, 2025
332 of 370 checks passed
@thaJeztah thaJeztah deleted the fix_event_ordering branch June 20, 2025 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants