Skip to content

libcontainerd/supervisor: fix data race#47300

Merged
cpuguy83 merged 1 commit intomoby:masterfrom
corhere:libc8d/fix-startup-data-race
Feb 1, 2024
Merged

libcontainerd/supervisor: fix data race#47300
cpuguy83 merged 1 commit intomoby:masterfrom
corhere:libc8d/fix-startup-data-race

Conversation

@corhere
Copy link
Contributor

@corhere corhere commented Feb 1, 2024

- What I did
I fixed a data race in libcontainerd/supervisor that I added in #44215.

- How I did it
The monitorDaemon() goroutine calls startContainerd() then blocks on <-daemonWaitCh to wait for it to exit. The startContainerd() function would (re)initialize the daemonWaitCh so a restarted containerd could be waited on. This implementation was race-free because startContainerd() would synchronously initialize the daemonWaitCh before returning. When the call to start the managed containerd process was moved into the waiter goroutine, the code to initialize the daemonWaitCh struct field was also moved into the goroutine. This introduced a race condition.

Move the daemonWaitCh initialization to guarantee that it happens before the startContainerd() call returns.

- How to verify it
Build a daemon with BUILDFLAGS=-race. Start it and check that no WARNING: DATA RACE messages are emitted.

Verify that the containerd supervisor is working correctly: it is launched when the daemon starts up, is restarted when killed, and is shut down when the daemon shuts down.

- Description for the changelog

  • Fixed a potential race condition in the managed containerd supervisor

- A picture of a cute animal (not mandatory but encouraged)

The monitorDaemon() goroutine calls startContainerd() then blocks on
<-daemonWaitCh to wait for it to exit. The startContainerd() function
would (re)initialize the daemonWaitCh so a restarted containerd could be
waited on. This implementation was race-free because startContainerd()
would synchronously initialize the daemonWaitCh before returning. When
the call to start the managed containerd process was moved into the
waiter goroutine, the code to initialize the daemonWaitCh struct field
was also moved into the goroutine. This introduced a race condition.

Move the daemonWaitCh initialization to guarantee that it happens before
the startContainerd() call returns.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

data race during daemon startup (libcontainerd/supervisor.(*remote).monitorDaemon())

4 participants