Ensure daemon root is unmounted on shutdown#36107
Conversation
68da97b to
2a45e4f
Compare
2a45e4f to
6e01047
Compare
|
Looks like this test is flaky (or broken) on Windows; |
daemon/daemon_linux.go
Outdated
There was a problem hiding this comment.
Bit confused here; the function description mentions "umounts shm/mqueue mounts for old containers", but here we're unmounting /var/lib/docker?
There was a problem hiding this comment.
Updated the doc string.
6e01047 to
3701d43
Compare
af26f64 to
739cd67
Compare
|
Looks like this can be simplified to something like: - return daemon.cleanupMountsByID("")
+ if err := daemon.cleanupMountsByID(""); err != nil {
+ return err
+ }
+
+ logrus.WithField("mountpoint", daemon.root).Debug("unmounting daemon root")
+ return mount.Unmount(daemon.root)Reason1: keep it simple Or, in case you only want to unmount
Otherwise there are two different checks in two different places, and they might diverge resulting in a breakage. |
|
@kolyshkin Figuring out if the mount was from docker or not is not so simple as we could have an unclean shutdown at some point. |
@cpuguy83 Right. The current way, i.e. check that daemon.root is a mount point with a Maybe make a function, like |
|
@kolyshkin This isn't just checking if the mountpoint is shared, it's checking if it is a mountpoint at all and if the root of the mountpoint is itself. I can't really see a way to use the same function for both mount and unmount reliably. The one thing we could do is persist a value somewhere so we know that that we at least mounted it at some point, but then we still need to check everything to make sure it should be unmount. A simple |
|
ping @kolyshkin what's the status on this one? |
|
LGTM |
739cd67 to
70b4f4e
Compare
|
Flaky test? Not sure I've seen this one before; https://jenkins.dockerproject.org/job/Docker-PRs-powerpc/8485/console |
|
While testing this |
70b4f4e to
32f4c20
Compare
This is only for the case when dockerd has had to re-mount the daemon root as shared. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
|
@cpuguy83 can you restart the z test? It's hard to tell what happened from that output, and could be a flaky case. |
|
@nishanttotla There's a stack trace in the daemon logs for the stuck daemon. |
|
ping @tonistiigi @kolyshkin this is green now; changes still LGTY? |
|
still LGTM |
|
I'm not sure this is a good way to detect if mount was created by 2 issues: Running In dind, |
|
@tonistiigi The first issue is a simple change, though there are other things that could happen to wind up in that state. When you say add a label, I'm not sure we can just add random labels to the bind mount? |
Ah, yes, I didn't think of that.
What do you mean by tracking a file here? Creating a file on mounting to detect it later? Why shouldn't it be safe to unmount if the detected file was only created by docker on mounting? |
|
We'd still have to make sure the mount was actually safe to remove since file itself may be out of sync with the system. |
|
@cpuguy83 How? if someone removes internal files from |
After step 4 if we only relied on the file existing we would unmount the user mount... |
|
Or not because if it's still our mount from before the crash we'd want to clean it up. |
I wonder if we have to take very scenario in mind here: what would be the "bad" thing of the mount being left behind after a crash? Restarting the daemon will reuse it, restarting the host will unmount it, correct? |
|
@thaJeztah The case here is that dockerd crashes, user manually unmounts, now for some reason user mounts manually, dockerd is started. In that case on stop dockerd would umount something manually mounted by user that is quite bad. It is a very unlikely scenario though. |
|
@tonistiigi oh, perhaps I was unclear: the "unmount" scenario can be covered by writing a file (we check that file on shutdown, and unmount if it's there, and remove the file on startup if it exists, and there's a mount). So, the "crash" situation would then only result in us not un-mounting, which would in most situations be no problem (given that we need that mount, and likely start the daemon again after a crash). |
|
@thaJeztah Ok, yes, if we don't attempt to resync the file after a crash then there is no extra unmount. But it means that we will always leak a mount if there is a crash. If we attempt to resync then we don't leak a mount 99.9% of the cases, even on crash, but there is a tiny possibility of false unmount. |
|
Yes, so my train of though is: daemon crashes, which is an exceptional condition. Either something is very wrong, and manual intervention is needed, or the daemon is started again, and everything recovers. During start, an existing mount is found:
If the host is rebooted (which could be part of recovering after a crash): situation is reset to "pristine" |
|
I think a false unmount is a worse scenario than a mount leak. |
|
So, looks like the “file” approach would give us that? |
ping @tonistiigi