devicemapper: remove container rootfs mountPath after umount#34573
devicemapper: remove container rootfs mountPath after umount#34573vdemeester merged 2 commits intomoby:masterfrom
Conversation
41b98d1 to
a7f8155
Compare
9934083 to
a875ebf
Compare
|
We should have separate suites for testing the graph drivers, so we can run the suite for any supported drivers. |
integration/util/kernel.go
Outdated
There was a problem hiding this comment.
Could this be inlined into run_test.go? Do you expect to use it outside of container tests?
integration/container/run_test.go
Outdated
There was a problem hiding this comment.
I think this should t.Skip() if it's not devicemapper.
There was a problem hiding this comment.
I think it would be a good measure to make sure the bug doesn't happen on other storage drivers too. I should probably just remove this check all-together.
There was a problem hiding this comment.
I don't think this is the right way to test storage drivers. These "integration" tests are very slow, we need to be conservative about the test we add here (see TESTING).
We should be able to test storage drivers without the daemon running, which makes it much easier and faster to test them. It looks like each graphdriver has a test suite which uses graphtest. We should test this case in that suite.
Why is it not possible to just check that the mount point doesn't exist anymore from that test suite?
There was a problem hiding this comment.
I would need to check again, but it's not that it's impossible, it's that I'd have to effectively start re-implementing parts of runc inside the test framework to make sure we're testing the right interaction between mount namespaces and libdm.
Just checking if the mountpoint has gone isn't good enough, you need to check that the mount has disappeared in other namespaces (that's where the bug comes from).
integration/container/run_test.go
Outdated
There was a problem hiding this comment.
Some distributions may have backported the fix, so the kernel version alone isn't enough to know whether the test will work (but we don't hard fail on older kernels).
There was a problem hiding this comment.
You can use kernel.CheckKernelVersion from github.com/docker/docker/pkg/parsers/kernel here. For example:
import "github.com/docker/docker/pkg/parsers/kernel"
...
oldKernel := !kernel.CheckKernelVersion(3, 18, 0)
...|
@dnephin So where does that leave this PR's tests? Should I not include tests at all (these tests actually currently fail silently because the error handling in the new [As an aside, I'm not sure it was a good idea to deprecate |
I don't understand this. What do you mean by error handling doesn't exist?
It's a work in progress. If you see something significant that is missing, please let me know.
You can change tests, you just can't add new ones
The ideal place for this test would be in |
There was a problem hiding this comment.
Why use EnsureRemoveAll? This function is trying to do so much that I don't even understand what it is doing. After lazy unmount, why don't we simply do a os.Remove() ?
So idea is that use removing directory to yank the mount points which might have leaked into other mount namespaces and this will only work on newer kernels. I think this should work reasonably well. Little concerned about some use case being broken somewhere. There have been quite a few changes w.r.t referece counting. Especially make sure that the case of docker daemon being shutdown/start works while containers were running the whole time.
There was a problem hiding this comment.
Yeah, os.Remove is probably a better choice.
Especially make sure that the case of docker daemon being shutdown/start works while containers were running the whole time.
Can you explain what you mean here? Are you referring to the "live restore" stuff?
There was a problem hiding this comment.
Yes, I was referring to live restore functionality.
There was a problem hiding this comment.
It might be good to tell that removing directory will remove the mount point on this directory from all the mount namespaces.
|
@cyphar We will require similar change in Remove() path as well? That is, make sure device is unmounted, then remove mnt directory and then try device deactivation and deletion? |
|
cc @rhatdan |
|
@rhvgoyal Yes probably. I'll need to figure out what's the best place to put this change (I'd prefer to not have to duplicate it in both paths). |
a875ebf to
ba9ee05
Compare
|
Actually @rhvgoyal I don't think we need this in the Remove path. The Remove path is only called from the driver after it's been |
ef75b39 to
330461c
Compare
libdm currently has a fairly substantial DoS bug that makes certain operations fail on a libdm device if the device has active references through mountpoints. This is a significant problem with the advent of mount namespaces and MS_PRIVATE, and can cause certain --volume mounts to cause libdm to no longer be able to remove containers: % docker run -d --name testA busybox top % docker run -d --name testB -v /var/lib/docker:/docker busybox top % docker rm -f testA [fails on libdm with dm_task_run errors.] This also solves the problem of unprivileged users being able to DoS docker by using unprivileged mount namespaces to preseve mounts that Docker has dropped. SUSE-Bug: https://bugzilla.suse.com/show_bug.cgi?id=1045628 SUSE-Backport: moby#34573 Signed-off-by: Aleksa Sarai <asarai@suse.de>
|
nudge |
|
Maybe a unit test that does uses unshare(1), and can skip if it's not available. Not exactly ideal, but ... |
|
The problem with such a test is that [Again, this would not be a problem if I could just write a normal integration test... 😞] |
|
Why would it break other unit tests? |
262017f to
2ee2f84
Compare
In order to avoid reverting our fix for mount leakage in devicemapper, add a test which checks that devicemapper's Get() and Put() cycle can survive having a command running in an rprivate mount propagation setup in-between. While this is quite rudimentary, it should be sufficient. We have to skip this test for pre-3.18 kernels. Signed-off-by: Aleksa Sarai <asarai@suse.de>
2ee2f84 to
1af8ea6
Compare
|
|
|
Restarted PowerPC; I was told there were some connection issue with Jenkins and the PowerPC nodes |
|
/cc @vdemeester |
dnephin
left a comment
There was a problem hiding this comment.
Thanks for writing the unit test
|
@dnephin Sorry, I never answered your question. I was worried about using |
libdm currently has a fairly substantial DoS bug that makes certain operations fail on a libdm device if the device has active references through mountpoints. This is a significant problem with the advent of mount namespaces and MS_PRIVATE, and can cause certain --volume mounts to cause libdm to no longer be able to remove containers: % docker run -d --name testA busybox top % docker run -d --name testB -v /var/lib/docker:/docker busybox top % docker rm -f testA [fails on libdm with dm_task_run errors.] This also solves the problem of unprivileged users being able to DoS docker by using unprivileged mount namespaces to preseve mounts that Docker has dropped. SUSE-Bug: https://bugzilla.suse.com/show_bug.cgi?id=1045628 SUSE-Backport: moby#34573 Signed-off-by: Aleksa Sarai <asarai@suse.de>
libdm currently has a fairly substantial DoS bug that makes certain
operations fail on a libdm device if the device has active references
through mountpoints. This is a significant problem with the advent of
mount namespaces and MS_PRIVATE, and can cause certain --volume mounts
to cause libdm to no longer be able to remove containers:
This also solves the problem of unprivileged users being able to DoS
docker by using unprivileged mount namespaces to preseve mounts that
Docker has dropped.
Walter by Alexander Lemke.
Closes #34542
Signed-off-by: Aleksa Sarai asarai@suse.de
/cc @rhvgoyal This is an alternative to #34542
/cc @vrothberg