Skip to content

Mount Point Plugins Implementation#33375

Closed
dsheets wants to merge 14 commits intomoby:masterfrom
dsheets:mountpoint-plugins
Closed

Mount Point Plugins Implementation#33375
dsheets wants to merge 14 commits intomoby:masterfrom
dsheets:mountpoint-plugins

Conversation

@dsheets
Copy link
Copy Markdown
Contributor

@dsheets dsheets commented May 24, 2017

- What I did

Designed and implemented mount point interposition plugins as (mostly) described by #33048. This PR implements that proposal but improves the mount point filtering system to be more general and accurate. In particular, filters loaded from mount point plugins are now DNF and include negation (necessary to select local volume bind mounts and anonymous local volume mounts).

I'm moderately happy with how the implementation turned out.

I'm not sure what the Windows story is.

I'm not sure how I should adjust the tests to account for the new API-tests-only policy. Are there good examples somewhere that I should follow? Or a transition PR that I can crib from?

This is my first major contribution to the Docker engine and the most golang I have written in a single patch. I'm very happy to take pointers on both Docker code base best practice and golang best practice.

Container error conditions, daemon restart/reload, and plugin server assumptions were some of my biggest hurdles.

In terms of design, one of my remaining reservations is the level of expressivity of the mount point plugin filter 'language'. It supports what is mostly an equality predicate (set membership for values) and negation. I can see the case for more expressive predicates like path prefix (useful for --opt device=/prefix/rest) but this would require expanding the predicate language further (currently negation is a barely acceptable special case).

- How I did it

I read a lot of the volume/mount/bind related code and then copied a lot of the structure of the AuthZ plugin subsystem and hooked into the places that seemed relevant from reading and testing.

- How to verify it

I wrote some tests in bd35f31. I plan to write some more to test the plugin V2 dynamics (similar to integration-cli/docker_cli_authz_plugin_v2_test.go) but the relevant client package isn't published yet (it's in this PR) and I haven't worked out how to fake-vendor it, yet.

I also recommend reading 9d58803 to understand the design.

- Description for the changelog

Mount point plugins allow bind mounts and volume mounts to be selectively intercepted

- A picture of a cute animal (not mandatory but encouraged)

marine otter

Lontra felina, or the marine otter, is believed to be the smallest marine mammal on Earth. It is endangered and can be found along the western coast of South America.

/cc @cpuguy83 @thaJeztah @yallop @dnephin

@AkihiroSuda
Copy link
Copy Markdown
Member

Does it replace volume plugin?

@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 24, 2017

@AkihiroSuda this does not replace volume plugins -- it's a different layer from volume plugins. Volume plugins provide some volume file system at a source location. Mount point plugins can then act on this data (or change the source of the mount) before a container starts and after a container stops (but before a return code/exit event is issued). The current implementation allows for a stack of mount point plugins to participate in host bind mounts and volume mounts. The design should be able to accommodate interposing on tmpfs, layers, config, network, and secrets mounts as well (but none of these are implemented yet). I recommend reading #33048 for the motivation and background of this feature.

@dsheets dsheets force-pushed the mountpoint-plugins branch 3 times, most recently from cd05492 to ed6d965 Compare May 24, 2017 15:47
@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 24, 2017

I'm not sure what to do about:

These files import internal code: (either directly or indirectly)
15:48:03  - pkg/mountpoint/api.go imports github.com/docker/docker/api/types/mount
15:48:03  - pkg/mountpoint/chain.go imports github.com/docker/docker/api/types/mount
15:48:03  - pkg/mountpoint/chain.go imports github.com/docker/docker/volume
15:48:03  - pkg/mountpoint/mountpoint.go imports github.com/docker/docker/api/types/mount
15:48:03  - pkg/mountpoint/mountpoint.go imports github.com/docker/docker/volume
15:48:03  - pkg/mountpoint/plugin.go imports github.com/docker/docker/api/types/mount

Should pkg/mountpoint live somewhere else? It is intended to be used by plugins.

Perhaps this highlights an issue with sharing the volume.MountPoint serialization via the mount point plugin API because it is not meant to be externally visible?

Should docker/docker/api be in a package?

@dsheets dsheets force-pushed the mountpoint-plugins branch 2 times, most recently from 47272ab to da6b434 Compare May 30, 2017 11:22
@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 30, 2017

I've rebased this PR and moved the implementation from pkg/mountpoint into volume/. In the process, I've moved some types from api/types/mount/mount.go into pkg/mountpoint/api.go. This passes the basic validation stages of the CI. The remaining CI failures are cross-platform and easily fixed, spurious, unknown, or intermittent and related. I'm still tracking down the intermittent issue but the rest is ready for design review, afaik.

Copy link
Copy Markdown
Member

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the types move to pkg/mountpoint? I think these types should be under api/types.

Was this because of import checks when you were using pkg/? Now that you've moved the logic to volume/ I think you can use api/types again?

@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 30, 2017

The types moved to pkg/mountpoint because they are used in the pkg/mountpoint API. The types could be duplicated (😢) or api/types could be put into a pkg/.

@dnephin
Copy link
Copy Markdown
Member

dnephin commented May 30, 2017

Why does pkg/mountpoint need to be there? How about moving it to volumes/mountpoint then you can import from api/types ?

@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 30, 2017

pkg/mountpoint specifies the API for both the client-side and server-side of the mount point plugin API. My understanding was that packages meant to be used by third parties or containing reusable components should live in pkg/. Is that not right? I'm just trying to follow the example of pkg/authorization...

@dnephin
Copy link
Copy Markdown
Member

dnephin commented May 30, 2017

api/types is the package for "interface" types that need to be used by both client and server.

In #32989 the proposed new location for pkg/authorization is plugins/authorization.

I think you're right about the intent of pkg/ however in practice I don't think that really works. The only guarantee we can make about pkg/ is that it doesn't import other parts of docker/docker.

@dsheets dsheets force-pushed the mountpoint-plugins branch from da6b434 to 9a11d2e Compare May 30, 2017 17:37
@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented May 30, 2017

OK, I've put all the new modules under volume/ and volume/mountpoint/ and rebased again.

@dsheets dsheets force-pushed the mountpoint-plugins branch from 7ac780b to 83b68b2 Compare May 31, 2017 12:58
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some nits found while reading

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailing comma (hence the "red" highlighting because it's invalid JSON)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailing comma

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing closing }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailing comma

@thaJeztah
Copy link
Copy Markdown
Member

ping @johnstep to have a look from the windows side of things

@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented Jun 10, 2017

I've been prototyping with this PR and I think that the mount point filtering should be much more expressive (e.g. to allow filtering on volume labels) and the attach response message should be more uniform. In particular, I think both should use datatypes with the same shape as the attach request mount point struct.

Regardless of those changes, I believe the basic design of this changeset can be reviewed/approved. I will be off the grid from June 15 until July 7 but @yallop will be available to answer questions.

@thaJeztah thaJeztah added the status/needs-attention Calls for a collective discussion during a review session label Jun 15, 2017
Copy link
Copy Markdown
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'm good with design, though I would like to see the pieces split up a bit such that we have mount middleware, and then plugins that can act as mount middleware rather than just mount plugins.
Likewise the plugin API should reflect that the plugins are implementing mountpoint middleware rather than a mount plugin.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define a struct to encapsulate this rather than string parsing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the filters need to be more expressive than they currently are and packing expressions into strings is not really appropriate (was it ever?). I'm working on this now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the types from api/types/mount?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but eventually this enumeration will be bigger than the one in api/types/mount. For instance, layers and secrets could also be interposed on. This implementation currently doesn't support tmpfs so perhaps that variant should be removed from this type.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a very good reason to create yet another type...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the use of the clock terminology here. Is this an ordinal describing the order in which the mounts were applied?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the field doc is Clock is a positive integer used to ensure mount detachments occur in the correct order. Because interposition is targeted based on mount point properties, different mounts into the same container can have different stacks of participating plugins. To unwind the plugins in the correct order without maintaining a separate index, we use this ordinal. See https://github.com/moby/moby/pull/33375/files#diff-077bf4d53f2bf30ff32bdb1c2c2c70daR208 for the unwinding routine.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsheets It might be good to have some commentary on why this works, over using the separate index approach or just a stack. I don't think I quite understand the intended behavior here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a glance, this does seem more complicated than other methods. I'm not understanding the unwind code very well.

David Sheets added 13 commits September 19, 2017 12:13
Previously, these values which are pointers to strings would be
dereferenced to have their (aliased) strings written to. If these fields
were not specified in the plugin configuration, the values would be
pointers and dereferencing would fail. This removes the use of the
aliased string pointer field and eliminates any mutation side effects.

Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
… store

Signed-off-by: David Sheets <dsheets@docker.com>
Necessary until moby#33658 is merged.

Signed-off-by: David Sheets <dsheets@docker.com>
This requires an update to swarmkit so the swarm protobuf protocol includes
the consistency field.

Signed-off-by: David Sheets <dsheets@docker.com>
Previously, volume.MountPoint.Setup would return the path at which the
mount would appear but the volume.MountPoint object itself might not
contain this path (e.g. in the case of anonymous volume mounts). Now,
.Setup does not return the path and, instead, guarantees the
postcondition that, after a successful call, .Source will contain the
path to the mount point.

Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
Mount point plugins are just one type of mount point middleware,
now. This will allow us to create internal mount point middleware that
performs similar functions to mount point plugins without all of the
plugin machinery including IPC.

daemon/volumes: split volume.MountPoint.Setup into Realize and Setup

Realize ensures that there is a path to the mount point file system
whereas Setup performs finalization of a mount point including creation
of the mount point path if it does not exist.

volume: moved source path existence validation to mount failure

This simplifies mount spec validation by removing the file system state
input side effect of checking whether the source path exists. Instead,
this error condition is naturally handled during bind mounting. To
achieve this, I introduce a new flag on volume.MountPoint,
CreateSourceIfMissing, that indicates that the source path should be
created if it is missing. We use this flag to decide whether to
idtools.MkdirAllAndChownNew rather than always executing that function
and relying on the early spec validation to prevent --mount requests
from creating directory hierarchies. I also simplified
volume.MountPoint.Setup to remove the optional checking function as this
check can now be performed accurately immediately after
volume.MountPoint.Realize for Unix-like systems.

This change was necessary in order to support mount point plugins that
implement full file system interposition as the requested source should
not be checked early in that case and, instead, the effective source
should be used for mounting (or fail).

daemon/cluster/executor/container: don't precheck bind mount sources

Checking bind mount source directories for existence interferes with
mount point plugins and is unnecessary as automatic creation of source
directories only occurs with 'raw' mount specifications (e.g. `-v
/foo:/bar`) and not with structured specifications.

daemon/cluster/executor/container: add mount point plugins to info

Also, fix an authz plugin type name reporting bug.

volume/mountpoint: add emergency detach instructions for middleware

In the case where middleware is unreachable when it is time to detach an
interposed mount, emergency detach instructions can be used to prevent
mount leaks or file system debris from accumulating or causing further
issues. Without emergency detach instructions, many mount point
middleware chains could fail to detach if a plugin is unreachable (even
if later plugins plowed ahead, they may be unable to clean up their
mounts or directories if the unresponsive plugin used them). With
emergency detach instructions, many mount point plugin chains can safely
and completely detach without _requiring_ responsive plugins. For
resource tracking purposes, we still prefer to contact plugins for
detach.

Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
A missing bind source on container creation is OK. The failure for a
missing bind source should only occur much later after any mount point
middleware has participated and then only if the the mount point has its
CreateSourceIfMissing flag set (e.g. from -v).

Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: David Sheets <dsheets@docker.com>
sysInfo = sysinfo.New(true)

testEnv.Print()
setupSuite()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to break windows CI (I just realized that these suites aren't running on windows CI but that should be fixed shortly).

It looks like mountpoint_windows_test.go is missing which needs to define these.

mountPointController[2] = mountPointControllerForLocalVolumeBindMounts()
mountPointController[3] = mountPointControllerForAnonymousBindMounts()
mountPointController[4] = mountPointControllerForAllBindMounts()
mountPointController[5] = mountPointControllerForMountsIntoContainersWithoutAllowLabel()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there are 6 controllers being setup, but some tests also modify the controllers. I believe not every test uses every controller, is that right?

I think this is going to be difficult to maintain. Instead of having everything setup for every test could you have a function like:

func setupPlugin(controllers ...testMountPointController) { ... }

and call that on each test with just the controllers that it needs? That should also remove the need to modify the static controllers from individual tests, and will hopefully remove the indexing using numbers as well.

setupPlugin() can still add them to a static slice, and teardown can iterate over the slice to perform cleanup.

@dsheets
Copy link
Copy Markdown
Contributor Author

dsheets commented Oct 5, 2017

Could someone please review the design and implementation of this PR?

A few additional test cases and some minor clean-up will be forthcoming but nothing is blocking a review, afaik.

@vdemeester
Copy link
Copy Markdown
Member

cc @thaJeztah @cpuguy83 @vieux what is the status for this one ? 👼

@flx42
Copy link
Copy Markdown
Contributor

flx42 commented Dec 14, 2017

FWIW, we (NVIDIA, for GPU enablement in containers) would not benefit from this new type of plugin, I believe. Having support for standardized OCI hooks would be better for us. As described in #28837, such hooks could either be registered to the docker daemon (like we do for runtimes), or passed to docker run with something like --exec-opt prestart=hook1,hook2, or dropped inside a directory like in Red Hat's fork.

@cpuguy83
Copy link
Copy Markdown
Member

@flx42 I think hooks are basically deprecated in favor of the create/start split that is available in runc.... how to expose that from docker I'm really not certain.... right now there isn't even such a split in the moby codebase yet.

@flx42
Copy link
Copy Markdown
Contributor

flx42 commented Dec 14, 2017

@cpuguy83 What do you mean "deprecated"? Aren't they part of the OCI runtime spec?
https://github.com/opencontainers/runtime-spec/blob/v1.0.1/specs-go/config.go#L122-L130

@cpuguy83
Copy link
Copy Markdown
Member

@flx42 Sure it's part of the spec, but it basically the preference is to use create/start instead of hooks. Deprecated doesn't mean gone.

@tiborvass
Copy link
Copy Markdown
Contributor

@anusha-ragunathan it would be interesting to investigate whether this approach could work for @flx42 as well.

@vdemeester
Copy link
Copy Markdown
Member

@cpuguy83 what is the status here ?

@cpuguy83
Copy link
Copy Markdown
Member

If this is something we need to have, it would be good if someone who is interested in having this merged to take it over.
Also are we sure this is the right layer to abstract? Could (or should) this be generalized for the entire container (i.g. full container lifecycle hooks)?

I'm going to go ahead and close this one (which has tons of merge conflicts as is) and move discussion back to the open issue. Thanks! 👼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/needs-attention Calls for a collective discussion during a review session status/1-design-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.