Skip to content

Set daemon root to use shared propagation#36096

Merged
yongtang merged 1 commit intomoby:masterfrom
cpuguy83:use_rshared_prop_for_daemon_root
Jan 24, 2018
Merged

Set daemon root to use shared propagation#36096
yongtang merged 1 commit intomoby:masterfrom
cpuguy83:use_rshared_prop_for_daemon_root

Conversation

@cpuguy83
Copy link
Copy Markdown
Member

This change sets an explicit mount propagation for the daemon root.
This is useful for people who need to bind mount the docker daemon root
into a container.

Since bind mounting the daemon root should only ever happen with at
least rlsave propagation (to prevent the container from holding
references to mounts making it impossible for the daemon to clean up its
resources), we should make sure the user is actually able to this.

Most modern systems have shared root (/) propagation by default
already, however there are some cases where this may not be so
(e.g. potentially docker-in-docker scenarios, but also other cases).
So this just gives the daemon a little more control here and provides
a more uniform experience across different systems.

Copy link
Copy Markdown
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐻

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone has set MountFlags=Private on docker.service in systemd, won't this result in dockerd root being converted to shared, thereby negating the operator's preference to isolate docker mounts? If so, and that was their intent, we should recommend they switch to MountFlags=slave.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MountFlags=Private, this would till be disconnected from the parent mount namespace even though we've set it to shared.
So new mounts would not leak in or out (to/from the parent).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me.

This change sets an explicit mount propagation for the daemon root.
This is useful for people who need to bind mount the docker daemon root
into a container.

Since bind mounting the daemon root should only ever happen with at
least `rlsave` propagation (to prevent the container from holding
references to mounts making it impossible for the daemon to clean up its
resources), we should make sure the user is actually able to this.

Most modern systems have shared root (`/`) propagation by default
already, however there are some cases where this may not be so
(e.g. potentially docker-in-docker scenarios, but also other cases).
So this just gives the daemon a little more control here and provides
a more uniform experience across different systems.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@cpuguy83 cpuguy83 force-pushed the use_rshared_prop_for_daemon_root branch from 1954c67 to a510192 Compare January 23, 2018 22:17
@kolyshkin
Copy link
Copy Markdown
Contributor

LGTM

Copy link
Copy Markdown
Member

@yongtang yongtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yongtang yongtang merged commit 3ca99ac into moby:master Jan 24, 2018
@tonistiigi
Copy link
Copy Markdown
Member

Where is this mount cleaned up?

This is useful for people who need to bind mount the docker daemon root
into a container.

Why do we want to allow that? We are setting the parent directories of graph-dir and container dir to private/unbindable (in part) to avoid these accidental leaks and possible EBUSY. If something outside is tracking the graphdriver mounts then the reference counter in Docker gets confused and will delete mounted data (or at least in some versions in aufs fail doing that).

@cpuguy83
Copy link
Copy Markdown
Member Author

@tonistiigi

Where is this mount cleaned up?

Nice catch, I'll follow up to clean that up on shutdown.

Why do we want to allow that?

It's already alllowed today, lots of things do it including Kube, cadavisor, and even docker-ee.
We could restrict this even farther to say you can only mount with slave propagation (e.g. no changes in a container would propagate to the host), but I didn't feel like it's totally necessary to make such a restriction since someone has to actually specify shared propagation.... there must be some reason they decided to.

We are setting the parent directories of graph-dir and container dir to private/unbindable

We pretty much need to stop setting private propagation on these things because it is really causing a lot of these issues. See #36047 for this change and explanation.

We could do unbindable... and maybe we should still... but I worry about this breaking people. I definitely don't think we should set the graph dir itself to unbindable because this would prevent any sort of way to do analysis on the graph driver dir from a container. This would mean setting up a new sub-dir (like we did for IPC and secret mounts on containers) and adjusting the graphdriver implementation to use it, but also be backwards compatible for old containers (live restore at least).

@cpuguy83 cpuguy83 deleted the use_rshared_prop_for_daemon_root branch January 24, 2018 23:03
@cpuguy83
Copy link
Copy Markdown
Member Author

Also, maybe missing #36055 as context to also fix issues with people bind mounting the daemon root.

@thaJeztah thaJeztah added the area/daemon Core Engine label Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants