daemon/command: disable c8d snapshotter when userns remapping enabled#51042
Conversation
daemon/command/daemon_linux.go
Outdated
| func disableC8dSnapshotterOnUsernsRemap(cfg *config.Config) { | ||
| if cfg.RemappedRoot != "" { | ||
| if enabled, ok := cfg.Features["containerd-snapshotter"]; !ok || enabled { | ||
| log.G(context.TODO()).Warn("userns remapping enabled, disabling containerd snapshotter") |
There was a problem hiding this comment.
Can we instead make this a hard failure and return an error (validate the combination of userns-remapping and snapshotter, then invalidate the config and error)?
Lines 35 to 42 in dcf5db2
There was a problem hiding this comment.
Can we instead make this a hard failure and return an error (validate the combination of userns-remapping and snapshotter, then invalidate the config and error)?
Not sure if that makes sense. In v29.0, we're making the feature containerd-snapshotter an opt-out. Users who have userns-remap set and that'd upgrade to v29.0 would find their Engine is now failing to start and would need to explicitly opt-out of that feature.
I think it'd be best to return an error if containerd-snapshotter is explicitly opted-in, and userns-remap is enabled. If it's not explicitly set, we shouldn't enable it if userns-remap is set. Is this what you're suggesting?
There was a problem hiding this comment.
+1 for not opting in to containerd when userns-remap is enabled. And if both containerd AND userns-remap is enabled then return a hard error.
|
Looking at the issue; Who creates that |
The original {
"destination": "/etc/resolv.conf",
"type": "bind",
"source": "/var/lib/docker/100000.100000/buildkit/executor/resolv.conf",
"options": [
"nosuid",
"noexec",
"nodev",
"rbind",
"ro"
]
},The original issue: comes from this runc code path: https://github.com/opencontainers/runc/blob/00aec12c711a4d1562578e1d5e99d77610a169d9/libcontainer/rootfs_linux.go#L160. The rootfs bind-mounted by buildkit's runcexecutor isn't id-mapped (https://github.com/moby/buildkit/blob/1cff4fb6f7938128c24eacae7c064b589423cb0e/executor/runcexecutor/executor.go#L239) — the rootMount := mount.Mount{
Type: "bind",
Source: "/var/lib/docker/100000.100000/buildkit/containerd-overlayfs/cachemounts/buildkit1070698249",
Options: []string{"rw", "rbind"},
}That 'Source' itself comes from buildkit's These two client opts add labels If I add this to buildkit's if s.idmap != nil {
for i := range mounts {
if mounts[i].Type != "bind" && mounts[i].Type != "overlay" {
continue
}
for _, m := range s.idmap.UIDMaps {
mounts[i].Options = append(mounts[i].Options, fmt.Sprintf("uidmap=%d:%d:%d", m.ID, m.ParentID, m.Count))
}
for _, m := range s.idmap.GIDMaps {
mounts[i].Options = append(mounts[i].Options, fmt.Sprintf("gidmap=%d:%d:%d", m.ID, m.ParentID, m.Count))
}
}
}Then the rootfs is correctly id-mapped, and buildkit's UIDMappings: []specs.LinuxIDMapping{{ContainerID: 0, HostID: 100000, Size: 65536}},
GIDMappings: []specs.LinuxIDMapping{{ContainerID: 0, HostID: 100000, Size: 65536}},After that, everything works correctly. I'm not sure if that's the proper fix though (Tõnis seemed to suggest it wasn't), and it'd require changes in buildkit which won't receive a new minor release before current target date for v29.0. |
2ef4c9a to
ceac428
Compare
Thought I left a comment, but probably forgot to press the button; if wiring up is the issue, given that we (currently) only provide a single mapping ... wondering if we could (but it's a bit of a hack) just have a global var that's set to the right mapping for this (until things are wired up) 🤔 |
03463dd to
f3ed627
Compare
|
|
||
| ctx := testutil.GetContext(c) | ||
| s.d.StartWithBusybox(ctx, c, "--userns-remap", "default") | ||
| s.d.StartWithBusybox(ctx, c, "--userns-remap", "default", "--storage-driver", "vfs") |
There was a problem hiding this comment.
Not entirely sure why but the daemon fails to start in this test when overlay2 is used — it's running an error from here:
moby/daemon/graphdriver/overlayutils/overlayutils.go
Lines 72 to 75 in 45bc224
ISTR the kernel limits the size of the mount option string, so I think that's due to that (it's ~1800 chars long).
Also, I think this test was working correctly before ead007f as DOCKER_GRAPHDRIVER was defaulting to vfs (see ead007f#diff-3e42eb9b4a639f89ff1487b88be1de1111e14d83fe672e3cf69b87bcf86b48b0R43).
ERRO[2025-10-08T09:50:35.932904673Z] Failed to mount overlayfs filesystem error="invalid argument" lowerDir="/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/lower2:/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/lower1" mnt=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/merged opts="lowerdir=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/lower2:/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/lower1,upperdir=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/upper,workdir=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/work" upperDir=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/upper workDir=/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestDaemonUserNamespaceRootSetting/de382817ce451/root/100000.100000/check-overlayfs-support3140787799/work
daemon/command/daemon_linux.go
Outdated
| // Buildkit breaks when userns remapping is enabled and containerd snapshotter is used. As a temporary workaround, | ||
| // if containerd snapshotter is explicitly enabled, and userns remapping is enabled too, return an error. If userns | ||
| // remapping is enabled, but containerd-snapshotter is enabled by default, disable it. See https://github.com/moby/moby/issues/47377. | ||
| func disableC8dSnapshotterOnUsernsRemap(cfg *config.Config) error { | ||
| if cfg.RemappedRoot == "" { | ||
| return nil | ||
| } | ||
|
|
||
| enabled := cfg.Features["containerd-snapshotter"] | ||
| if enabled { | ||
| return errors.New("containerd-snapshotter is explicitly enabled, but is not comptatible with userns remapping. Please disable userns remapping or containerd-snapshotter") | ||
| } | ||
|
|
||
| log.G(context.TODO()).Warn("userns remapping enabled, disabling containerd snapshotter") | ||
| if cfg.Features == nil { | ||
| cfg.Features = make(map[string]bool) | ||
| } | ||
| cfg.Features["containerd-snapshotter"] = false | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Perhaps not a strict blocker, but I just noticed that this possible could be rolled into setPlatformOptions, because it's specific to Linux, and which .. looks to be dealing with userns-remapping already;
moby/daemon/command/daemon_linux.go
Lines 12 to 23 in c09d53a
There was a problem hiding this comment.
Good call out! I've merged disableC8dSnapshotterOnUsernsRemap into setPlatformOptions.
3f8398a to
4460a72
Compare
4460a72 to
4c8111d
Compare
Buildkit fails when userns remapping is enabled and c8d snapshotter is used. As a temporary workaround, disable c8d snapshotter when userns remapping is enabled. This will need a proper fix in the future. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
4c8111d to
e1722eb
Compare
| enabled := conf.Features["containerd-snapshotter"] | ||
| if enabled { | ||
| return errors.New("containerd-snapshotter is explicitly enabled, but is not compatible with userns remapping. Please disable userns remapping or containerd-snapshotter") | ||
| } |
There was a problem hiding this comment.
containerd-snapshotter is no longer an opt-in for fresh engine installations.
So if you have the userns enabled in your daemon.json on a fresh install then it would miss this.
We need to handle it somewhere around here:
Line 1276 in 45ddf37
There was a problem hiding this comment.
This condition only tests if the feature flag was explicitly set (e.g. when upgrading the daemon), and returns an error if so.
If it's not set, the feature flag is automatically disabled a few lines below: https://github.com/akerouanton/docker/blob/e1722eb8d8013019524c3ff56e45374b629d6e68/daemon/command/daemon_linux.go#L40
There was a problem hiding this comment.
Right but that will break the default opt-in, because containerd-snapshotter is set to explicit false (instead of no value present).
There was a problem hiding this comment.
Oh wait, I missed this line:
moby/daemon/command/daemon_linux.go
Lines 17 to 19 in e1722eb
- What I did
Buildkit fails when userns remapping is enabled and c8d snapshotter is used. As a temporary workaround, disable c8d snapshotter when userns remapping is enabled. This will need a proper fix in the future.
- How to verify it
- Human readable description for the release notes