-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Description
When using containers with user namespaces enabled, we are observing that it is possible for containerd to remove files from the overlayfs snapshotter's data for the container image, as well as from other mounts that may be added to the container. When the removed files are on the snapshot, it affects future containers started on the host. When the removed files are from other mounts it could cause permanent data loss.
Steps to reproduce the issue
- Start a container on a host with user namespaces enabled
- Run some executable processes in one of these containers or open some files to cause the idmapped mount to be busy
- Kill the container
- Be unlucky. This does not always cause data loss, presumably only when the kernel has not finished cleaning up its accounting of open file handles by the time the containerd
unmountcall is attempted.
Describe the results you received and expected
Actual results, from containerd log:
time="2024-09-17T23:04:52.334331723Z" level=warning msg="failed to unmount temp lowerdir /var/lib/containerd/tmpmounts/ovl-idmapped1999575633/2" error="device or resource busy"
time="2024-09-17T23:04:52.337862379Z" level=info msg="TaskExit event in podsandbox handler container_id:\"3e5d4ccbf1246380d2db67bb97d49f614dc21b2493fac0656f7d5dd8815cec5c\" id:\"a917de21e41c6c09e3e06ecbd605b3775e68c6bb40e1aef6ab37338dbcc59df3\" pid:2449241 exited_at:{seconds:1726614292 nanos:327893138}"
time="2024-09-17T23:04:52.388515611Z" level=warning msg="failed to remove temporary overlay lowerdir's" error="unlinkat /var/lib/containerd/tmpmounts/ovl-idmapped1999575633/2: device or resource busy"
Resulting in the underlying path behind /var/lib/containerd/tmpmounts/ovl-idmapped1999575633/2, a layer of the container image, having its files removed since it was not successfully unmounted before it was os.RemoveAll()ed.
Expected results:
Temporary idmap mount is cleaned up without underlying data loss
What version of containerd are you using?
containerd github.com/containerd/containerd v2.0.0-rc.0 93022d8
Any other relevant information
Relevant code appears to be here,
containerd/core/mount/mount_linux.go
Lines 254 to 260 in 67b0687
| if err := unix.Unmount(lowerDir, 0); err != nil { | |
| log.L.WithError(err).Warnf("failed to unmount temp lowerdir %s", lowerDir) | |
| } | |
| } | |
| if terr := os.RemoveAll(filepath.Clean(filepath.Join(tmpLowerDirs[0], ".."))); terr != nil { | |
| log.L.WithError(terr).Warnf("failed to remove temporary overlay lowerdir's") | |
| } |
If the unmounts fail it does not prevent the os.RemoveAll
Show configuration if it is related to CRI plugin.
No response