-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Description
When an image defines VOLUME [ "/run" ], attempt to run a user-namespaced container with hostUsers: false via K3s fails with message
Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/var/lib/kubelet/pods/9587ef35-67eb-4fcb-9b57-a28fc23bd1fb/volumes/kubernetes.io~projected/kube-api-access-94hrt" to rootfs at "/var/run/secrets/kubernetes.io/serviceaccount": create mountpoint for /var/run/secrets/kubernetes.io/serviceaccount mount: mkdirat /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/test-volume/rootfs/run/secrets: permission denied
It is necessary to specify the /run as an emptyDir volume mount.
The problem is present with the containerd and runc shipped in stock K3c (v2.0.4-k3s2). Also reproduced with containerd 2.1 and runc 1.3.0 and crun 1.21.
Steps to reproduce the issue
- Have an Ubuntu 24.04 machine (VM).
export INSTALL_K3S_EXEC="--kubelet-arg feature-gates=UserNamespacesSupport=true --kube-apiserver-arg feature-gates=UserNamespacesSupport=true --kube-controller-manager-arg feature-gates=UserNamespacesSupport=true --kube-scheduler-arg feature-gates=UserNamespacesSupport=true"curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig ~/.kube/configsudo chown -R $( id -u ):$( id -g ) ~/.kubesudo apt update ; sudo apt install -y buildah- Have a Dockerfile with
FROM docker.io/library/alpine:latest
VOLUME [ "/run" ]
buildah build -t docker-archive:/tmp/test-volume.tar:localhost/test-volumesudo k3s ctr images import - < /tmp/test-volume.tar- Have a test-volume.yaml with
apiVersion: v1
kind: Pod
metadata:
name: test-volume
spec:
restartPolicy: Never
hostUsers: false
containers:
- name: test-volume
image: localhost/test-volume
imagePullPolicy: Never
command: [ "cat", "/proc/self/uid_map" ]
kubectl apply -f test-volume.yamlkubectl logs test-volumekubectl describe pod/test-volume | tail -2
Describe the results you received and expected
Expected:
Something like
0 383975424 65536
indicating that the container is running user-namespaced, and
Normal Created 5s kubelet Created container: test-volume
Normal Started 5s kubelet Started container test-volume
with no error.
Actual:
No output from kubectl logs test-volume. The describe showing
Normal Created 5s kubelet Created container: test-volume
Warning Failed 5s kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/var/lib/kubelet/pods/9587ef35-67eb-4fcb-9b57-a28fc23bd1fb/volumes/kubernetes.io~projected/kube-api-access-94hrt" to rootfs at "/var/run/secrets/kubernetes.io/serviceaccount": create mountpoint for /var/run/secrets/kubernetes.io/serviceaccount mount: mkdirat /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/test-volume/rootfs/run/secrets: permission denied
What version of containerd are you using?
containerd github.com/containerd/containerd/v2 v2.1.0 061792f
Any other relevant information
I originally filed this issue in K3s as k3s-io/k3s#12332 and was suggested to file with containerd. So this is essentially a copy of that issue.
I am able to make things work by adding
volumeMounts:
- mountPath: /run
name: run-volume
volumes:
- name: run-volume
emptyDir: {}
to test-volume.yaml.
But I would expect that VOLUME [ "/run" ] to do effectively the same, and user-namespaced containers not choking on that image.
This is a minimized example of an issue we've seen with https://github.com/freeipa/freeipa-container and K3s and containerd. The reason why we have VOLUME [ "/run" ] in the image is to make the usage of that systemd-based image easier with
securityContext:
readOnlyRootFilesystem: true
Show configuration if it is related to CRI plugin.
version = 3
[plugins.'io.containerd.cri.v1.runtime'.cni]
bin_dirs = [ "/var/lib/rancher/k3s/data/cni" ]
conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
cgroup_writable = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
Metadata
Metadata
Assignees
Labels
Type
Projects
Status