-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
After upgrading from runc 1.1.3 to runc 1.1.4 we started to see a particular container fail with an error like:
FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec /entrypoint.sh: permission denied: unknown
The same container runs successfully with runc 1.1.3.
Upon inspection of the container, we discoverd that the entrypoint's permissions were set to 744, but it was owned by root instead of the container's user. We were able to fix the issue by correctly chowning the entrypoint script, but the upgrade caused an outage, so I'm reporting it here as a bug.
Root Cause
I traced the cause down to this PR #3522
In that PR, a new check was introduced to exit early if the entrypoint isn't executable. It does so by
- Attempting to use
faccessat2(2)to check the effective permissions if available (requires kernel 5.8+ and libseccomp 2.4.0+) - Falling back to
access(2)to check the real users permission iffaccessat2(2)isn't available.
We were hitting case 2. The problem is that this check is more restrictive than execve(2) because access(2) removes capabilities when checking for permission.
Docker and containerd have a set of default capabilities that they add to the runtime spec. Both include CAP_DAC_OVERRIDE in this set. See containerd's list, docker's list.
The capabilities man page says this about CAP_DAC_OVERRIDE
Bypass file read, write, and execute permission checks. (DAC is an abbreviation of "discretionary access control".)
So in runc 1.1.3, execve worked because runc has the CAP_DAC_OVERRIDE capability to ignore the fact that the user doesn't actually have execute permission. In runc 1.1.4, the addtional access(2) check striped capabilities when checking permissions and caused runc to exit with permission denied: unknown.
Conditions for this bug
- kernel < 5.8 or libseccomp < 2.4.0
- An entry point script where the owner has execute permission, but the container user has readonly permission
CAP_DAC_OVERRIDEor similar capabilities set (which is the default for both containerd and docker)
Reproduction
entrypoint.sh
#!/bin/sh
echo "Hello, World!"
Dockerfile
FROM alpine:latest
COPY ./entrypoint.sh /entrypoint.sh
RUN chown root:root /entrypoint.sh\
&& chmod 744 /entrypoint.sh
RUN adduser --disabled-password testuser
USER testuser
ENTRYPOINT ["/entrypoint.sh"]
When run with runc 1.1.3:
$ sudo nerdctl run --rm -it --net=none runc114error:latest
Hello, World!
When run with runc 1.1.4:
$ sudo nerdctl run --rm -it --net=none runc114error:latest
FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec /entrypoint.sh: permission denied: unknown