Skip to content

Runc 1.1.4 container fails with permission denied: unknown when relying on capabilities #3715

@Kern--

Description

@Kern--

After upgrading from runc 1.1.3 to runc 1.1.4 we started to see a particular container fail with an error like:

FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec /entrypoint.sh: permission denied: unknown

The same container runs successfully with runc 1.1.3.

Upon inspection of the container, we discoverd that the entrypoint's permissions were set to 744, but it was owned by root instead of the container's user. We were able to fix the issue by correctly chowning the entrypoint script, but the upgrade caused an outage, so I'm reporting it here as a bug.

Root Cause
I traced the cause down to this PR #3522

In that PR, a new check was introduced to exit early if the entrypoint isn't executable. It does so by

  1. Attempting to use faccessat2(2) to check the effective permissions if available (requires kernel 5.8+ and libseccomp 2.4.0+)
  2. Falling back to access(2) to check the real users permission if faccessat2(2) isn't available.

We were hitting case 2. The problem is that this check is more restrictive than execve(2) because access(2) removes capabilities when checking for permission.

Docker and containerd have a set of default capabilities that they add to the runtime spec. Both include CAP_DAC_OVERRIDE in this set. See containerd's list, docker's list.
The capabilities man page says this about CAP_DAC_OVERRIDE

Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

So in runc 1.1.3, execve worked because runc has the CAP_DAC_OVERRIDE capability to ignore the fact that the user doesn't actually have execute permission. In runc 1.1.4, the addtional access(2) check striped capabilities when checking permissions and caused runc to exit with permission denied: unknown.

Conditions for this bug

  1. kernel < 5.8 or libseccomp < 2.4.0
  2. An entry point script where the owner has execute permission, but the container user has readonly permission
  3. CAP_DAC_OVERRIDE or similar capabilities set (which is the default for both containerd and docker)

Reproduction

entrypoint.sh

#!/bin/sh

echo "Hello, World!"

Dockerfile

FROM alpine:latest

COPY ./entrypoint.sh /entrypoint.sh
RUN  chown root:root /entrypoint.sh\
      && chmod 744 /entrypoint.sh

RUN adduser --disabled-password testuser
USER testuser
ENTRYPOINT ["/entrypoint.sh"]

When run with runc 1.1.3:

$ sudo nerdctl run --rm -it --net=none runc114error:latest
Hello, World!

When run with runc 1.1.4:

$ sudo nerdctl run --rm -it --net=none runc114error:latest
FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec /entrypoint.sh: permission denied: unknown

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions