Skip to content

Tighten permissions on created directory#743

Merged
rhatdan merged 2 commits intocontainers:masterfrom
rhatdan:perms
Oct 30, 2020
Merged

Tighten permissions on created directory#743
rhatdan merged 2 commits intocontainers:masterfrom
rhatdan:perms

Conversation

@rhatdan
Copy link
Member

@rhatdan rhatdan commented Oct 12, 2020

The mount points of containers are being marked as 755 where
a lot of images now come with the default for "/", as 555.

The tighter controls on / means that a root process can not
modify the / directory without DAC_OVERRIDE. And some consider
this better security. Bottom line is tools like rpm -qVf /
are reporting failures on permissions on / in lots of RPM
based images.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1854566

Signed-off-by: Daniel J Walsh dwalsh@redhat.com

@rhatdan
Copy link
Member Author

rhatdan commented Oct 12, 2020

@nalind @mtrmac @vrothberg @rhvgoyal @saschagrunert PTAL
I am not sure if I went too far.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 12, 2020

Conceptually, I would expect the permissions of the / directory, as visible from inside the container, to come from the permissions set in the corresponding layer tarball — and with a trivial test with a loopback-mounted filesystem, the permissions of the mount point are indeed set from the mounted filesystem, not from the parent.

So I don’t understand how the PR description and the code really relates; at best changing one hard-coded value for another replaces one set of images with non-matching permissions with another set. With my very minimal understanding of overlay, are all of those directories even container-observable? I thought that the permissions of mount points before the mount happens basically don’t matter.

Either way, I was expecting some way to apply the permissions from the layer tarball to the root of the extracted tree (or, even, some code that removes something that prevents such permission application, which should be the default behavior).

This may well be correct but it is surprising to me at a first glance, and I don’t know enough about overlay or c/storage to immediately say one way or another — I’ll defer to experts.

@giuseppe
Copy link
Member

agree with @mtrmac. We should probably set the same permissions as the rootfs in the image, otherwise switching a hardcoded value for another will silent the rpm warning but can introduce other regressions. As you pointed out, running without CAP_DAC_OVERRIDE and not being able to create dir/files in the root directory.

Reinstalling filesystem from the container fixes the issue with /, but there are other errors reported by rpm -qV, such as /proc and /sys having the wrong ownership, that cannot be fixed from a user namespace.
While we can think of fixing the root permissions to follow what the rootfs tarball specifies, I think these other issues must be addressed in the rpm tool itself and make it aware of running in a container

@rhatdan
Copy link
Member Author

rhatdan commented Oct 13, 2020

@nalind Didn't we talk about this a while ago, and figured out there was an issue in / of the container.

@nalind
Copy link
Member

nalind commented Oct 13, 2020

The root directory of a layer is frequently omitted from layer blobs in images (for example, docker.io/library/fedora), though the one in the bug report appears to be an exception to that. In the case of overlay, if I'm reading the docs right, the permissions on the upper ("diff") directory are the ones that are visible for the mountpoint.

_ = idtools.MkdirAs(lower1Dir, 0500, rootUID, rootGID)
_ = idtools.MkdirAs(lower2Dir, 0500, rootUID, rootGID)
_ = idtools.MkdirAs(upperDir, 0500, rootUID, rootGID)
_ = idtools.MkdirAs(workDir, 0500, rootUID, rootGID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is run-time feature test code. We delete these directories when the test is finished. What's the purpose of this hunk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just globally changed the permissions.

@nalind
Copy link
Member

nalind commented Oct 13, 2020

As an alternate approach that I haven't tested, does defaulting the permissions of the "diff" directory for a layer to the permissions of its parent layer's "diff" directory, if it has one, provide the desired result? I'd expect that to bring us more in line the device mapper driver.

@rhatdan rhatdan force-pushed the perms branch 2 times, most recently from 54cca22 to 6d33a94 Compare October 29, 2020 19:48
@rhatdan
Copy link
Member Author

rhatdan commented Oct 29, 2020

@giuseppe @nalind @mtrmac PTAL, I think I got it correctly, Tested this with Podman and it seems to work correctly.

@rhatdan rhatdan force-pushed the perms branch 2 times, most recently from 3109c8c to ee02f73 Compare October 29, 2020 20:00
The mount points of containers are being marked as 755 where
a lot of images now come with the default for "/", as 555.

The tighter controls on / means that a root process can not
modify the / directory without DAC_OVERRIDE.  And some consider
this better security.  Bottom line is tools like `rpm -qVf /`
are reporting failures on permissions on / in lots of RPM
based images.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@rhatdan
Copy link
Member Author

rhatdan commented Oct 29, 2020

@giuseppe @mtrmac PTAL

@rhatdan
Copy link
Member Author

rhatdan commented Oct 29, 2020

Nalin points out that most/alot of container images do not come with ./ inside of them.

IE the tar ball contains. (Fedora comes this way)
tmp
usr
var
as opposed to (UBI8 comes this way)
./tmp
./usr
./var

For these containers need a safe default. Fedora, Centos, RHEL have decided that the safe default for /, is 555. So we should go with that.

This PR attempts to look at the parent directory or the directory of lower layers and use it.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ... Ubuntu is sick atm

@rhatdan
Copy link
Member Author

rhatdan commented Oct 30, 2020

@giuseppe @vrothberg Please /approve this PR so testing will happen

@giuseppe
Copy link
Member

/approve

@rhatdan rhatdan merged commit 3223e3e into containers:master Oct 30, 2020
if err != nil {
return err
}
perms = os.FileMode(st.Mode())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all instances: AFAICS st.Mode() is already an os.FileMode, so the cast is unnecessary; OTOH the value also includes the file type bits.

So AFAICS this should be perms = st.Mode().Perms().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I pass st.Mode() I get this error.

make
GO111MODULE=on go build -mod=vendor -compiler gc -tags "  "  ./cmd/containers-storage
# github.com/containers/storage/drivers/overlay
drivers/overlay/overlay.go:582:9: cannot use st.Mode() (type uint32) as type os.FileMode in assignment
make: *** [Makefile:53: containers-storage] Error 2

Which indicates that st.Mode() has no value of perms and needs the cast.

Note this is a system.Stat not an os.Stat.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my mistake.

kolyshkin added a commit to kolyshkin/cri-tools that referenced this pull request Nov 12, 2020
A change introduced by [1] makes it impossible to create a directory
entry right under container's root directory, so the test case
"should enforce a permissive profile" fails.

For the purpose of this test it does not matter where we create the
file, so let's do it under /tmp.

[1] containers/storage#743

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin added a commit to kolyshkin/cri-tools that referenced this pull request Nov 12, 2020
A change introduced by [1] makes it impossible to create a directory
entry right under container's root directory, so the test case
"should enforce a permissive profile" fails.

For the purpose of this test it does not matter where we create the
file, so let's do it under /tmp.

[1] containers/storage#743

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin added a commit to kolyshkin/cri-tools that referenced this pull request Nov 12, 2020
A change introduced by [1] makes it impossible to create a directory
entry right under container's root directory, so the test case
"should enforce a permissive profile" fails.

For the purpose of this test it does not matter where we create the
file, so let's do it under /tmp.

[1] containers/storage#743

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 956463e)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@cgwalters
Copy link
Contributor

I know this is an old PR but all the relevant people are here, and this is now biting me. It took me quite a while to understand that basically container runtimes have always ignored / in the tar stream (if it exists, which per this PR often doesn't, but sometimes it does).

AI-generated summary in https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847 (but of course with a lot of guidance and tweaking from me)

I am probably going to work around this in composefs by setting the metadata for / to equal /usr...probably no one is going to ever care, but it sure is ugly.

cgwalters added a commit to cgwalters/composefs-rs that referenced this pull request Jan 9, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
cgwalters added a commit to cgwalters/composefs-rs that referenced this pull request Jan 9, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
cgwalters added a commit to cgwalters/composefs-rs that referenced this pull request Jan 10, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
cgwalters added a commit to cgwalters/composefs-rs that referenced this pull request Jan 12, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
cgwalters added a commit to cgwalters/composefs-rs that referenced this pull request Jan 14, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
cgwalters added a commit to composefs/composefs-rs that referenced this pull request Jan 15, 2026
OCI container layer tars often don't include a root directory entry,
and when they do, container runtimes (Podman, Docker) ignore it and
use hardcoded defaults. This makes root metadata non-deterministic,
causing composefs digest mismatches.

For detailed analysis of this issue, see:
  https://gist.github.com/cgwalters/c18c9337aa9345d763aa446cc95c7847
  containers/storage#743

Since the metadata of `/` is ill-defined in the OCI spec, we default
to copying it from `/usr` - no one should care if these differ, except
for SELinux labels which we handle separately in the boot path.

Add new APIs:
- copy_root_metadata_from_usr(): copies mode, uid, gid, mtime, xattrs
  from /usr to root directory
- read_container_root(): wrapper that calls read_filesystem() then
  copy_root_metadata_from_usr()

Remove the incomplete have_root_stat/ensure_root_stat() mechanism which
only handled mtime, not permissions or other metadata.

Update cfsctl commands (compute-id, create-image, create-dumpfile) to
use read_container_root() by default. Add --no-propagate-usr-to-root
flag to use raw read_filesystem() when root metadata is well-defined.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants