Skip to content

Conversation

@kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Nov 17, 2021

As reported in containers/podman#12205, in a case where read-only fuse (sshfs)
mount is used as a volume without specifying ro flag, the kernel fails
to remount it (when adding various flags such as nosuid and nodev),
returning EPERM.

Here's the relevant strace line:

[pid 333966] mount("/tmp/bats-run-PRVfWc/runc.RbNv8g/bundle/mnt", "/proc/self/fd/7", 0xc0001e9164, MS_NOSUID|MS_NODEV|MS_REMOUNT|MS_BIND|MS_REC, NULL) = -1 EPERM (Operation not permitted)

I was not able to reproduce it with other read-only mounts as the source
(tried tmpfs, read-only bind mount, and an ext2 mount), so somehow this
might be specific to fuse.

The fix is to check whether the source has RDONLY flag, and retry the
remount with this flag added.

A test case (which was kind of hard to write) is added, and it fails
without the fix:

not ok 80 runc run [bind mount of a read-only sshfs mount]
# (in test file tests/integration/mounts.bats, line 78)
#   `[ "$status" -eq 0 ]' failed
# runc spec --rootless (status=0):
# 
# runc run test_busybox (status=1):
# time="2021-11-17T20:58:16Z" level=warning msg="unable to get oom kill count" error="no directory specified for memory.oom_control"
# time="2021-11-17T20:58:16Z" level=error msg="runc run failed: unable to start container process: error during container init: error mounting \"/tmp/bats-run-5470/runc.OkQNok/bundle/mnt\" to rootfs at \"/mnt\": mount /tmp/bats-run-5470/runc.OkQNok/bundle/mnt:/mnt (via /proc/self/fd/7), flags: 0x5026: operation not permitted"

Note that rootless user need to be able to ssh to rootless@localhost
in order to sshfs to work -- amend setup scripts to make it work.

1.0 backport: #3292

@kolyshkin kolyshkin force-pushed the rootless-ro-bind-rw branch 7 times, most recently from 08915b8 to dd466b3 Compare November 18, 2021 00:12
@kolyshkin kolyshkin changed the title WIP test Fix failure with rw bind mount of a ro fuse Nov 18, 2021
@kolyshkin kolyshkin added backport/1.0-todo A PR in main branch which needs to be backported to release-1.0 kind/bug labels Nov 18, 2021
@kolyshkin kolyshkin added this to the 1.1.0 milestone Nov 18, 2021
@kolyshkin kolyshkin marked this pull request as ready for review November 18, 2021 03:35
[ "$status" -eq 0 ]
}

@test "runc run [rw bind mount of a ro fuse sshfs mount]" {
Copy link
Member

@AkihiroSuda AkihiroSuda Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this into mounts_sshfs.bats, as this has huge extra depenedency?

Or maybe just skip the sshfs test by default and require some env var to opt-in to the sshfs test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. If sshfs is not installed or not working for some reason, the test is skipped.

return mount(source, m.Destination, procfd, m.Device, uintptr(m.Flags|unix.MS_REMOUNT), "")
flags := uintptr(m.Flags | unix.MS_REMOUNT)
err := mount(source, m.Destination, procfd, m.Device, flags, "")
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rather check errors.Is(err, syscall.EPERM)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A different kernel might return a different error, and if the original mount is not read-only, we still return that error, and if it is, a retry with added RO flag will most probably return the very same error.

In other words, the error path overhead of a statfs() and (maybe) a mount() is pretty small, and the error returned is most probably the same, so it makes sense to retry on any error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short, it doesn't hurt to retry even if there was an error other than EPERM.

@kolyshkin kolyshkin force-pushed the rootless-ro-bind-rw branch 2 times, most recently from d5bf4e0 to 20e15a2 Compare November 18, 2021 16:38
As reported in [1], in a case where read-only fuse (sshfs) mount
is used as a volume without specifying ro flag, the kernel fails
to remount it (when adding various flags such as nosuid and nodev),
returning EPERM.

Here's the relevant strace line:

> [pid 333966] mount("/tmp/bats-run-PRVfWc/runc.RbNv8g/bundle/mnt", "/proc/self/fd/7", 0xc0001e9164, MS_NOSUID|MS_NODEV|MS_REMOUNT|MS_BIND|MS_REC, NULL) = -1 EPERM (Operation not permitted)

I was not able to reproduce it with other read-only mounts as the source
(tried tmpfs, read-only bind mount, and an ext2 mount), so somehow this
might be specific to fuse.

The fix is to check whether the source has RDONLY flag, and retry the
remount with this flag added.

A test case (which was kind of hard to write) is added, and it fails
without the fix. Note that rootless user need to be able to ssh to
rootless@localhost in order to sshfs to work -- amend setup scripts
to make it work, and skip the test if the setup is not working.

[1] containers/podman#12205

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

@thaJeztah PTAL

@kolyshkin
Copy link
Contributor Author

This fixes a real bug and we need to backport it to 1.0.3. Can this be merged? PTAL @opencontainers/runc-maintainers

@thaJeztah
Copy link
Member

Ahm, sorry, I started looking, but got pulled away

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit 02fe5f4 into opencontainers:master Nov 22, 2021
if err == nil {
return nil
}
// Check if the source has ro flag...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta question: Can this be solved by requiring the entity configuring the mounts to set the necessary flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's not clear who should handle this (e.g. podman thinks it shouldn't, see containers/podman#12205) , and handling it in runc is kind of trivial. Besides, crun already does this, and this makes it hard to convince upper layers should do this.

@kolyshkin kolyshkin added backport/1.0-done A PR in main branch which has been backported to release-1.0 and removed backport/1.0-todo A PR in main branch which needs to be backported to release-1.0 labels Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.0-done A PR in main branch which has been backported to release-1.0 kind/bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants