conf: create separate peer group for container's root#4229
Merged
stgraber merged 3 commits intolxc:masterfrom Nov 29, 2022
Merged
conf: create separate peer group for container's root#4229stgraber merged 3 commits intolxc:masterfrom
stgraber merged 3 commits intolxc:masterfrom
Conversation
68485b0 to
86cd616
Compare
Finally, we turn the rootfs into a shared mount. Note, that this doesn't reestablish mount propagation with the hosts mount namespace. Instead we'll create a new peer group. We're doing this because most workloads do rely on the rootfs being a shared mount. For example, systemd daemon like sytemd-udevd run in their own mount namespace. Their mount namespace has been made a dependent mount (MS_SLAVE) with the host rootfs as it's dominating mount. This means new mounts on the host propagate into the respective services. This is broken if we leave the container's rootfs a dependent mount. In which case both the container's rootfs and the service's rootfs will be dependent mounts with the host's rootfs as their dominating mount. So if you were to mount over the rootfs from the host it would not just propagate into the container's mount namespace it would also propagate into the service. That's nonsense semantics for nearly all relevant use-cases. Instead, establish the container's rootfs as a separate peer group mirroring the behavior on the host. Signed-off-by: Christian Brauner (Microsoft) <christian.brauner@ubuntu.com>
Member
Author
|
Jenkins: test this please |
Member
Author
|
jenkins: test this please |
1 similar comment
Member
Author
|
jenkins: test this please |
86cd616 to
4f5e5cc
Compare
Member
|
jenkins: test this please |
1 similar comment
Member
|
jenkins: test this please |
Signed-off-by: Christian Brauner (Microsoft) <christian.brauner@ubuntu.com>
4f5e5cc to
01ae6d4
Compare
Signed-off-by: Christian Brauner (Microsoft) <christian.brauner@ubuntu.com>
|
Testsuite passed |
cmatsuoka
added a commit
to cmatsuoka/craft-parts
that referenced
this pull request
Mar 9, 2023
Address shared mount issues affecting /dev mount in chroots. This is a result of lxc/lxc#4229 (container rootfs became a shared mount, meaning that unmounts propagates through the shared group and original mounts are unmounted too). See canonical/rockcraft#195 for details. Signed-off-by: Claudio Matsuoka <claudio.matsuoka@canonical.com>
1 task
cmatsuoka
added a commit
to canonical/craft-parts
that referenced
this pull request
Mar 9, 2023
Address shared mount issues affecting /dev mount in chroots. This is a result of lxc/lxc#4229 (container rootfs became a shared mount, meaning that unmounts propagates through the shared group and original mounts are unmounted too). See canonical/rockcraft#195 for details. Signed-off-by: Claudio Matsuoka <claudio.matsuoka@canonical.com>
mihalicyn
added a commit
to mihalicyn/lxc
that referenced
this pull request
Mar 31, 2023
Long story behind this. Many years ago, Stéphane Graber discovered an issue with apparmor mount rules. Since lxc@7f2b132 commit ("apparmor: Update mount states handling") it was prohibited to change mount propagation flags, just because adding rules which allow mount propagation user inside the container gets an ability to mount everything [1]. Now with modern systemd versions this problem become more critical than before. For instance, ArchLinux containers fail to start without nesting apparmor profile enabled (because nesting profile effectively just allow all mounts). Of course, that's a security issue. We've also enabled sharing on the container rootfs: lxc#4229 Now for many workloads it's needed to change propagation flag to private (see canonical/craft-parts#400). Issue: $ lxc-start -F archlinux-test systemd 253-1-arch running in system mode (+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified) Detected virtualization lxc. Detected architecture x86-64. Welcome to Arch Linux! bpf-lsm: BPF LSM hook not enabled in the kernel, BPF LSM not supported Failed to remount root directory as MS_SLAVE: Permission denied (sd-gens) failed with exit status 1. [!!!!!!] Failed to start up manager. Exiting PID 1... Workaround (unsafe): $ lxc-start -s lxc.apparmor.allow_nesting=1 -s lxc.apparmor.profile=generated -F arch-test John Johansen (Apparmor maintainer) and LXD team worked on fix [2]. It was merged to stable AppArmor 3.0 and 3.1 branches already. There is no stable AppArmor version tag for that, but I think it will be in the AppArmor version 3.0.10. See also: [1] https://bugs.launchpad.net/apparmor/+bug/1597017 [2] https://gitlab.com/apparmor/apparmor/-/merge_requests/333 Fixes: lxc#4280 Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Finally, we turn the rootfs into a shared mount. Note, that this doesn't reestablish mount propagation with the hosts mount namespace. Instead we'll create a new peer group.
We're doing this because most workloads do rely on the rootfs being a shared mount. For example, systemd daemon like sytemd-udevd run in their own mount namespace. Their mount namespace has been made a dependent mount (MS_SLAVE) with the host rootfs as it's dominating mount. This means new mounts on the host propagate into the respective services.
This is broken if we leave the container's rootfs a dependent mount. In which case both the container's rootfs and the service's rootfs will be dependent mounts with the host's rootfs as their dominating mount. So if you were to mount over the rootfs from the host it would not just propagate into the container's mount namespace it would also propagate into the service.