-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
There are a few aspects of mount(8)'s behaviour when dealing with bind-mounts and remounting that are quite strange. I started looking at writing some patches to fix them, but given that these changes would affect every libmount user, it seemed prudent to ask the maintainer's option before writing patches.
In short, the issue is that due to the way the old mount(2) API is implemented and the understandable-but-somewhat-naive way that mount flags are handled by libmount; mount --bind -o ..., mount --bind -o remount,..., and mount -o remount,... have confusing behaviour. I haven't yet tested whether the fsconfig-based libmount hooks have the same behaviour, but given that the implementation was designed to be compatible it seems pretty likely that the behaviour is the same.
mount --bind -o rwon a read-only mount will silently produce a read-only bind-mount. This applies to all other clearing flags (exec,dev,suid, etc) becauselibmountwill only try to do aMS_BIND|MS_REMOUNTreconfigure if there are any flags requested -- but the code doesn't account for the fact that the user might've requested aMNT_INVERTflag. Basically, it seems that this should be handled likemount --bind -o remount,...where the current flags are read and then the options are applied to the existing flags. Or, if the intended behaviour is "I only want the requested flags, ignore the old flags" thenMS_BIND|MS_REMOUNTneeds to be done unconditionally.- This behaviour also means that
mount --bind -o roon anosuidmount will clear thenosuidbit silently.
- This behaviour also means that
- For
mount --bind -o remount,..., the waylibmounttreats atime (by treating them as though they are standard mount flags) leads to confusing behaviour. The current behaviour handles the most trivial case --mount --bind -o remount,atimeon anoatimemount works, but the following cases do not:mount --bind -o remount,relatimeon anoatimemount doesn't work (it silently produces annoatimemount) becauselibmountwill domount(MS_NOATIME|MS_RELATIME)butMS_RELATIMEis actually ignored by the kernel ifMS_NOATIMEis set.mount --bind -o remount,nodiratime,norelatimeon astrictatimemount will produce adiratime,relatimemount becausenorelatimein this case should be replaced withstricatimebecause aMS_DIRATIMEmount ends up implyingMS_RELATIMEinternally.mount --bind -o remount,roof anodiratime,strictatimemount produces as adiratime,relatimemount becauseMS_STRICTATIMEis not an actual mount flag shown instatfsand/proc/self/mountinfo-- solibmountends up passing justMS_NODIRATIMEas the "previous" mount flags, which the kernel then turns in toMS_NODIRATIME|MS_RELATIME.mount --bind -o remount,strictatimeon anodiratimemount will produce anodiratimemount becauseMS_STRICTATIMEdoesn't clearMS_NODIRATIME. To be fair, this is arguably "expected" behaviour depending on what semantics you want.
I don't know what the expected semantics of mount --bind -o are supposed to be, but if the idea is that you specify all of the flags (and the old mount flags are completely ignored), then you need to unconditionally call mount(MS_BIND|MS_REMOUNT) in order to ensure that you clear flags when requested. The downside is that this also means that mount --bind and mount --bind -o might have different behaviour (presumably the behaviour that mount --bind retains the old mount flags is something people want). In addition, you probably want to pass MS_RELATIME by default (if no atime flags are specified) in order to force the default kernel setting, rather than inheriting the old atime flag (to match the other mount flags).
For mount --bind -o remount,... the atime semantics should probably be something like:
- If
MS_RELATIME|MS_NOATIMEare not present in/proc/self/mountinfoorstatfsthen addMS_STRICTATIMEto correctly handle thestrictatimecases. - If an atime flag (
strictatime,relatime,noatime) is requested, clear all other atime flags from the "old" set. Currently the old atime flags are kept but this results in weird behaviour because some of the atime flags are technically an enum and so passing multiple values produces incorrect behaviour (most notably in theMS_RELATIME|MS_NOATIMEcase). norelatimeshould probably be converted tostrictatimein some cases.- The interaction of
atime,diratime,norelatimeandnostrictatimeneeds to be reconsidered. Doesatimejust mean "noMS_NOATIME" or should it also clearMS_NODIRATIME? What isnostrictatimesupposed to mean? You cannot clear theMS_STRICTATIMEflag as it is not a real flag -- should it instead meanMS_RELATIME(the default atime since 2009)?
I discovered this while working on runc's mount handling code. The OCI runtime-spec requires us to mirror mount(8) semantics, and I then discovered that several bugs present in runc are also present in libmount but slightly different.
I will write up a simple test script to help make it easier to understand the multitude of issues.