Skip to content

images: fix segfault when mounting without cap_sys_admin#25244

Merged
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
giuseppe:mount-fix-segfault
Feb 7, 2025
Merged

images: fix segfault when mounting without cap_sys_admin#25244
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
giuseppe:mount-fix-segfault

Conversation

@giuseppe
Copy link
Copy Markdown
Member

@giuseppe giuseppe commented Feb 6, 2025

commit c6fe5e5 rearranged the code so that it accesses the store before checking if the current process has CAP_SYS_ADMIN.

Restore the check and augment it to also check for CAP_SYS_ADMIN.

Closes: #25241

Does this PR introduce a user-facing change?

None

@giuseppe giuseppe added No New Tests Allow PR to proceed without adding regression tests and removed release-note-none labels Feb 6, 2025
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2025
@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Feb 6, 2025

this should be made more robust, and make sure we never access LibimageRuntime() if is it nil, but for now, just restore the previous behavior (and account for the CAP_SYS_ADMIN check that was added in the meanwhile) so that we don't segfault

Copy link
Copy Markdown
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we duplicate this complex reexec logic here? No other command need this so why should mount need it...
I guess because the goal is/was to get the cannot run command %q in rootless mode, must execute podman unshare first message for rootless users so they know they only see the mount point in the unshare env...

So if that is the case I think if we just drop that code and instead return an error if we are not already part in the user namesapce instead as this is what the code tries to do today

Also you are missing ContainerMount() which used the same broken code.

@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Feb 6, 2025

Why do we duplicate this complex reexec logic here? No other command need this so why should mount need it... I guess because the goal is/was to get the cannot run command %q in rootless mode, must execute podman unshare first message for rootless users so they know they only see the mount point in the unshare env...

So if that is the case I think if we just drop that code and instead return an error if we are not already part in the user namesapce instead as this is what the code tries to do today

Also you are missing ContainerMount() which used the same broken code.

we have this ugly hack because we still want mount to succeed for vfs

@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Feb 6, 2025

the reason for this is that "mount" doesn't work from the host, users get confused that they need to podman unshare first.

With vfs instead it doesn't matter, because the "mount" point is accessible from the host since there is no mount

commit c6fe5e5 rearranged the code so
that it accesses the store before checking if the current process has
CAP_SYS_ADMIN.

Restore the check and augment it to also check for CAP_SYS_ADMIN.

Closes: containers#25241

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Feb 6, 2025

Also you are missing ContainerMount() which used the same broken code.

thanks, fixed now

@baude
Copy link
Copy Markdown
Member

baude commented Feb 6, 2025

lgtm

@Luap99
Copy link
Copy Markdown
Member

Luap99 commented Feb 6, 2025

Is it really that important to special case a storage driver there? I mean I agree that we should not just break that for backwards compatibility reasons but maybe we can should consider removing this when we have the chance or at rework it in a way where not duplicate this fragile logic.


func (ic *ContainerEngine) ContainerMount(ctx context.Context, nameOrIDs []string, options entities.ContainerMountOptions) ([]*entities.ContainerMountReport, error) {
if os.Geteuid() != 0 {
hasCapSysAdmin, err := unshare.HasCapSysAdmin()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this error happen without something critical going wrong?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have this function called in different places, and the result is cached. So if it fails, even if we ignore it here, it will be caught somewhere else

@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Feb 7, 2025

Is it really that important to special case a storage driver there? I mean I agree that we should not just break that for backwards compatibility reasons but maybe we can should consider removing this when we have the chance or at rework it in a way where not duplicate this fragile logic.

I don't think it is such an important case, except the error code is helpful if users don't see the mount after running podman (container|image) mount, because they are not in the podman unshare environment.

@Luap99
Copy link
Copy Markdown
Member

Luap99 commented Feb 7, 2025

except the error code is helpful if users don't see the mount after running podman (container|image) mount

Yes we can keep that, but this is unrelated to duplicating the reexec.
If we just want to error all we need to do is check _CONTAINERS_USERNS_CONFIGURED is not set to done then we error out instead of doing the rexec.
Looking at this I am not even sure how we would go into the vfs case, the special error is always set here and overwrites the main command:

podman/cmd/podman/main.go

Lines 98 to 105 in 2cbb5fe

_, found := c.Command.Annotations[registry.UnshareNSRequired]
if found {
if rootless.IsRootless() && os.Getuid() != 0 {
c.Command.RunE = func(cmd *cobra.Command, args []string) error {
return fmt.Errorf("cannot run command %q in rootless mode, must execute `podman unshare` first", cmd.CommandPath())
}
}
}

Copy link
Copy Markdown
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway let's get this merged, it fixes the panic.
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 7, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Feb 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 24b686e into containers:main Feb 7, 2025
81 of 82 checks passed
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label May 9, 2025
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators May 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. No New Tests Allow PR to proceed without adding regression tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mount: null deref when not running with a userns

4 participants