Skip to content

Conversation

@gabriel-samfira
Copy link
Contributor

Images may be created with a VOLUME stanza pointed to drive letters that are not C:. Currently, an image that has such VOLUMEs defined, will cause containerd to error out when starting a container.

This change skips copying existing contents to volumes that are not C:. as an image can only hold files that are destined for the C: drive of a container.

Fixes: #8171

Tests will be added in a separate PR, as we also need to update the volume-copy-up image to include a D: volume.

@k8s-ci-robot
Copy link

Hi @gabriel-samfira. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thaJeztah
Copy link
Member

Wondering if this is expected to be supported (but things may have changed), as I seem to recall "non-C" destinations were not supported?

@gabriel-samfira
Copy link
Contributor Author

Wondering if this is expected to be supported (but things may have changed), as I seem to recall "non-C" destinations were not supported?

This will work:

docker run --rm -v C:\test:D: img_tag

This will fail:

docker run --rm -v C:\test:D:\ img_tag

We can mount a folder as a different drive letter than C, but we cannot mount it as a subfolder of a non-C drive. It's also possible to mount folders from the hosts other non-C drives.

image

@thaJeztah
Copy link
Member

Ah! Yes, I recall now, a drive works, but not a directory to another (non-c) drive, as there is no drive present there to mount on; thanks!

@thaJeztah
Copy link
Member

I wonder (but I'm not very familiar with the Windows internals) if we could make such a scenario work by implicitly creating a drive before mounting the directory?

@TBBle
Copy link
Contributor

TBBle commented Apr 7, 2023

It would probably be possible, but need testing. We could definitely create an implicit volume and mkdir the path to the mount-point (and then another implicit volume, I guess?), i.e. implicitly VOLUME ["D:\some\path"] does the equivalent of VOLUME ["D:", "D:\some\path"] and then -v D:\hostpath:D:\some\path would probably work. (Assuming hcsshim and HCS itself doesn't reject that.)

Because this is implemented in front-ends, not containerd itself, we'd want to update Docker as well, to avoid surprises for users. Happily the builders don't try to filter this list, so it'd only be a containerd-cri/Docker change.

Docker already diverges in VOLUME implementation as it currently misses the 'copy existing files' functionality for VOLUME. I'm not sure if that can be fixed before Windows-containerd is done as I'm not sure if the limitation mentioned in that code was a HCSv1 API limitation or a HCS platform version limitation from LTSC 2016.

If we did that, would we also make to implement logic so that -v D:\hostpath:D:\some\path doesn't require VOLUME to prepare the mount-point?

So yeah, this sounds like a separate feature request to consider.

@gabriel-samfira
Copy link
Contributor Author

I can investigate as part of the buildkit work. But in the meantime this will fix the containers not starting issue.

@dcantah
Copy link
Member

dcantah commented Apr 7, 2023

cc @kevpar @kiashok

@gabriel-samfira
Copy link
Contributor Author

@estesp @kzys I think it's fairly safe to merge this when you have some time, if you don't see anything wrong with it.

@gabriel-samfira
Copy link
Contributor Author

CC @MikeZappa87

If you have some cycles, a review would be great 😄

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment .. need to decide if we are trimming c: ignore case or skipping c: ignore case

continue
}
// The volume may have been defined with a C: prefix, which we can't use here.
volume = strings.TrimPrefix(volume, "C:")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above new block overrides this trim by continuing to the next item in volumeMounts

c: || C: will continue on...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do a case insensitive check for that prefix and strip it. Good catch.

Copy link
Contributor Author

@gabriel-samfira gabriel-samfira May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-worded the comments a bit and I replaced the strings.TrimPrefix() call with a simple slice expression removing the first 2 characters.

The code does a case insensitive comparison of the drive letter via strings.EqualFold(), and if we're not dealing with a volume under C: we continue with the next item in the volumeMounts.

I will add a new test after this merges that will make use of volumes mounted asD: and another volume under C: vut defined as lower case. But that implies updating the volume-copy-up image, so we'll have to merge this first, then build the image and then add the tests.

@gabriel-samfira gabriel-samfira requested a review from mikebrow May 11, 2023 17:02
if len(volume) >= 2 && string(volume[1]) == ":" {
// Perform a case insensitive comparison to "C", and skip non-C mounted volumes.
if !strings.EqualFold(string(volume[0]), "c") {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a windows fan, is this something a user knows in advance? do we need to add a log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a windows fan

That's quite alright. I am happy to provide some more context.

is this something a user knows in advance?

It's somewhat documented here: https://docs.docker.com/engine/reference/commandline/run/#volume

do we need to add a log?

TL;DR

I don't think a log message here would be helpful. There is nothing to correct and the user can't really do anything about the situation. I will explain bellow.

TS;NM

The code here tries to populate a VOLUME with pre-existing files from within an image. For any volume mounted somewhere outside the root filesystem, we can't really store files within an image. At least not without a spec change (I may be wrong, need to check).

On Linux, all volumes get mounted somewhere under /. So if we try to mount a docker volume, a block volume or a remote filesystem like NFS, all mounts end up somewhere under /. The DOS drive letters in Windows are like having multiple / on Linux, but only one of which is the actual boot-able filesystem. Windows allows us to mount filesystem inside folders, just like on Linux, but it also allows us to mount filesystems as different drive letters, which as I mentioned, act like a separate /.

So in essence, for pre-existing VOLUME contents we can easily support the case where volumes are mounted as folders (which is identical to Linux), but we can't support the case where volumes are mounted as different drive letters, as the image build process doesn't capture files that are not under C:.

For the case where there is a VOLUME mounted as a different drive letter, the target must always be a bare drive letter like: D: and not a sub-folder of that drive letter like: D:\ or D:\somepath. This is because we must first have a valid store for a drive letter and that drive letter set up, before we can mount anything under it. It's like first having to mount a FS under / on Linux before you can mount anything in /var.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for telling me why it is done.

I wasn't sure if users would be confused after skipping some volume copies and how we could make this easier for them to know. Of course if this is common knowledge in windows, then I guess my fears are superfluous 😁

Copy link
Contributor

@TBBle TBBle May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also implicit from the VOLUME documentation, because the only way to have a non-C: mount on Windows during build-time is by calling VOLUME ["D:"] earlier (which will as a new volume, will be empty), and as noted, anything done to that volume after that step is discarded, same as Linux. (In the future, BuildKit will support RUN --mount which could also create non-C: volumes, but again, any changes would not be persisted to the image, consistent with Linux).

So from a user perspective, it's not so much that we skip the volume copy, as that there's no way for the user to have put anything there to be copied, same as if they used VOLUME ["C:\some\empty\dir"] or VOLUME ["/newdir"].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would feel much better about this if this code path for stripping and/or skipping volumes was limited to windows build versions of containerd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change in a separate commit. PTAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Stat check should be sufficient otherwise..

On Windows, if we don't strip the drive letter, fs.RootPath() will try to join the root path on the host, to the volume defined by the user, and we end up with C:\ProgramData\containerd\....truncated....\D: which is an invalid path on Windows and throws an error. The os.Stat() check will fail with an error that does not match os.IsNotExist().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh.. I meant if all this was hidden under a .windows.go then I wouldn't have to worry about it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have loved to hide this under a _windows.go, but Windows hosts can also create Linux containers 😅. This code path needs to be applied to Windows Containers On Windows (WCOW) and skipped on Linux Containers On Windows (LCOW).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithVolumes change looks great, trying to think of any esoteric edge cases though..

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely like this better.. :-) can you add some tests for the variations mentioned?

mountMap[filepath.Clean(v.HostPath)] = v.ContainerPath
}
opts = append(opts, customopts.WithVolumes(mountMap))
opts = append(opts, customopts.WithVolumes(mountMap, image.ImageSpec.OS))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need other fields in the ImageSpec over time if variants change or this gets more complicated.. suppose passing in just OS is fine for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface will be more consistent passing in the Platform object. We commonly use that type in our interfaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmcgowan Changes made. PTAL.

Failures seem unrelated.

@gabriel-samfira
Copy link
Contributor Author

definitely like this better.. :-) can you add some tests for the variations mentioned?

Will add some tests shortly! Thank you for the review @mikebrow !

@gabriel-samfira
Copy link
Contributor Author

I created a PR to update the volume-copy-up images:

Once that merges and gets pushed to ghcr.io/containerd, I can change existing tests and add new ones to account for the scenarios discussed here, as part of this PR.

@mikebrow would you mind having a look at the image changes in case I missed a relevant test case?

@mikebrow
Copy link
Member

I created a PR to update the volume-copy-up images:

Once that merges and gets pushed to ghcr.io/containerd, I can change existing tests and add new ones to account for the scenarios discussed here, as part of this PR.

@mikebrow would you mind having a look at the image changes in case I missed a relevant test case?

reviewed looks good... just need to rebase this one to pick up the merged test

Images may be created with a VOLUME stanza pointed to drive letters that
are not C:. Currently, an image that has such VOLUMEs defined, will
cause containerd to error out when starting a container.

This change skips copying existing contents to volumes that are not C:.
as an image can only hold files that are destined for the C: drive of a
container.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Windows systems are capable of running both Windows Containers and Linux
containers. For windows containers we need to sanitize the volume path
and skip non-C volumes from the copy existing contents code path. Linux
containers running on Windows and Linux must not have the path sanitized
in any way.

Supplying the targetOS of the container allows us to proprely decide
when to activate that code path.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira gabriel-samfira force-pushed the fix-non-c-volume branch 2 times, most recently from a7770fd to a4efcb0 Compare May 25, 2023 10:12
@gabriel-samfira
Copy link
Contributor Author

Tests were updated. Failures seem unrelated. PTAL.

Copy link
Contributor

@TBBle TBBle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits and ideas, but nothing blocking that leaps out at me. I didn't check the test results though.

@gabriel-samfira gabriel-samfira force-pushed the fix-non-c-volume branch 7 times, most recently from 63dff64 to c2d0a3f Compare May 26, 2023 06:54
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
@gabriel-samfira
Copy link
Contributor Author

@mikebrow kind reminder

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kzys
Copy link
Member

kzys commented Jun 9, 2023

Let me babysit some flaky tests and merge this PR...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

containerd trips over Dockerfile instruction VOLUME for Windows container image

9 participants