Skip to content

tracker for "physically bound" containers #644

@cgwalters

Description

@cgwalters

Splitting this out from #128 and also from CentOS/centos-bootc#282

What we want to support is an opinionated way to "physically" embed (app) containers inside a (bootc) container.

From the UX point of view, a really key thing is there is one container image - keeping the problem domain of "versioning/mirroring" totally simple.

There's things like

FROM quay.io/centos-bootc/centos-bootc:stream9
RUN podman --storage-driver=vfs --root=/usr/share/containers/storage pull <someimage>
COPY somecontainer.container /usr/share/containers/systemd

as one implementation path. A lot of sub-issues here around nested overlayfs and whiteouts.

I personally think what would work best here actually is a model where we have an intelligent build process that does something like this - basically we should support a flow that takes the underlying layers (tarballs), and renames all the files to prefix with /usr/share/containers/storage/overlay or so, plus a bit that adds all the metadata as a final layer - this would help ensure that we never re-pull unchanged layers even for "physically" bound images.

IOW it'd look like

[base image layer 1]
[base image layer 2]
...
[embedded content layer 1, but with all files included renamed to prefix with /usr/share/containers/storage/overlay/<blobid>]
...
[embedded layer with everything else in /usr/share/containers/storage *except* the layers]
...

The big difference between this and RUN podman --root pull is that inherently that is going to result in a single "physical" layer in the bootc image, even if the input container image has multiple layers.

A reason I argue for this is that inherently RUN podman pull is (without forcing on stuff like podman build --timestamp going to be highly subject to "timestamp churn" on the random json files that podman creates, and that is going to mean every time the base image changes the client has to download these "physically embedded" images, even if logically they didn't change. Of course there's still outstanding bugs like containers/buildah#5592 that defeat layer caching in general.

However...note that this model "squashes" all the layers in the app images into one layer in the base image, so on the network, e.g. the base image used by an app changes, it will force a re-fetch of the entire app (all its layers), even if some of the app layers didn't change.

In other words, IMO this model breaks some of the advantages of the content-addressed storage in OCI by default. We'd need deltas to mitigate.

(For people using ostree-on-the-network for the host today, this is mitigated because ostree always behaves similarly to zstd:chunked and has static deltas; but I think we want to make this work with OCI)

Longer term though, IMO this approach clashes with the direction I think we need to take for e.g. configmaps - we really will need to get into the business of managing more than just one bootable container image, which leads to:

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/clientRelated to the client/CLIarea/installIssues related to `bootc install`enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions