WIP - support multi-image docker archives#975
WIP - support multi-image docker archives#975vrothberg wants to merge 1 commit intocontainers:masterfrom
Conversation
|
libpod PR to illustrate how I imagine it to be used -> containers/podman#6811 |
mtrmac
left a comment
There was a problem hiding this comment.
Just a very quick first pass for now.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
@mtrmac, mind taking another look? I've been starring at it too long. |
mtrmac
left a comment
There was a problem hiding this comment.
From a quick look, this still seems to lean rather towards inheritance, almost towards monkey-patching rather than restructuring; most importantly I think we should not need two more ImageReference implementations.
I was thinking:
- For sources:
- Split the part of
tarfile.Sourcethat contains archive-wide state (tarPath,removeTarPathOnClose, maybe some of the layer info) into adocker/internal/tarfile.ReadableArchiveor so (no need to make this public? I might have to think about the public/private structure more. The name can certainly be better.) - Keep the existing public
tarfile.Sourceworking, on top of that, but also add a way to create it with a caller-suppliedReadableArchive(in which casetarfile.Sourcedoes not drive deleting temporary files). - Add a
*ReadableArchivetoarchiveReference; inarchive.newImageSource, ifref.readableArchiveis set, use it to create atarfile.Source; if not, use the old code path. - Add support for lookup by tag and ID/index/something to
archiveReference; it automatically applies both to the single-image and multi-image case. - Then make
MultiImageSource, which creates aReadableArchiveand provides an API to enumerate / create references.
- Split the part of
And similarly for destinations.
|
|
||
| // Reference returns an ImageReference embedding the MultiImageDestination. | ||
| func (m *MultiImageDestination) Reference() types.ImageReference { | ||
| ref := &archiveReference{path: m.path} |
There was a problem hiding this comment.
This does not include the tag
There was a problem hiding this comment.
Tags are set via AdditionalTags.
https://github.com/containers/libpod/pull/6811/files#diff-3303ff4a5ad91328550c7bee8df0be69R259
There was a problem hiding this comment.
Yes, but not on the reference = not in copy.Image error messages.
There was a problem hiding this comment.
OTOH there is still the case of saving untagged images, so references don’t always have any extra data anyway.
I’ll read all of this more carefully a bit later.
There was a problem hiding this comment.
Much appreciated, thanks!
I am on PTO tomorrow but would will go back to this on Monday morning. Ideally, we need to get the feature in next week.
There was a problem hiding this comment.
Can we please proceed?
There was a problem hiding this comment.
I’m sorry for not getting back, I did promise I would.
Still, #975 (review) was I think fairly clear to the general direction. See #991 for an unfinished prototype of the read side. Yes, the PR is larger, but it already includes the ability to read any (even untagged) image in an archive using a textual reference, and a lot of the “new” code is actually only moved — tarfile.Source was just split into two, the only non-trivial net new code at that layer is just chooseManifest.
There was a problem hiding this comment.
Thanks! I am still not sure how to proceed.
Now we have two PRs and I am worried we are running out of time; in fact, we might not get it into RHEL 8.3 any more.
If you agree, I want to focus on Podman-only needs first. We can still make follow-up cards for a more generally applicable solution, in case that will buy us some time. Certainly, the API shouldn't break.
| // Reset the repoTags to prevent them from leaking into a following | ||
| // image/manifest. | ||
| d.repoTags = []reference.NamedTagged{} | ||
| return nil |
There was a problem hiding this comment.
This is still stateful, then…
mtrmac
left a comment
There was a problem hiding this comment.
More detailed comments after a careful read, finally. Conceptually, I’d still prefer for the code paths to be shared as much as possible, instead of a layer on top that partially overrides behavior like the destination here.
| return s.loadTarManifest() | ||
| if s.manifests != nil { | ||
| return s.manifests, nil | ||
| } |
There was a problem hiding this comment.
This is somewhat contra to the documentation of the function.
| // MultiImageSourceItem is a reference to _one_ image in a multi-image archive. | ||
| // Note that MultiImageSourceItem implements types.ImageReference. It's a | ||
| // long-lived object that can only be closed via it's parent MultiImageSource. | ||
| type MultiImageSourceItem struct { |
There was a problem hiding this comment.
Conceptually, I’m concerned about having two ImageReference implementations with different behavior but the same string syntax. ref.Transport().ParseReference(ref.StringWithinTransport()) is documented to be “equivalent to” ref; that’s easiest to do when there is just one implementation, and not the case here (with a docker-archive:$path-formatted references that successfully access images in multi-image archives, but fail when used from the CLI).
| } | ||
|
|
||
| // Manifest returns the tarfile.ManifestItem. | ||
| func (m *MultiImageSourceItem) Manifest() (*tarfile.ManifestItem, error) { |
There was a problem hiding this comment.
Nothing but RepoTags is really usable by callers, see e.g. the use of Config in containers/podman#6811 ; so I’m a bit reluctant to commit to this as an API. OTOH, we clearly can keep this stable.
| // MultiImageDestinations allows for creating and writing to docker archives | ||
| // that include more than one image. | ||
| type MultiImageDestination struct { | ||
| *archiveImageDestination |
There was a problem hiding this comment.
Doesn’t this externally expose PutBlob and everything else?
|
|
||
| // Close is a NOP. Please use Finalize() for committing the archive and | ||
| // closing the underlying resources. | ||
| func (m *MultiImageDestination) Close() error { |
There was a problem hiding this comment.
Close() is conceptually a bit different from Finalize() (or ImageDestination.Commit); it allows cleaning up the temporary files even on error.
Sure, this is a possible way to structure the API, but it’s a bit inconvenient to use: A typically caller will use something like defer multiDest.Close() and might not even check for errors if there is already a failure with a more important root cause to preserve, whereas on success the caller really wants to check that multiDest.Finalize() did succeed. Having “commit” and “deallocate” be the same operation forces every such caller to have a committed flag that’s checked inside the defer, or to have a critical part of the process in a defer.
| // Reset the repoTags to prevent them from leaking into a following | ||
| // image/manifest. | ||
| d.repoTags = []reference.NamedTagged{} | ||
| return nil |
There was a problem hiding this comment.
… it’s also not documented that the caller is supposed to use (only) AddRepoTags after each PutManifest AFAICS.
|
Closing as @mtrmac took over. Thanks again! |
Add a
MultiImageArchive{Reader,Writer}todocker/archiveto supportdocker archives with more than one image.
To allow the new archive reader/writer to be used for copying images,
add an
Image{Destination,Source}tocopy.Options. When set, thedestination/source referenced will be ignored and the specified
Image{Destination,Source}will be used instead.Signed-off-by: Valentin Rothberg rothberg@redhat.com