Skip to content

Support reading any/all images from docker-archive: tarballs#991

Merged
vrothberg merged 15 commits intocontainers:masterfrom
mtrmac:tarfile-read
Aug 20, 2020
Merged

Support reading any/all images from docker-archive: tarballs#991
vrothberg merged 15 commits intocontainers:masterfrom
mtrmac:tarfile-read

Conversation

@mtrmac
Copy link
Copy Markdown
Collaborator

@mtrmac mtrmac commented Jul 25, 2020

  • Add support for choosing an image to read from a docker-archive:, either by reference (using the existing syntax) or by index (docker-archive:/some/path:@0).
  • Add docker/archive.Reader that allows listing all images in a docker-archive-formatted file, and reading any/all of them, only copying/extracting the archive once.

See individual commit messages for details.

Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the direction and embedding the index into the reference!

// MultiImageSource manages a single Docker archive, allows listing its contents
// and accessing individual images with less overhead than creating image references
// individually (because the archive is, if necessary, copied or decompressed only once).
type MultiImageSource struct {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not entirely happy about this name - it’s not an ImageSource. docker/archive.Reader? MultiReader? MultiImageReader?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying docker/archive.Reader — it’s not just for multi-image readers, a caller that wants to open a single-image archive, extract the tag from the manifest, and read the single image, would benefit as well.

)

// ReadableArchive is a ((docker save)-formatted) tar archive that allows random access to any component.
type ReadableArchive struct {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name feels a bit weird, but so does internal/tarfile.Reader. I’m very much open to suggestions.

OTOH it’s internal, so it’s not as critical.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying tarfile.Reader, with variable name archive. The internal/tarfile.Reader/archive.Reader pair is somewhat symmetrical, but it might be too confusing; I’m not sure.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Jul 28, 2020

@vrothberg This should now be code-complete (full transport tests, leading to fixes in reference parsing), but I still need to rebase to move some changes across commits. And it remains mostly untested, hoping Podman’s tests will handle the basics.

@mtrmac mtrmac force-pushed the tarfile-read branch 4 times, most recently from 4d46560 to dcfdda0 Compare July 29, 2020 17:34
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Jul 29, 2020

@vrothberg Ready for review (but still basically untested).

@mtrmac mtrmac marked this pull request as ready for review July 29, 2020 17:37
@mtrmac mtrmac changed the title WIP: Support reading any/all images from docker-archive: tarballs Support reading any/all images from docker-archive: tarballs Jul 29, 2020
@vrothberg
Copy link
Copy Markdown
Member

Thanks, I'll test #998 and will comment there.

Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Aug 11, 2020

Needs a rebase

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 12, 2020

Rebased, now includes all of #998.

@vrothberg
Copy link
Copy Markdown
Member

@rhatdan ... merge me :)

Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add means to figure out if a given reference refers to a file? Ultimately, if it's a file then we have to create a Reader and query for the reference list.

Currently, I don't find any other way than checking if (ref).StringWithinTransport() is on disk or not. However, that's not ideal as it may very well be a typo or the file really doesn't exit.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 12, 2020

It wouldn’t be unprecedented… but I kind of think that the caller should know that in the first place. IIRC the only really ambiguous case is podman pull docker-archive:$path, where I’d argue “treat the transport:name syntax always as a single image” is closest to a consistent semantics.

Could we add means to figure out if a given reference refers to a file? Ultimately, if it's a file then we have to create a Reader and query for the reference list.

Currently, I don't find any other way than checking if (ref).StringWithinTransport() is on disk or not.

Well, if you want an ugly hack, the syntax is documented to split path:rest on a :. But I’d recommend against that.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 12, 2020

Added another commit, “Add docker/archive.NewReaderForReference”. DO NOT MERGE until it is verified it is useful for Podman.

@mtrmac mtrmac changed the title Support reading any/all images from docker-archive: tarballs DO NOT MERGE: Support reading any/all images from docker-archive: tarballs Aug 12, 2020
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 13, 2020

Also added Reader.ManifestTagsForReference to accommodate Podman’s handling of unqualified short names in the archives (which are only created by projectatomic/docker AFAIK). Ugh.

Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ... ready to merge from my side

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Aug 19, 2020

Need @mtrmac to remove the "DO NOT Merge:" from the commit message.

@mtrmac mtrmac changed the title DO NOT MERGE: Support reading any/all images from docker-archive: tarballs Support reading any/all images from docker-archive: tarballs Aug 19, 2020
mtrmac added 15 commits August 19, 2020 21:23
It's a possible user error, not a supposedly-unreachable internal
error.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that we can add more functionality to it without making it
public API.

Keep the original docker/tarfile.Source as a compatibility shim.

Similarly, move docker/tarfile/src_test.go as well, to keep it
close to the implementation being tested.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…archive

... instead of the forwarding shims.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…rfile

This is a private API now, so we can just drop it; rename the
...With[System]Context variants to use the shorter names, and update
all callers.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to eventually allow creating multiple Sources from a single
Reader (= a temporary file containing a seekable/uncompressed
copy of the archive).

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
At least docker/archive will need to deal with the two
optionally separately, so such a separate constructor must exist;
and the other callers are not much more complex if they separate
it as well.

This will also allow us to add reference lookup without having
to duplicate the API.

As a concession to the simpler callers, add a closeArchive parameter
(currently always true) that allows them not to worry about
the lifetime of the tarfile.Reader.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Every caller will need that data; this way it can be shared across
several consumers, and we don't need to synchronize access/creation
of the parsed data structure.

Should not change behavior, except that errors are now reported
earlier.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We can drop the unused error return value now.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We already accept the syntax for docker-archive: references,
now implement the lookup instead of warning and ignoring the value.

Implement the lookup in tarfile.Reader, not tarfile.Source,
because we will want to provide an API to obtain tags from a
Reader+Reference, without constructing a Source.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Add support for path:@index (e.g. path:@0, path:@1 ...) reference syntax
to docker-archive.

This will allow reading even untagged images from multi-image archives.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
with only two operations: Close(), and List() which
returns a set of ImageReference objects that allow accessing
the individual images.

For now, use of every reference triggers creation of a new
tarfile.Reader; that will be fixed momentarily.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
In archive.Reader, embed a reference to tarfile.Reader to
the created ImageReference objects, and use them in NewImageSource.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
To allow accessing the same image multiple times, if given
a (probably user-specified) ImageReference.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
to allow Podman default to localhost/$tag for unqualified $tag
values in docker-archive tarnballs.

(Note that only projectatomic/docker creates such tarballs;
almost no-one else should use this.)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 19, 2020

OK, rebased and “DO NOT MERGE” dropped (although I haven’t had time to fully review the Podman PR)

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Aug 19, 2020

LGTM
Merge once it passes the tests.

Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for the great work and your support, @mtrmac !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants