Skip to content

docker-archive: read+write#998

Closed
mtrmac wants to merge 37 commits intocontainers:masterfrom
mtrmac:tarfile-integration
Closed

docker-archive: read+write#998
mtrmac wants to merge 37 commits intocontainers:masterfrom
mtrmac:tarfile-integration

Conversation

@mtrmac
Copy link
Copy Markdown
Collaborator

@mtrmac mtrmac commented Jul 29, 2020

This combines #991 and #996.

@vrothberg
Copy link
Copy Markdown
Member

So far it's looking good. I found some time to wire this PR into the Podman one. Writing works very conveniently. Will take a look into loading and report back; presumably tomorrow.

@vrothberg
Copy link
Copy Markdown
Member

Created a first draft based on this PR (containers/podman#6811). It's looking pretty good so far!

@mtrmac mtrmac force-pushed the tarfile-integration branch from 5a67f7f to 5697640 Compare August 1, 2020 20:06
@vrothberg
Copy link
Copy Markdown
Member

@mtrmac, can you have a look at containers/podman#6811 and check if the usage of the new interfaces is as intended? Once, we're cool there, I can add tests to it and then do the vendor dance.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 3, 2020

@mtrmac, can you have a look at containers/podman#6811 and check if the usage of the new interfaces is as intended?

Yes, looks fine.

Once, we're cool there, I can add tests to it and then do the vendor dance.

Note that the current write implementation is not correct yet, at least it generates wrong legacy-format data for images that share layers but differ in configs.

@mtrmac mtrmac force-pushed the tarfile-integration branch from 5697640 to cb6958c Compare August 6, 2020 23:45
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 6, 2020

@vrothberg This was updated based on new versions of the two constituent PRs. I assume we’ll want to review one of the PRs individually, after which I will include the extra changes from that first PR into the other one, and this integration PR will be closed without merging.

mtrmac added 21 commits August 7, 2020 02:07
We do support the legacy format nowadays.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that we can add more functionality to it without making it
public API.

Keep the original docker/tarfile.Destination as a compatibility shim.

ManifestItem is moved to the private package to avoid an import cycle,
but remains visible.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…cker/archive

... instead of the forwarding shims.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…rfile

his is a private API now, so we can just drop it; rename the
...With[System]Context varianteto use the shorter name, and update
all callers.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We will be splitting uses of these paths across two objects, so
make them a private API instead of a copy&pasted convention.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to eventually allow creating multiple Destinations from a single
Writer.

NOTE: This only splits the implementation into two, mostly at function
boundary; it does NOTHING to support writing multiple images; top-level
metadata is written independently on each Destination.PutManifest .
This will be fixed soon.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to follow the PutManifest flow.

Should not change behavior, nothing about the code was changed.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This still only supports one image, but we will fix that momentarily.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... only to make future commits easier to review.

This introduces a bit of inefficiency (creating an on-stack
ManifestItem only to copy it into a single-element array), but
that will be gone momentarily.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... from writeLegacyLayerMetadata to the non-legacy createManifest.
Now that we have a dedicated function for computing the path consistently,
introducing another reference to it does not hurt maintainability, and the
small efficiency gain of computing the path only once is not really worth
coupling the legacy/non-legacy code, especially when the legacy code
is going to get more complex.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
NOTE: This is not sufficient to create correct multi-image archives
yet, the legacy format is still invalid.

This will eventually allow creating multiple Destinations from a single
Writer.

Should not change behavior for current callers, except that possible
JSON failures are now reported later.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that we can later only do this for layers that haven't been
created yet.

Should not change behavior, apart from timing of some reported errors.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... because it will soon depend on the rest of layerConfig.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that the caller does not have to care about lastLayerID and
empty images.

This preserves the current behavior of silently ignoring tags intended
for empty images. That's probably not quite right, but also not a subject
of this PR.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Now that both the legacy and non-legacy format can incrementally add
images to a single archive, this will allow creating multiple
Destinations from a writer Writer.

Image IDs are now generated differently; that may be observable by
very old Docker versions.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
At least docker/archive will need to deal with the two
optionally separately, so such a separate constructor must exist;
and the other callers are not much more complex if they separate
it as well.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This does not really allow _concurrency_ because we are streaming
the data to a single io.Writer, but it gives us safety against concurrent
callers (hypothetically when copying multiple images, at least).

Note that we do not currently set HasThreadSafePutBlob, although
we could, because the benefit is probably fairly small (basically it would
parallelize creating on-disk copies of streamed inputs with unknown
size or digest); that might be a wrong guess.

Also adds a sanity check against using the Writer after Finish().

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We will add another caller in the next commit.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... which allows creating ImageReferece objects that all write
to a shared tarfile.Writer.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
It's a possible user error, not a supposedly-unreachable internal
error.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added 16 commits August 7, 2020 02:07
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that we can add more functionality to it without making it
public API.

Keep the original docker/tarfile.Source as a compatibility shim.

ManifestItem is moved to the private package to avoid an import cycle,
but remains visible.

docker/tarfile/src_test.go should logically be moved as well, but we
can't do that yet due to the import cycle.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…archive

... instead of the forwarding shims.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…rfile

This is a private API now, so we can just drop it; rename the
...With[System]Context variants to use the shorter names, and update
all callers.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to eventually allow creating multiple Sources from a single
Reader (= a temporary file containing a seekable/uncompressed
copy of the archive).

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
At least docker/archive will need to deal with the two
optionally separately, so such a separate constructor must exist;
and the other callers are not much more complex if they separate
it as well.

This will also allow us to add reference lookup without having
to duplicate the API.

As a concession to the simpler callers, add a closeArchive parameter
(currently always true) that allows them not to worry about
the lifetime of the tarfile.Reader.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Every caller will need that data; this way it can be shared across
several consumers, and we don't need to synchronize access/creation
of the parsed data structure.

Should not change behavior, except that errors are now reported
earlier.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We can drop the unused error return value now.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We already accept the syntax for docker-archive: references,
now implement the lookup instead of warning and ignoring the value.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Add support for path:@index (e.g. path:@0, path:@1 ...) reference syntax
to docker-archive.

This will allow reading even untagged images from multi-image archives.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
with only two operations: Close(), and List() which
returns a set of ImageReference objects that allow accessing
the individual images.

For now, use of every reference triggers creation of a new
tarfile.Reader; that will be fixed momentarily.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
In archive.Reader, embed a reference to tarfile.Reader to
the created ImageReference objects, and use them in NewImageSource.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
, closer to the implementation being tested.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the tarfile-integration branch from cb6958c to d7c80f4 Compare August 7, 2020 00:19
@vrothberg
Copy link
Copy Markdown
Member

@mtrmac, I am currently combining a review of the PRs with updating the open PR in c/podman. I'll add some tests on top etc.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 10, 2020

FWIW I do plan to actually test the code modified by these PRs sometime later today. Probably not automated tests at this point, though.

@vrothberg
Copy link
Copy Markdown
Member

FWIW I do plan to actually test the code modified by these PRs sometime later today. Probably not automated tests at this point, though.

Feel free to share ideas for tests in containers/podman#6811

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Aug 11, 2020

All of the post-merge changes in this PR have been included in #991, closing.

@mtrmac mtrmac closed this Aug 11, 2020
@mtrmac mtrmac deleted the tarfile-integration branch August 11, 2020 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants