Correctly compute UncompressedSize on zstd:chunked pull, don’t set it on estargz#2130
Conversation
|
Cc: @giuseppe . This should fix containers/skopeo#2437 (comment) . |
|
I picked this into my pet pr, and it works slightly better, but now it breaks in the podman additional-store test: Basically, $ hack/bats -T --root 010
...
same failure, down to the same hashes |
|
Skopeo is involved, though (for the copy to the store dir), and I haven't rebuilt skopeo, so it's possible that the bad thing is happening in that step. |
|
Thanks! I can‘t see how the two-ID output can happen. I’ll set up a VM and test (a better version of) this PR, as well as those tests, tomorrow. |
|
PR updated to compute the size correctly for zstd:chunked with tar-split, manually tested per containers/skopeo#2437 (comment) (and inspecting the created layer metadata). Ready for review. |
Testing that on a slightly unclean VM, Podman commit 2aacd4e212525db4ee06be8e44e4405400d4df9d + this c/storage fix passes 010 successfully in And, on the command line, |
|
Scratch that last part, that’s not invoking the tested version. |
|
Ugh. The podman-pause thing is my fault, I couldn't find a good solution to a nasty problem. Please try this: |
cgwalters
left a comment
There was a problem hiding this comment.
What about adding the uncompressed size to the ToC? Seems like it'd possibly simplify things?
This looks sane to me but I only gave it a superficial review.
pkg/chunked/compression_linux.go
Outdated
| return nil | ||
| } | ||
|
|
||
| // tarSizeFromTarSplit computes the total tarball size, computing only tarSplit |
There was a problem hiding this comment.
| // tarSizeFromTarSplit computes the total tarball size, computing only tarSplit | |
| // tarSizeFromTarSplit computes the total tarball size using only the tarSplit metadata |
There was a problem hiding this comment.
Thanks, I don’t know what I was thinking.
| case storage.FileType: | ||
| // entry.Size is the “logical size”, which might not be the physical size for sparse entries; | ||
| // but the way tar-split/tar/asm.WriteOutputTarStream combines FileType entries and returned files contents, | ||
| // sparse files are not supported. |
There was a problem hiding this comment.
But we don't error out today?
There was a problem hiding this comment.
If we should be erroring out, that’s something that should happen at tar-split construction.
The physical size outright isn’t available in the tar-split format, so here this consumer is only documenting the impact of that assumption.
(As is, sparse files are barely supported by archive/tar or the tar-split fork: they are indicated either by a special file type, or as a regular file a special PAX record; and there is no API to expose the data / hole segments. Also, c/storage/pkg/archive does not special-case them at all.)
Anyway, I’m open to adding a comment to the |
The current value obtained by summing the sizes of regular file contents does not match the size of the uncompressed layer tarball. We don't have a convenient source to compute the correct size for estargz without pulling the full layer and defeating the point; so we must allow for the size being unknown. For recent zstd:chunked images, we have the full tar-split, so we can compute the correct size; that will happen in the following commits. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Empty tar-split shouldn't ever happen, but being precise here doesn't hurt. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
|
On a VM manually building things: All of:
pass Similarly, containers/podman@9d8e3b0 passes Either way, I haven’t been able to reproduce this:
I have now filed containers/podman#24265 to exercise that in the usual environment. Is there anything else I should be looking at? |
> go mod edit -replace github.com/containers/storage=github.com/mtrmac/storage@chunked-size > go mod tidy && go mod vendor + a HACK to override the bloat check Signed-off-by: Miloslav Trmač <mitr@redhat.com>
|
@mtrmac please forgive me, I've switched gears for today and have commitments I must attend to. The crucial element I don't see in your comment is running the full CI suite using |
|
Thanks, I was looking for references to |
|
Thank you. That's really pretty huge. |
|
/lgtm |
| // - If UncompressedDigest is set, this must be set to a valid value. | ||
| // - Otherwise, if TOCDigest is set, this is either valid or -1. | ||
| // - If neither of this digests is set, this should be treated as if it were | ||
| // an uninitialized value. |
There was a problem hiding this comment.
You need indentation for list items (same as for code blocks, i.e. an extra space).
| // - If UncompressedDigest is set, this must be set to a valid value. | |
| // - Otherwise, if TOCDigest is set, this is either valid or -1. | |
| // - If neither of this digests is set, this should be treated as if it were | |
| // an uninitialized value. | |
| // - If UncompressedDigest is set, this must be set to a valid value. | |
| // - Otherwise, if TOCDigest is set, this is either valid or -1. | |
| // - If neither of this digests is set, this should be treated as if it were | |
| // an uninitialized value. |
There was a problem hiding this comment.
Yeah, gofmt didn’t want to help with this (and struct field comments are formatted in HTML just as an uninterpreted code block), so I punted.
You’re right, this is the right thing to do. Fixed in #2136 .
| if tarSplit != nil { | ||
| uncompressedTarSize, err = tarSizeFromTarSplit(tarSplit) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("computing size from tar-split") |
There was a problem hiding this comment.
nit: errors.New, or fmt.Errorf("computing site from tar-split: %w", err), or just err.
Note to self (and in case it is useful to others): Commit “DO NOT MERGE: Test with a zstd:chunked testimage” in containers/podman#24287 . |
Is there more info on that? Sounds worth adding to |
|
@cgwalters That’s not directly a c/storage property; it’s a c/image choice of a a deterministic image ID, in https://github.com/containers/image/blob/cba49408c0ea237a6aa6dba5e81b74f4a8f23480/storage/storage_dest.go#L671-L684 . Yes, we might eventually need a user-facing explanation; I’m not sure where is a good place for it, the internal c/* projects are not too likely to be known to users. Maybe a Podman blog post. |
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The current value obtained by summing the sizes of regular file contents does not match the size of the uncompressed layer tarball.
We don't have a convenient source to compute the correct size for estargz without pulling the full layer and defeating the point.
For recent zstd:chunked images, we have the full tar-split, so we can compute the correct size;
for now, this doesn't do that. That might slow down image size computation.Absolutely untested, and we probably do want the tar-split-based computation to happen.