Skip to content

Support estargz compression type#2246

Merged
AkihiroSuda merged 1 commit intomoby:masterfrom
ktock:esgzcompression
Aug 25, 2021
Merged

Support estargz compression type#2246
AkihiroSuda merged 1 commit intomoby:masterfrom
ktock:esgzcompression

Conversation

@ktock
Copy link
Copy Markdown
Collaborator

@ktock ktock commented Jul 13, 2021

Currently, BuildKit supports lazy pulling of eStargz but not supports creating eStargz images. It would be great if it also supports the creation (there was also a discussion in slack recently). This commit adds support for compression=estargz to the cache as one of the compression variants.

buildctl build ... \
  --output type=image,name=docker.io/username/image,push=true,compression=estargz,oci-mediatypes=true

This commit also contains updates to docs to describe this usage.

@tonistiigi
Copy link
Copy Markdown
Member

Do I get correctly that this creates 2 blobs. One from differ and then tries to convert it. This should be avoided. PTAL containerd/containerd#4263 (especially the last section about adding a workaround if there is no full support).

@ktock
Copy link
Copy Markdown
Collaborator Author

ktock commented Jul 13, 2021

@tonistiigi Thank you for the comment.

Do I get correctly that this creates 2 blobs. One from differ and then tries to convert it. This should be avoided.

Yes, this creates one blob from differ and converts it. This might slower the overall time of estargz conversion but why I choose this approach is because it allows us to have this feature without having an additional (forked) implementation of diff.Comparer. Could you tell me why we should avoid this approach? --force-compression already does something similar. (I might have overlooked something)

PTAL containerd/containerd#4263 (especially the last section about adding a workaround if there is no full support).

Is there example/existing implementation of add a callback option to differ where I can pass the WriteCloser for compression. mentioned in containerd/containerd#4263 ?

@tonistiigi
Copy link
Copy Markdown
Member

--force-compression already does something similar. (I might have overlooked something)

My understanding was that this only happened if the blob with a different compression already existed from the previous build/pull. In that case, the behavior is expected. If that is not the only case we should check that there isn't a performance or storage regression on the default path.

Is there example/existing implementation of add a callback option to differ where I can pass the WriteCloser for compression.

No that would need to be an update in the differ method in containerd library. When I looked at it last time containerd only supported uncompressed/gzip. This is needed for zstd as well, this is the case that I created the issue for.

@ktock
Copy link
Copy Markdown
Collaborator Author

ktock commented Jul 13, 2021

@tonistiigi

My understanding was that this only happened if the blob with a different compression already existed from the previous build/pull. In that case, the behavior is expected.

Yes, the conversion by --force-compression happens only when the blob from the previous build/pull has the compression type different from the specified one.

If that is not the only case we should check that there isn't a performance or storage regression on the default path.

This patch creates 2 blobs only when the compression is estargz and doesn't affect the default path of gzip and uncompressed. Do you think that should be avoided even for estargz? If so I think we need to have a new diff.Comparer implementation that supports estargz. WDYT?

@tonistiigi
Copy link
Copy Markdown
Member

Yes, the conversion by --force-compression happens only when the blob from the previous build/pull has the compression type different from the specified one.

Thanks for confirming.

Do you think that should be avoided even for estargz? If so I think we need to have a new diff. Comparer implementation that supports estargz. WDYT?

Yes, it would be preferred to avoid creating 2 blobs. It takes more storage and slows things down without the user expecting it. We have this issue anyway when we want to add more compression methods like zstd. As containerd differ only supports gzip I think the simplest way is to extend it with a callback method that implements the compression that it will call instead of the builtin gzip routine. Eventually, they probably want to extend their stream processors capability but I think that is a bigger change(maybe you disagree?). FYI @fuweid

@ktock ktock marked this pull request as draft July 14, 2021 13:17
@ktock ktock force-pushed the esgzcompression branch from 7f822fc to 395d4c2 Compare July 14, 2021 13:18
cache/blobs.go Outdated
Comment on lines 131 to 137
descr, err = sr.cm.Differ.Compare(ctx, lower, upper,
diff.WithMediaType(mediaType),
diff.WithReference(sr.ID()),
diff.WithCompression(compressor),
)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi Added this API to the fork of containerd (ktock/containerd@11b4622). Now this patch doesn't create 2 blobs. I'll work on upstreaming this API.

@ktock ktock force-pushed the esgzcompression branch 7 times, most recently from 4f79c89 to de8a342 Compare July 20, 2021 01:49
go.mod Outdated
github.com/containerd/go-cni v1.0.2
github.com/containerd/go-runc v1.0.0
github.com/containerd/stargz-snapshotter v0.6.4
github.com/containerd/stargz-snapshotter/estargz v0.6.4
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump up to 0.7.0, and update Dockerfile as well

@ktock
Copy link
Copy Markdown
Collaborator Author

ktock commented Jul 27, 2021

#2246 (comment) has been merged in containerd. Updated this patch not to depend on the fork.
Also bumped up stargz snapshotter to v0.7.0

@ktock ktock force-pushed the esgzcompression branch 3 times, most recently from 284ef97 to 152998f Compare August 3, 2021 01:25
cache/blobs.go Outdated
mediaType = ocispec.MediaTypeImageLayer
case compression.Gzip:
mediaType = ocispec.MediaTypeImageLayerGzip
compressor = func(dest io.Writer, requiredMediaType string) (io.WriteCloser, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to set custom compressor for plain gzip?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this not to set the custom compressor.

return mt
}

func eStargzLayerConvertFunc(ctx context.Context, cs content.Store, desc ocispec.Descriptor) (*ocispec.Descriptor, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you move these stargz-only helper functions to a separate stargz file

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Moved this into a new file.

cache/remote.go Outdated
}
desc.Annotations[dslKey] = strings.Join(existingRepos, ",")
if dh, ok := sr.descHandlers[desc.Digest]; ok {
desc.Annotations = mergeEStargzAnnotations(dh.SnapshotLabels, desc.Annotations)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note. It would be better if this annotations load/save wouldn't need startgz-only exceptions everywhere and would be more generic. But I don't have a specific proposal how to improve this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the note. Fixed the patch to make it more generic. Now *immutableRef.addCompressionBlob() saves estargz-related annotations as well so we don't need estargz-only expections.

}
}
if esgz && !i.ociTypes {
logrus.Warn("estargz compression should be used with oci-mediatypes for enabling layer verification")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to make this an error? Or always imply ocitypes for stargz, at least if not set?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this to always imply ocitypes with a warning log.

@sipsma
Copy link
Copy Markdown
Collaborator

sipsma commented Aug 16, 2021

Haven't taken a look in depth yet, but one note, if #2273 gets merged before this PR please update the new TestGetRemote in manager_test.go to also create this new estargz mediatype (right now it covers gzip and uncompressed) so we can make sure there aren't any regressions in parallel blob creation.

@ktock
Copy link
Copy Markdown
Collaborator Author

ktock commented Aug 19, 2021

@sipsma Thank you for the note about the tests. Added estargz compression to TestGetRemote.

For more details about lazy pull with stargz/eStargz images, please refer to the docs on these repositories.

### Getting stargz/eStargz formatted images
## Obtaining stargz/eStargz images
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/obtain/create/ perhaps

}
}
if esgz && !i.ociTypes {
logrus.Warn("forcefully turning on oci-mediatype mode for estargz")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably s/forcefully/forcibly/

@AkihiroSuda
Copy link
Copy Markdown
Member

Sorry needs rebase

@ktock
Copy link
Copy Markdown
Collaborator Author

ktock commented Aug 24, 2021

Rebased. Also added tests to check the converted descriptor is valid and fixed estargz generation code to pass the tests.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@AkihiroSuda
Copy link
Copy Markdown
Member

3 LGTMs, merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants