Skip to content

llbsolver: change references to evaluate lazily #870

@tonistiigi

Description

@tonistiigi

Buildkit solver works with arbitrary result types without knowing what they actually mean. In practice, the result reference is currently mapped to a cache reference that is mapped to a snapshot (+ optional blob) on disk.

The solver also defines an object called Remote

buildkit/solver/types.go

Lines 99 to 102 in f3b968c

type Remote struct {
Descriptors []ocispec.Descriptor
Provider content.Provider
}
that represents a stack of snapshots distributable as blobs. A remote can be implemented without local data being present(hence the name), eg. this is how cache import determines matches without pulling all the data.

Workers allow to convert between a local ref and a remote and vice versa.

Proposal

Instead of returning snapshot references ops should be able to return a reference that can be either a local cache reference or remote. If an op can return a remote that is preferred.

Most importantly, image source would just return a remote instead of an unpacked snapshot and cache loaded from remote locations would also initally only load the remote. Note that returning a remote in that case means that the actual image bytes were not pulled, just a pointer to them was returned.

All the ops using inputs should expect that a input could be a remote and not a cache.ImmutableRef. If they receive a remote but need to access the data directly (for example exec op) they can call a method on it that would extract it and change the type to immutableref. This method would be where the pull now actually happens and needs to be connected with the progress of the original vertex.

Same goes for exporters that should handle the new reference type.

The benefit of this is that it exposes cases where builder can now work on remote data without a slow pull and extract process. For example if you get a cache match then the builder will just return without pulling anything down. If you are using image exporter we can directly push from the remote, effectively just copying data between registries (or better with cross repo push support). For example, this can be common on adding existing images to manifest lists. In distributed mode #231 remotes can be used to pass data between workers without actually transferring the bits.

When data is required locally, the change should be invisible to end user.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions