-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Buildkit solver works with arbitrary result types without knowing what they actually mean. In practice, the result reference is currently mapped to a cache reference that is mapped to a snapshot (+ optional blob) on disk.
The solver also defines an object called Remote
Lines 99 to 102 in f3b968c
| type Remote struct { | |
| Descriptors []ocispec.Descriptor | |
| Provider content.Provider | |
| } |
Workers allow to convert between a local ref and a remote and vice versa.
Proposal
Instead of returning snapshot references ops should be able to return a reference that can be either a local cache reference or remote. If an op can return a remote that is preferred.
Most importantly, image source would just return a remote instead of an unpacked snapshot and cache loaded from remote locations would also initally only load the remote. Note that returning a remote in that case means that the actual image bytes were not pulled, just a pointer to them was returned.
All the ops using inputs should expect that a input could be a remote and not a cache.ImmutableRef. If they receive a remote but need to access the data directly (for example exec op) they can call a method on it that would extract it and change the type to immutableref. This method would be where the pull now actually happens and needs to be connected with the progress of the original vertex.
Same goes for exporters that should handle the new reference type.
The benefit of this is that it exposes cases where builder can now work on remote data without a slow pull and extract process. For example if you get a cache match then the builder will just return without pulling anything down. If you are using image exporter we can directly push from the remote, effectively just copying data between registries (or better with cross repo push support). For example, this can be common on adding existing images to manifest lists. In distributed mode #231 remotes can be used to pass data between workers without actually transferring the bits.
When data is required locally, the change should be invisible to end user.