Skip to content

Add MergeOp to LLB #1431

@tonistiigi

Description

@tonistiigi

This is a replacement proposal for #871 that proposed changing results to ref array to solve the same problems. The problem with that approach is that when subbuild results are reused with #1286 the client can't predict how many refs will be returned.

For example, client wants to do a subbuild and then run touch foo on top of that result. This is simple until subbuild returns a single ref, you can just convert that result to llb.State and run state.Run() on top of it. But should it return multiple results, the client would need to detect it, convert it into multiple states and then the only thing it could do is to use slow llb.Copy to join them into a single state that could be used as a rootfs for running the command. This is very slow, and we can't expect every client to write these complicated exceptions for every case. Subbuild likely is a black box to the main client/frontend.

Instead, an alternative solution is to add a new LLB operation type MergeOp.

MergeOp takes multiple inputs and layers them on top of each other as if files from a second input would be copied over the first one.

Eg. a Dockerfile

FROM alpine
COPY / /

Could either be written as

llb.Image("alpine").Copy(llb.Local(), "/", "/")

or

llb.Merge(llb.Image("alpine"), llb.Local())

The difference is that in the latter case, we have avoided the potentially expensive copy and cache chains are now tracked independently while previously the copy had a dependency on the alpine image.

The key to making MergeOp more efficient than Copy is that the implementation should be lazily evaluating and only do expensive work when the result of the merge is needed. By default MergeOp should return a reference that contains pointers to its input references without doing any work.

When reference from MergeOp is sent to the exporter, the differ can generally work fine with individual sub-snapshots. Eg. if two images are layered on top of each other, there is no need to run the differ again. Generally, if a sub-snapshot already has calculated blobs, they don't need to be reevaluated after merge.

When the merged reference does need to be mounted(eg. to run a container on top of it), things get a bit more complicated. Now it depends on the underlying snapshot implementation if this mount can be done efficiently or not. On overlay based snapshotters you can just take the lower dirs from each sub-snapshot and join them together for a single snapshot without doing a data move. For other implementations, copy can't be avoided anymore, and data needs to be duplicated now. In here, it is especially important that the copy does not happen before the mount is actually needed. Implementing this requires some significant changes to the cache ref manager. What the actual snapshot implementation does should be invisible to the LLB layer and solver/exporters.

This should also work well, lazy refs proposal #870 . It probably makes sense to implement them together. Also supporting stargz #1402 is somewhat related.

message MergeOp {
    repeated Partition partitions = 1;
}

message Partition {
    int64 input = 1 [(gogoproto.customtype) = "InputIndex", (gogoproto.nullable) = false];
    string addPrefix = 2;
    string trimPrefix = 3;
}

AddPrefix and TrimPrefix can be used to access or create subdirectories over the input references to support cases like COPY /in /out.

@hinshun @sipsma

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions