Proposal: embed build sources in image config

When an image has been built it is good to know what were the dependencies of the specific build. This allows figuring out if any of the dependencies have been updated and the build should be run again. Or maybe in the future this could be used as a way to pin the dependencies to a specific digest for reproducibility.

LLB has a `Source` operation for these cases: a container image, git commit, http URL, or local directory. Everything but the local directory can be tracked with immutable digest based only on the LLB definition.

When this immutable digest is computed in `CacheMap()` https://github.com/moby/buildkit/blob/1879325ec570a6c9dfd8f0670d700335fa6a0cb4/solver/types.go#L147, we can extend the return structure https://github.com/moby/buildkit/blob/1879325ec570a6c9dfd8f0670d700335fa6a0cb4/solver/types.go#L160 with extra information that is later added to the image config. Because the `solver` package is generic and doesn't know about LLB/snapshots I think it should just be a string map. I don't think it makes sense to reuse the existing `CacheOpts` field for this (@sipsma).

```
ResolveResponse map[string]string

{ "container-image://docker.io/library/alpine:3.13": "sha256:deadbeef" }
```

When solver runs the build it already stores the `CacheMap` value for all the vertexes running as part of the build. Before returning `CachedResult` https://github.com/moby/buildkit/blob/1879325ec570a6c9dfd8f0670d700335fa6a0cb4/solver/jobs.go#L508 it can walk back all the parent vertexes and gather their `ResolveResponse` values and combine them to a single structure that is returned out from the `Build()` function. The extra return value is needed because `Metadata` in `CachedResult` is not typed. Maybe it should be but that is for a different proposal.

Now this structure can be passed to the exporter. The image exporter will would add it as an extra field. As this is BuildKit specific, I think it makes sense to use similar as what we do with inline build-cache - use a single base64 encoded string with a buildkit specific name.

```
"moby.buildkit.buildinfo.v0": <base64>
```

Base64 decodes to

```
{ "sources": [
{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "pin": "sha256:"
}, 
{
   "type": "git",
   "ref": "github.com/docker/buildx#master",
   "pin": "sha1:deadbeef"
}
]
```

There is one special case to take into account. A frontend might have already transformed a string user typed before generating LLB. Eg. in Dockerfile this happens for `FROM` images because Dockerfile needs to load their image config in the frontend in order to access env/onbuild etc.  While doing that Dockerfile always adds digest to the image ref in order for the LLB solve to always point to the same image. So in LLB we already have the digest ref, but in the embedded buildinfo it would be better to show the original value.

The solution for thiss is that Dockerfile frontend can create its own `moby.buildkit.buildinfo.v0` key in the image config for the values it sees and then the image exporter can fix it up after full solve. This is similar to how the history array works atm by Dockerfile adding the command strings and exporter filling up dates etc. later in `patchImageConfig()`. Dockerfile can add a record like:

```
{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "alias": "docker.io/library/alpine:3.13@sha256:",
   "pin": "sha256:",
}, 
```
So that when now LLB adds a source for "docker.io/library/alpine:3.13@sha256:" it is fixed in exporter and `alpine:3.13` is used as original ref instead.

We can start by adding this frontend component in Dockerfile and extend it to support full LLB.

I think this can be enabled by default. There shouldn't be any security aspect of having access to the source images. Mostly this information is already in the history array with textual form. But we should provide a way to opt-out with a special key in `-o`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: embed build sources in image config #2269

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: embed build sources in image config #2269

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions