containerd integration: Make build work#44079
Conversation
04ce54d to
ddc621e
Compare
|
I squashed a couple of commits as some of them were fix-ups in previous commits; kept the last two separate (but possibly the "runc" one should be squashed. |
|
@tonistiigi @rumpl PTAL (also at #44079 (comment)) |
This comment was marked as resolved.
This comment was marked as resolved.
|
Ah, dang; conflict in vendor.sum; I'll rebase |
|
done 👍 |
|
Not sure I've seen this before; |
|
/cc @tonistiigi |
|
Alright; this time it passed without the weird error; looks like a race indeed? @tonistiigi PTAL if this PR looks good to you 🙏 |
@thaJeztah fyi BuildKit tests in this workflow https://github.com/moby/moby/blob/master/.github/workflows/buildkit.yml don't run with containerd snapshotter atm. As discussed with @tonistiigi, this matrix should run both with dockerd worker and containerd integration. Basically we should change the worker being tested to make it based on a new matrix entry and also update docker daemon config accordingly. I can do it on @rumpl fork so you can cherry pick here. LGTY? |
cpuguy83
left a comment
There was a problem hiding this comment.
Code looks mostly good.
Also tested this out and it seems to work pretty well, even supports multi-platform output.
Logs are quite verbose after a build and looks like there's some transient error that occurs:
INFO[2022-11-01T21:53:03.707692329Z] [core] Subchannel Connectivity change to CONNECTING module=grpc
INFO[2022-11-01T21:53:03.707771334Z] [core] Subchannel picks a new address "localhost" to connect module=grpc
INFO[2022-11-01T21:53:03.707832438Z] [core] Channel Connectivity change to CONNECTING module=grpc
WARN[2022-11-01T21:53:03.707856239Z] [core] grpc: addrConn.createTransport failed to connect to {localhost localhost <nil> <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing only one connection allowed" module=grpc
INFO[2022-11-01T21:53:03.707904743Z] [core] Subchannel Connectivity change to TRANSIENT_FAILURE module=grpc
INFO[2022-11-01T21:53:03.707934945Z] [core] Channel Connectivity change to TRANSIENT_FAILURE module=grpc
ERRO[2022-11-01T21:53:03.708072154Z] healthcheck failed fatally
INFO[2022-11-01T21:53:03.708146258Z] [core] Channel Connectivity change to SHUTDOWN module=grpc
Not sure if that's related to this change or just the containerd snapshotter use-case as a whole.
b4d8b10 to
e0e2066
Compare
Yeah, for that specific case, thinking if (ideally) exporters should have a mechanism to "register" themselves, and some way to get that list. Similar to how (e.g.) graph drivers register themselves 🤔 |
|
However, I think in the end we still will need to support the We already have the |
|
Opened a vendor PR: #45095 |
|
Rebased and included the s3 cache support. Now all buildkit tests should pass. |
builder/builder-next/controller.go
Outdated
| "inline": inlineremotecache.ResolveCacheExporterFunc(), | ||
| "local": localremotecache.ResolveCacheExporterFunc(opt.SessionManager), | ||
| "gha": gha.ResolveCacheExporterFunc(), | ||
| "s3": s3remotecache.ResolveCacheExporterFunc(), |
There was a problem hiding this comment.
Comparing these two, I noticed that these are not "symmetrical";
- ❓ is it expected that the
inlinecache is included inExporters, but not inImporters? - (nit): perhaps we should sort both lists have the exporters/importers in the same order (we could just sort them alphabetically, then at least there's no bike-shedding about what order they should appear in).
There was a problem hiding this comment.
Sorting would definitely make sense!
Regarding inline cache - according to the buildkit's docs it embeds the cache into the image and is expected to be imported with the registry cache: https://github.com/moby/buildkit#inline-push-image-and-cache-together
There was a problem hiding this comment.
Yes, I was wondering if it was something like that 🤔 it was confusing me a bit though (we could even consider including a comment to describe that, if what you linked to is indeed the expected behaviour). /cc @crazy-max @jedevc (if you know)
There was a problem hiding this comment.
Yeah this is expected - they are assymetrical. Inline cache works fundamentally differently than the other types of cache though they expose the same interface (but we can import them in the same way).
There was a problem hiding this comment.
I would really prefer to keep s3/azure caching out of the main dockerd binary.
I'm tempted to say; open as a separate PR, as there's other caching that we may want to(?) add, but IIUC, it was currently needed to make the BuildKit CI pass (as it runs some tests that expect s3 to be supported?) For the caching, I was wondering how authentication is handled, and it looks like they use environment-variables that are supported by the SDKs. If we add support for those (guess this would be "automatically?"), at least we need to update the documentation for the daemon to mention the environment variables it may be using;
That said, perhaps we need to look at those;
|
|
As discussed in the call;
BuildKit CI was green before those patches were removed (but they allowed us to verify that everything worked) |
thaJeztah
left a comment
There was a problem hiding this comment.
LGTM (can't approve as this PR was originally opened by me)
@cpuguy83 @neersighted @rumpl PTAL if you're okay with the last iteration)
- Only use the image exporter in build if we don't use containerd
Without this "docker build" fails with:
Error response from daemon: exporter "image" could not be found
- let buildx know we support containerd snapshotter
- Pass the current snapshotter to the buildkit worker
If buildkit uses a different snapshotter we can't list the images any
more because we can't find the snapshot.
builder/builder-next: make ContainerdWorker a minimal wrapper
Note that this makes "Worker" a public field, so technically one could
overwrite it.
builder-next: reenable runc executor
Currently, without special CNI config the builder would
only create host network containers that is a security issue.
Using runc directly instead of shim is faster as well
as builder doesn’t need anything from shim. The overhead
of setting up network sandbox is much slower of course.
builder/builder-next: simplify options handling
Trying to simplify the logic;
- Use an early return if multiple outputs are provided
- Only construct the list of tags if we're using an image (or moby) exporter
- Combine some logic for snapshotter and non-snapshotter handling
Create a constant for the moby exporter
Pass a context when creating a router
The context has a 10 seconds timeout which should be more than enough to
get the answer from containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com> Co-authored-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
|
For transparency; this PR does add
I think it's okay to keep that change in, but I'll open a tracking ticket for further discussion on how we want to go forward with configuring these. |
|
Yay! |

Upstreaming
No conflicts, except for vendoring / vendor.sum
- A picture of a cute animal (not mandatory but encouraged)