Skip to content

Introduce Docker layer caching#1559

Merged
tynes merged 3 commits intoethereum-optimism:developfrom
mslipper:feature/layer-caching
Oct 12, 2021
Merged

Introduce Docker layer caching#1559
tynes merged 3 commits intoethereum-optimism:developfrom
mslipper:feature/layer-caching

Conversation

@mslipper
Copy link
Copy Markdown
Contributor

@mslipper mslipper commented Oct 10, 2021

Overview

This PR improves the performance of our integration tests by adding Docker layer caching. After running a couple of test builds on my fork, I was able to reduce the time taken by the "build services" step from 15 minutes to under 5.

Implementation

To make layer caching work I did the following things:

  1. Set up a new GH Actions cache step.
  2. Updated the build-ci.sh script to use BuildKit/BuildX, then configured the builder to store layers in the cache directory.
  3. Updated the Dockerfiles that depend on the builder image to support a custom registry URL. I had to do this in order to get the builder image out of BuildKit and into Docker since BuildKit isolates every image in its own container.

Caveats

  • The 15 minutes to under 5 result assumes a warm cache. Exact timings will vary based on which code is changed in each commit.
  • Worst-case build time is slightly increased (~1m) because of cache write overhead.
  • BuildX has a native GH Actions cache backend. However, I found that this backend did not reliably use the cache even when it could so I rolled my own.
  • @gakonst as far as I could tell the yarn build in the build-ci.sh script was superfluous so I removed it. Let me know if this is incorrect and I'll fix it. The tests all passed after removing it.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Oct 10, 2021

⚠️ No Changeset found

Latest commit: ed52b25

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added 2-reviewers M-ci Meta: ci related work A-ops Area: ops labels Oct 10, 2021
@snario snario requested review from gakonst and snario October 10, 2021 21:24
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 10, 2021

Codecov Report

Merging #1559 (ed52b25) into develop (ad4b432) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #1559   +/-   ##
========================================
  Coverage    76.71%   76.71%           
========================================
  Files           82       82           
  Lines         3032     3032           
  Branches       463      463           
========================================
  Hits          2326     2326           
  Misses         706      706           
Flag Coverage Δ
batch-submitter 61.74% <ø> (ø)
contracts 86.05% <ø> (ø)
core-utils 65.03% <ø> (ø)
data-transport-layer 37.86% <ø> (ø)
message-relayer 83.48% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad4b432...ed52b25. Read the comment docs.

Copy link
Copy Markdown
Contributor

@snario snario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me and really easy to understand. Had a couple questions and a code style comment.

@@ -1,4 +1,5 @@
FROM ethereumoptimism/builder AS builder
ARG LOCAL_REGISTRY=docker.io
FROM ${LOCAL_REGISTRY}/ethereumoptimism/builder AS builder
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a question, and not a comment on this PR, but do you know if it would be possible to also pull a tagged version of this builder image with an ARG too? We recently had some confusion on the team due to this being pushed as latest on every CI run.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we can pull a tagged version of the builder image using an ARG. Doing so introduces some funky behavior around using the ARG elsewhere in the build but it's totally doable. Here's the Docker documentation on doing this:

https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact

id: buildx
with:
version: latest
driver-opts: image=moby/buildkit:master,network=host
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does image=moby/buildkit:master basically mean "use the docker-container backend"? I was just curious how we toggle between the native binary and this option.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This directive specifies configures the docker-container backend. To toggle between backends, you'd specify the driver option.

I'm using the master version of buildkit here because of this bug, which requires changes both in buildkit and containerd. Unfortunately, GH Action's version of containerd is out of date and does not contain the fix. By specifying the version here the bug will hopefully fix itself as soon as GitHub updates containerd since we're already at the correct bulidkit version.

mslipper and others added 2 commits October 10, 2021 16:13
Co-authored-by: Liam Horne <liam@lihorne.com>
@mslipper
Copy link
Copy Markdown
Contributor Author

mslipper commented Oct 10, 2021

Now that the cache is warm, I rebuilt the integration tests and got the following results:

  • Total time (first/second build): 28m 52s/23m 43s
  • Service build time: (first/second build): 11m 56s/4m 29s

Turns out the --parallel option in the previous version of the script was being ignored, so even though this was a clean-slate build it was still faster than previously.

Copy link
Copy Markdown
Contributor

@gakonst gakonst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Matt and I synced on this offline. The warm/cold cache caveat seems fine to me.

@tynes tynes merged commit e5c2686 into ethereum-optimism:develop Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ops Area: ops M-ci Meta: ci related work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants