🚀 Feature
Motivation
The workflow to build docker images for CI today is a pain which involves
- Editing circleci configuration
- Reverting said configuration
- Copying the workflow ID from step 1
- Editing all related files to show updated tag as new workflow ID
- Adding workflow ID to ECR garbage collector so it doesn't clean it up
- Get things merged
Pitch
To make this easier I propose we take a two tiered approach:
- Calculate a hash of the docker image and all dependencies using a hashing algo (like
sha256sum), hash should change if any files change
- Check if hash already exists within our docker repositories for said image using
docker manifest inspect ${IMAGE}:${HASH}
- If hash exists noop, if it doesn't exist then build image
- Save hash to a env file that gets passed onto dependent jobs
- Have all jobs that use this image depend on the job that builds the image
Example script to calculate hash
#!/usr/bin/env bash
set -eou pipefail
HASH_DIR=${HASH_DIR:-docker}
find "${HASH_DIR}" -type f -exec sha256sum {} \; \
| sort \
| sha256sum \
| cut -d' ' -f1
Additional notes
- I feel like we should move off of ECR if at all possible, Docker hub allows us to not have to rely on ECR garbage collection
- With this I also propose we should eventually think of moving images that are currently built in
pytorch/builder to be built in this repo as well so that they can reap the benefits of this new docker builder workflow
cc @ezyang @seemethere @malfet
🚀 Feature
Motivation
The workflow to build docker images for CI today is a pain which involves
Pitch
To make this easier I propose we take a two tiered approach:
sha256sum), hash should change if any files changedocker manifest inspect ${IMAGE}:${HASH}Example script to calculate hash
Additional notes
pytorch/builderto be built in this repo as well so that they can reap the benefits of this new docker builder workflowcc @ezyang @seemethere @malfet