[RELAND] [RELAND] .circleci: Improve docker image build workflow#38335
Closed
seemethere wants to merge 2 commits intopytorch:masterfrom
Closed
[RELAND] [RELAND] .circleci: Improve docker image build workflow#38335seemethere wants to merge 2 commits intopytorch:masterfrom
seemethere wants to merge 2 commits intopytorch:masterfrom
Conversation
…ocker image build workflow" This reverts commit 6e66e85.
💊 CI failures summary and remediationsAs of commit edbdc85 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 10 times. |
6c50de3 to
cc99a12
Compare
malfet
approved these changes
May 12, 2020
Contributor
|
Can you please pull this one into |
Member
Author
|
Running |
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc99a12 to
edbdc85
Compare
Contributor
facebook-github-bot
left a comment
There was a problem hiding this comment.
@seemethere is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Contributor
|
@seemethere merged this pull request in cdf4d42. |
This was referenced May 14, 2020
facebook-github-bot
pushed a commit
that referenced
this pull request
May 14, 2020
Summary: Previous attempts to get this right: * #38335 * #38279 * #37976 This tag kept getting deleted before the docker image ci workflow could be merged causing it to have upstream breakages. It'd be best to make sure the garbage collector just doesnt garbage collect it. This is a pre-step to merge #38484 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: #38483 Differential Revision: D21577359 Pulled By: seemethere fbshipit-source-id: c4e0709bd8fff8f24a988b60eaa9f8c01576ef2f
facebook-github-bot
pushed a commit
that referenced
this pull request
May 15, 2020
Summary: closes #37855 Relies on #38483 Previous attempts to get this right: * #38335 * #38279 * #37976 This reverts commit 8063960. Improves the docker image build workflow from many steps to basically transparent from a user's perspective. To update docker images now all one has to do is edit the .circleci/docker folder and it will update automatically and also dynamically add the tags to the list of tags to keep from the garbage collector. Adding a new image will currently stay the same but we can explore doing that dynamically as well. How the build workflow works: - Docker tags are determined by the hash defined from git for the .circleci/docker sub-directory (extracted using git rev-parse) - Images are only built if the computed hash is not found in ecr and the hash is different than the previously computed hash. The previously computed hash is found using the same process as before but subbing out HEAD for the merge base between HEAD and the base git revision - That tag is then passed through the jobs using a shared workspace which is added to downstream jobs using the circleci ${BASH_ENV} How the new garbage collection works: - Tags to keep are generated by stepping through all of the commits in in the .circleci/docker subdirectory Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: #38484 Differential Revision: D21585458 Pulled By: seemethere fbshipit-source-id: 37792a1e0f5e5531438c4ae61507639c133aa76d
This was referenced May 18, 2020
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
…orch#38335) Summary: This reverts commit 401ffb0. Two things learned from the previous reland: * `cirlceci-agent step halt` doesn't actually halt the step in place, you must explicitly exit the step after the `step halt` is called * Even though `circleci` uses `git` to checkout repositories inside of docker images, that does not mean `git` is available after the fact. <details> <summary> Changes from previous reland </summary> ```patch commit cc99a12 Author: Eli Uriegas <eliuriegas@fb.com> Date: Tue May 12 10:58:18 2020 -0700 .cirlceci: Install git for gc, exit step explicitly Signed-off-by: Eli Uriegas <eliuriegas@fb.com> diff --git a/.circleci/config.yml b/.circleci/config.yml index 481d788..856a0fb 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -2018,13 +2018,15 @@ jobs: export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1} eval $(aws ecr get-login --no-include-email --region us-east-1) set -x + PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # Check if image already exists, if it does then skip building it if docker manifest inspect "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/${IMAGE_NAME}:${DOCKER_TAG}"; then circleci-agent step halt + # circleci-agent step halt doesn't actually halt the step so we need to + # explicitly exit the step here ourselves before it causes too much trouble + exit 0 fi - PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # If no image exists but the hash is the same as the previous hash then we should error out here - # no stampeding herd effect plz. if [[ ${PREVIOUS_DOCKER_TAG} = ${DOCKER_TAG} ]]; then echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch" echo " contact the PyTorch team to restore the original images" diff --git a/.circleci/ecr_gc_docker/Dockerfile b/.circleci/ecr_gc_docker/Dockerfile index d0198ac..36347d5 100644 --- a/.circleci/ecr_gc_docker/Dockerfile +++ b/.circleci/ecr_gc_docker/Dockerfile @@ -1,6 +1,6 @@ FROM ubuntu:16.04 -RUN apt-get update && apt-get install -y python-pip && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log +RUN apt-get update && apt-get install -y git python-pip && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log ADD requirements.txt /requirements.txt diff --git a/.circleci/verbatim-sources/docker_jobs.yml b/.circleci/verbatim-sources/docker_jobs.yml index e04d11c..3918cc0 100644 --- a/.circleci/verbatim-sources/docker_jobs.yml +++ b/.circleci/verbatim-sources/docker_jobs.yml @@ -35,13 +35,15 @@ export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1} eval $(aws ecr get-login --no-include-email --region us-east-1) set -x + PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # Check if image already exists, if it does then skip building it if docker manifest inspect "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/${IMAGE_NAME}:${DOCKER_TAG}"; then circleci-agent step halt + # circleci-agent step halt doesn't actually halt the step so we need to + # explicitly exit the step here ourselves before it causes too much trouble + exit 0 fi - PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # If no image exists but the hash is the same as the previous hash then we should error out here - # no stampeding herd effect plz. if [[ ${PREVIOUS_DOCKER_TAG} = ${DOCKER_TAG} ]]; then echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch" echo " contact the PyTorch team to restore the original images" ``` </details> Pull Request resolved: pytorch#38335 Differential Revision: D21536269 Pulled By: seemethere fbshipit-source-id: 5577f84fa49dd6e1e88fce461646fd68be3d417d
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: Previous attempts to get this right: * pytorch#38335 * pytorch#38279 * pytorch#37976 This tag kept getting deleted before the docker image ci workflow could be merged causing it to have upstream breakages. It'd be best to make sure the garbage collector just doesnt garbage collect it. This is a pre-step to merge pytorch#38484 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: pytorch#38483 Differential Revision: D21577359 Pulled By: seemethere fbshipit-source-id: c4e0709bd8fff8f24a988b60eaa9f8c01576ef2f
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: closes pytorch#37855 Relies on pytorch#38483 Previous attempts to get this right: * pytorch#38335 * pytorch#38279 * pytorch#37976 This reverts commit 46bc26c. Improves the docker image build workflow from many steps to basically transparent from a user's perspective. To update docker images now all one has to do is edit the .circleci/docker folder and it will update automatically and also dynamically add the tags to the list of tags to keep from the garbage collector. Adding a new image will currently stay the same but we can explore doing that dynamically as well. How the build workflow works: - Docker tags are determined by the hash defined from git for the .circleci/docker sub-directory (extracted using git rev-parse) - Images are only built if the computed hash is not found in ecr and the hash is different than the previously computed hash. The previously computed hash is found using the same process as before but subbing out HEAD for the merge base between HEAD and the base git revision - That tag is then passed through the jobs using a shared workspace which is added to downstream jobs using the circleci ${BASH_ENV} How the new garbage collection works: - Tags to keep are generated by stepping through all of the commits in in the .circleci/docker subdirectory Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: pytorch#38484 Differential Revision: D21585458 Pulled By: seemethere fbshipit-source-id: 37792a1e0f5e5531438c4ae61507639c133aa76d
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This reverts commit 6e66e85.
Two things learned from the previous reland:
cirlceci-agent step haltdoesn't actually halt the step in place, you must explicitly exit the step after thestep haltis calledcircleciusesgitto checkout repositories inside of docker images, that does not meangitis available after the fact.Changes from previous reland
commit cc99a12c9029472bd73325876bc0e9dbb1746b05 Author: Eli Uriegas <eliuriegas@fb.com> Date: Tue May 12 10:58:18 2020 -0700 .cirlceci: Install git for gc, exit step explicitly Signed-off-by: Eli Uriegas <eliuriegas@fb.com> diff --git a/.circleci/config.yml b/.circleci/config.yml index 481d7889da..856a0fb10a 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -2018,13 +2018,15 @@ jobs: export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1} eval $(aws ecr get-login --no-include-email --region us-east-1) set -x + PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # Check if image already exists, if it does then skip building it if docker manifest inspect "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/${IMAGE_NAME}:${DOCKER_TAG}"; then circleci-agent step halt + # circleci-agent step halt doesn't actually halt the step so we need to + # explicitly exit the step here ourselves before it causes too much trouble + exit 0 fi - PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # If no image exists but the hash is the same as the previous hash then we should error out here - # no stampeding herd effect plz. if [[ ${PREVIOUS_DOCKER_TAG} = ${DOCKER_TAG} ]]; then echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch" echo " contact the PyTorch team to restore the original images" diff --git a/.circleci/ecr_gc_docker/Dockerfile b/.circleci/ecr_gc_docker/Dockerfile index d0198acb86..36347d5e6d 100644 --- a/.circleci/ecr_gc_docker/Dockerfile +++ b/.circleci/ecr_gc_docker/Dockerfile @@ -1,6 +1,6 @@ FROM ubuntu:16.04 -RUN apt-get update && apt-get install -y python-pip && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log +RUN apt-get update && apt-get install -y git python-pip && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log ADD requirements.txt /requirements.txt diff --git a/.circleci/verbatim-sources/docker_jobs.yml b/.circleci/verbatim-sources/docker_jobs.yml index e04d11c5cd..3918cc04ae 100644 --- a/.circleci/verbatim-sources/docker_jobs.yml +++ b/.circleci/verbatim-sources/docker_jobs.yml @@ -35,13 +35,15 @@ export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1} eval $(aws ecr get-login --no-include-email --region us-east-1) set -x + PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # Check if image already exists, if it does then skip building it if docker manifest inspect "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/${IMAGE_NAME}:${DOCKER_TAG}"; then circleci-agent step halt + # circleci-agent step halt doesn't actually halt the step so we need to + # explicitly exit the step here ourselves before it causes too much trouble + exit 0 fi - PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker") # If no image exists but the hash is the same as the previous hash then we should error out here - # no stampeding herd effect plz. if [[ ${PREVIOUS_DOCKER_TAG} = ${DOCKER_TAG} ]]; then echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch" echo " contact the PyTorch team to restore the original images"