[CI] Use sccache installed in docker image in xla build#153002
[CI] Use sccache installed in docker image in xla build#153002
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153002
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 18 Cancelled Jobs, 2 Unrelated FailuresAs of commit 7be1f26 with merge base 5fe58ab ( NEW FAILURE - The following job has failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge -f "Sparta!!! (i.e. let's test in trunk)" |
|
@pytorchbot merge -f "should be fine, xla is fine and I checked that the docker digest is the new one, canceled jobs are dup pull jobs" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot cherry-pick --onto release/2.7 -c critical |
The edited comment should have the info. The code change looks large, but its copied from the install_cache script that our docker images use https://github.com/pytorch/pytorch/blob/6a8006472e431f872ca40c7aad250b61105de583/.ci/docker/common/install_cache.sh#L42 Sccache stopped working on xla at some point near dec 17 2023. I am not sure what commit caused it. I think it was having trouble writing to the cache. Either way, there is an sccache already installed on the docker image, so we should use that instead of a binary from s3 which we're probably no longer sure where it came from/what commit it was built from The one in the docker image is installed here https://github.com/pytorch/xla/blob/69d438ee65cc250c974ca80edd80462ffbb2e163/.github/upstream/Dockerfile#L61 and is also very old, so I have pytorch/xla#9102 to update it sccache still not writing properly, i will investigate, but xla build currently broken after the above xla pr, and this should fix it Pull Request resolved: #153002 Approved by: https://github.com/malfet (cherry picked from commit cbcb57d)
Cherry picking #153002The cherry pick PR is at #153983 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
[CI] Use sccache installed in docker image in xla build (#153002) The edited comment should have the info. The code change looks large, but its copied from the install_cache script that our docker images use https://github.com/pytorch/pytorch/blob/6a8006472e431f872ca40c7aad250b61105de583/.ci/docker/common/install_cache.sh#L42 Sccache stopped working on xla at some point near dec 17 2023. I am not sure what commit caused it. I think it was having trouble writing to the cache. Either way, there is an sccache already installed on the docker image, so we should use that instead of a binary from s3 which we're probably no longer sure where it came from/what commit it was built from The one in the docker image is installed here https://github.com/pytorch/xla/blob/69d438ee65cc250c974ca80edd80462ffbb2e163/.github/upstream/Dockerfile#L61 and is also very old, so I have pytorch/xla#9102 to update it sccache still not writing properly, i will investigate, but xla build currently broken after the above xla pr, and this should fix it Pull Request resolved: #153002 Approved by: https://github.com/malfet (cherry picked from commit cbcb57d) Co-authored-by: Catherine Lee <csl@fb.com>
The edited comment should have the info. The code change looks large, but its copied from the install_cache script that our docker images use
pytorch/.ci/docker/common/install_cache.sh
Line 42 in 6a80064
Sccache stopped working on xla at some point near dec 17 2023. I am not sure what commit caused it. I think it was having trouble writing to the cache.
Either way, there is an sccache already installed on the docker image, so we should use that instead of a binary from s3 which we're probably no longer sure where it came from/what commit it was built from
The one in the docker image is installed here https://github.com/pytorch/xla/blob/69d438ee65cc250c974ca80edd80462ffbb2e163/.github/upstream/Dockerfile#L61 and is also very old, so I have pytorch/xla#9102 to update it
sccache still not writing properly, i will investigate, but xla build currently broken after the above xla pr, and this should fix it