Skip to content

Add workaround for nvcc header dependecies bug#62550

Closed
peterbell10 wants to merge 54 commits intogh/peterbell10/110/basefrom
gh/peterbell10/110/head
Closed

Add workaround for nvcc header dependecies bug#62550
peterbell10 wants to merge 54 commits intogh/peterbell10/110/basefrom
gh/peterbell10/110/head

Conversation

@peterbell10
Copy link
Copy Markdown
Collaborator

@peterbell10 peterbell10 commented Aug 1, 2021

Stack from ghstack:

I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running ninja -d explain shows

ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty

considering ninja is working relative to the build folder, these files don't
actually exist. I traced this back to the output of nvcc -MD containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the .d file before ninja looks at it. To use it, I run the build with

export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

Differential Revision: D31503351

I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Aug 1, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit d6bba04 (more details on the Dr. CI page):


  • 17/17 failures introduced in this PR

🕵️ 17 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build docker-pytorch-linux-xenial-py3-clang5-android-ndk-r19c (1/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7 (2/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-bionic-rocm4.2-py3.6 (3/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.2-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.2-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-bionic-py3.6-clang9 (4/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-xenial-py3.6-gcc5.4 (5/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-xenial-py3-clang5-asan (6/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-bionic-rocm4.1-py3.6 (7/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.1-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.1-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-bionic-rocm4.3.1-py3.6 (8/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.3.1-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.3.1-py3.6:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See CircleCI build docker-pytorch-linux-xenial-py3.6-gcc7 (9/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc7:95cdae0bef25f951661d3fb0458cbfc86ca2898e
no such manifest: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc7:95cdae0bef25f951661d3fb0458cbfc86ca2898e
++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
+ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
95cdae0bef25f951661d3fb0458cbfc86ca2898e
+++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
+ PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
+ [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

See GitHub Actions build linux-xenial-py3.6-clang7-asan / build (10/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:19.2538830Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:19.2303085Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.2503569Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.2504601Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.2516668Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.2521200Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.2533569Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.2535088Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:19.2536709Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:19.2537724Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:19.2538165Z + exit 1
2021-10-08T02:03:19.2538830Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:19.2539503Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:19.2543501Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:19.2661455Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:19.2662010Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:19.2672643Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:19.2672963Z env:
2021-10-08T02:03:19.2673470Z   BUILD_ENVIRONMENT: linux-xenial-py3.6-clang7-asan
2021-10-08T02:03:19.2674498Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang7-asan
2021-10-08T02:03:19.2675556Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
2021-10-08T02:03:19.2676544Z   XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla

See GitHub Actions build periodic-pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7-slow-gradcheck / build (11/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:19.0614096Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:19.0381529Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.0581012Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.0581950Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.0594031Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.0598013Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.0609884Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.0610785Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:19.0612164Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:19.0613027Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:19.0613451Z + exit 1
2021-10-08T02:03:19.0614096Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:19.0614729Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:19.0618419Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:19.0731880Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:19.0732467Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:19.0742759Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:19.0743061Z env:
2021-10-08T02:03:19.0744166Z   BUILD_ENVIRONMENT: periodic-pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7-slow-gradcheck
2021-10-08T02:03:19.0745880Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7
2021-10-08T02:03:19.0747032Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
2021-10-08T02:03:19.0747892Z   XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla

See GitHub Actions build linux-xenial-py3.6-clang7-onnx / build (12/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:18.8133785Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:18.7899621Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:18.8099660Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:18.8100845Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:18.8112893Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:18.8116629Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:18.8129073Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:18.8130066Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:18.8131588Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:18.8132711Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:18.8133142Z + exit 1
2021-10-08T02:03:18.8133785Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:18.8134456Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:18.8138094Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:18.8251193Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:18.8251686Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:18.8262187Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:18.8262501Z env:
2021-10-08T02:03:18.8262996Z   BUILD_ENVIRONMENT: linux-xenial-py3.6-clang7-onnx
2021-10-08T02:03:18.8263993Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang7-onnx
2021-10-08T02:03:18.8265025Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
2021-10-08T02:03:18.8265900Z   XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla

See GitHub Actions build linux-xenial-cuda11.3-py3.6-gcc7 / build (13/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:17.2805141Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:17.2565635Z + [[ 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0 = \d\4\4\4\6\3\3\1\7\e\4\a\2\c\2\8\e\8\f\4\7\4\7\4\6\7\3\2\f\c\e\b\c\4\f\2\0\c\4\2 ]]
2021-10-08T02:03:17.2567994Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:17.2769812Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:17.2770902Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:17.2783415Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:17.2787075Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:17.2799735Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:17.2800937Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:17.2802248Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:17.2803704Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:17.2805141Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:17.2806121Z + exit 1
2021-10-08T02:03:17.2806535Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:17.2811188Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:17.2932345Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:17.2932899Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:17.2944010Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:17.2944313Z env:
2021-10-08T02:03:17.2944832Z   BUILD_ENVIRONMENT: linux-xenial-cuda11.3-py3.6-gcc7
2021-10-08T02:03:17.2945981Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7
2021-10-08T02:03:17.2947239Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2

See GitHub Actions build linux-vulkan-bionic-py3.6-clang9 / build (14/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:22.9415338Z + echo 'ERROR: Som...'\''t available for the merge-base of your branch'
2021-10-08T02:03:22.9167688Z ++ git rev-parse HEAD
2021-10-08T02:03:22.9179178Z + [[ 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0 = \d\4\4\4\6\3\3\1\7\e\4\a\2\c\2\8\e\8\f\4\7\4\7\4\6\7\3\2\f\c\e\b\c\4\f\2\0\c\4\2 ]]
2021-10-08T02:03:22.9182649Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:22.9381395Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:22.9382530Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:22.9394485Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:22.9398174Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:22.9410421Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:22.9411831Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:22.9413644Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:22.9415338Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:22.9416839Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:22.9417574Z + exit 1
2021-10-08T02:03:22.9418216Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:22.9421329Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:22.9533129Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:22.9533628Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:22.9544011Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:22.9544309Z env:
2021-10-08T02:03:22.9544833Z   BUILD_ENVIRONMENT: linux-vulkan-bionic-py3.6-clang9
2021-10-08T02:03:22.9545839Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9

See GitHub Actions build linux-bionic-py3.6-clang9 / build (15/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:19.9621810Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:19.9385658Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.9587548Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.9588844Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.9601205Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.9604919Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.9617130Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.9618637Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:19.9619858Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:19.9620699Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:19.9621148Z + exit 1
2021-10-08T02:03:19.9621810Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:19.9622505Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:19.9626793Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:19.9747091Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:19.9747659Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:19.9758838Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:19.9759171Z env:
2021-10-08T02:03:19.9759639Z   BUILD_ENVIRONMENT: linux-bionic-py3.6-clang9
2021-10-08T02:03:19.9760577Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9
2021-10-08T02:03:19.9761589Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
2021-10-08T02:03:19.9762481Z   XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla

See GitHub Actions build linux-xenial-py3.6-gcc5.4 / build (16/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:19.3791884Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:19.3542748Z ++ git rev-parse HEAD
2021-10-08T02:03:19.3554479Z + [[ 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0 = \d\4\4\4\6\3\3\1\7\e\4\a\2\c\2\8\e\8\f\4\7\4\7\4\6\7\3\2\f\c\e\b\c\4\f\2\0\c\4\2 ]]
2021-10-08T02:03:19.3557084Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.3758171Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:19.3759254Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.3772263Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.3776177Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:19.3788366Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:19.3789265Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:19.3790449Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:19.3791884Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:19.3792717Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:19.3793154Z + exit 1
2021-10-08T02:03:19.3793557Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:19.3797572Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:19.3917737Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2021-10-08T02:03:19.3918294Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2021-10-08T02:03:19.3929269Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:19.3929593Z env:
2021-10-08T02:03:19.3930058Z   BUILD_ENVIRONMENT: linux-xenial-py3.6-gcc5.4
2021-10-08T02:03:19.3931007Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4

See GitHub Actions build linux-xenial-py3.6-gcc7-bazel-test / build-and-test (17/17)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

2021-10-08T02:03:21.1841554Z ERROR: Something h... isn't available for the merge-base of your branch
2021-10-08T02:03:21.1598151Z ++ git rev-parse HEAD
2021-10-08T02:03:21.1609521Z + [[ 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0 = \d\4\4\4\6\3\3\1\7\e\4\a\2\c\2\8\e\8\f\4\7\4\7\4\6\7\3\2\f\c\e\b\c\4\f\2\0\c\4\2 ]]
2021-10-08T02:03:21.1611795Z ++ git merge-base HEAD 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:21.1810345Z + MERGE_BASE=840f9e1c6c8e8adf23e6b9de93aa724a04672aa0
2021-10-08T02:03:21.1811370Z + git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:21.1823178Z 95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:21.1826894Z ++ git rev-parse 840f9e1c6c8e8adf23e6b9de93aa724a04672aa0:.circleci/docker
2021-10-08T02:03:21.1838331Z + PREVIOUS_DOCKER_TAG=95cdae0bef25f951661d3fb0458cbfc86ca2898e
2021-10-08T02:03:21.1839194Z + [[ 95cdae0bef25f951661d3fb0458cbfc86ca2898e = \9\5\c\d\a\e\0\b\e\f\2\5\f\9\5\1\6\6\1\d\3\f\b\0\4\5\8\c\b\f\c\8\6\c\a\2\8\9\8\e ]]
2021-10-08T02:03:21.1840374Z + echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
2021-10-08T02:03:21.1841554Z ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
2021-10-08T02:03:21.1842366Z + echo '       contact the PyTorch team to restore the original images'
2021-10-08T02:03:21.1842953Z + exit 1
2021-10-08T02:03:21.1843365Z        contact the PyTorch team to restore the original images
2021-10-08T02:03:21.1847263Z ##[error]Process completed with exit code 1.
2021-10-08T02:03:21.1965440Z ##[group]Run # Ensure the working directory gets chowned back to the current user
2021-10-08T02:03:21.1966258Z �[36;1m# Ensure the working directory gets chowned back to the current user�[0m
2021-10-08T02:03:21.1966852Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2021-10-08T02:03:21.1978269Z shell: /usr/bin/bash -e {0}
2021-10-08T02:03:21.1978585Z env:
2021-10-08T02:03:21.1979123Z   BUILD_ENVIRONMENT: linux-xenial-py3.6-gcc7-bazel-test

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Aug 2, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 123c589
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Aug 2, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 2d1fd3a
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Aug 2, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: c2f1aed
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
if abs_path.exists():
return abs_path

paths = '\n '.join(str(d / path) for d in include_dirs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (although it goes against my gut feeling, but I was told by many PyTorch contributors that the like to following this rule: https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#strings )

Suggested change
paths = '\n '.join(str(d / path) for d in include_dirs)
paths = "\n ".join(str(d / path) for d in include_dirs)

Copy link
Copy Markdown
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…es bug"

I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Sep 23, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 234f436
Pull Request resolved: #62550
… nvcc header dependecies bug"

I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Sep 23, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 28265e9
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 6, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 798ce27
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 7, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 864176b
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 7, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 2a4b1ab
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 7, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: dbc1415
Pull Request resolved: #62550
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 8, 2021
I noticed that running the build twice in a row resulted in ~80 CUDA files being
rebuilt. Running `ninja -d explain` shows
```
ninja explain: TH/generic/THStorage.h is dirty
ninja explain: TH/generic/THStorageCopy.h is dirty
ninja explain: THC/generic/THCStorage.h is dirty
ninja explain: THC/generic/THCStorageCopy.h is dirty
ninja explain: TH/generic/THTensor.h is dirty
ninja explain: THC/generic/THCTensor.h is dirty
ninja explain: THC/generic/THCTensorCopy.h is dirty
ninja explain: THC/generic/THCTensorMath.h is dirty
ninja explain: THC/generic/THCTensorMathMagma.h is dirty
ninja explain: THC/generic/THCTensorMathPairwise.h is dirty
ninja explain: THC/generic/THCTensorScatterGather.h is dirty
```

considering `ninja` is working relative to the `build` folder, these files don't
actually exist. I traced this back to the output of `nvcc -MD` containing
paths relative to the include directory, instead of being absolute.

This adds a little script to launch the compiler then resolve any relative paths
in the `.d` file before `ninja` looks at it. To use it, I run the build with
```
export CMAKE_CUDA_COMPILER_LAUNCHER="python;`pwd`/tools/nvcc_fix_deps.py;ccache"
```

There are some possible pit-falls here. The same relative path might work for
two include directories, and the compiler could pick a different one. Or,
the compiler might have additional implicit include directories that are needed
to resolve the path. However, this has worked perfectly in my testing and it's
completely opt-in so should be fine.

ghstack-source-id: 0dc9305
Pull Request resolved: #62550
@malfet
Copy link
Copy Markdown
Contributor

malfet commented Oct 8, 2021

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/peterbell10/110/head branch October 15, 2021 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants