[CUDA12] Make PyTorch compatible with CUDA 12 by jianyuh · Pull Request #91118 · pytorch/pytorch

jianyuh · 2022-12-19T21:17:02Z

Fix the failure when building PyTorch from source code using CUDA 12

In file included from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAFunctions.h:12,
                 from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAStream.h:10,
                 from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAGraphsC10Utils.h:3,
                 from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.h:5,
                 from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:2:
/home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp: In member function ‘void at::cuda::CUDAGraph::capture_end()’:
/home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:75: warning: converting to non-pointer type ‘long long unsigned int’ from NULL [-Wconversion-null]
     AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0));
                                                                           ^
/home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAException.h:31:42: note: in definition of macro ‘C10_CUDA_CHECK’
     C10_UNUSED const cudaError_t __err = EXPR;                           \
                                          ^~~~
/home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:5: note: in expansion of macro ‘AT_CUDA_CHECK’
     AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0));
     ^~~~~~~~~~~~~
/home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:75: error: too many arguments to function ‘cudaError_t cudaGraphInstantiate(CUgraphExec_st**, cudaGraph_t, long long unsigned int)’
     AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0));
                                                                           ^
/home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAException.h:31:42: note: in definition of macro ‘C10_CUDA_CHECK’
     C10_UNUSED const cudaError_t __err = EXPR;                           \
                                          ^~~~
/home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:5: note: in expansion of macro ‘AT_CUDA_CHECK’
     AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0));
     ^~~~~~~~~~~~~
In file included from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAStream.h:6,
                 from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAGraphsC10Utils.h:3,
                 from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.h:5,
                 from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:2:
/usr/local/cuda/include/cuda_runtime_api.h:11439:39: note: declared here
 extern __host__ cudaError_t CUDARTAPI cudaGraphInstantiate(cudaGraphExec_t *pGraphExec, cudaGraph_t graph, unsigned long long flags __dv(0));
                                       ^~~~~~~~~~~~~~~~~~~~
ninja: build stopped: subcommand failed.

/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp: In function ‘void torch::cuda::shared::initCudartBindings(PyObject*)’:
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:13: error: ‘cudaOutputMode_t’ was not declared in this scope
   py::enum_<cudaOutputMode_t>(
             ^~~~~~~~~~~~~~~~
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:13: note: suggested alternative: ‘cudaGraphNode_t’
   py::enum_<cudaOutputMode_t>(
             ^~~~~~~~~~~~~~~~
             cudaGraphNode_t
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:29: error: template argument 1 is invalid
   py::enum_<cudaOutputMode_t>(
                             ^
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:38:30: error: ‘cudaKeyValuePair’ was not declared in this scope
       .value("KeyValuePair", cudaKeyValuePair)
                              ^~~~~~~~~~~~~~~~
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:39:21: error: ‘cudaCSV’ was not declared in this scope
       .value("CSV", cudaCSV);
                     ^~~~~~~
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:39:21: note: suggested alternative: ‘cudart’
       .value("CSV", cudaCSV);
                     ^~~~~~~
                     cudart
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:99:7: error: ‘cudaProfilerInitialize’ was not declared in this scope
       cudaProfilerInitialize);
       ^~~~~~~~~~~~~~~~~~~~~~
/home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:99:7: note: suggested alternative: ‘cudaProfilerStart’
       cudaProfilerInitialize);
       ^~~~~~~~~~~~~~~~~~~~~~
       cudaProfilerStart
ninja: build stopped: subcommand failed.

After these fixes, we can see CUDA 12 is successfully built with OSS PyTorch instructions.

USE_CUDA=1 python setup.py develop 2>&1 | tee compile.log

cc @ngimel

pytorch-bot · 2022-12-19T21:17:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91118

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 35f7265:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ngimel · 2022-12-19T21:22:45Z

fyi @ptrblck

ngimel · 2022-12-19T21:25:01Z

torch/csrc/cuda/shared/cudart.cpp

does cudaProfilerInitialize not work anymore?

Thanks! Just add some fix.

ngimel · 2022-12-19T21:30:09Z

torch/csrc/cuda/shared/cudart.cpp

and this would prevent profiler initialization?

Thanks! Just add some fix.

ngimel

Stamp to unblock, but we should have a proper task listing things that need to be done for cuda 12.

facebook-github-bot · 2022-12-20T02:47:14Z

@jianyuh has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-12-20T02:48:56Z

@jianyuh has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-12-20T04:17:21Z

@jianyuh has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jianyuh · 2022-12-20T04:19:22Z

@pytorchbot merge

pytorchmergebot · 2022-12-20T04:22:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-12-20T04:23:16Z

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

jianyuh · 2022-12-20T04:23:40Z

@pytorchbot merge

facebook-github-bot · 2022-12-20T04:23:57Z

@jianyuh has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pytorchmergebot · 2022-12-20T04:25:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

torch/csrc/cuda/shared/cudart.cpp

pytorchmergebot · 2022-12-20T07:33:10Z

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

jianyuh · 2022-12-20T07:45:33Z

@pytorchbot merge

pytorchmergebot · 2022-12-20T07:47:13Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

facebook-github-bot · 2022-12-20T08:12:49Z

@jianyuh has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jianyuh requested a review from ngimel December 19, 2022 21:17

ngimel reviewed Dec 19, 2022

View reviewed changes

ngimel approved these changes Dec 19, 2022

View reviewed changes

eqy mentioned this pull request Dec 19, 2022

[CUDA][CUDA 12] CUDA 12 Support Tracking Issue #91122

Closed

5 tasks

ngimel mentioned this pull request Dec 20, 2022

[CUDA12] Clean up deprecated APIs #91050

Closed

ngimel changed the title ~~Make PyTorch compatible with CUDA 12~~ [CUDA 12] Make PyTorch compatible with CUDA 12 Dec 20, 2022

ngimel changed the title ~~[CUDA 12] Make PyTorch compatible with CUDA 12~~ [CUDA12] Make PyTorch compatible with CUDA 12 Dec 20, 2022

jianyuh force-pushed the pytorch_cuda12 branch from a5cc27b to c182754 Compare December 20, 2022 02:45

jianyuh force-pushed the pytorch_cuda12 branch from c182754 to 8b876b4 Compare December 20, 2022 02:48

jianyuh force-pushed the pytorch_cuda12 branch from 8b876b4 to 548f2c9 Compare December 20, 2022 04:16

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 20, 2022

jianyuh force-pushed the pytorch_cuda12 branch from 548f2c9 to c4d39a5 Compare December 20, 2022 04:22

brad-mengchi self-requested a review December 20, 2022 06:02

brad-mengchi approved these changes Dec 20, 2022

View reviewed changes

yan12125 reviewed Dec 20, 2022

View reviewed changes

torch/csrc/cuda/shared/cudart.cpp Outdated Show resolved Hide resolved

torch/csrc/cuda/shared/cudart.cpp Outdated Show resolved Hide resolved

jianyuh force-pushed the pytorch_cuda12 branch from c4d39a5 to 223c56a Compare December 20, 2022 07:38

Make PyTorch compatible with CUDA 12

35f7265

jianyuh force-pushed the pytorch_cuda12 branch from 223c56a to 35f7265 Compare December 20, 2022 07:44

pytorchmergebot added the Merged label Dec 20, 2022

pytorchmergebot closed this in 63b8ecc Dec 20, 2022

jianyuh added module: cuda Related to torch.cuda, and CUDA support in general release notes: cuda release notes category labels Feb 26, 2023

Utopiah mentioned this pull request Mar 22, 2023

Future plans to upgrade nerfstudio nerfstudio-project/nerfstudio#1306

Open

Conversation

jianyuh commented Dec 19, 2022 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91118

✅ No Failures

Uh oh!

ngimel commented Dec 19, 2022

Uh oh!

ngimel Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

jianyuh Dec 20, 2022

Choose a reason for hiding this comment

Uh oh!

ngimel Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

jianyuh Dec 20, 2022

Choose a reason for hiding this comment

Uh oh!

ngimel left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Dec 20, 2022

Uh oh!

facebook-github-bot commented Dec 20, 2022

Uh oh!

facebook-github-bot commented Dec 20, 2022

Uh oh!

jianyuh commented Dec 20, 2022

Uh oh!

pytorchmergebot commented Dec 20, 2022

Merge started

Uh oh!

pytorchmergebot commented Dec 20, 2022

Uh oh!

jianyuh commented Dec 20, 2022

Uh oh!

facebook-github-bot commented Dec 20, 2022

Uh oh!

pytorchmergebot commented Dec 20, 2022

Merge started

Uh oh!

Uh oh!

Uh oh!

pytorchmergebot commented Dec 20, 2022

Uh oh!

jianyuh commented Dec 20, 2022

Uh oh!

pytorchmergebot commented Dec 20, 2022

Merge started

Uh oh!

facebook-github-bot commented Dec 20, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jianyuh commented Dec 19, 2022 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Dec 19, 2022 •

edited

Loading