Commit a5b5ea9
use new cuda kernel launch code in nvprof parsing (#35016)
Summary:
This PR would fix #33986.
The meaning of cbid 13 and 211 can be found at here
https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L238
https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L436
or it can also be found in the header file at `/usr/local/cuda/extras/CUPTI/include/cupti_runtime_cbid.h`.
Please also check [this at stackoverflow](https://stackoverflow.com/questions/48552390/whats-the-difference-between-launching-with-an-api-call-vs-the-triple-chevron-s). I also executed the profiling code (in the issue) on CUDA 9.2, and the cbid is already changed to 211. Just in case someone would build pytorch against older CUDA versions, I leave both 13 and 211 in the assertion.
cc csarofeen ptrblck ezyang ngimel
Pull Request resolved: #35016
Differential Revision: D20550879
Pulled By: ezyang
fbshipit-source-id: 968efc5e1126f1dd31acc9f5f4463f351d8a4c4f1 parent e327255 commit a5b5ea9
1 file changed
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
787 | 787 | | |
788 | 788 | | |
789 | 789 | | |
790 | | - | |
| 790 | + | |
| 791 | + | |
791 | 792 | | |
792 | 793 | | |
793 | 794 | | |
| |||
0 commit comments