Skip to content

use new cuda kernel launch code in nvprof parsing#35016

Closed
xwang233 wants to merge 1 commit intopytorch:masterfrom
xwang233:cuda-kernel-launch-fix
Closed

use new cuda kernel launch code in nvprof parsing#35016
xwang233 wants to merge 1 commit intopytorch:masterfrom
xwang233:cuda-kernel-launch-fix

Conversation

@xwang233
Copy link
Copy Markdown
Collaborator

This PR would fix #33986.

The meaning of cbid 13 and 211 can be found at here

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L238

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L436

or it can also be found in the header file at /usr/local/cuda/extras/CUPTI/include/cupti_runtime_cbid.h.

Please also check this at stackoverflow. I also executed the profiling code (in the issue) on CUDA 9.2, and the cbid is already changed to 211. Just in case someone would build pytorch against older CUDA versions, I leave both 13 and 211 in the assertion.

cc @csarofeen @ptrblck @ezyang @ngimel

@ngimel
Copy link
Copy Markdown
Collaborator

ngimel commented Mar 19, 2020

That's cool. I guess we can't easily run tests that require nvprof in our CI, but can you guys add a test in yours? Also, what's the plan for when nvprof is no longer supported, and there's only nsight systems?

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Mar 19, 2020

💊 CircleCI build failures summary and remediations

As of commit 516687b (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚


  • 4/4 broken upstream at merge base e5ee95e since Mar 19

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🚧 4 upstream failures:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 4 times.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Mar 20, 2020

This seems pretty harmless to accept.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ptrblck
Copy link
Copy Markdown
Collaborator

ptrblck commented Mar 20, 2020

@ngimel Xiao has written the test and we'll add it to our CI.
nvprof/nsys needs some offline discussion.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang merged this pull request in a5b5ea9.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
This PR would fix pytorch#33986.

The meaning of cbid 13 and 211 can be found at here

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L238

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L436

or it can also be found in the header file at `/usr/local/cuda/extras/CUPTI/include/cupti_runtime_cbid.h`.

Please also check [this at stackoverflow](https://stackoverflow.com/questions/48552390/whats-the-difference-between-launching-with-an-api-call-vs-the-triple-chevron-s). I also executed the profiling code (in the issue) on CUDA 9.2, and the cbid is already changed to 211. Just in case someone would build pytorch against older CUDA versions, I leave both 13 and 211 in the assertion.

cc csarofeen ptrblck ezyang ngimel
Pull Request resolved: pytorch#35016

Differential Revision: D20550879

Pulled By: ezyang

fbshipit-source-id: 968efc5e1126f1dd31acc9f5f4463f351d8a4c4f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to load nvprof trace

7 participants