[cudnn] Support v8 API in fbcode by xw285cornell · Pull Request #96512 · pytorch/pytorch

xw285cornell · 2023-03-10T09:17:06Z

Summary: It turns out we never turn on cudnn v8 API which blocks bf16 conv. Enable the new v8 API

Test Plan: buck run mode/dev-nosan scripts/xdwang/example:fc_pytorch

Reviewed By: ngimel

Differential Revision: D43784279

pytorch-bot · 2023-03-10T09:17:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/96512

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit 3862dc1:

NEW FAILURES - The following jobs have failed:

android-emulator-build-test / build-and-test (default, 1, 1, ubuntu-latest) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2023-03-10T09:18:08Z

This pull request was exported from Phabricator. Differential Revision: D43784279

facebook-github-bot · 2023-03-10T10:08:13Z

This pull request was exported from Phabricator. Differential Revision: D43784279

xw285cornell · 2023-03-11T01:07:56Z

@ngimel @malfet if you can help take a look :)

malfet

LGTM, though I'm surprised why size_t cast is needed

malfet · 2023-03-11T01:09:34Z

aten/src/ATen/native/cudnn/Conv_v8.cpp

Why this change is needed? If it is, then please use static_cast<size_t>(plan.getWorkspaceSize())

Suggested change

if ((size_t) plan.getWorkspaceSize() <= max_workspace_size) {

if (plan.getWorkspaceSize() <= max_workspace_size) {

That's our favorite signed-unsigned comparison probably, can we enable compilation flags to error on it in OSS builds? It's been a nuisance.

Given that getWorkspaceSize() returns int64_t in cudnn frontend, a better fix would be to change our vars like max_worksplace_size and curr_workspace_size to int64_t instead of size_t

sounds good! I'll make the change and see if I can make this from warning to error in oss :)

facebook-github-bot · 2023-03-15T08:41:32Z

This pull request was exported from Phabricator. Differential Revision: D43784279

Summary: As discussed in pytorch#96512, turn on sign-compare for OSS build Test Plan: pytorch CI Differential Revision: D44085536 fbshipit-source-id: d2093131d84230aed316f783198f9229f2a773dc

facebook-github-bot · 2023-03-15T08:51:06Z

This pull request was exported from Phabricator. Differential Revision: D43784279

xw285cornell · 2023-03-15T17:50:40Z

Looks this PR has a conflict with #96723, will probably wait a bit to get in sync

facebook-github-bot · 2023-03-21T14:55:40Z

This pull request was exported from Phabricator. Differential Revision: D43784279

Summary: Pull Request resolved: pytorch#96512 It turns out we never turn on cudnn v8 API which blocks bf16 conv. Enable the new v8 API Test Plan: buck run mode/dev-nosan scripts/xdwang/example:fc_pytorch Reviewed By: ngimel Differential Revision: D43784279 fbshipit-source-id: 902a4e162807faae874cc9c4baaa90479cd72006

facebook-github-bot · 2023-03-21T15:04:26Z

This pull request was exported from Phabricator. Differential Revision: D43784279

facebook-github-bot · 2023-03-23T01:39:08Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2023-03-23T01:40:59Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: It turns out we never turn on cudnn v8 API which blocks bf16 conv. Enable the new v8 API Test Plan: buck run mode/dev-nosan scripts/xdwang/example:fc_pytorch Reviewed By: ngimel Differential Revision: D43784279 Pull Request resolved: pytorch/pytorch#96512 Approved by: https://github.com/malfet

facebook-github-bot added the fb-exported label Mar 10, 2023

xw285cornell requested review from malfet and ngimel March 10, 2023 09:18

xw285cornell force-pushed the export-D43784279 branch from 1b12f00 to 88103da Compare March 10, 2023 10:08

malfet approved these changes Mar 11, 2023

View reviewed changes

This was referenced Mar 11, 2023

MPS: Add support for TopK (k>16) on M1 GPU #78915

Closed

MPS cumsum issue - RuntimeError: MPS does not support cumsum op with int64 input. Support has been added in macOS 13.3 #96610

Closed

xw285cornell force-pushed the export-D43784279 branch from 88103da to e42ab56 Compare March 15, 2023 08:41

xw285cornell mentioned this pull request Mar 15, 2023

Turn on sign/unsign check #96808

Closed

xw285cornell force-pushed the export-D43784279 branch from e42ab56 to dcb3435 Compare March 15, 2023 08:51

xw285cornell force-pushed the export-D43784279 branch from dcb3435 to a692d89 Compare March 21, 2023 14:55

xw285cornell force-pushed the export-D43784279 branch from a692d89 to 3862dc1 Compare March 21, 2023 15:04

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 23, 2023

pytorchmergebot added the Merged label Mar 23, 2023

pytorchmergebot closed this in 788300c Mar 23, 2023

	if ((size_t) plan.getWorkspaceSize() <= max_workspace_size) {
	if (plan.getWorkspaceSize() <= max_workspace_size) {

Conversation

xw285cornell commented Mar 10, 2023

Uh oh!

pytorch-bot bot commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/96512

❌ 1 Failures

Uh oh!

facebook-github-bot commented Mar 10, 2023

Uh oh!

facebook-github-bot commented Mar 10, 2023

Uh oh!

xw285cornell commented Mar 11, 2023

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

malfet Mar 11, 2023

Choose a reason for hiding this comment

Uh oh!

ngimel Mar 11, 2023

Choose a reason for hiding this comment

Uh oh!

ngimel Mar 11, 2023

Choose a reason for hiding this comment

Uh oh!

xw285cornell Mar 14, 2023

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Mar 15, 2023

Uh oh!

facebook-github-bot commented Mar 15, 2023

Uh oh!

xw285cornell commented Mar 15, 2023

Uh oh!

facebook-github-bot commented Mar 21, 2023

Uh oh!

facebook-github-bot commented Mar 21, 2023

Uh oh!

facebook-github-bot commented Mar 23, 2023

Uh oh!

pytorchmergebot commented Mar 23, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Mar 10, 2023 •

edited

Loading