Skip to content

Refactor cudnn convolution#49109

Closed
zasdfgbnm wants to merge 4 commits intomasterfrom
cudnn7-params-refactor
Closed

Refactor cudnn convolution#49109
zasdfgbnm wants to merge 4 commits intomasterfrom
cudnn7-params-refactor

Conversation

@zasdfgbnm
Copy link
Copy Markdown
Collaborator

cuDNN v7 API has been deprecated, so we need to migrate to cuDNN v8 API. The v8 API does not exist on cuDNN 7, so there will be a long time both API should exist.

This is step 0 of adding cuDNN v8 API. There is no real code change in this PR. It just copy-pastes existing code. The original Conv.cpp is split into ConvPlaceholders.cpp, ConvShared.cpp, ConvShared.h, Conv_v7.cpp, Conv_v8.cpp. Currently Conv_v8.cpp is empty, and will be filled in the future.

The ConvPlaceholders.cpp contains placeholder implementation of cudnn convolution when cudnn is not enabled. These operators only raise errors and do no real computation. This file also contains deprecated operators. These operators are implemented using current operators.

The ConvShared.cpp and ConvShared.h contains code that will be shared by the v7 and v8 API, these include the definition of struct ConvolutionParams and ConvolutionArgs. As well as ATen exposed API like cudnn_convolution and intermediate cudnn_convolution_forward. These exposed functions will call raw API like raw_cudnn_convolution_forward_out in Conv_v7.cpp or Conv_v8.cpp for the real implementation.

The Conv_v7.cpp, Conv_v8.cpp contains the implementation of raw APIs, and are different for v7 and v8.

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Dec 9, 2020

💊 CI failures summary and remediations

As of commit e4be821 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 11 times.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 10, 2020

Codecov Report

Merging #49109 (e4be821) into master (3f9ff48) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #49109      +/-   ##
==========================================
- Coverage   80.73%   80.73%   -0.01%     
==========================================
  Files        1871     1871              
  Lines      201759   201759              
==========================================
- Hits       162888   162885       -3     
- Misses      38871    38874       +3     

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

params->dataType = dataType;
// ASSERT(weight.dim() == input.dim())
for (int i = 0; i != input.dim(); ++i) {
params->input_size[i] = (int) input.size(i);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please do unchecked accesses here? input.sizes()[i]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix in later PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, these were all from the old code, right?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are, but that's not a reason not to fix them.


// Input
checkDimRange(c, input, 3, 6 /* exclusive */);
checkSize(c, input, input_channels_dim, weight->size(1) * groups);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unchecked access for size(1)


auto layout = cudnn_conv_use_channels_last(*input, *weight) ?
at::MemoryFormat::ChannelsLast : at::MemoryFormat::Contiguous;
auto output_t = at::empty(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do at::native::empty_cuda here to avoid dispatch


Tensor grad_input, grad_weight;
if (output_mask[0]) {
grad_input = at::cudnn_convolution_transpose_backward_input(grad_output, weight, padding, stride, dilation, groups, benchmark, deterministic, allow_tf32);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could call cudnn_convolution_forward directly here, and get rid of cudnn_convolution and cudnn_convolution_transpose_backward_input? Do they provide any value?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the value is a better error message, i.e. correctly calling it grad_output instead of input

@zasdfgbnm zasdfgbnm deleted the cudnn7-params-refactor branch December 10, 2020 19:09
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang merged this pull request in 45473ff.

facebook-github-bot pushed a commit that referenced this pull request Jan 25, 2021
Summary:
- Resolves ngimel's review comments in #49109
- Move `ConvolutionArgs` from `ConvShared.h` to `Conv_v7.cpp`, because cuDNN v8 uses different descriptors therefore will not share the same `ConvolutionArgs`.
- Refactor the `ConvolutionParams` (the hash key for benchmark):
  - Remove `input_stride`
  - Add `input_dim`
  - Add `memory_format`
- Make `repro_from_args` to take `ConvolutionParams` instead of `ConvolutionArgs` as arguments so that it can be shared for v7 and v8
- Rename some `layout` to `memory_format`. `layout` should be sparse/strided and `memory_format` should be contiguous/channels_last. They are different things.

Pull Request resolved: #50827

Reviewed By: bdhirsh

Differential Revision: D26048274

Pulled By: ezyang

fbshipit-source-id: f71aa02d90ffa581c17ab05b171759904b311517
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
cuDNN v7 API has been deprecated, so we need to migrate to cuDNN v8 API. The v8 API does not exist on cuDNN 7, so there will be a long time both API should exist.

This is step 0 of adding cuDNN v8 API. There is no real code change in this PR. It just copy-pastes existing code. The original `Conv.cpp` is split into `ConvPlaceholders.cpp`, `ConvShared.cpp`, `ConvShared.h`, `Conv_v7.cpp`, `Conv_v8.cpp`. Currently `Conv_v8.cpp` is empty, and will be filled in the future.

The `ConvPlaceholders.cpp` contains placeholder implementation of cudnn convolution when cudnn is not enabled. These operators only raise errors and do no real computation. This file also contains deprecated operators. These operators are implemented using current operators.

The `ConvShared.cpp` and `ConvShared.h` contains code that will be shared by the v7 and v8 API, these include the definition of struct `ConvolutionParams` and `ConvolutionArgs`. As well as ATen exposed API like `cudnn_convolution` and intermediate `cudnn_convolution_forward`. These exposed functions will call raw API like `raw_cudnn_convolution_forward_out` in `Conv_v7.cpp` or `Conv_v8.cpp` for the real implementation.

The `Conv_v7.cpp`, `Conv_v8.cpp` contains the implementation of raw APIs, and are different for v7 and v8.

Pull Request resolved: pytorch#49109

Reviewed By: H-Huang

Differential Revision: D25463783

Pulled By: ezyang

fbshipit-source-id: 1c80de8e5d94d97a61e45687f6193e8ff5481e3e
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
- Resolves ngimel's review comments in pytorch#49109
- Move `ConvolutionArgs` from `ConvShared.h` to `Conv_v7.cpp`, because cuDNN v8 uses different descriptors therefore will not share the same `ConvolutionArgs`.
- Refactor the `ConvolutionParams` (the hash key for benchmark):
  - Remove `input_stride`
  - Add `input_dim`
  - Add `memory_format`
- Make `repro_from_args` to take `ConvolutionParams` instead of `ConvolutionArgs` as arguments so that it can be shared for v7 and v8
- Rename some `layout` to `memory_format`. `layout` should be sparse/strided and `memory_format` should be contiguous/channels_last. They are different things.

Pull Request resolved: pytorch#50827

Reviewed By: bdhirsh

Differential Revision: D26048274

Pulled By: ezyang

fbshipit-source-id: f71aa02d90ffa581c17ab05b171759904b311517
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants