Skip to content

Use NNPACK for strided convolutions.#27402

Closed
AshkanAliabadi wants to merge 16 commits intogh/AshkanAliabadi/3/basefrom
gh/AshkanAliabadi/3/head
Closed

Use NNPACK for strided convolutions.#27402
AshkanAliabadi wants to merge 16 commits intogh/AshkanAliabadi/3/basefrom
gh/AshkanAliabadi/3/head

Conversation

@AshkanAliabadi
Copy link
Contributor

@AshkanAliabadi AshkanAliabadi commented Oct 4, 2019

Stack from ghstack:

Differential Revision: D18126265

@pytorchbot pytorchbot added module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen module: nn Related to torch.nn module: operators labels Oct 4, 2019
AshkanAliabadi pushed a commit that referenced this pull request Oct 5, 2019
ghstack-source-id: dc68314
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Oct 9, 2019
ghstack-source-id: 8f71a05
Pull Request resolved: #27402
if (params.use_cpu_depthwise3x3_winograd(input, weight)) {
output = convolution_depthwise3x3_winograd_stub(
input.device().type(), input, weight, bias, params.padding, params.stride, params.groups);
input.device().type(), input, weight, bias, params.stride, params.padding, params.groups);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, was it a bug? do we have a unittest for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a crazy one. Here's my explanation of how I think it was working: #27117 (comment)

AshkanAliabadi pushed a commit that referenced this pull request Oct 24, 2019
ghstack-source-id: 9344289
Pull Request resolved: #27402
Copy link
Contributor Author

@AshkanAliabadi AshkanAliabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending CI, it seems that I finally managed to get this working. This is really not an ideal solution in my opinion and definitely not the direction we should be taking PyTorch mobile in, in the long run. I think the ideal solution would be to have one single mobile-focused backend, just like the reference or vendor-provided backends we currently have, that handles floating point and quantized operations both for NHWC and NCHW layout but efficiently for mobile. That is a considerable undertaking though, and if we want to close the gap in the meanwhile, we need short-term solutions like this unfortunately.

if(AT_NNPACK_ENABLED)
include_directories(${NNPACK_INCLUDE_DIRS})
list(APPEND ATen_CPU_DEPENDENCY_LIBS nnpack) # cpuinfo is added below
list(APPEND ATen_CPU_DEPENDENCY_LIBS nnpack nnpack_reference_layers) # cpuinfo is added below
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NNPACK does not support strided convolutions on batches (used during training,) hence we fallback to the slower reference implementation here.

weight.data_ptr<float>(),
bias_.data_ptr<float>(),
output.data_ptr<float>(),
nnpack_threadpool());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NNPACK does not support strided convolutions on batches (used during training,) hence we fallback to the slower reference implementation here.

@AshkanAliabadi AshkanAliabadi requested a review from ljk53 October 24, 2019 00:35
AshkanAliabadi pushed a commit that referenced this pull request Oct 24, 2019
ghstack-source-id: 7386b44
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Oct 28, 2019
ghstack-source-id: 685c631
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Oct 29, 2019
ghstack-source-id: c629489
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Oct 31, 2019
ghstack-source-id: 0ab1420
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Nov 1, 2019
ghstack-source-id: e524d7a
Pull Request resolved: #27402
AshkanAliabadi pushed a commit that referenced this pull request Nov 1, 2019
ghstack-source-id: da21324
Pull Request resolved: #27402
@AshkanAliabadi AshkanAliabadi removed the request for review from ljk53 November 1, 2019 22:42
@facebook-github-bot facebook-github-bot deleted the gh/AshkanAliabadi/3/head branch December 2, 2019 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen module: nn Related to torch.nn

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants