[Caffe2] MIOpen bug fixes and performance enhancements by ashishfarmer · Pull Request #11766 · pytorch/pytorch

ashishfarmer · 2018-09-17T18:47:57Z

This PR contains changes for:

Performance enhancements for group conv using MIOpen
Performance enhancements by removing unnecessary computations while running pooling through MIOpen
Added check for bwdData comptutation while running MIOpen convGradient operator
Fix in MIOpen poolingGradient operator to compute window size for global pooling case
Minor code cleanup in MIOpen spatial batch norm operator

cc: @bddppq @petrex

…iopen_fixes

petrex · 2018-09-18T17:21:00Z

@ashishfarmer Thanks! Do you have data regarding the performance increase?

petrex · 2018-09-18T17:23:40Z

@bddppq Is there a way we can run op tests on this particular PR? thx

bddppq · 2018-09-20T17:28:49Z

@petrex @ashishfarmer yes you can trigger the tests here https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-trigger-test/build?delay=0sec with "GIT_COMMIT" as "origin/pr/YOUR_PR_NUMBER/head" (in this case it's "origin/pr/11766/head").

Triggered here: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-trigger-test/13272/

bddppq

LG. Could you add a test case that could trigger the issue that you fixed?

caffe2/operators/hip/pool_op_miopen.cc

        beta_(OperatorBase::GetSingleArgument<float>("beta", 0.0)),
        do_backward_(
-            OperatorBase::GetSingleArgument<bool>("do_backward", true)),
+            OperatorBase::GetSingleArgument<bool>("do_backward", false)),


bddppq · 2018-09-20T17:53:53Z

Also

pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Lines 143 to 144 in 5392b12

    
           vector<int> dims = {N, C, H, W, D}; 
        
           vector<int> strides = {C * H * W * D, H * W * D, W * D, D, 1};

and

pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Lines 278 to 279 in 5392b12

    
           vector<int> dims = {N, C, H, W, D}; 
        
           vector<int> strides = {C * H * W * D, H * W * D, W * D, D, 1};

look unnecessary to me.

bddppq · 2018-09-20T17:55:02Z

And here

pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Line 280 in 5392b12

MIOPEN_ENFORCE(miopenSet4dTensorDescriptor(

needs to update miopen_input_dims_

ashishfarmer · 2018-09-20T18:41:41Z

The case that is fixed in this PR (global pool gradient) is already covered by test_global_pooling in caffe2/python/operator_test/pooling_test.py

bddppq · 2018-09-20T18:50:59Z

@ashishfarmer lol then why was it not failing before?

ashishfarmer · 2018-09-20T19:02:49Z

It was an intermittent failure before, that we got to the root cause of with this uninitialized variable issue. It is a very specific case (MIOpen engine + AveragePool + NCHW layout) when it will be triggered, and hypothesis would for the most part not encounter it.

ashishfarmer · 2018-09-20T20:07:10Z

Also
pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Lines 143 to 144 in 5392b12

vector dims = {N, C, H, W, D};
vector strides = {C * H * W * D, H * W * D, W * D, D, 1};

and
pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Lines 278 to 279 in 5392b12

vector dims = {N, C, H, W, D};
vector strides = {C * H * W * D, H * W * D, W * D, D, 1};
look unnecessary to me.

Removed these vectors

ashishfarmer · 2018-09-20T20:07:32Z

And here

pytorch/caffe2/operators/hip/spatial_batch_norm_op_miopen.cc

Line 280 in 5392b12

MIOPEN_ENFORCE(miopenSet4dTensorDescriptor(
needs to update miopen_input_dims_

Good catch! Thank you! Fixed this

ashishfarmer · 2018-09-20T20:08:11Z

@bddppq - Added the fixes as per your review

facebook-github-bot

bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: This PR contains changes for: 1. Performance enhancements for group conv using MIOpen 2. Performance enhancements by removing unnecessary computations while running pooling through MIOpen 3. Added check for bwdData comptutation while running MIOpen convGradient operator 4. Fix in MIOpen poolingGradient operator to compute window size for global pooling case 5. Minor code cleanup in MIOpen spatial batch norm operator Differential Revision: D9979050 Pulled By: bddppq fbshipit-source-id: fabc7a44a2f9ca0307d99564d1ce8fe1de9a6fbb

bddppq · 2018-09-21T00:55:52Z

Merged to master in c7751f4

Ashish added 8 commits September 12, 2018 10:42

perf enhancements for MIOpen pool

b69f374

global pool fix for miopen bwd pooling

1ebe626

Added check for bwd data compute for MIOpen conv gradient op

aef1cdc

group conv fix for MIOpen conv op

ebcc6c1

use switch for global pooling fix

1e6a836

Merge branch 'master' of https://github.com/pytorch/pytorch into af/m…

c3b29f9

…iopen_fixes

minor cleanup

4c2de93

Merge branch 'master' of https://github.com/pytorch/pytorch into af/m…

5392b12

…iopen_fixes

pytorchbot added the caffe2 label Sep 17, 2018

ashishfarmer mentioned this pull request Sep 18, 2018

[Caffe2] Update on MIOPEN pooling ops ROCm/pytorch#175

Closed

bddppq self-requested a review September 20, 2018 17:42

bddppq reviewed Sep 20, 2018

View reviewed changes

Ashish added 3 commits September 20, 2018 12:09

Removed unused do_backward flag

e4da715

Update miopen_input_dims for reshape check

814d24c

Removed unused strides and dims lists

01d7e5d

facebook-github-bot reviewed Sep 20, 2018

View reviewed changes

bddppq approved these changes Sep 20, 2018

View reviewed changes

bddppq closed this Sep 21, 2018

petrex mentioned this pull request Oct 25, 2018

[Caffe2] Update MIOPEN ops (for release 1.5) ROCm/pytorch#184

Closed

ezyang added open source labels Jun 24, 2019

twumasipennoh mentioned this pull request Feb 8, 2024

dynamo + autograd.Function: dynamo doesn't model multiple ctx.save_for_backward calls. #117652

Closed

Conversation

ashishfarmer commented Sep 17, 2018

Uh oh!

petrex commented Sep 18, 2018

Uh oh!

petrex commented Sep 18, 2018

Uh oh!

bddppq commented Sep 20, 2018

Uh oh!

bddppq left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

bddppq commented Sep 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bddppq commented Sep 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashishfarmer commented Sep 20, 2018

Uh oh!

bddppq commented Sep 20, 2018

Uh oh!

ashishfarmer commented Sep 20, 2018

Uh oh!

ashishfarmer commented Sep 20, 2018

Uh oh!

ashishfarmer commented Sep 20, 2018

Uh oh!

ashishfarmer commented Sep 20, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

bddppq commented Sep 21, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

bddppq commented Sep 20, 2018 •

edited

Loading

bddppq commented Sep 20, 2018 •

edited

Loading