[ROCm] improve use of ROCm libraries, enable more tests, small fixes by iotamudelta · Pull Request #10406 · pytorch/pytorch

iotamudelta · 2018-08-10T16:50:57Z

some small leftovers from the last PR review
enable more unit test sets for CI
replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
use strided_batched gemm interface also from the batched internal interface
re-enable Dropout.cu as we now have philox w/ rocRAND

Add workarounds in pyHIPIFY for __forceinline__ and std:: math functions in HIP

This requires changes to the hipify script (essentially making it a little smarter by only statically casting arguments 5 and onwards (which are the actual kernel arguments, ignoring the launch arguments) while still doing the templating for the actual kernel. Co-production w/ Jithun to get the logic right.

…le progress bar

Merge from upstream

No more patches

As per review, change cast to just int.

Removed the change to cudaOccupancyMaxActiveBlocksPerMultiprocessor in SoftMax.cu inside disable_features.yaml.

…hin hipify-python

Merge from upstream

Refactoring & Fixing the pyhipify script.

… work for multi-GPU setup anyway, and gives a seg fault on call to getNumGPUs()

… baseline tests for ROCm CI; add override so user can run all tests if desired

Hardcode getNumGPUs() to 1 for ROCm builds …

Merge from upstream

hcRNG is not supported any longer, rocRAND is the library replacing it. This requires us to disable one function, add some include directories to setup.py and add an include to THCTensorRandom if we are in the ROCm context.

Disabling KMTHINLTO.

Including HCC issue #.

Enable test_torch, test_dataloader, test_indexing and test_utils …

Merge from upstream

aten/src/THC/THCBlas.cu

+    THError("Cublas_SgemmBatched only supports m, n, k, lda, ldb, ldc, batchCount"
+            "with the bound [val] <= %d", INT_MAX);
+  }
+


aten/src/THC/THCBlas.cu

+                             double alpha, const double *a[], int64_t lda, const double *b[], int64_t ldb,
+                             double beta, double *c[], int64_t ldc, int64_t batchCount)
+{
+  if( (m >= INT_MAX) || (n >= INT_MAX) || (k >= INT_MAX) || (lda >= INT_MAX)  || (ldb >= INT_MAX) || (ldc >= INT_MAX) || (batchCount >= INT_MAX) )


ezyang

Lint failed. This must be fixed before we can merge.

facebook-github-bot

ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Remove duplicate definition of skipIfRocm

Scope ifdef down as per review.

facebook-github-bot

ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

aten/src/ATen/CMakeLists.txt

+ FIND_LIBRARY(HIPRAND_LIBRARY hiprand HINTS ${HIPRAND_PATH}/lib)

- list(APPEND ATen_CUDA_DEPENDENCY_LIBS ${HIPBLAS_LIBRARY} ${HIPRNG_LIBRARY})
+ list(APPEND ATen_CUDA_DEPENDENCY_LIBS ${ROCBLAS_LIBRARY} ${HIPRAND_LIBRARY})


z

Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: pytorch/pytorch#10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2

…h#10406) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: pytorch#10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2

iotamudelta and others added 30 commits July 13, 2018 10:47

Merge pull request #27 from jithunnair-amd/math_and_inline_workaround

4c14b2c

Add workarounds in pyHIPIFY for __forceinline__ and std:: math functions in HIP

Cleaned argument list & added show-progress field with False value

43cf936

Improved comments for disablefuncmode ENUM and added ability to disab…

5b1e9f5

…le progress bar

Merge pull request #33 from iotamudelta/master

ac5448f

Merge from upstream

Merge branch 'master' into patchesbegone

48d194d

Merge pull request #30 from iotamudelta/patchesbegone

d2528ec

No more patches

De-coupled PyTorch specific parts in the script.

d343ee2

If/def and removing from disable_features

7dae82a

As per review, change cast to just int.

a2bb69a

Merge pull request #34 from iotamudelta/master

dd69afb

As per review, change cast to just int.

Removed unnecessary field from disable_features.yaml

c462928

Update disabled_features.yaml

772f737

Removed the change to cudaOccupancyMaxActiveBlocksPerMultiprocessor in SoftMax.cu inside disable_features.yaml.

Fixing variable name

85d60a6

Handling ArgParse bool type issue.

b5486f4

Merge remote-tracking branch 'upstream/master'

298e77e

Further decoupling of PyTorch specific information into constants wit…

0269858

…hin hipify-python

Merge branch 'master' into refactor_pyhipify

4c0811e

Merge pull request #35 from iotamudelta/master

3c7d028

Merge from upstream

Merge pull request #36 from Jorghi12/refactor_pyhipify

09fd104

Refactoring & Fixing the pyhipify script.

Hardcode getNumGPUs() to 1 for ROCm builds since it currently doesn't…

3f58e08

… work for multi-GPU setup anyway, and gives a seg fault on call to getNumGPUs()

Disable certain unit tests in test_utils for ROCm builds to establish…

b0a84c6

… baseline tests for ROCm CI; add override so user can run all tests if desired

Merge pull request #37 from jithunnair-amd/getNumGPUs_hack_for_ROCm

3fadf87

Hardcode getNumGPUs() to 1 for ROCm builds …

Merge remote-tracking branch 'upstream/master'

b29376c

Merge pull request #39 from iotamudelta/master

33ebb58

Merge from upstream

cub-hip fixes have been merged to master.

a1a7e3f

Replace hcRNG with rocRAND.

c1652be

hcRNG is not supported any longer, rocRAND is the library replacing it. This requires us to disable one function, add some include directories to setup.py and add an include to THCTensorRandom if we are in the ROCm context.

Update build.sh

9f3fb17

Disabling KMTHINLTO.

Handling OOM

d472732

Update build.sh

e949ad7

Including HCC issue #.

iotamudelta and others added 5 commits August 10, 2018 10:13

Merge remote-tracking branch 'rocm_upstream/master'

7ac442c

Merge remote-tracking branch 'upstream/master'

6260298

Merge pull request #108 from jithunnair-amd/enable_unit_tests_for_rocm_2

3636c54

Enable test_torch, test_dataloader, test_indexing and test_utils …

Merge remote-tracking branch 'rocm_upstream/master'

30ebd41

Merge pull request #112 from iotamudelta/master

e8b0772

Merge from upstream

iotamudelta requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners August 10, 2018 16:50

ezyang reviewed Aug 10, 2018

View reviewed changes

aten/src/THC/THCBlas.cu

THError("Cublas_SgemmBatched only supports m, n, k, lda, ldb, ldc, batchCount"

"with the bound [val] <= %d", INT_MAX);

}

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Aug 10, 2018

View reviewed changes

ezyang approved these changes Aug 10, 2018

View reviewed changes

ezyang previously requested changes Aug 10, 2018

View reviewed changes

facebook-github-bot reviewed Aug 10, 2018

View reviewed changes

iotamudelta and others added 4 commits August 10, 2018 17:44

Scope ifdef down as per review.

6316bd3

Remove duplicate definition of skipIfRocm

58d6a22

Merge pull request #117 from jithunnair-amd/flake8_fixes

35bfa51

Remove duplicate definition of skipIfRocm

Merge pull request #116 from iotamudelta/master

73d48cf

Scope ifdef down as per review.

facebook-github-bot reviewed Aug 13, 2018

View reviewed changes

bddppq reviewed Aug 13, 2018

View reviewed changes

facebook-github-bot closed this in 75651d5 Aug 13, 2018

ezyang added open source merged labels Jun 24, 2019

jithunnair-amd deleted the libraries_PR branch September 25, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] improve use of ROCm libraries, enable more tests, small fixes#10406

[ROCm] improve use of ROCm libraries, enable more tests, small fixes#10406
iotamudelta wants to merge 335 commits intopytorch:masterfrom
ROCm:libraries_PR

iotamudelta commented Aug 10, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

iotamudelta commented Aug 10, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants