Skip to content

Get rid of some template arguments in GPU loop#33308

Closed
zasdfgbnm wants to merge 4 commits intopytorch:masterfrom
zasdfgbnm:reduce_template_args
Closed

Get rid of some template arguments in GPU loop#33308
zasdfgbnm wants to merge 4 commits intopytorch:masterfrom
zasdfgbnm:reduce_template_args

Conversation

@zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented Feb 13, 2020

Globally define

constexpr int num_threads = C10_WARP_SIZE * 2;
constexpr int thread_work_size = 4;
constexpr int block_work_size = thread_work_size * num_threads;

and kill all the template arguments passing these values.

These are effectively global, but we are now passing them around by template arguments, causing many inconvenience in coding.

@dr-ci
Copy link

dr-ci bot commented Feb 13, 2020

💊 CircleCI build failures summary and remediations

As of commit 48ba22b:

  • 1/2 failures introduced in this PR
  • 1/2 recognized as flaky ❄️
    • Re-run these jobs?

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

🕵️ 1 new failure recognized by patterns

The following build failures do not appear to be due to upstream breakage:

See CircleCI build pytorch_xla_linux_xenial_py3_6_clang7_build (1/1)

Step: "Build" (full log | pattern match details)

Feb 13 20:29:01 CMake Error at /var/lib/jenkins/workspace/xla/test/cpp/build/gtest/src/googletest-stamp/googletest-configure-Release.cmake:16 (message):
Feb 13 20:28:57 -- Build files have been written to: /var/lib/jenkins/workspace/xla/test/cpp/build 
Feb 13 20:28:57 + make -j 
Feb 13 20:28:57 Scanning dependencies of target googletest 
Feb 13 20:28:57 [  4%] Creating directories for 'googletest' 
Feb 13 20:28:57 [  9%] Performing download step (git clone) for 'googletest' 
Feb 13 20:28:59 -- googletest download command succeeded.  See also /var/lib/jenkins/workspace/xla/test/cpp/build/gtest/src/googletest-stamp/googletest-download-*.log 
Feb 13 20:28:59 [ 19%] No patch step for 'googletest' 
Feb 13 20:28:59 [ 19%] Performing update step for 'googletest' 
Feb 13 20:28:59 Current branch master is up to date. 
Feb 13 20:28:59 [ 23%] Performing configure step for 'googletest' 
Feb 13 20:29:01 CMake Error at /var/lib/jenkins/workspace/xla/test/cpp/build/gtest/src/googletest-stamp/googletest-configure-Release.cmake:16 (message): 
Feb 13 20:29:01   Command failed: 1 
Feb 13 20:29:01  
Feb 13 20:29:01    '/usr/bin/cmake' '-GUnix Makefiles' '/var/lib/jenkins/workspace/xla/test/cpp/build/gtest/src/googletest-src' 
Feb 13 20:29:01  
Feb 13 20:29:01   See also 
Feb 13 20:29:01  
Feb 13 20:29:01     /var/lib/jenkins/workspace/xla/test/cpp/build/gtest/src/googletest-stamp/googletest-configure-*.log 
Feb 13 20:29:01  
Feb 13 20:29:01  
Feb 13 20:29:01 CMakeFiles/googletest.dir/build.make:105: recipe for target 'gtest/src/googletest-stamp/googletest-configure' failed 

❄️ 1 failure recognized as flaky

The following build failures have been detected as flaky and may not be your fault:

See CircleCI build pytorch_windows_test2 (1/1)

Step: "Test" (full log | pattern match details) ❄️

AssertionError: tensor(1.) not less than or equal to 1e-05 :
====================================================================== 
FAIL: test_cuda_extension (__main__.TestCppExtensionAOT) 
---------------------------------------------------------------------- 
Traceback (most recent call last): 
  File "test_cpp_extensions_aot.py", line 67, in test_cuda_extension 
    self.assertEqual(z, torch.ones_like(z)) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 862, in assertEqual 
    assertTensorsEqual(x, y) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 832, in assertTensorsEqual 
    self.assertLessEqual(max_err, prec, message) 
AssertionError: tensor(1.) not less than or equal to 1e-05 :  
 
---------------------------------------------------------------------- 
Ran 10 tests in 1.339s 
 
FAILED (failures=1, skipped=1) 
Traceback (most recent call last): 
  File "run_test.py", line 486, in <module> 
    main() 
  File "run_test.py", line 479, in main 
    raise RuntimeError(message) 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 1 time.

@zasdfgbnm zasdfgbnm changed the title [WIP]Get rid of some template arguments in GPU loop Get rid of some template arguments in GPU loop Feb 13, 2020
@zasdfgbnm zasdfgbnm requested a review from ngimel February 13, 2020 21:09
@zasdfgbnm
Copy link
Collaborator Author

test failures are unrelated

namespace at { namespace native { namespace modern { namespace detail {
namespace at { namespace native {

constexpr int num_threads = C10_WARP_SIZE * 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume those values are unchanged from what they used to be for ROCm? And at present we don't need to specialize them on a per-kernel basis? If so, then constexp'ing them rather than having them as template arguments makes sense.

@ngimel
Copy link
Collaborator

ngimel commented Feb 13, 2020

cc @iotamudelta. This PR does not change any existing behavior, just makes some template arguments that were never changed into constexprs.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in cd038c0.

ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
Summary:
Globally define
```C++
constexpr int num_threads = C10_WARP_SIZE * 2;
constexpr int thread_work_size = 4;
constexpr int block_work_size = thread_work_size * num_threads;
```
and kill all the template arguments passing these values.

These are effectively global, but we are now passing them around by template arguments, causing many inconvenience in coding.
Pull Request resolved: pytorch#33308

Differential Revision: D19907250

Pulled By: ngimel

fbshipit-source-id: 4623b69baea7e6e77f460ffdfa07cf9f8cba588a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants