Fix issues in reductions and thread predicates by naoyam · Pull Request #470 · csarofeen/pytorch

naoyam · 2020-10-30T08:07:00Z

Thread predicates are ignored when calling blockReduce and gridReduce. See #468 for a reproducer of the problem when it is ignored for blockReduce. See #367 for a reproducer of gridReduce. This PR also adds the reproducers as new tests.

For gridReduce, the best way to apply thread predicates is, IMO, to set the TIDx/y/z template parameters as false. An assumption I have is that the thread predicate of a GridReduction must not include BIDx/y/z since we don't allow multiple calls to gridReduce in a single kernel.

To pass around the predicate info until the CUDA code for calling gridReduce is generated, a new field of type ParallelTypeBitmap is added to kir::GridReduction. To use ParalleTypeBitmap from kernel_ir.h, I also extracted the class from lower_utils.h into its own header file.

For blockReduce, the change is much more trivial (IndexLowering::visit(kir::ReductionOp*).

Fixes #367
Fixes #468

When TIDx/y/z are predicated, set the TIDx/y/z template flags as false Closes #367

csarofeen

This looks good to me. Do we assert somewhere if someone attempts to use 2 grid reductions in the same kernel? We actually may want to do this at some point and just want to make sure until then we throw a hard error.

Can you also please update the issues with the code that is now generated?
Thanks!

naoyam · 2020-11-02T18:07:38Z

Added an issue of detecting multiple grid reductions (#475).

naoyam · 2020-11-02T18:12:38Z

Updated issue #367 with the generated kernel.

naoyam · 2020-11-02T18:16:44Z

Updated issue #468 with the generated kernel.

tlemo · 2020-11-02T20:55:52Z

    }
  }

+  std::string generateGridReduceTemplateFlags(


Please place this local helper in a anonymous namespace

It's already in a class defined in an anonymous namespace.

Ah, right. One reason I don't like anonymous namespaces, you have to scroll around a lot to find them :(

Yeah, that's a downside.

tlemo · 2020-11-02T22:19:55Z

  ReductionOp* reduction_op_ = nullptr;
  Allocate* reduction_buffer_ = nullptr;
  Allocate* sync_buffer_ = nullptr;
+  ParallelTypeBitmap thread_pred_;


Why can't we use the "normal" predicate (Expr::predicate_) instead of this?

We could. We would need to make some changes to gridReduce. In particular, indexing the work buffer written by each thread block needs some non-significant change.

However, gridReduce does have template bool parameters exactly for predicating based on block and thread indices being zero, so using those template flags should make more sense. The normal predicate is not for thread/block indices, so it can't be mapped to the template flags, and that's why we need to separate them for gridReduce. Note that for other expressions, we just combine them by &&.

Added a comment to the code itself too.

Naoya Maruyama added 4 commits October 29, 2020 17:12

Add test cases

a113a3e

Fix #468

ad07f75

Fix thread predicate for GridReduction

6f9abe8

When TIDx/y/z are predicated, set the TIDx/y/z template flags as false Closes #367

cleanup

ef7031e

naoyam requested review from csarofeen and tlemo October 30, 2020 08:07

Naoya Maruyama added 3 commits October 30, 2020 01:30

clang-tidy

df9e9ff

Delete accidentally added file

1c5aede

Remove unnecessary include

b9add42

csarofeen approved these changes Nov 2, 2020

View reviewed changes

naoyam mentioned this pull request Nov 2, 2020

Issue with block reduction then grid reduction #367

Closed

naoyam mentioned this pull request Nov 2, 2020

ReductionOp ignores thread predicates #468

Closed

tlemo approved these changes Nov 2, 2020

View reviewed changes

Naoya Maruyama added 3 commits November 2, 2020 15:01

PR feedback

1f93e86

clang-format

34c41ac

Add a comment on GridReduction::thread_predicate_

a341179

naoyam merged commit a4d48c3 into 20_10_20_devel Nov 2, 2020

naoyam deleted the fix-issue367 branch November 2, 2020 23:23

Conversation

naoyam commented Oct 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csarofeen left a comment

Choose a reason for hiding this comment

Uh oh!

naoyam commented Nov 2, 2020

Uh oh!

naoyam commented Nov 2, 2020

Uh oh!

naoyam commented Nov 2, 2020

Uh oh!

Uh oh!

tlemo Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

naoyam Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

tlemo Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

naoyam Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tlemo Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

naoyam Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

naoyam Nov 2, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

naoyam commented Oct 30, 2020 •

edited

Loading