Fix scheduling for fp16 reductions by csarofeen · Pull Request #370 · csarofeen/pytorch

csarofeen · 2020-09-08T21:30:13Z

Fixes #362

into fp16reduction

jjsjann123

LGTM. I'm not stamping it yet as I haven't got around to try the fix.

jjsjann123 · 2020-09-09T07:41:06Z

+// them.
+struct FindOutputs : public IterVisitor {
+  const std::unordered_set<Val*>& of_;
+  std::unordered_set<Val*> outs;


nitpick, inconsistent name of of_ & outs;

jjsjann123 · 2020-09-09T07:41:33Z

+    traverseFrom(fusion, fusion->outputs(), false);
+  };
+
+ public:


nitpick, public in struct.

jjsjann123 · 2020-09-09T07:53:22Z

+      // Use dependency check to find the reduction tv as it returns used values
+      // instead of exprs.
+      auto used_vals = DependencyCheck::getAllValsBetween(
+          {fusion.inputs().begin(), fusion.inputs().end()}, fusion.outputs());


Does getAllValsBetween(...) have a smaller traverse space than fusion.exprs() because we might have trailing nodes after outputs? Just curious about the motivation of code change here.

jjsjann123 · 2020-09-09T07:57:29Z

+    //     tv->axis(-1)->parallelize(ParallelType::TIDx);
+    // }
+  }
+  TensorView* out0 = fusion->outputs()[0]->as<TensorView>();


code below out0 should be cleaned as well. This is used in previously when scheduling is marked explicitly in transformation.

Fixes #357. Two things in this PR: Do type propagation even for profiling executor -> this is the root cause for bug reported in #357 Allow dtype argument in sum, which is simply handled in our type propagation. It exposes scheduling issue in #362, for which we added python tests (currently disabled). Fix is in PR #370, will enable after merge.

…ion didn't have their TIDX bound.

…ead binding issues

jjsjann123 and others added 2 commits September 4, 2020 15:15

failing half reduction kernel

08c98e0

Fix reduction scheduling for fp16.

9f9c89f

csarofeen requested review from jjsjann123 and kevinstephano September 8, 2020 21:30

Merge branch '20_8_18_devel' of https://www.github.com/csarofeen/pytorch

1064a6b

into fp16reduction

jjsjann123 approved these changes Sep 9, 2020

View reviewed changes

jjsjann123 and others added 6 commits September 14, 2020 12:26

Merge remote-tracking branch 'csarofeen/20_8_18_devel' into HEAD

51961cd

enabling python test; clang-tidy

18c19d5

addressing my own review comments :)

f6a17c8

Fixed outer dimension reduction scheduling. The outputs of the reduct…

96dc43e

…ion didn't have their TIDX bound.

enabling fp16 on reduction shmoo tests; fixing testing scripts on thr…

15817b6

…ead binding issues

Merge remote-tracking branch 'csarofeen/20_8_18_devel' into HEAD

6255919

jjsjann123 merged commit 83028a4 into 20_8_18_devel Sep 14, 2020

jjsjann123 deleted the fp16reduction branch September 14, 2020 23:15

jjsjann123 mentioned this pull request Sep 15, 2020

Fp16 reduction benchmark test #375

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scheduling for fp16 reductions#370

Fix scheduling for fp16 reductions#370
jjsjann123 merged 9 commits into20_8_18_develfrom
fp16reduction

csarofeen commented Sep 8, 2020

Uh oh!

jjsjann123 left a comment

Uh oh!

jjsjann123 Sep 9, 2020

Uh oh!

jjsjann123 Sep 9, 2020

Uh oh!

jjsjann123 Sep 9, 2020

Uh oh!

jjsjann123 Sep 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

csarofeen commented Sep 8, 2020

Uh oh!

jjsjann123 left a comment

Choose a reason for hiding this comment

Uh oh!

jjsjann123 Sep 9, 2020

Choose a reason for hiding this comment

Uh oh!

jjsjann123 Sep 9, 2020

Choose a reason for hiding this comment

Uh oh!

jjsjann123 Sep 9, 2020

Choose a reason for hiding this comment

Uh oh!

jjsjann123 Sep 9, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants