[CUDA_FUSER] Expand operation support for cuda fuser by jjsjann123 · Pull Request #37849 · pytorch/pytorch

jjsjann123 · 2020-05-05T18:36:43Z

This PR added more supported operations in CUDA fuser. We are covering major point-wise operations supported in legacy fuser.

In an attempt to adapt to legacy executor:

added an naive shape propagation pass on pytorch JIT IR;
small refactor on graph partitioning;
fallback interpreter execution of fusion group;

…our op declaration / or revisit type promotion

…ul, add_alpha, and sub_alpha.

… that Christian is working on a fix

…as to be solved either by defining them all internal to the fuser or somehow including a CUDA path.

…ary Op Type.

…sage.

…rnel

jjsjann123 · 2020-05-06T22:47:11Z

Windows test seems to be be jammed... I clicked the rerun that somehow spins up the passed tests 🤦

facebook-github-bot

@soumith has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-05-07T20:20:45Z

@soumith merged this pull request in 1667aa6.

This is to reland #38675, and test cpp_extension compatible in _test only, this is enough, the purpose of this test is to make sure pytorch and cpp extension are compatible with xenial + cuda 9.2 + gcc 5.4 There are two non gcc5.4 (+ cuda9.2) compatible change introduced recently: #37849 #38627 which caused the following problems: https://app.circleci.com/pipelines/github/pytorch/pytorch/173756/workflows/7445e169-9c26-4ec4-a23a-ff6160d155b1/jobs/5582207/steps https://app.circleci.com/pipelines/github/pytorch/pytorch/173970/workflows/bf0de0f2-9156-4c8f-a097-53ca8e20d4b0/jobs/5589265/steps The root cause is that gcc 5.4 does not support uniform initialization list well, it can not deduce a correct type in some cases. It probably bugs in the gcc 5 compiler, I modified these code a little bit to make them compatible with cuda 9.2 + gcc 5.4. People are still using xenial + gcc5.4 + cuda 9.x, this env should be covered until xenial is deprecated. Differential Revision: [D21731026](https://our.internmc.facebook.com/intern/diff/D21731026) [ghstack-poisoned]

This is to reland #38675, and test cpp_extension compatibility in _test only, this is enough, the purpose of this test is to make sure pytorch and cpp extension are compatible with xenial + cuda 9.2 + gcc 5.4 There are two non gcc5.4 (+ cuda9.2) compatible change introduced recently: #37849 #38627 which caused the following problems: https://app.circleci.com/pipelines/github/pytorch/pytorch/173756/workflows/7445e169-9c26-4ec4-a23a-ff6160d155b1/jobs/5582207/steps https://app.circleci.com/pipelines/github/pytorch/pytorch/173970/workflows/bf0de0f2-9156-4c8f-a097-53ca8e20d4b0/jobs/5589265/steps The root cause is that gcc 5.4 does not support uniform initialization list well, it can not deduce a correct type in some cases. It probably bugs in the gcc 5 compiler, I modified these code a little bit to make them compatible with cuda 9.2 + gcc 5.4. People are still using xenial + gcc5.4 + cuda 9.x, this env should be covered until xenial is deprecated. Differential Revision: [D21731026](https://our.internmc.facebook.com/intern/diff/D21731026) [ghstack-poisoned]

Summary: This PR added more supported operations in CUDA fuser. We are covering major point-wise operations supported in legacy fuser. In an attempt to adapt to legacy executor: 1. added an naive shape propagation pass on pytorch JIT IR; 2. small refactor on graph partitioning; 3. fallback interpreter execution of fusion group; Pull Request resolved: pytorch/pytorch#37849 Reviewed By: yf225 Differential Revision: D21444320 Pulled By: soumith fbshipit-source-id: 712e18ab8497f8d58a07e6f8d200cdab52cf0d74

Summary: This PR added more supported operations in CUDA fuser. We are covering major point-wise operations supported in legacy fuser. In an attempt to adapt to legacy executor: 1. added an naive shape propagation pass on pytorch JIT IR; 2. small refactor on graph partitioning; 3. fallback interpreter execution of fusion group; Pull Request resolved: pytorch#37849 Reviewed By: yf225 Differential Revision: D21444320 Pulled By: soumith fbshipit-source-id: 712e18ab8497f8d58a07e6f8d200cdab52cf0d74

kevinstephano and others added 30 commits April 17, 2020 11:46

merge with failing tests

6a7b5b6

fixing multiplication through WAR, we should re-examine the order of …

506c747

…our op declaration / or revisit type promotion

Add new Unary and Binary Ops as well as compound ops like lerp, addcm…

28cc4f1

…ul, add_alpha, and sub_alpha.

Update type.h to have fixed placement of Mod in enum.

e9e7a1a

Adding CondtionalOp to support where, clamp, threshold ops.

7c755ea

Switch up names and arguments for ConditionalOp to TernaryOp.

1827859

Add clamp, threshold, and where. Fixed remainder.

8ebeaa6

Add test for Ternary Ops.

0179074

refactor registration

6d50f23

operatoins filled

317c0f1

tests working

057aa60

removing some prints

30a336f

quick fix for remainder and added it to python test

a00a1ae

[CUDA FUSER] fallback execution added to catch codegen error.

b28d705

temporarily disable binary operations

2daa421

Added RandLike but without testing.

a6b8d0a

Updated UnaryOps test to also test RandLike.

856eff0

Added Casting support from float2half and half2float but it fails tests.

5a20917

monkey shape inference in PyTorch Jit IR done

7e78f19

disabled output size check for fusion, disabled a hanging python test…

790f3b2

… that Christian is working on a fix

Fix up cast ops. There is still a problem with FP16 intrinsics that h…

255f872

…as to be solved either by defining them all internal to the fuser or somehow including a CUDA path.

Added the hand instrinsics for FP16. Commented out the unused Cast Un…

72b7e6b

…ary Op Type.

Fix circular compute at references. Extra error checking on codegen u…

85d955f

…sage.

Clang.

e7c9520

re-enabling multi output tests

0b72fe0

Move code that was written in header to kernel_resource_strings.

cc3bf0d

Add Bool DataType support. No operations use it yet.

b2f33fa

randn working in python

1042513

repro attempt 2 _ code lowering

c24a714

integrate clamp / threshold; fix weighted add / sub

f84f5e0

mruberry requested review from ngimel and removed request for apaszke May 5, 2020 22:27

mruberry added module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 5, 2020

ngimel removed their request for review May 5, 2020 23:17

mruberry removed the module: cuda Related to torch.cuda, and CUDA support in general label May 5, 2020

kevinstephano and others added 6 commits May 5, 2020 19:43

Fixes to pass clang-tidy.

bc9940c

default value for unroll_factor_, this fixes cpp tests with runTestKe…

c637ed7

…rnel

clang-format

61178a4

fix flake8

b019630

debugging CI

e1a8b30

undo debug changes, fix rand_like test for pascal card in CI

f19df98

csarofeen approved these changes May 7, 2020

View reviewed changes

facebook-github-bot reviewed May 7, 2020

View reviewed changes

facebook-github-bot closed this in 1667aa6 May 7, 2020

jjsjann123 deleted the operation_expand_pr branch May 7, 2020 17:35

facebook-github-bot added the merged label May 7, 2020

glaringlee mentioned this pull request May 27, 2020

add xenial + cuda 9.2 + gcc 5.4 CI test #39036

Closed

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA_FUSER] Expand operation support for cuda fuser#37849

[CUDA_FUSER] Expand operation support for cuda fuser#37849
jjsjann123 wants to merge 62 commits intopytorch:masterfrom
csarofeen:operation_expand_pr

jjsjann123 commented May 5, 2020

Uh oh!

jjsjann123 commented May 6, 2020

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

jjsjann123 commented May 5, 2020

Uh oh!

jjsjann123 commented May 6, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants