Splits CPU and CUDA fusion compilers by mruberry · Pull Request #10981 · pytorch/pytorch

mruberry · 2018-08-28T23:54:49Z

This PR splits the CPU and CUDA fusion compilers, putting them into a new jit/fusers/ directory with jit/fusers/common for common components. In particular:

A fusion interface is created that allows "fusion handles" to be requested
The CPU and CUDA fusers implement this interface, with dispatch determined by device
The fusion compilers, fusion function specializations and resource strings are split
CPU-specific classes like TempFile and DynamicLibrary are in the CPU fuser
Common classes likes TensorDesc and the base fusion function class are in jit/fusers/common
There is still some specialization in jit/fusers/common, but these specializations are small(-ish)
Updates the build system to remove the dummy interface on Windows and minimize the use of macros

This structure should allow in-flight PRs to easily rebase while providing a clear interface to the fusers.

mruberry · 2018-08-29T03:34:03Z

Looks like all tests are passing. fyi @zdevito @apaszke

apaszke

Looks much better than the initial PR. Some naming requests and this should be ready to land. I'd also like @ezyang to double check the use of macros, as I'm not sure if it fits well with our build philosophy.

torch/csrc/jit/fusers/common/common_fusion_function.h

+, AnnotatedGraph& agraph
+, bool use_cuda);
+
+struct CommonFusionFunction {


torch/csrc/jit/fusers/common/common_fusion_function.h

@@ -0,0 +1,87 @@
+#if USE_CPU_FUSER || USE_CUDA_FUSER


torch/csrc/jit/fusers/common/common_fusion_handle.h

+//   - the shapes satisfy graph invariants for our fused code (e.g. that all intermediate shapes
+//     are the same - see fusion_compiler.cpp for more details).
+//   - their FusionArgSpecs compare equal
+struct CommonFusionHandle : public FusionHandle {


torch/csrc/jit/fusers/common/partition_desc.h

@@ -0,0 +1,47 @@
+#if USE_CPU_FUSER || USE_CUDA_FUSER


torch/csrc/jit/fusers/fusion_interface.h

@@ -0,0 +1,34 @@
+#pragma once


torch/CMakeLists.txt

  )

+if (NOT WIN32)
+  add_definitions(-DUSE_CPU_FUSER)


ezyang

Macro treatment is problematic.

apaszke

This LGTM. @ezyang can you please take a look at macros again?

mruberry · 2018-09-07T02:43:49Z

Naming and config changes are in, I also took the three relevant PRs that updated the fusion_compiler and ported them to the split.

I am unsure of what's going on with the ~~two~~ failure. ~~The circleci appears to be running itself again (?) and~~ pytorch-linux-trusty-py2.7.9 has had jit problems (there's a PR to disable a test for it), but in this case there are three errors, not just test_scalar_fusion.

torch/csrc/jit/fusers/Config.h

@@ -0,0 +1,4 @@
+#pragma once


torch/CMakeLists.txt


+CONFIGURE_FILE(
+    ${TORCH_SRC_DIR}/csrc/jit/fusers/Config.h.in
+    ${CMAKE_CURRENT_SOURCE_DIR}/csrc/jit/fusers/Config.h)


apaszke · 2018-09-10T16:13:09Z

@mruberry CI failure looks related

apaszke · 2018-09-10T16:55:05Z

Hmm that might be a transient issue. I can't reproduce this locally. Let's see if a retest fixes it.

apaszke · 2018-09-10T17:40:20Z

Nope, this seems to consistently fail 😕

zou3519 · 2018-09-10T18:43:56Z

CPU fuser tests are flaky and possibly broken (even on master). See #11360 for some discussion -- I'm not sure why, but I think the pr/pytorch-linux-trusty-py2.7 machine runs out of memory, but this does need more investigation.

We are looking into it.

mruberry · 2018-09-11T01:34:02Z

It is strange that three tests were failing consistently (test_scalar_fusion was also failing until it was disabled), and this is mentioned above. I suspect these failures are revealed by this PR more than caused by it.

apaszke · 2018-09-11T01:52:11Z

I agree, but unfortunately we can't merge this until we resolve the problem. It's likely that we will disable the CPU fuser by default, but we don't want to disable its tests entirely to avoid undetected breakages of that code path. @zou3519 is looking into that

ezyang · 2018-09-12T04:08:56Z

@pytorchbot retest this please

zou3519 · 2018-09-12T13:58:09Z

I'm still looking into it, sorry for the delay. Some strange things are going on

zou3519 · 2018-09-12T20:38:18Z

I posted an update in #11360 about what I've found. The summary is that test_jit.py has high peak memory usage that causes the graph fuser to fail when it runs the fork() syscall, despite fork having copy-on-write semantics.

The fastest solution to unblock this PR would be to move the fuser tests to run first in test_jit.py so they run before test_jit starts using a lot of memory. A more robust solution (that I am looking into right now) is to figure out why test_jit.py uses so much memory (> 4gb) and fix that.

Run TestEndToEndHybridFrontendModels last. It has high peak memory usage that causes unrelated CPU fuser test failures if those tests run after. A more robust fix for this issue is being tracked in pytorch#11360

zou3519 · 2018-09-13T17:31:35Z

@mruberry @apaszke I pushed a change to make TestEndToEndHybridFrontendModels run last in test_jit.py, which should avoid the CPU fuser test failures and unblock this PR. I am investigating a more robust solution but this PR should be good to go after the CI runs

facebook-github-bot

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

apaszke · 2018-09-14T02:06:00Z

Hmm while OSS tests are passing, this has unfortunately triggered a failure in some internal tests... I'll push some code that disables the fuser by default (except for the tests) tomorrow in the morning.

Also, disable grad mode in _check_trace, which greatly decreases peak memory usage when inputs with requires_grad are used to trace.

apaszke · 2018-09-14T15:39:42Z

I have disabled the CPU fuser by default, except for the few tests that should actually exercise it. The memory usage regression should be now addressed thanks to @zou3519, who noticed that we run _check_trace with grad mode enabled (which forces us to keep two graphs in memory). We still don't have a definite answer for why the peak RSS is persistent in many environments, but it looks like free is simply very unwilling to return memory to the OS.

facebook-github-bot

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: This patch adds fused forward and backward for clamp to the jit. This is one item of #11118 . If it's OK, I'd be happy to also add some more of #11118 . The patch depends on #11150 , which I merged into master as a base. I'll rebase it when that or #10981 is merged. This is first serious jit patch, thank you, ngimel and the others for their guidance. All errors are my own. Pull Request resolved: #11574 Differential Revision: D9943090 Pulled By: apaszke fbshipit-source-id: c40954b8c28c374baab8d3bd89acc9250580dc67

Splits CPU and CUDA fusion compilers

3dfcfba

mruberry requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners August 28, 2018 23:54

mruberry added 9 commits August 28, 2018 17:05

Adds missing files

0a1e5a9

Adds missing files

9658ca1

Adds missing files

3f9bcaf

Adds back explicit guards

f08ef43

Adds guards on CPU-only builds

1123144

Syntax fix for CPU-only path

638dfd3

Guard fix

664a2aa

Addtl guards to prevent 'not used' warnings

73cf341

Addtl CPU guards

6fdaaa1

apaszke mentioned this pull request Aug 31, 2018

JIT fusion difference between module and functional interfaces #11141

Closed

apaszke reviewed Sep 5, 2018

View reviewed changes

ezyang reviewed Sep 5, 2018

View reviewed changes

torch/CMakeLists.txt Outdated

)

if (NOT WIN32)

add_definitions(-DUSE_CPU_FUSER)

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang previously requested changes Sep 5, 2018

View reviewed changes

mruberry added 5 commits September 6, 2018 16:40

Config.h.in and renaming

59e7cce

intermediate removal

b294776

fusion_compiler.h changes

53fda97

merges with master

b6b678d

Removes leftover files

18a2385

apaszke approved these changes Sep 7, 2018

View reviewed changes

ezyang reviewed Sep 7, 2018

View reviewed changes

torch/csrc/jit/fusers/Config.h Outdated

@@ -0,0 +1,4 @@

#pragma once

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Sep 7, 2018

View reviewed changes

torch/CMakeLists.txt

CONFIGURE_FILE(

${TORCH_SRC_DIR}/csrc/jit/fusers/Config.h.in

${CMAKE_CURRENT_SOURCE_DIR}/csrc/jit/fusers/Config.h)

This comment was marked as off-topic.

Sign in to view

weiyangfb added the oncall: jit Add this issue/PR to JIT oncall triage queue label Sep 11, 2018

t-vi mentioned this pull request Sep 12, 2018

Jit fuse clamp #11574

Closed

zou3519 and others added 2 commits September 13, 2018 10:28

Fix CI and unblock this PR

581af06

Run TestEndToEndHybridFrontendModels last. It has high peak memory usage that causes unrelated CPU fuser test failures if those tests run after. A more robust fix for this issue is being tracked in pytorch#11360

Merge branch 'master' into fusion_compiler_split

a38d274

fix whitespace

8fd0a9d

facebook-github-bot reviewed Sep 13, 2018

View reviewed changes

soumith approved these changes Sep 13, 2018

View reviewed changes

apaszke added 2 commits September 14, 2018 08:13

Disable CPU fuser by default

a7fe284

Also, disable grad mode in _check_trace, which greatly decreases peak memory usage when inputs with requires_grad are used to trace.

Merge branch 'master' into fusion_compiler_split

b8793a0

facebook-github-bot reviewed Sep 14, 2018

View reviewed changes

facebook-github-bot closed this in 96d3f96 Sep 14, 2018

t-vi mentioned this pull request Sep 15, 2018

return aten::gt to the list of fusable operations, add expected graphs #11150

Closed

zou3519 mentioned this pull request Sep 18, 2018

renable test_scalar_fusion #11378

Closed

mruberry deleted the fusion_compiler_split branch September 25, 2018 16:42

ezyang added open source merged labels Jun 24, 2019

Conversation

mruberry commented Aug 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mruberry commented Aug 29, 2018

Uh oh!

apaszke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry commented Sep 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke commented Sep 10, 2018

Uh oh!

apaszke commented Sep 10, 2018

Uh oh!

apaszke commented Sep 10, 2018

Uh oh!

zou3519 commented Sep 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mruberry commented Sep 11, 2018

Uh oh!

apaszke commented Sep 11, 2018

Uh oh!

ezyang commented Sep 12, 2018

Uh oh!

zou3519 commented Sep 12, 2018

Uh oh!

zou3519 commented Sep 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zou3519 commented Sep 13, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

apaszke commented Sep 14, 2018

Uh oh!

apaszke commented Sep 14, 2018

Uh oh!

mruberry commented Aug 28, 2018 •

edited

Loading

apaszke left a comment •

edited

Loading

mruberry commented Sep 7, 2018 •

edited

Loading

zou3519 commented Sep 10, 2018 •

edited

Loading

zou3519 commented Sep 12, 2018 •

edited

Loading