Don't flatten output lists in the JIT IR#10949

Closed

apaszke wants to merge 12 commits intopytorch:masterfrom

apaszke:jit_chunk_outputs

Contributor

apaszke commented Aug 28, 2018

Operators like aten::chunk used to return a number of tensors, but
now return a list. To make it easier to do shape prop through
aten::chunk and fuse it, I've also introduced prim::ConstantChunk,
which behaves like the previous implementation (has a variable length
output list).

The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those.

apaszke requested review from colesbury, ezyang, gchanan, soumith and zdevito as code owners

August 28, 2018 15:34

zou3519 added the oncall: jit label

zou3519 reviewed

View reviewed changes

test/expect/TestJit.test_decompose_addmm.expect Outdated Show resolved Hide resolved

apaszke mentioned this pull request

Change specialization rules in GraphExecutors #10977

Closed

zdevito approved these changes

View reviewed changes

Contributor

zdevito left a comment

This looks correct. I have some minor things to change to keep things clean.

torch/csrc/jit/register_prim_ops.cpp Outdated Show resolved Hide resolved

torch/csrc/jit/tracer.h Outdated Show resolved Hide resolved

torch/csrc/jit/ir.h Outdated Show resolved Hide resolved

torch/csrc/jit/passes/canonicalize_ops.cpp Outdated Show resolved Hide resolved

torch/csrc/jit/register_prim_ops.cpp Outdated Show resolved Hide resolved

apaszke force-pushed the jit_chunk_outputs branch from 281c854 to 4d9a4ac Compare

August 29, 2018 15:14

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

apaszke requested review from Yangqing, anderspapitto, bddppq, dzhulgakov, houseroad, jamesr66a and smessmer as code owners

August 29, 2018 22:23

apaszke force-pushed the jit_chunk_outputs branch 2 times, most recently from 673bb10 to a645b8f Compare

August 30, 2018 04:49

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

apaszke added 5 commits

August 30, 2018 13:59


          Don't flatten output lists in the JIT IR

2eca727

Operators like aten::chunk used to return a number of tensors, but
now return a list. To make it easier to do shape prop through
aten::chunk and fuse it, I've also introduced prim::ConstantChunk,
which behaves like the previous implementation (has a variable length
output list).


          Review comments

78779d2


          Fix for insane scoping rules in Python 2

03c1a6d


          Remove TORCH_API from templated functions

209dd53


          Fix ONNX problems + review comments

35d3fdb

apaszke added 7 commits

August 30, 2018 13:59


          Fix ConstantChunk op

deecec1


          Rebase fixes

3e20552


          Fix for Werror build

fea68cb


          Another fix for macOS CI

928ee96


          Fixes for ONNX export

3d22265


          Fix expect file

b43e78d


          Rebase fixes

ee4f855

apaszke force-pushed the jit_chunk_outputs branch from 2528423 to ee4f855 Compare

August 30, 2018 21:12

facebook-github-bot reviewed

View reviewed changes

Contributor

facebook-github-bot left a comment

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot closed this in

f3c3127

apaszke deleted the jit_chunk_outputs branch

August 31, 2018 04:35

facebook-github-bot pushed a commit that referenced this pull request


          Change specialization rules in GraphExecutors (#10977)

00df09b

Summary:
**Review last commit only.** Stacked on top of #10949.

This commit fixes a number of issues connected to caching
differentiability status of graphs inside graph executors,
and changes the rules for optimization of differentiable subgraphs.
Previously every one of those was instantiated as a separate graph
executor, but now they are simply heavier-optimized graph regions,
and graph executors are only instantiated for their backward.

zdevito
Pull Request resolved: #10977

Differential Revision: D9600626

Pulled By: apaszke

fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3

apaszke mentioned this pull request

[JIT] Fixing split/chunk and other operators that return lists #10793

Closed

PenghuiCheng pushed a commit to PenghuiCheng/pytorch that referenced this pull request


          Don't flatten output lists in the JIT IR (pytorch#10949)

09ee738

Summary:
Operators like aten::chunk used to return a number of tensors, but
now return a list. To make it easier to do shape prop through
aten::chunk and fuse it, I've also introduced prim::ConstantChunk,
which behaves like the previous implementation (has a variable length
output list).

The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those.

zdevito
Pull Request resolved: pytorch#10949

Reviewed By: zdevito

Differential Revision: D9556823

Pulled By: apaszke

fbshipit-source-id: 33e63b17fc7247cac6cfc05eb7eb9bf069b499ee

PenghuiCheng pushed a commit to PenghuiCheng/pytorch that referenced this pull request


          Change specialization rules in GraphExecutors (pytorch#10977)

f599697

Summary:
**Review last commit only.** Stacked on top of pytorch#10949.

This commit fixes a number of issues connected to caching
differentiability status of graphs inside graph executors,
and changes the rules for optimization of differentiable subgraphs.
Previously every one of those was instantiated as a separate graph
executor, but now they are simply heavier-optimized graph regions,
and graph executors are only instantiated for their backward.

zdevito
Pull Request resolved: pytorch#10977

Differential Revision: D9600626

Pulled By: apaszke

fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3

ezyang added open source merged labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

zou3519 zou3519 left review comments

facebook-github-bot facebook-github-bot left review comments

zdevito zdevito approved these changes

colesbury Awaiting requested review from colesbury

ezyang Awaiting requested review from ezyang

gchanan Awaiting requested review from gchanan

soumith Awaiting requested review from soumith

anderspapitto Awaiting requested review from anderspapitto

bddppq Awaiting requested review from bddppq

dzhulgakov Awaiting requested review from dzhulgakov

houseroad Awaiting requested review from houseroad

jamesr66a Awaiting requested review from jamesr66a

smessmer Awaiting requested review from smessmer

Yangqing Awaiting requested review from Yangqing

Labels

oncall: jit open source