add autodiff expressions for common operations#11832
add autodiff expressions for common operations#11832zou3519 wants to merge 18 commits intopytorch:masterfrom
Conversation
201f382 to
c14e20d
Compare
|
There is something really fishing going on with LSTM's double backward (that is the test that is failing). Disabling the autodiff formula for mul(Tensor, Scalar) causes the test to pass, renabling it causes the test to fail. As far as I can tell the autodiff formula is correct so I'm not too sure what is going on (it's possible that it's a numeric precision issue, but the gradients are off by ~100 (they are on the order of magnitude of 1000) |
|
@apaszke thank you for pointing that out! (I would have gone on a long search for the cause otherwise...). |
c14e20d to
1a5d564
Compare
Previously test_jit.py only tested autograd on backward graphs. This is because we borrow from test_autograd and construct graphs with a small number of nodes. Because the number of nodes is small (typically 1-2), those graph do not end up containing autodiff subgraphs, so autodiff never gets tested. This PR enables autodiff testing by doing the following: - added disableDebugAutodiffSubgraphInlining fn to graph_executor to disable autodiff subgraph inlining. - (implementation) added autodiffSubgraphNodeThreshold and autodiffSubgraphInlineThreshold. These are set to their default values (2, 5) but disableDebugAutodiffSubgraphInlining() sets both to 1, disabling subgraph inlining and allowing 1-node autodiff subgraphs. - The relevant backward jit tests disable autodiff subgraph inlining so they will test the autodiff versions of the operators instead of autograd whenever an autodiff variant exists. - We don't run the tests that do inline autodiff subgraphs anymore. This has no impact on testing correctness because the assumption is that autograd functions are correct and are tested in test_autograd.py
- Added div - Added scalar-tensor and tensor-scalar overloads
1a5d564 to
969a0ab
Compare
|
Should be ready for review now. I excluded some advanced indexing tests from running because they fail autodiff tests (they were previously incorrect but this PR exposed them) and I don't see an easy fix for them. The root cause is #12024. |
apaszke
left a comment
There was a problem hiding this comment.
Nice! Haven't reviewed all AD formulas, but they should be tested anyway
test/test_jit.py
Outdated
|
|
||
| def assert_fully_differentiable_graph(graph): | ||
| nodes = list(graph.nodes()) | ||
| if not (len(nodes) == 1 and 'prim::DifferentiableGraph' in str(nodes[0])): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/jit/graph_executor.cpp
Outdated
| friend struct GraphExecutor; | ||
|
|
||
| size_t autodiffSubgraphNodeThreshold = 2; | ||
| size_t autodiffSubgraphInlineThreshold = 5; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
facebook-github-bot
left a comment
There was a problem hiding this comment.
zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
* upstream/master: (117 commits) Add full namespace resolution in CAFFE_DURATION (pytorch#12065) T33898723: Simple put operators for caffe2 stats (pytorch#12057) add narrow() support for sparse tensors re: pytorch#8853 (pytorch#11342) Fix ONNX bug, add symbolic for full Enable tracing of tensor factories with an out argument Fix warnings emitted when testing distributions (pytorch#12038) Unify versions across setup.py, libtorch, and libcaffe2 (pytorch#12053) add autodiff expressions for common operations (pytorch#11832) Blob doesn't allow access to destroyCall anymore (pytorch#11548) IValue can store Blob (pytorch#11414) Move Blob to ATen/core (pytorch#11924) Use tempfile during serialized test comparison (pytorch#12021) fix segfault when grad to a hook fn is None (pytorch#12028) Fallback CreateMutex/AtomicIter operators for mkl-dnn Unify all *_EXPORT and *_IMPORT macros across c++ backend (pytorch#12019) Add safety asserts for methods on TensorImpl which don't work on Variable. (pytorch#12058) Make USE_IDEEP work again (pytorch#12026) Fix "identifier following the 'template' keyword does not refer to a template" (pytorch#12037) Delete some unused variables. (pytorch#12059) Support TypeIdentifier::name() (pytorch#12036) ...
This PR does a few things:
Enable testing autodiff in test_jit.py
Previously test_jit.py only tested autograd on backward graphs.
This is because we borrow from test_autograd and construct graphs with a small
number of nodes. Because the number of nodes is small (typically 1-2), those graph
do not end up containing autodiff subgraphs, so autodiff never gets tested.
This PR enables autodiff testing by doing the following:
autodiff subgraph inlining.
These are set to their default values (2, 5) but disableDebugAutodiffSubgraphInlining()
sets both to 1, disabling subgraph inlining and allowing 1-node autodiff subgraphs.
will test the autodiff versions of the operators instead of autograd whenever
an autodiff variant exists.
This has no impact on testing correctness because the assumption is
that autograd functions are correct and are tested in test_autograd.py
Add autodiff for a lot of ops
This allows the graph fuser to work better because a lot of these ops were previously not autodiff-compatible but fusible. On a more concrete example, lstm backward contains a lot of tensor-scalar operations; these autodiff formulas help its double backward pass.
Included:
Test Plan
TestJitGenerated tests autodiff for all of the added operations.
cc @apaszke @zdevito