[PyTorch] [gradcheck] change backward() to grad()#7710
Conversation
… Replace unnecesary create_graph=True with retain_graph=True
|
The current gradgradcheck is broken because the inputs, containing >>> # print numerical and analytical grad wrt the grad_output
>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=False)
(tensor([[ 1.0133, 0.0000, 0.0000],
[ 0.0000, 1.0133, 0.0000],
[ 0.0000, 0.0000, 1.0133]]), tensor([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]]))
>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=True)
(tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]), tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]))Also fixing this. |
torch/autograd/gradcheck.py
Outdated
| input = contiguous(input) | ||
| target = contiguous(target) | ||
| if not all(t.is_contiguous() for t in iter_tensors(target)): | ||
| raise RuntimeError("target must only contain contiguous Tensors") |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| input_args = tuple(x for x in input_args if isinstance(x, torch.Tensor) and x.requires_grad) | ||
| grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs, create_graph=True) | ||
| grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs, | ||
| create_graph=True, only_inputs=True, allow_unused=True) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| @@ -200,18 +197,19 @@ def fn(input): | |||
| zero_gradients(inputs) | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
I just got rid of the contiguity enforcement altogether. Hope that everything still works. |
| ggO, | ||
| ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask, | ||
| (ggI * gO * nonpositive_mask).sum() | ||
| (ggI * gO * nonpositive_mask).sum().expand_as(weight) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@apaszke This is ready for review. |
apaszke
left a comment
There was a problem hiding this comment.
LGTM. One concern about gradgradcheck that would be good to clarify before merging.
| ggO, | ||
| ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask, | ||
| (ggI * gO * nonpositive_mask).sum() | ||
| (ggI * gO * nonpositive_mask).sum().expand_as(weight) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| # need data here to get around the version check because without .data, | ||
| # the following code updates version but doesn't change content | ||
| x_tensor = x_tensor.data | ||
| for d_idx, x_idx in enumerate(product(*[range(m) for m in x_tensor.size()])): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| var.requires_grad = True | ||
| return var | ||
| y = torch.testing.make_non_contiguous(y).requires_grad_() | ||
| return y |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| def new_func(*args): | ||
| input_args = args[:-num_outputs] | ||
| grad_outputs = args[-num_outputs:] |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (24 commits) Allow empty storage for the 'Edge' class. (pytorch#7595) Process group base class and Gloo implementation (pytorch#7628) _LRSchedulers getstate include optimizer info (pytorch#7757) [PyTorch] [gradcheck] change backward() to grad() (pytorch#7710) Update test_nn.py (pytorch#7787) Define general default scheduler for TBB and fix ppc64le bug (pytorch#7761) Add support for accepting Tensor as input in clip_grad_* functions. (pytorch#7769) [Easy] Remove unused code (pytorch#7782) Update tbb (pytorch#7734) Add @generated annotation (pytorch#7780) fix legacy comment after variable tensor merge (pytorch#7771) Revert pytorch#7750 and pytorch#7762 to fix Windows CI on master (pytorch#7772) Temporarily disable build env check (pytorch#7768) Add missing brace (pytorch#7762) [C++ API] Add backward() to Tensor and Variable (pytorch#7750) [auto] Update onnx to d43b550 - Fix .gitignore and add missing files (onnx/onnx#1005) onnx/onnx@d43b550 [auto] Update onnx to ea1aa13 - add tests for reduce ops (onnx/onnx#675) onnx/onnx@ea1aa13 include cudnn_h (pytorch#7749) [C++ API] Using new registration mechanism (pytorch#7663) [auto] Update onnx to 5dd68e6 - Add a util function: polish_model (onnx/onnx#1000) onnx/onnx@5dd68e6 ...
* Change backward calls to grad to avoid memory leak from pytorch#7343; Replace unnecesary create_graph=True with retain_graph=True * fix gradgradcheck use of make_non_contiguous * allow non-contguous target * remove unnecessray .grad.zero_() * remove contiguous_detach * fix PReLU double backward always returning ggW as a scalar * let noncontig gO require grad * move requires_grad to return
Change backward calls to grad to avoid memory leak from #7343
Replace unnecessary create_graph=True with retain_graph=True
The memory leak is blocking #7270 .