Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python#5408
Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python#5408ezyang merged 7 commits intopytorch:masterfrom vedanuj:conv_2d_backward
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
So, the reservation from Adam is valid, but not in the way he expressed it. I think the right design would be a separate lookup table that branches one step further. For example: This way, all this extra API is encapsulated in the |
|
|
|
@ezyang Yes makes sense. If I have understood correctly we should add all the gradient functions we will expose(now or in future) via a separate module called |
|
@vedanuj correct |
|
I thought about the namespace, and it's not a significantly better solution (the complexity remains the same), but I'm ok with it. One extra thing that we might want to think of before we start adding more and more functions there is that sometimes the derivative computations for a single op can share a lot of intermediates, and so we might need to redesign the API (to avoid having a function per computed gradient) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Needs tests. |
|
The last commit added tests for all the grad functions and addresses other comments. Tests:
|
|
@pytorchbot test this please |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
The commit 23a6b3a adds conv1d gradient w.r.t. input( |
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
apaszke
left a comment
There was a problem hiding this comment.
LGTM, but I have some minor comments
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
New commits changes some names of parameters and removes use of deprecated @apaszke TensorFlow APIs for the similar methods requires to input the full shape for input size and weight size. Should we keep it same as TF or change input size and weight size to only specify the dimensions like height, width, depth? |
|
No, if we're going to accept the shape then it should be the full shape. Let's leave it as is. |
* upstream/master: (663 commits) Fix "command not found" error in perf test (pytorch#5982) add pip mkl-devel to the error message when mkl is found but mkl headers are not (pytorch#5984) Support batch LowerCholeskyTransform (pytorch#5980) Linearly interpolating upsampling fix (pytorch#5927) Store perf numbers in S3 (pytorch#5951) Modidy setup docs for Windows (pytorch#5981) Group Normalization (pytorch#5968) [distributions] Implement Power transform (pytorch#5976) Disable TestBottleneck test_cuda on Windows (pytorch#5977) Fix crash when cat-ing empty cuda tensors (pytorch#5971) Update no_unions flag for nanopb gen and update ONNX proto files (pytorch#5972) Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python (pytorch#5408) Fixed non-determinate preprocessing on DataLoader (pytorch#4640) add AVX2 implementation for sigmoid function (pytorch#5010) Implement torch.util.bottleneck (pytorch#5216) Remove pragma once from cpp file (pytorch#5965) fix mvn docs (pytorch#5967) Fix incorrect rendering of Tensor.index_*_ doc examples. (pytorch#5969) Implement range for loop in script (pytorch#5827) Add windows doc (pytorch#5859) ... # Conflicts: # aten/src/TH/generic/THTensorMath.c # torch/_tensor_docs.py # torch/csrc/generic/methods/TensorCompare.cwrap
…Python (pytorch#5408) This PR addresses issue pytorch#5024 * Expose Conv2dBackward in python * Separate interface for exposing gardients of operators * Revert old changes * Add tests * Add conv1d gradients. Refactor tests for grad convolutions * Refactor names and change examples * Remove Varibale from tests for conv backward
This PR addresses issue #5024 and exposes in python gradients with respect to input and weight for
conv2dandconv3d.Gradients with respect to input of convolution (
F.grad.conv2d_input,F.grad.conv3d_input) :For this
conv_transpose2d/conv_transpose3dare used which are essentially the gradients ofconv2d/conv3dw.r.t. input. The output_padding required forconv_transpose2d/conv_transpose3dare implicitly calculated from shape of input gradient, shape of output gradient, kernel size, stride and padding.Gradients with respect to weights (
F.grad.conv2d_weight,F.grad.conv3d_weight):For gradients with respect to the weight we use the
conv2d/conv3dbut with strides and dilations swapped. We reshape the output gradient and input in such a way that the output gradient can be treated as the weight of the forward convolution.Please review @ezyang