Skip to content

[WIP] Implement twice backward of ConvNd#1555

Closed
caogang wants to merge 24 commits intopytorch:masterfrom
caogang:conv_backward
Closed

[WIP] Implement twice backward of ConvNd#1555
caogang wants to merge 24 commits intopytorch:masterfrom
caogang:conv_backward

Conversation

@caogang
Copy link
Contributor

@caogang caogang commented May 15, 2017

Hi, all

I have tried to implement the backward of backward of ConNd. And I have running an example. The example running without error, but I am not sure if the results are right.

Please review it, and if there is any problem, just tell me.

caogang and others added 24 commits May 7, 2017 09:43
* master:
  Add F.normalize (pytorch#1467)
  Expose custom attributes from C++ functions (pytorch#1430)
  Add high order gradient support for Sigmoid (pytorch#1496)
* conv:
  debug the segment fault of ConvBackwardBackward
  Fix the compile Error
  Change the methods called in ConvBackwardBackward
  Add twice differentiate for cudnn_conv
  Fix linear bug
  Using mask_fill instead and modify the allocation in inplace mode
  Modify ‘norm_type’ to ‘p’ and add TODO notes
  Using expand_as instead
  Add high order support for norm operator
  Add high order support for sqrt operator
  Modify the coding-style to satisfy PEP-8
  Simplify the implementation of relu and threshold operator
  Add high order support for relu and threshold operator
  using grad_output.size() instead and no need to do zero_()
  set grad_input volatile=True
  Add case : grad_output.volatile
  Modify the return value of sigmoid
  Modify the sigmoid in torch.nn.functional.py
  Fixed : a small bug
  Add twice differentiation of sigmoid op

# Conflicts:
#	torch/autograd/_functions/reduce.py
#	torch/autograd/variable.py
#	torch/csrc/autograd/functions/init.cpp
#	torch/nn/_functions/linear.py
#	torch/nn/_functions/thnn/activation.py
Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noticed, ConvBackwardBackward is pretty much the same as ConvForward modulo some minor parameter changes. For this reason it's better to use the forward function to implement it, because if you do it this way, it will be differentiable as many times as you want. This implementation would only work for grad of grad.

@caogang
Copy link
Contributor Author

caogang commented May 15, 2017

Hi, @apaszke .

Ok, I will change it to ConvForward. But I want to confirm whether the logic of this implementation is correct before going to change that. Can you figure out the correctness?

@albanD
Copy link
Collaborator

albanD commented May 15, 2017

@caogang you can use the gradcheck tool from the autograd package here to make sure that the backward is computed correctly.

@caogang
Copy link
Contributor Author

caogang commented May 15, 2017

@albanD Can gradcheck also make sure the grad of grad right? I think it just check the grad not the grad of grad.

@albanD
Copy link
Collaborator

albanD commented May 15, 2017

calling gradcheck(torch._C._functions.ConvNd(some_args), inp) will check the grad, calling gradcheck(torch._C._functions.ConvNdBackward(some_args), inp) will check the grad of the backward function :)

@caogang
Copy link
Contributor Author

caogang commented May 15, 2017

@albanD Oh, that's great. I will check it. Thanks!

@apaszke
Copy link
Contributor

apaszke commented May 15, 2017

You'll need to add a constructor for ConvNdBackward (right now it's NoCtor) if you're going to construct it from Python. You might be able to turn ConvCtor to a template that constructs either ConvForward or ConvBackward depending on the argument (I think their params are the same).

@caogang
Copy link
Contributor Author

caogang commented May 16, 2017

Hi, all . @albanD @apaszke
I have a problem. Assume the forward formula to calculate, is ConvForward -> ConvForward -> ConvBackward ->ConvBackward. And the backward process is ConvBackwardBackward -> ConvBackwardBackward -> ConvBackward -> ConvBackward (ConvBackwardBackward is almost the same as ConvForward). Should I accumulate grad of weight and bias in every backward process unit? Or just do that in the ConvBackward unit.

@caogang
Copy link
Contributor Author

caogang commented May 16, 2017

Please refer to the PR #1569 for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants