Skip to content

Forward AD formulas batch 1#57768

Closed
albanD wants to merge 13 commits intogh/albanD/90/basefrom
gh/albanD/90/head
Closed

Forward AD formulas batch 1#57768
albanD wants to merge 13 commits intogh/albanD/90/basefrom
gh/albanD/90/head

Conversation

@albanD
Copy link
Copy Markdown
Collaborator

@albanD albanD commented May 6, 2021

Note that this PR implements formulas only for ops that are supported by OpInfo.
Slow gradcheck also passes for this PR and can be found here: #57976

Stack from ghstack:

Differential Revision: D28387766

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented May 6, 2021

💊 CI failures summary and remediations

As of commit 10a4089 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_macos_10_13_py3_build (1/1)

Step: "Spin up environment" (full log | diagnosis details | 🔁 rerun)

Waiting for a VM assignment: .......................................................................
Build-agent version 1.0.63541-3d9f91b1 (2021-05-21T07:54:49+0000)
Creating a dedicated VM with xcode:12.0 image
Waiting for a VM assignment: ............................................................................................................................................................................................................................................................................................................

We timed out preparing a VM for this build, potentially due to our infrastructure or cloud provider.  Please retry the build in a few minutes

Unexpected capacity error: error caused by capacity


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

albanD added a commit that referenced this pull request May 6, 2021
ghstack-source-id: fabd1e8
Pull Request resolved: #57768
@albanD albanD requested a review from zou3519 May 6, 2021 20:48
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
albanD added a commit that referenced this pull request May 7, 2021
ghstack-source-id: 720308c
Pull Request resolved: #57768
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
dgl-intel pushed a commit to dgl-intel/pytorch that referenced this pull request May 7, 2021
ghstack-source-id: f9da430
Pull Request resolved: pytorch#57768
albanD added 2 commits May 7, 2021 18:39
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
albanD added a commit to albanD/pytorch that referenced this pull request May 10, 2021
ghstack-source-id: 8b9396e
Pull Request resolved: pytorch#57768
albanD added a commit that referenced this pull request May 10, 2021
ghstack-source-id: 8b9396e
Pull Request resolved: #57768
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
albanD added a commit to albanD/pytorch that referenced this pull request May 11, 2021
ghstack-source-id: e313cea
Pull Request resolved: pytorch#57768
albanD added 2 commits May 12, 2021 10:09
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
Note that this PR implements formulas only for ops that are supported by OpInfo.




[ghstack-poisoned]
@albanD
Copy link
Copy Markdown
Collaborator Author

albanD commented May 12, 2021

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Note that this PR implements formulas only for ops that are supported by OpInfo.
Slow gradcheck also passes for this PR and can be found here: #57976


Differential Revision: [D28387766](https://our.internmc.facebook.com/intern/diff/D28387766)

[ghstack-poisoned]
Note that this PR implements formulas only for ops that are supported by OpInfo.
Slow gradcheck also passes for this PR and can be found here: #57976


Differential Revision: [D28387766](https://our.internmc.facebook.com/intern/diff/D28387766)

[ghstack-poisoned]
Note that this PR implements formulas only for ops that are supported by OpInfo.
Slow gradcheck also passes for this PR and can be found here: #57976


Differential Revision: [D28387766](https://our.internmc.facebook.com/intern/diff/D28387766)

[ghstack-poisoned]
self: handle_r_to_c(self.scalar_type(), grad)
tensor1: handle_r_to_c(tensor1.scalar_type(), grad * (value / tensor2).conj())
tensor2: handle_r_to_c(tensor2.scalar_type(), -grad * (value * tensor1 / (tensor2 * tensor2)).conj())
result: self_t + maybe_multiply(tensor1_t / tensor2_p, value) - maybe_multiply(tensor2_t * (tensor1_p / tensor2_p) / tensor2_p, value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no action required) Could you actually "auto-elementwise" this and other pointwise operations? It looks like the formula (for the real case at least) is just the backward formula for self + backward formula for tensor 1 + backward formula for tensor2 while replacing all the grads with the correct tangents.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we could do it here. Will add it if it shows up again.

self: maybe_multiply(grad, beta.conj())
mat1: mm_mat1_backward(grad, mat2, mat1.sizes(), mat1.strides(), alpha)
mat2: mm_mat2_backward(grad, mat1, mat2.sizes(), mat2.strides(), alpha)
result: maybe_multiply(self_t, beta) + maybe_multiply(mat1_t.mm(mat2_p), alpha) + maybe_multiply(mat1_p.mm(mat2_t), alpha)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting to note that this is just maybe_multiply(self_t, beta) added with maybe_multiply( formula_for_mm , alpha ). Are there any chances we would want to dedup code between this and the mm formula in the future?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem with such formula is that they are not element-wise. So adding the formulas won't work.
And they are affine (not linear) and so we would need to provide some arguments to a smarter auto_affine to handle this. Which feels like a dangerous step to take.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree the design for this would be tricky.

Copy link
Copy Markdown
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the formulas lgtm from a real numbers perspective, but I am not sure how to derive them for complex numbers

Copy link
Copy Markdown
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline Alban walked me through a derivation of complex forward-mode AD derivative for torch.sin and torch.conj and those helped me understand enough to derive the complex formulas as well.

NB: we should update the one example above that wasn't updated for this PR, but other than that things lgtm

Note that this PR implements formulas only for ops that are supported by OpInfo.
Slow gradcheck also passes for this PR and can be found here: #57976


Differential Revision: [D28387766](https://our.internmc.facebook.com/intern/diff/D28387766)

[ghstack-poisoned]
@albanD
Copy link
Copy Markdown
Collaborator Author

albanD commented May 24, 2021

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@albanD merged this pull request in 09a1b1c.

@facebook-github-bot facebook-github-bot deleted the gh/albanD/90/head branch May 29, 2021 14:17
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
Pull Request resolved: pytorch#57768

Note that this PR implements formulas only for ops that are supported by OpInfo.

Test Plan: Imported from OSS

Reviewed By: zou3519, malfet

Differential Revision: D28387766

Pulled By: albanD

fbshipit-source-id: b4ba1cf1ac1dfd46cdd889385c9c2d5df3cf7a71
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants