Use blocks machinery to simplify bookkeeping in autodiff#5036
Merged
ezyang merged 3 commits intopytorch:masterfrom Feb 5, 2018
Merged
Use blocks machinery to simplify bookkeeping in autodiff#5036ezyang merged 3 commits intopytorch:masterfrom
ezyang merged 3 commits intopytorch:masterfrom
Conversation
Using @ezyang's suggestion, this change uses a block rather than staging annotations to represent the reverse pass. This allows us to reuse the machinery to copy graphs/blocks to extract the reverse pass concisely. This also change the input order of Gradients df to: [output vjps][temporary vjps][captures] In addition to being simpler to generate in this order, it also will allow ExecutionPlan to append the captures onto the already- existing input list of vjps that are given by the autograd, rather than have to prepend them, which should be slightly cheaper.
This changes the Gradient struct to enforce that input captures appear before output captures in the capture list, which makes it easier to use in ExecutionPlan.
71b4c33 to
6f27a41
Compare
Contributor
|
I did only a cursory look but it all seems fine. |
Contributor
|
@ezyang thanks for letting me take a look :P |
apaszke
reviewed
Feb 6, 2018
| // note: reverse_node is intentionally not inserted to avoid | ||
| // accidentally acting on it (e.g. in elminate dead code), | ||
| // std::cout << *reverse_node << to view its state. | ||
| auto reverse_node = graph.create("Reverse"_sym, 0); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| primal_outputs.emplace_back(capture_val); | ||
| grad_desc.df_input_captures.emplace_back(Capture::Kind::Output, | ||
| primal_outputs.size() - 1); | ||
| // we need to create a new temporary output for this capture because it wasn't availiable. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| // XXX: Take care when handling outputs - they can be duplicated! | ||
| Gradient grad_desc; | ||
|
|
||
| WithInsertPoint guard(*grad_desc.f, grad_desc.f->block()); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
* Remove addValues and use WithInsertPoint * Use blocks to simplify differentiate Using @ezyang's suggestion, this change uses a block rather than staging annotations to represent the reverse pass. This allows us to reuse the machinery to copy graphs/blocks to extract the reverse pass concisely. This also change the input order of Gradients df to: [output vjps][temporary vjps][captures] In addition to being simpler to generate in this order, it also will allow ExecutionPlan to append the captures onto the already- existing input list of vjps that are given by the autograd, rather than have to prepend them, which should be slightly cheaper. * Enforce that input capture are before outputs This changes the Gradient struct to enforce that input captures appear before output captures in the capture list, which makes it easier to use in ExecutionPlan.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Using @ezyang's suggestion, this change uses a block rather than
staging annotations to represent the reverse pass. This allows us to reuse the machinery to copy graphs/blocks to extract the reverse pass concisely, eliminating ~50 lines of code.
This also changes the input order of Gradients df to:
[output vjps][temporary vjps][captures]
In addition to being simpler to generate in this order, it also
will allow ExecutionPlan to append the captures onto the already-
existing input list of vjps that are given by the autograd,
rather than have to prepend them, which should be slightly cheaper.
This also changes the Gradient struct to enforce that input
captures appear before output captures in the capture list,
which makes it easier to use in ExecutionPlan.