Skip to content

Retry - [JIT] Propagate profiled information to DifferentiableGraph outputs#79498

Closed
davidberard98 wants to merge 5 commits intogh/davidberard98/115/basefrom
gh/davidberard98/115/head
Closed

Retry - [JIT] Propagate profiled information to DifferentiableGraph outputs#79498
davidberard98 wants to merge 5 commits intogh/davidberard98/115/basefrom
gh/davidberard98/115/head

Conversation

@davidberard98
Copy link
Copy Markdown
Contributor

@davidberard98 davidberard98 commented Jun 14, 2022

Stack from ghstack:

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Changes from previous revision: previously, an output node would sometimes be profiled within a no_grad region, meaning that the profiled value would have the wrong requires_grad value (i.e. it would be profiled as requires_grad=False in the no_grad region; but at the time of output, it would have been profiled as requires_grad=True). To fix this, we keep track of which context each node is part of, and only consider profile nodes that are within the same context region. Tested on both of the internal models. (the memory leak and no_grad model).

Differential Revision: D37131946

…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Jun 14, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 3b1521a (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

davidberard98 added a commit that referenced this pull request Jun 14, 2022
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

ghstack-source-id: 53436e2
Pull Request resolved: #79498
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jun 14, 2022
…ableGraph outputs"

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

[ghstack-poisoned]
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ableGraph outputs"

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Differential Revision: [D37131946](https://our.internmc.facebook.com/intern/diff/D37131946)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jun 14, 2022
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

ghstack-source-id: 6297ca4
Pull Request resolved: #79498
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

}

if (!requiresGradResults.empty()) {
bool requires_grad = std::any_of(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eellison does this seem like a safe fix to the nograd issue?

I can think of one case where the output should requires_grad but this won't do so: if all the uses of an diff. graph output that requires_grad are used in nograd sections, then all the profiled values will record requires_grad=False. But I think that's probably fine because then I don't think it matters that it requires_grad?

…ableGraph outputs"

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Differential Revision: [D37131946](https://our.internmc.facebook.com/intern/diff/D37131946)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jun 15, 2022
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

ghstack-source-id: f2c6290
Pull Request resolved: #79498
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ableGraph outputs"

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Differential Revision: [D37131946](https://our.internmc.facebook.com/intern/diff/D37131946)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jun 18, 2022
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

ghstack-source-id: 2d2bdf2
Pull Request resolved: #79498
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!!!

@davidberard98
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@davidberard98 your PR has been successfully merged.

@github-actions
Copy link
Copy Markdown
Contributor

Hey @davidberard98.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jun 22, 2022
…utputs (#79498)

Summary:
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Pull Request resolved: #79498

Approved by: https://github.com/eellison

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/268ced51045f59c4a36c32edcad736a754c61288

Reviewed By: eellison

Differential Revision: D37131946

Pulled By: davidberard98

fbshipit-source-id: 7dcfa99f65048d59eaf0628f89ecabe123f16a77
@facebook-github-bot facebook-github-bot deleted the gh/davidberard98/115/head branch June 25, 2022 14:16
miladm pushed a commit to miladm/pytorch that referenced this pull request Jun 27, 2022
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Pull Request resolved: pytorch#79498

Approved by: https://github.com/eellison
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
…utputs

Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.

Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.

Pull Request resolved: pytorch#79498

Approved by: https://github.com/eellison
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants