Skip to content

fix grad thrashing of shape analysis#40939

Closed
eellison wants to merge 2 commits intogh/eellison/85/basefrom
gh/eellison/85/head
Closed

fix grad thrashing of shape analysis#40939
eellison wants to merge 2 commits intogh/eellison/85/basefrom
gh/eellison/85/head

Conversation

@eellison
Copy link
Copy Markdown
Contributor

@eellison eellison commented Jul 2, 2020

Stack from ghstack:

Previously, when we would do shape analysis by running the op with representative inputs, we would always set the grad property to false. This led to a wrong static analysis when we would create differentiable subgraphs, and propagate shapes without also propagating requires_grad, and then uninline them.

Differential Revision: D22394676

@eellison eellison requested a review from apaszke as a code owner July 2, 2020 20:25
eellison pushed a commit that referenced this pull request Jul 2, 2020
ghstack-source-id: 279e469
Pull Request resolved: #40939
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 2, 2020
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Jul 2, 2020

💊 CI failures summary and remediations

As of commit 9e2e6d2 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 3 times.

@eellison eellison requested a review from Krovatkin July 2, 2020 20:29
// its most constrained form.
if (stack[i].isTensor())
node->outputs()[i]->inferTypeFrom(stack[i].toTensor());
auto tensor_type = node->outputs()[i]->type()->cast<TensorType>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we get rid of inferTypeFrom we might be losing some useful information about tensor properties which is probably the reason why we tried running an operation in the first place? Shouldn't we do inferTypeFrom()->withRequiresGrad() ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s what we’re doing with tensortype::create(stack[i].toTensor())

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infer type from calls tensortype::create

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t think we can call withRequiresGrad bc that would get rid of grad analysis that has set it to false. I think we just want to preserve whatever our previous grad inference was

Comment thread test/test_jit.py
torch._C._jit_pass_complete_shape_analysis(foo.graph, (torch.tensor([0.39]),), False)

# requires_grad property shouldn't be accidentally set by shape analysis
self.assertTrue(foo.graph.findNode("aten::sub").output().requiresGrad() is None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we actually infer in this particular case that aten::sub doesn't need requires_gard?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we haven’t done grad analysis, we don’t know if it requires grad or not

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh got it! thanks!

Copy link
Copy Markdown
Contributor

@Krovatkin Krovatkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Previously, when we would do shape analysis by running the op with representative inputs, we would always set the grad property to false. This led to a wrong static analysis when we would create differentiable subgraphs, and propagate shapes without also propagating requires_grad, and then uninline them.

Differential Revision: [D22394676](https://our.internmc.facebook.com/intern/diff/D22394676)

[ghstack-poisoned]
eellison pushed a commit that referenced this pull request Jul 6, 2020
ghstack-source-id: dd8742b
Pull Request resolved: #40939
eellison pushed a commit that referenced this pull request Jul 6, 2020
ghstack-source-id: dd8742b
Pull Request resolved: #40939
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@eellison merged this pull request in 37a572f.

csarofeen pushed a commit to csarofeen/pytorch that referenced this pull request Jul 7, 2020
Summary:
Pull Request resolved: pytorch#40939

Previously, when we would do shape analysis by running the op with representative inputs, we would always set the grad property to false. This led to a wrong static analysis when we would create differentiable subgraphs, and propagate shapes without also propagating requires_grad, and then uninline them.

Test Plan: Imported from OSS

Differential Revision: D22394676

Pulled By: eellison

fbshipit-source-id: 254e6e9f964b40d160befe0e125abe1b7aa2bd5e
malfet pushed a commit that referenced this pull request Jul 7, 2020
* [JIT] fix unfold shape analysis (#40749)

Summary:
unfold on a 0-dimensioned tensor returns a 1-dim tensor
Pull Request resolved: #40749

Differential Revision: D22361481

Pulled By: eellison

fbshipit-source-id: 621597e5f97f6e39953eb86f8b85bb4142527a9f

* shape analysis fix for default dtype'

ghstack-source-id: 723aa27
Pull Request resolved: #40938

* fix grad thrashing of shape analysis

ghstack-source-id: dd8742b
Pull Request resolved: #40939

Co-authored-by: Elias Ellison <eellison@fb.com>
@facebook-github-bot facebook-github-bot deleted the gh/eellison/85/head branch July 10, 2020 14:18
csarofeen added a commit to csarofeen/pytorch that referenced this pull request Aug 16, 2020
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#40939

Previously, when we would do shape analysis by running the op with representative inputs, we would always set the grad property to false. This led to a wrong static analysis when we would create differentiable subgraphs, and propagate shapes without also propagating requires_grad, and then uninline them.

Test Plan: Imported from OSS

Differential Revision: D22394676

Pulled By: eellison

fbshipit-source-id: 254e6e9f964b40d160befe0e125abe1b7aa2bd5e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants