Skip to content

compiled autograd: default fakeify backward inputs with static shapes instead of duck sizing#133581

Closed
bdhirsh wants to merge 1 commit intogh/bdhirsh/607/basefrom
gh/bdhirsh/607/head
Closed

compiled autograd: default fakeify backward inputs with static shapes instead of duck sizing#133581
bdhirsh wants to merge 1 commit intogh/bdhirsh/607/basefrom
gh/bdhirsh/607/head

Conversation

@bdhirsh
Copy link
Collaborator

@bdhirsh bdhirsh commented Aug 15, 2024

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133581

Note: Links to docs will display an error until the docs builds have been completed.

❌ 40 New Failures, 1 Cancelled Job, 1 Unrelated Failure

As of commit 64d034e with merge base 454713f (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@xmfan
Copy link
Member

xmfan commented Aug 15, 2024

Still don't fully understand the change, changing the fakification logic shouldn't do anything:

  • Compiled autograd's logic for marking things as dynamic is all in here: https://github.com/pytorch/pytorch/blob/main/torch/csrc/dynamo/python_compiled_autograd.cpp#L210. We don't use these fake tensors to determine dynamism, only to bind with proxies and trace ops into the graph.
  • After tracing, we call torch.compile using the graph module and the real inputs. The fake tensors created by compiled autograd are just thrown away. Dynamo logic then dictates what should be marked dynamic.

@ezyang
Copy link
Contributor

ezyang commented Aug 18, 2024

@bdhirsh I have actually completely forgotten what exactly we talked about in our 1:1, so you had better it write it down here :P

@xmfan
Copy link
Member

xmfan commented Aug 21, 2024

The compiled autograd failures are both due to the GmWrapper inputs not being unflattened properly in the previous PR. Tests passed for me if you do something like out = PropagateUnbackedSymInts(mod_).run(*mod.unflatten_fn(args))

If the traced graph is always the same between tracing with static FakeTensor vs dynamic ones (is this is true?), then this approach should be okay

@albanD albanD removed their request for review August 21, 2024 21:38
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2024

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Nov 5, 2024
@github-actions github-actions bot closed this Dec 5, 2024
@github-actions github-actions bot deleted the gh/bdhirsh/607/head branch January 5, 2025 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants