Add torch.empty_permuted by ezyang · Pull Request #95069 · pytorch/pytorch

ezyang · 2023-02-17T16:12:11Z

Stack from ghstack (oldest at bottom):

-> Add torch.empty_permuted #95069

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics.

The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with empty_strided. However, empty_strided does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With empty_permuted, this is known statically to be the case and helps us skip some 0/1 guards.

However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides:

The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is torch.empty(N, H, W, C).permute(0, 3, 1, 2); the permute call has to take NHWC to NCHW, and is the inverse of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say torch.empty_permuted((N, C, H, W), (0, 2, 3, 1)), letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3.
An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation.

It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it.

A nice enhancement of this feature would be to accept physical_layout anywhere memory_format is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead.

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

pytorch-bot · 2023-02-17T16:12:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95069

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures, 1 Pending

As of commit eaa50db:

BROKEN TRUNK - The following jobs failed but were present on the merge base c16b291:

lintrunner / linux-job

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 3fd9577 Pull Request resolved: #95069

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 62631ba Pull Request resolved: #95069

albanD

SGTM!

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

dagitses · 2023-02-17T22:18:02Z

aten/src/ATen/native/TensorFactories.cpp

  return result;
 }

+Tensor empty_permuted_symint(SymIntArrayRef size, IntArrayRef physical_layout, c10::optional<ScalarType> dtype_opt,


do you want any error checking that physical layout is a permutation of range(len(physical_layout))?

Hmm, that's a good idea...

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 4e1401d Pull Request resolved: #95069

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 7ada284 Pull Request resolved: #95069

ezyang · 2023-02-19T20:08:39Z

@pytorchbot merge

pytorchmergebot · 2023-02-19T20:10:22Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

jeanschmidt · 2023-02-21T12:03:31Z

@pytorchbot revert -m "Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq" -c ghfirst

pytorchmergebot · 2023-02-21T12:05:14Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2023-02-21T12:05:25Z

@ezyang your PR has been successfully reverted.

This reverts commit bedeb1f. Reverted #95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq

This reverts commit 92e03cd. [ghstack-poisoned]

This reverts commit 92e03cd. cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

This reverts commit 92e03cd. Pull Request resolved: #95208 Approved by: https://github.com/albanD

This reverts commit bedeb1f. Reverted pytorch/pytorch#95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq

)" This reverts commit ce950b4.

This reverts commit 92e03cd.

This reverts commit bedeb1f.

torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch#95069 Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD, https://github.com/dagitses

This reverts commit bedeb1f. Reverted pytorch#95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq

This reverts commit 92e03cd. Pull Request resolved: pytorch#95208 Approved by: https://github.com/albanD

Add torch.empty_permuted

abac07c

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

ezyang requested review from mruberry and ngimel as code owners February 17, 2023 16:12

ezyang added a commit that referenced this pull request Feb 17, 2023

Add torch.empty_permuted

4aeacee

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 3fd9577 Pull Request resolved: #95069

github-actions bot requested review from Chillee, SherlockNoMad, albanD, antoniojkim, bdhirsh, jbschlosser, miladm, voznesenskym and wconstab February 17, 2023 16:12

ezyang added keep-going Don't stop on first failure, keep running tests until the end release notes: python_frontend python frontend release notes category topic: new features topic category labels Feb 17, 2023

malfet approved these changes Feb 17, 2023

View reviewed changes

ngimel approved these changes Feb 17, 2023

View reviewed changes

ezyang requested a review from dagitses February 17, 2023 18:02

ezyang removed the keep-going Don't stop on first failure, keep running tests until the end label Feb 17, 2023

ezyang added a commit that referenced this pull request Feb 17, 2023

Add torch.empty_permuted

05b6bc6

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 62631ba Pull Request resolved: #95069

albanD approved these changes Feb 17, 2023

View reviewed changes

github-actions bot added ciflow/inductor module: inductor labels Feb 17, 2023

dagitses approved these changes Feb 17, 2023

View reviewed changes

ezyang added a commit that referenced this pull request Feb 17, 2023

Add torch.empty_permuted

dd7516f

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 4e1401d Pull Request resolved: #95069

ezyang added a commit that referenced this pull request Feb 19, 2023

Add torch.empty_permuted

0a32a92

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 7ada284 Pull Request resolved: #95069

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 19, 2023

pytorchmergebot added the Merged label Feb 20, 2023

pytorchmergebot closed this in bedeb1f Feb 20, 2023

pytorchmergebot added the Reverted label Feb 21, 2023

ezyang added a commit that referenced this pull request Feb 21, 2023

Reland "Add torch.empty_permuted (#95069)"

ac1a0bc

This reverts commit 92e03cd. [ghstack-poisoned]

pytorchmergebot pushed a commit that referenced this pull request Feb 21, 2023

Reland "Add torch.empty_permuted (#95069)" (#95208)

ce950b4

This reverts commit 92e03cd. Pull Request resolved: #95208 Approved by: https://github.com/albanD

msaroufim mentioned this pull request Mar 3, 2023

Remove mention of dynamo.optimize() in docs #96002

Closed

pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023

Revert "Reland "Add torch.empty_permuted (pytorch#95069)" (pytorch#95208

e0af696

)" This reverts commit ce950b4.

pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023

Revert "Revert "Add torch.empty_permuted (pytorch#95069)""

b3bff17

This reverts commit 92e03cd.

pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023

Revert "Add torch.empty_permuted (pytorch#95069)"

84ea484

This reverts commit bedeb1f.

facebook-github-bot deleted the gh/ezyang/1827/head branch June 8, 2023 16:50

jhavukainen pushed a commit to kulinseth/pytorch that referenced this pull request Mar 15, 2024

Reland "Add torch.empty_permuted (pytorch#95069)" (pytorch#95208)

bc5aa4a

This reverts commit 92e03cd. Pull Request resolved: pytorch#95208 Approved by: https://github.com/albanD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add torch.empty_permuted#95069

Add torch.empty_permuted#95069
ezyang wants to merge 6 commits intogh/ezyang/1827/basefrom
gh/ezyang/1827/head

ezyang commented Feb 17, 2023 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Feb 17, 2023 •

edited

Loading

Uh oh!

albanD left a comment

Uh oh!

dagitses Feb 17, 2023

Uh oh!

ezyang Feb 17, 2023

Uh oh!

ezyang commented Feb 19, 2023

Uh oh!

pytorchmergebot commented Feb 19, 2023

Uh oh!

jeanschmidt commented Feb 21, 2023 •

edited by huydhn

Loading

Uh oh!

pytorchmergebot commented Feb 21, 2023

Uh oh!

pytorchmergebot commented Feb 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

ezyang commented Feb 17, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95069

❌ 1 Failures, 1 Pending

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

dagitses Feb 17, 2023

Choose a reason for hiding this comment

Uh oh!

ezyang Feb 17, 2023

Choose a reason for hiding this comment

Uh oh!

ezyang commented Feb 19, 2023

Uh oh!

pytorchmergebot commented Feb 19, 2023

Merge started

Uh oh!

jeanschmidt commented Feb 21, 2023 • edited by huydhn Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorchmergebot commented Feb 21, 2023

Uh oh!

pytorchmergebot commented Feb 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ezyang commented Feb 17, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Feb 17, 2023 •

edited

Loading

jeanschmidt commented Feb 21, 2023 •

edited by huydhn

Loading