Skip to content

Only truncate leading 1s if the value is too big.#94521

Closed
ezyang wants to merge 3 commits intogh/ezyang/1805/basefrom
gh/ezyang/1805/head
Closed

Only truncate leading 1s if the value is too big.#94521
ezyang wants to merge 3 commits intogh/ezyang/1805/basefrom
gh/ezyang/1805/head

Conversation

@ezyang
Copy link
Copy Markdown
Contributor

@ezyang ezyang commented Feb 9, 2023

Stack from ghstack (oldest at bottom):

If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang ezyang@meta.com

If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Feb 9, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94521

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 Failures, 1 Pending

As of commit a7eee5a:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ezyang added a commit that referenced this pull request Feb 9, 2023
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: dbc8b03
Pull Request resolved: #94521
ezyang added a commit that referenced this pull request Feb 9, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 9, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: c4aaa3e
Pull Request resolved: #94521
@ezyang ezyang added release notes: composability release notes category topic: not user facing topic category labels Feb 10, 2023
@albanD albanD removed their request for review February 10, 2023 18:11
ezyang added a commit that referenced this pull request Feb 10, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 10, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
@ezyang ezyang added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 10, 2023
ezyang added a commit that referenced this pull request Feb 11, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 11, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Feb 19, 2023

@pytorchbot revert -c nosignal -m "fails internal tests"

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@ezyang your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Feb 19, 2023
This reverts commit 03f4a63.

Reverted #94521 on behalf of https://github.com/ezyang due to fails internal tests
@ezyang ezyang reopened this Feb 19, 2023
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 19, 2023
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: 905f645
Pull Request resolved: #94521
@ezyang ezyang closed this Feb 19, 2023
ezyang added a commit that referenced this pull request Feb 19, 2023
If it's just right, broadcasting will do the right thing automatically.

This helps with unbacked SymInts as I can avoid testing one equality on the inside.

The previous attempt at #94521 I got the logic a bit wrong. I need to compute the difference between the data to be set, and the post-slice space for the values. But I incorrectly compared against the *pre-slice* space in the original PR. Another version of this PR which is wrong is to compare against variableIndices.size(); but remember that in advanced indexing with tensors/lists, each of the individual indices specify what coordinates to read out of each dimension; so to get the post-slice space you have to look at the dim of the advanced index itself! There is now a test for this.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
Copy link
Copy Markdown
Collaborator

@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah hard to do it fully correctly, but this is better than before.

@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Feb 20, 2023

better version at #95141

ezyang added a commit that referenced this pull request Feb 20, 2023
…cked"


This prevents us from guarding on leading unbacked SymInts.

The previous attempt at #94521 I got the logic a bit wrong. My idea there was to avoid slicing when the values to be set have low enough dimensionality that they definitely aren't too long. To do this, I need to compute the difference between the data to be set, and the post-slice space for the values. But I incorrectly compared against the *pre-slice* space in the original PR. Another version of this PR which is wrong is to compare against variableIndices.size(); but remember that in advanced indexing with tensors/lists, each of the individual indices specify what coordinates to read out of each dimension! A third incorrect attempt tested `variableIndices[0].dim()`, which is only correct if you don't broadcast one of the later variable indices, and if there are enough variableIndices to cover all dims. This is all quite complicated, so I went for a simpler solution of checking if the leading dim had a hint before testing if it is not equal to one.

BTW, there is no test for this one stripping behavior. There is now a test for this, based off the real code that caused the problem.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 20, 2023
This prevents us from guarding on leading unbacked SymInts.

The previous attempt at #94521 I got the logic a bit wrong. My idea there was to avoid slicing when the values to be set have low enough dimensionality that they definitely aren't too long. To do this, I need to compute the difference between the data to be set, and the post-slice space for the values. But I incorrectly compared against the *pre-slice* space in the original PR. Another version of this PR which is wrong is to compare against variableIndices.size(); but remember that in advanced indexing with tensors/lists, each of the individual indices specify what coordinates to read out of each dimension! A third incorrect attempt tested `variableIndices[0].dim()`, which is only correct if you don't broadcast one of the later variable indices, and if there are enough variableIndices to cover all dims. This is all quite complicated, so I went for a simpler solution of checking if the leading dim had a hint before testing if it is not equal to one.

BTW, there is no test for this one stripping behavior. There is now a test for this, based off the real code that caused the problem.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 20, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 20, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 20, 2023
…SymInts"


This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 20, 2023
This adds support for unbacked SymInts from nonzero, and then gets this working end-to-end with advanced indexing and boolean masking. I needed to apply a variety of fixes to get this to work; these fixes are emblematic of the kind of enablement work we will need to do for unbacked SymInts as they are used more heavily.

* When we allocate a contiguous tensor, hard code relevant fields (e.g., `is_contiguous_`) true, instead of keeping the complicated guardless implementation. As Horace notes, all guardless implementations really do is push off the evaluation problem until later. We do poke contiguity reasonably often, but we also typically allocate these with torch.empty so we know statically they are contiguous.
* Allow boolean tensors into the index.Tensor meta, and redirect fake tensor to call the meta function directly (so that when it calls nonzero, it decomposes to the fake tensor nonzero)
* Don't rewrap tensors in TensorMeta; it does nothing but hits `empty_strided` which is generally not unbacked SymInt friendly
* Refactor `compute_elementwise_output_strides` to return the output permutation. I then use this to rewrite the `empty_like` reference to avoid calling `empty_strided`; instead, it allocates a contiguous tensor and permutes it.
* Refactor `_broadcast_shapes` to test for more common cases (shapes are equal) before broadcasting case, which is enough to get the broadcasting in advanced indexing going without adding an unnecessary == 1 guard.
* Make nonzero return an unbacked symint in size, and put enough assumptions on it so that the boolean masking end to end works
* Don't remove leading 1s in setitem unless the value tensor is too big for the input tensor. Submitted separately at #94521
* Don't record `memory_format` on TensorMeta if there are unbacked SymInts. Actually, we can probably do a bit better here; sometimes a tensor will have unbacked SymInt but is obviously contiguous and we ought to report that. Leaving that for future work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Feb 21, 2023
This prevents us from guarding on leading unbacked SymInts.

The previous attempt at #94521 I got the logic a bit wrong. My idea there was to avoid slicing when the values to be set have low enough dimensionality that they definitely aren't too long. To do this, I need to compute the difference between the data to be set, and the post-slice space for the values. But I incorrectly compared against the *pre-slice* space in the original PR. Another version of this PR which is wrong is to compare against variableIndices.size(); but remember that in advanced indexing with tensors/lists, each of the individual indices specify what coordinates to read out of each dimension! A third incorrect attempt tested `variableIndices[0].dim()`, which is only correct if you don't broadcast one of the later variable indices, and if there are enough variableIndices to cover all dims. This is all quite complicated, so I went for a simpler solution of checking if the leading dim had a hint before testing if it is not equal to one.

BTW, there is no test for this one stripping behavior. There is now a test for this, based off the real code that caused the problem.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>
Pull Request resolved: #95141
Approved by: https://github.com/ngimel
pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023
pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1805/head branch June 8, 2023 16:49
jhavukainen pushed a commit to kulinseth/pytorch that referenced this pull request Mar 15, 2024
If it's just right, broadcasting will do the right thing
automatically.

This helps with unbacked SymInts as I can avoid testing one
equality on the inside.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: pytorch#94521
Approved by: https://github.com/voznesenskym
jhavukainen pushed a commit to kulinseth/pytorch that referenced this pull request Mar 15, 2024
jhavukainen pushed a commit to kulinseth/pytorch that referenced this pull request Mar 15, 2024
This prevents us from guarding on leading unbacked SymInts.

The previous attempt at pytorch#94521 I got the logic a bit wrong. My idea there was to avoid slicing when the values to be set have low enough dimensionality that they definitely aren't too long. To do this, I need to compute the difference between the data to be set, and the post-slice space for the values. But I incorrectly compared against the *pre-slice* space in the original PR. Another version of this PR which is wrong is to compare against variableIndices.size(); but remember that in advanced indexing with tensors/lists, each of the individual indices specify what coordinates to read out of each dimension! A third incorrect attempt tested `variableIndices[0].dim()`, which is only correct if you don't broadcast one of the later variable indices, and if there are enough variableIndices to cover all dims. This is all quite complicated, so I went for a simpler solution of checking if the leading dim had a hint before testing if it is not equal to one.

BTW, there is no test for this one stripping behavior. There is now a test for this, based off the real code that caused the problem.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>
Pull Request resolved: pytorch#95141
Approved by: https://github.com/ngimel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: composability release notes category Reverted topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants