Skip to content

[dynamo] TENSOR_SUBCLASS_METADATA_MATCH don't copy SymInts#175596

Closed
azahed98 wants to merge 5 commits intogh/azahed98/6/basefrom
gh/azahed98/6/head
Closed

[dynamo] TENSOR_SUBCLASS_METADATA_MATCH don't copy SymInts#175596
azahed98 wants to merge 5 commits intogh/azahed98/6/basefrom
gh/azahed98/6/head

Conversation

@azahed98
Copy link
Copy Markdown
Contributor

@azahed98 azahed98 commented Feb 24, 2026

Stack from ghstack (oldest at bottom):

Fixes an error with the TENSOR_SUBCLASS_METADATA_MATCH guard when the tensor subclass has a SymInt in its metadata. In this scenario, deepcopy of the metadata propagates through the SymInt down to the ShapeEnv, FakeMode, and then FakeTensors, causing an error due to no data pointer.

This PR replaces SymInts in the metadata with an _AnyCompare object that always returns True for equals checks. This assumes dynamic shapes checks will handle correctness.

Test Plan: The original error can be reproduced with this script (if ran on the previous commit from this stack). This PR adds a regression test with a manually injected SymInt into the metadata, then compiles with full_graph=True and checks for no recompiles.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @Lucaskabela @jataylo

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 24, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175596

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit b44fa06 with merge base f72a552 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 24, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

[ghstack-poisoned]
azahed98 added a commit that referenced this pull request Feb 24, 2026
[ghstack-poisoned]
Fixes an error with the TENSOR_SUBCLASS_METADATA_MATCH guard when the tensor subclass has a SymInt in its metadata. In this scenario, `deepcopy` of the metadata propagates through the SymInt down to the ShapeEnv, FakeMode, and then FakeTensors, causing an error due to no data pointer.

This PR replaces SymInts in the metadata with an `_AnyCompare` object that always returns `True` for equals checks. This assumes dynamic shapes checks will handle correctness.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a regression test with a manually injected SymInt into the metadata, then compiles with `full_graph=True` and checks for no recompiles.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
azahed98 added a commit that referenced this pull request Mar 12, 2026
ghstack-source-id: 3c31491
Pull-Request: #175596

[ghstack-poisoned]
azahed98 added a commit that referenced this pull request Mar 12, 2026
ghstack-source-id: 3c31491
Pull-Request: #175596

[ghstack-poisoned]
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
Fixes an error with the TENSOR_SUBCLASS_METADATA_MATCH guard when the tensor subclass has a SymInt in its metadata. In this scenario, `deepcopy` of the metadata propagates through the SymInt down to the ShapeEnv, FakeMode, and then FakeTensors, causing an error due to no data pointer.

This PR replaces SymInts in the metadata with an `_AnyCompare` object that always returns `True` for equals checks. This assumes dynamic shapes checks will handle correctness.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a regression test with a manually injected SymInt into the metadata, then compiles with `full_graph=True` and checks for no recompiles.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
@pytorch-bot pytorch-bot Bot added the ciflow/torchtitan Run TorchTitan integration tests label Mar 12, 2026
@azahed98 azahed98 added the topic: not user facing topic category label Mar 13, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Starting merge as part of PR stack under #175660

pytorchmergebot pushed a commit that referenced this pull request Mar 13, 2026
…sure refcycle (#175660)

Fixes a potential reference cycle that can block `swap_tensors` during or after compile. This reference cycle comes from a closure of a `MetaConverter` object within the `_empty_create_subclass` defined in `MetaConverter.empty_create_subclass`.

This PR moves `_empty_create_subclass` to be a method of `MetaConverter` instead, adding additional arguments and moving imports as needed.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a unit test that checks that weakrefs created by `MetaConverter` are cleaned up when it is manually deleted even if garbage collection is disabled.

Pull Request resolved: #175660
Approved by: https://github.com/anijain2305
ghstack dependencies: #175397, #175596
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…75596)

Fixes an error with the TENSOR_SUBCLASS_METADATA_MATCH guard when the tensor subclass has a SymInt in its metadata. In this scenario, `deepcopy` of the metadata propagates through the SymInt down to the ShapeEnv, FakeMode, and then FakeTensors, causing an error due to no data pointer.

This PR replaces SymInts in the metadata with an `_AnyCompare` object that always returns `True` for equals checks. This assumes dynamic shapes checks will handle correctness.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a regression test with a manually injected SymInt into the metadata, then compiles with `full_graph=True` and checks for no recompiles.

Pull Request resolved: pytorch#175596
Approved by: https://github.com/anijain2305
ghstack dependencies: pytorch#175397
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…sure refcycle (pytorch#175660)

Fixes a potential reference cycle that can block `swap_tensors` during or after compile. This reference cycle comes from a closure of a `MetaConverter` object within the `_empty_create_subclass` defined in `MetaConverter.empty_create_subclass`.

This PR moves `_empty_create_subclass` to be a method of `MetaConverter` instead, adding additional arguments and moving imports as needed.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a unit test that checks that weakrefs created by `MetaConverter` are cleaned up when it is manually deleted even if garbage collection is disabled.

Pull Request resolved: pytorch#175660
Approved by: https://github.com/anijain2305
ghstack dependencies: pytorch#175397, pytorch#175596
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
…75596)

Fixes an error with the TENSOR_SUBCLASS_METADATA_MATCH guard when the tensor subclass has a SymInt in its metadata. In this scenario, `deepcopy` of the metadata propagates through the SymInt down to the ShapeEnv, FakeMode, and then FakeTensors, causing an error due to no data pointer.

This PR replaces SymInts in the metadata with an `_AnyCompare` object that always returns `True` for equals checks. This assumes dynamic shapes checks will handle correctness.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a regression test with a manually injected SymInt into the metadata, then compiles with `full_graph=True` and checks for no recompiles.

Pull Request resolved: pytorch#175596
Approved by: https://github.com/anijain2305
ghstack dependencies: pytorch#175397
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
…sure refcycle (pytorch#175660)

Fixes a potential reference cycle that can block `swap_tensors` during or after compile. This reference cycle comes from a closure of a `MetaConverter` object within the `_empty_create_subclass` defined in `MetaConverter.empty_create_subclass`.

This PR moves `_empty_create_subclass` to be a method of `MetaConverter` instead, adding additional arguments and moving imports as needed.

**Test Plan:** The original error can be reproduced with [this script](https://gist.github.com/sayakpaul/929678132809874c5dbf9c5215460d33) (if ran on the previous commit from this stack). This PR adds a unit test that checks that weakrefs created by `MetaConverter` are cleaned up when it is manually deleted even if garbage collection is disabled.

Pull Request resolved: pytorch#175660
Approved by: https://github.com/anijain2305
ghstack dependencies: pytorch#175397, pytorch#175596
@github-actions github-actions Bot deleted the gh/azahed98/6/head branch April 13, 2026 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants