Use comparison key in OpSchema to avoid duplicate work between `__hash__` and `__eq__` by swolchok · Pull Request #161234 · pytorch/pytorch

swolchok · 2025-08-22T02:45:03Z

Stack from ghstack (oldest at bottom):

The performance cost of dict lookups keyed by OpSchema is a
significant minority of DTensor overhead. With this change we shave a
net ~1% off the total running time of the benchmark from #160580, as
measured by using cProfile and comparing cumulative time spent in
propagate + OpSchema's __post_init__. (__post_init__ grew from
2.5% to 6.4% (+3.9%) and propagate shrank from 12.5% to 7.8% (-4.7%)).

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta

[ghstack-poisoned]

pytorch-bot · 2025-08-22T02:45:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161234

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 513deee with merge base a85711d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

wconstab · 2025-08-22T22:34:48Z

torch/distributed/tensor/_op_schema.py

+            self._comparison_key = (self.op, args_to_hash)
+
+    def __hash__(self) -> int:
+        return hash(self._comparison_key)


NOTE: this must be used as a read only data class

if we take this literally, then i'd say why not precompute the hash value and store that instead of storing the tuple. Are you just hedging here, in case someone is mutating fields?

I think this is likely due to the function _inplace_rewrap_schema_suggestion where args and kwargs are mutated. This function is quite old. We can mark a TODO here and come back when we cleanup the DTensor prop logic and make OpSchema immutable.

ok, this sounds reasonable to me. lets do it. i'm stamping. @XilunWu can you open a PR to just add this todo?

wconstab · 2025-08-22T22:38:26Z

torch/distributed/tensor/_op_schema.py

        if len(self.args_schema) != len(other.args_schema):
            return False

-        # compare each element and early return if any of them is different


although the code here isn't directly moved into comparison key, i confirmed that the args_to_hash and kwargs_to_hash members are equivalent to the logic below, this LGTM

wconstab · 2025-08-22T22:39:41Z

torch/distributed/tensor/_op_schema.py

                new_arg_schema.append(arg)
        self.args_schema = tuple(new_arg_schema)
        self.kwargs_schema = origin_schema.kwargs_schema
+        self._recompute_comparison_key()


hmm, how did you confirm these are the only 2 places to call recompute? ideally we don't have to call it anywhere but post_init but this makes me wonder

It needs to be called anywhere somebody mutates args_schema or kwargs_schema. It's already been "supposed to be readonly" for years and I took that as a given. If you want it validated exhaustively then we will have to skip this optimization (but note that for any Python class, somebody could break your invariants if they wanted to by messing with stuff they've been requested not to).

I am a bit concerned about this, _inplace_rewrap_schema_suggestion where the OpSchema object get muted inside, is used here:

pytorch/torch/distributed/tensor/_sharding_prop.py

Lines 361 to 367 in 47d2673

suggestion_schema = None

if needs_redistribute:

suggestion_schema = OpSchema(

op_schema.op, tuple(expected_input_specs), {}

)

suggestion_schema._inplace_rewrap_schema_suggestion(op_schema)

Though, we don't need the hashing for suggestion_schema afterwards.

I was going to suggest add self._recompute_comparison_key() at the end of _inplace_rewrap_schema_suggestion for now to be safe, but this will for sure introduce extra overhead than before.

suggest add self._recompute_comparison_key() at the end of _inplace_rewrap_schema_suggestion for now to be safe

Great suggestion. We are replying to a comment thread about the line that implements it. :)

pytorchmergebot · 2025-08-25T15:24:50Z

Starting merge as part of PR stack under #161285

`self is other` means the same thing as `id(self) == id(other)`, but it's one operator instead of 3. Pull Request resolved: #161235 Approved by: https://github.com/wconstab, https://github.com/zpcore, https://github.com/fduwjj ghstack dependencies: #161231, #161234

…161240) get_write_alias() call count reduction explained briefly in code comment. We don't need to check write_aliases against None in the final outs_to_return calculation because we just did that check. Pull Request resolved: #161240 Approved by: https://github.com/wconstab ghstack dependencies: #161231, #161234, #161235

…ly_alias_match (#161284) Containers are truthy iff they're non-empty. Pull Request resolved: #161284 Approved by: https://github.com/Skylion007, https://github.com/wconstab ghstack dependencies: #161231, #161234, #161235, #161240

Drives down the overhead of return_and_correct_storage_aliasing slightly. Hopefully you'll agree it doesn't compromise readability. Pull Request resolved: #161285 Approved by: https://github.com/wconstab ghstack dependencies: #161231, #161234, #161235, #161240, #161284

…h__` and `__eq__` (pytorch#161234) The performance cost of `dict` lookups keyed by `OpSchema` is a significant minority of DTensor overhead. With this change we shave a net ~1% off the total running time of the benchmark from pytorch#160580, as measured by using cProfile and comparing cumulative time spent in propagate + OpSchema's `__post_init__`. (`__post_init__` grew from 2.5% to 6.4% (+3.9%) and propagate shrank from 12.5% to 7.8% (-4.7%)). Pull Request resolved: pytorch#161234 Approved by: https://github.com/wconstab ghstack dependencies: pytorch#161231

`self is other` means the same thing as `id(self) == id(other)`, but it's one operator instead of 3. Pull Request resolved: pytorch#161235 Approved by: https://github.com/wconstab, https://github.com/zpcore, https://github.com/fduwjj ghstack dependencies: pytorch#161231, pytorch#161234

…ytorch#161240) get_write_alias() call count reduction explained briefly in code comment. We don't need to check write_aliases against None in the final outs_to_return calculation because we just did that check. Pull Request resolved: pytorch#161240 Approved by: https://github.com/wconstab ghstack dependencies: pytorch#161231, pytorch#161234, pytorch#161235

…ly_alias_match (pytorch#161284) Containers are truthy iff they're non-empty. Pull Request resolved: pytorch#161284 Approved by: https://github.com/Skylion007, https://github.com/wconstab ghstack dependencies: pytorch#161231, pytorch#161234, pytorch#161235, pytorch#161240

…#161285) Drives down the overhead of return_and_correct_storage_aliasing slightly. Hopefully you'll agree it doesn't compromise readability. Pull Request resolved: pytorch#161285 Approved by: https://github.com/wconstab ghstack dependencies: pytorch#161231, pytorch#161234, pytorch#161235, pytorch#161240, pytorch#161284

Update

f43e5d7

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Aug 22, 2025

This was referenced Aug 22, 2025

Fix OpSchema equality check #161231

Closed

Minor cleanup of DeviceMesh.__eq__ #161235

Closed

swolchok requested review from XilunWu and wanchaol and removed request for wanchaol August 22, 2025 03:16

swolchok added the topic: not user facing topic category label Aug 22, 2025

swolchok mentioned this pull request Aug 22, 2025

Improve efficiency of _python_dispatch.return_and_correct_aliasing #161240

Closed

swolchok requested a review from wanchaol August 22, 2025 16:05

fix lint

513deee

[ghstack-poisoned]

swolchok requested review from ezyang and zpcore August 22, 2025 22:29

wconstab reviewed Aug 22, 2025

View reviewed changes

wconstab approved these changes Aug 22, 2025

View reviewed changes

swolchok mentioned this pull request Aug 23, 2025

Avoid double hash lookup in torch._library.simple_registry #161328

Closed

swolchok mentioned this pull request Aug 23, 2025

Fix accidental copy in pushPyOutToStack #161329

Closed

pytorchmergebot added the Merged label Aug 25, 2025

pytorchmergebot closed this in cfafd98 Aug 25, 2025

github-actions bot deleted the gh/swolchok/789/head branch September 25, 2025 02:10

stmcgovern mentioned this pull request Jan 21, 2026

[DTensor] Partial(sum) reductions are wrongly cached (?) #147180

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use comparison key in OpSchema to avoid duplicate work between `hash` and `eq`#161234

Use comparison key in OpSchema to avoid duplicate work between `hash` and `eq`#161234
swolchok wants to merge 2 commits intogh/swolchok/789/basefrom
gh/swolchok/789/head

swolchok commented Aug 22, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

wconstab Aug 22, 2025

Uh oh!

XilunWu Aug 22, 2025

Uh oh!

wconstab Aug 22, 2025 •

edited

Loading

Uh oh!

wconstab Aug 22, 2025

Uh oh!

wconstab Aug 22, 2025

Uh oh!

swolchok Aug 22, 2025

Uh oh!

zpcore Aug 22, 2025

Uh oh!

zpcore Aug 23, 2025

Uh oh!

swolchok Aug 25, 2025

Uh oh!

pytorchmergebot commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	suggestion_schema = None
	if needs_redistribute:
	suggestion_schema = OpSchema(
	op_schema.op, tuple(expected_input_specs), {}
	)
	suggestion_schema._inplace_rewrap_schema_suggestion(op_schema)

Conversation

swolchok commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161234

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wconstab Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

swolchok commented Aug 22, 2025 •

edited

Loading

pytorch-bot bot commented Aug 22, 2025 •

edited

Loading

wconstab Aug 22, 2025 •

edited

Loading