Skip to content

[DTensor] single-dim foreach strategy#170631

Closed
pianpwk wants to merge 4 commits intogh/pianpwk/46/basefrom
gh/pianpwk/46/head
Closed

[DTensor] single-dim foreach strategy#170631
pianpwk wants to merge 4 commits intogh/pianpwk/46/basefrom
gh/pianpwk/46/head

Conversation

@pianpwk
Copy link
Contributor

@pianpwk pianpwk commented Dec 17, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/170631

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b2ec35b with merge base 61622da (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
pianpwk added a commit that referenced this pull request Dec 17, 2025
ghstack-source-id: a22949b
Pull-Request: #170631
return OpSchema(
target_op, # type: ignore[arg-type]
args_schema=tuple(target_args),
kwargs_schema=op_schema.kwargs_schema,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I don't know of if there are any cases where kwargs_schema could contain a TupleStrategy and that would need to also be translated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changed to tree_map instead

child_strategies: list[StrategyType] = []
for tensorlist_i in range(tensorlist_len):
per_index_schema = _translate_foreach_op_schema(op_schema, tensorlist_i)
per_index_strategy = _expanded_strategy(per_index_schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if it is worth doing LRU cache on this helper. It seems like a very common case to have 100s or 1000s of tensors in the foreach list, but all of them sharing the same placements, or only a handful of patterns for placements.

Copy link
Contributor Author

@pianpwk pianpwk Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I tried this but it seems like it needs more work; the OpSchema._comparison_key that's used for caching doesn't include placements, so out-of-box it's over-caching and incorrect

I added a todo there

Copy link
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks pretty good to me! thanks!

i think we can land it asap, i will work on landing the earlier PRs in the stack

[ghstack-poisoned]
pianpwk added a commit that referenced this pull request Dec 18, 2025
ghstack-source-id: 83854c3
Pull-Request: #170631
@pianpwk pianpwk changed the title foreach single-dim strategy [DTensor] single-dim foreach strategy Dec 18, 2025
@pianpwk pianpwk added the release notes: distributed (dtensor) release notes category label Dec 18, 2025
@pianpwk pianpwk marked this pull request as ready for review December 18, 2025 19:15

# TODO maybe this could be helped by adding a new 'tag' to the OpOverload?
# Also, i'm guessing that i'll need more info from the registration callsite
# about which inputs are expected to be lists vs tensors. But maybe I can just infer it all from the runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess you already handled this part (Also..) so you can delete it

Copy link
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! i think we can land it and continue to iterate on the todos.

up to you whether to land my pointwise PR too, which I've rebased recently and made safe to land, or just copy its contents into your test for now

pytorchmergebot pushed a commit that referenced this pull request Dec 20, 2025
This PR adds a dummy version of a single-dim pointwise strategy, and it shouldn't be used yet for real but it is being used by unit tests in #170631 so it is useful to land.

Pull Request resolved: #168115
Approved by: https://github.com/pianpwk

Co-authored-by: Pian Pawakapan <pianpwk@meta.com>
[ghstack-poisoned]
pianpwk added a commit that referenced this pull request Dec 21, 2025
ghstack-source-id: 06a7151
Pull-Request: #170631
@pianpwk
Copy link
Contributor Author

pianpwk commented Dec 22, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 22, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

xgz2 pushed a commit that referenced this pull request Dec 22, 2025
This PR adds a dummy version of a single-dim pointwise strategy, and it shouldn't be used yet for real but it is being used by unit tests in #170631 so it is useful to land.

Pull Request resolved: #168115
Approved by: https://github.com/pianpwk

Co-authored-by: Pian Pawakapan <pianpwk@meta.com>
krastogi-in pushed a commit to krastogi-in/pytorch that referenced this pull request Jan 9, 2026
This PR adds a dummy version of a single-dim pointwise strategy, and it shouldn't be used yet for real but it is being used by unit tests in pytorch#170631 so it is useful to land.

Pull Request resolved: pytorch#168115
Approved by: https://github.com/pianpwk

Co-authored-by: Pian Pawakapan <pianpwk@meta.com>
krastogi-in pushed a commit to krastogi-in/pytorch that referenced this pull request Jan 9, 2026
@github-actions github-actions bot deleted the gh/pianpwk/46/head branch January 22, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants