Skip to content

[xpu][test] port some distributed tensor test files for Intel GPU#161703

Closed
wincent8 wants to merge 22 commits intopytorch:mainfrom
wincent8:wliao2/add_tensor_3
Closed

[xpu][test] port some distributed tensor test files for Intel GPU#161703
wincent8 wants to merge 22 commits intopytorch:mainfrom
wincent8:wliao2/add_tensor_3

Conversation

@wincent8
Copy link
Copy Markdown
Contributor

@wincent8 wincent8 commented Aug 28, 2025

it's another pr to port distributed tensor test for Intel GPU, while the other pr is #161604
We could enable Intel GPU with following methods and try the best to keep the original code styles:

Use torch.accelerator for general gpu
Skip the case if running on xpu which has known issues

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @ezyang @msaroufim @dcci @tianyu-l @XilunWu @SherlockNoMad

@pytorch-bot pytorch-bot Bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Aug 28, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Aug 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161703

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 2 Pending, 1 Unrelated Failure

As of commit 610f9b4 with merge base 79317dc (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@wincent8 wincent8 changed the title port some distributed tensor test files for Intel GPU [WIP] port some distributed tensor test files for Intel GPU Aug 28, 2025
@wincent8 wincent8 force-pushed the wliao2/add_tensor_3 branch from 104f48d to 5441a55 Compare September 2, 2025 06:26
@wincent8 wincent8 changed the title [WIP] port some distributed tensor test files for Intel GPU port some distributed tensor test files for Intel GPU Sep 2, 2025
@wincent8
Copy link
Copy Markdown
Contributor Author

wincent8 commented Sep 2, 2025

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot Bot added the topic: not user facing topic category label Sep 2, 2025
Comment thread test/distributed/tensor/test_redistribute.py Outdated
Comment thread test/distributed/tensor/test_dtensor.py Outdated
@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 2, 2025
Comment thread test/distributed/tensor/test_dtensor.py Outdated
guangyey
guangyey previously approved these changes Sep 8, 2025
Copy link
Copy Markdown
Collaborator

@guangyey guangyey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@guangyey
Copy link
Copy Markdown
Collaborator

guangyey commented Sep 8, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased wliao2/add_tensor_3 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout wliao2/add_tensor_3 && git pull --rebase)

@guangyey
Copy link
Copy Markdown
Collaborator

guangyey commented Sep 8, 2025

please fix lint.

@wincent8
Copy link
Copy Markdown
Contributor Author

wincent8 commented Sep 8, 2025

please fix lint.

fixed

@guangyey guangyey requested a review from d4l3k September 9, 2025 03:30
d4l3k
d4l3k previously approved these changes Sep 11, 2025
Copy link
Copy Markdown
Member

@d4l3k d4l3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wincent8
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Sep 12, 2025

Pull workflow has not been scheduled for the PR yet. It could be because author doesn't have permissions to run those or skip-checks keywords were added to PR/commits, aborting merge. Please get/give approval for the workflows and/or remove skip ci decorators before next merge attempt. If you think this is a mistake, please contact PyTorch Dev Infra.

@guangyey
Copy link
Copy Markdown
Collaborator

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor labels Sep 12, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Sep 12, 2025

To add the ciflow label ciflow/inductor please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased wliao2/add_tensor_3 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout wliao2/add_tensor_3 && git pull --rebase)

@pytorch-bot pytorch-bot Bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Nov 14, 2025
@guangyey
Copy link
Copy Markdown
Collaborator

@pytorchbot merge

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Nov 15, 2025

This PR has pending changes requested. Please address the comments and update the PR before merging.

@guangyey
Copy link
Copy Markdown
Collaborator

@albanD May I know if you have any other comments.

Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@guangyey
Copy link
Copy Markdown
Collaborator

Thanks!
@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/h100-distributed ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks Merged module: dtensor distributed tensor tag oncall: distributed Add this issue/PR to distributed oncall triage queue open source Reverted topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

9 participants