Skip to content

[SPMD][DTensor] introduce xla_distribute_module for DTensor integration#6683

Merged
yeounoh merged 1 commit intomasterfrom
xla_distribute_module
Mar 7, 2024
Merged

[SPMD][DTensor] introduce xla_distribute_module for DTensor integration#6683
yeounoh merged 1 commit intomasterfrom
xla_distribute_module

Conversation

@yeounoh
Copy link
Copy Markdown
Contributor

@yeounoh yeounoh commented Mar 7, 2024

This is to support pytorch/pytorch#92909

@yeounoh yeounoh added the distributed SPMD and other distributed things. label Mar 7, 2024
@yeounoh yeounoh self-assigned this Mar 7, 2024
@yeounoh yeounoh force-pushed the xla_distribute_module branch from c88e189 to a157894 Compare March 7, 2024 00:39
@yeounoh
Copy link
Copy Markdown
Contributor Author

yeounoh commented Mar 7, 2024

This need to land for experimental release of the auto-sharding API #6322

@yeounoh yeounoh requested review from alanwaketan and wanchaol March 7, 2024 00:41
@yeounoh yeounoh force-pushed the xla_distribute_module branch 2 times, most recently from 26fc3a8 to 30850e1 Compare March 7, 2024 07:11
@yeounoh
Copy link
Copy Markdown
Contributor Author

yeounoh commented Mar 7, 2024

cc @baoleai for visibility

@yeounoh
Copy link
Copy Markdown
Contributor Author

yeounoh commented Mar 7, 2024

CI turned green, and locally looks good on TPU and CPU

python test/spmd/test_dtensor_integration.py
...
----------------------------------------------------------------------
Ran 3 tests in 3.740s

OK

Copy link
Copy Markdown
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but how does that work with auto_sharding? You still shard the inputs in the test case.

@yeounoh
Copy link
Copy Markdown
Contributor Author

yeounoh commented Mar 7, 2024

LGTM, but how does that work with auto_sharding? You still shard the inputs in the test case.

Good question, we were thinkning about introducing pre-defined partition_fn for autosharding, e.g., torch_xla.distributed.auto_sharding_policy (subject to change). It would just be calling use_spmd(auto=True) though.

@yeounoh yeounoh force-pushed the xla_distribute_module branch from 4caa123 to e659a76 Compare March 7, 2024 18:31
@yeounoh yeounoh merged commit b6b9c6d into master Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

distributed SPMD and other distributed things.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants