[SPMD] suppor DTensor API integration#5776
Conversation
6cef706 to
ccb2c4c
Compare
|
Tested locally, @JackCaoG @jonb377 @alanwaketan I am moving/grouping the SPMD API under |
e60ce71 to
19fd03a
Compare
Had a chat with @JackCaoG , we will move out of experimental and to distributed from BETA. |
|
|
||
| log = logging.getLogger(__name__) | ||
|
|
||
| TORCH_XLA_INITIALIZED = False |
There was a problem hiding this comment.
I don't quite understand this TORCH_XLA_INITIALIZED flag
There was a problem hiding this comment.
Ah, we don't need this anymore, due to the fact that this code now lives in torch_xla.
| ) -> None: | ||
| if TORCH_XLA_INITIALIZED: | ||
| # TODO(yeounoh) replace this with xr.use_spmd() when we deprecate the flag. | ||
| os.environ["XLA_USE_SPMD"] = "1" |
There was a problem hiding this comment.
why not just call xr.use_spmd() now?
There was a problem hiding this comment.
there could be a case where the caller already has initialized some tensors before entering the API. We may want to support both, at least in the implementation. I will leave as a TODO here.
JackCaoG
left a comment
There was a problem hiding this comment.
I didn't read the whole or too careful since I felt like most of them are just moving the code around. Is that true?
| import torch_xla.runtime as xr | ||
| import torch_xla.experimental.xla_sharding as xs | ||
| from torch_xla.experimental.xla_sharding import Mesh | ||
| import torch_xla.distributed.spmd.xla_sharding as xs |
There was a problem hiding this comment.
What do you think about making the import path just import torch_xla.distributed.spmd as xs? Not sure if xs is still the right abbreviation in that case (maybe we can think of it as xla spmd), but it would make the spmd package the single entrypoint in userspace.
There was a problem hiding this comment.
Good point, let's do that.
be48d90 to
c0cdd77
Compare
…adow xla DTensor API.
c7b779a to
794cbc7
Compare
alanwaketan
left a comment
There was a problem hiding this comment.
Moving forward, should we use dtensor directly even within our codebase?
@jonb377 do you want to update HF to reflect this change? We can maybe a couple of days in case this get reverted : ).
|
@alanwaketan for HF my major concern is the backward compatibility. I think We should at least wait until 2.2 release is out before updating. |
Sounds good to me. |
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
In response to the change pytorch/xla#5776 and #92909 Pull Request resolved: #113214 Approved by: https://github.com/wanchaol
* [SPMD] move SPMD package to torch_xla/experimental/spmd, introduce shadow xla DTensor API. * support backward compatibility of the old imports * Move spmd out of experimental * Update spmd.md for distributed/spmd
This is a follow-up to pytorch/pytorch#92909, DTensor & XLA integration. In this PR, we address
toch_xla/distributed/spmdxlaDTensor API totorch_xla/distributed/spmdso DTensor implementation can access.