Move PJRT Python APIs out of torch_xla.experimental.*#5011
Move PJRT Python APIs out of torch_xla.experimental.*#5011will-cromar merged 25 commits intomasterfrom
torch_xla.experimental.*#5011Conversation
torch_xla.experimental.*torch_xla.experimental.*
| @@ -11,14 +11,15 @@ | |||
| import torch_xla.core.xla_env_vars as xenv | |||
There was a problem hiding this comment.
should we renamed these experimental files?
There was a problem hiding this comment.
Yes, good catch.
| # TODO(wcromar): Detect GPU device too | ||
|
|
||
|
|
||
| def device_type() -> Optional[str]: |
There was a problem hiding this comment.
I think we have similar functions in torch_xla.core.xla_model, do we want to do some clean up?
There was a problem hiding this comment.
I want to take this chance to do some clean up. I am always confuse what function to call for local ordinal, gloabal ordinal, worla_size etc and what do they really mean in a pod context. If we can restructure those api a bit and maybe have those apis in this runtime instead that would be nice...(random idea, might need more thinking)
There was a problem hiding this comment.
xla_model becomes a kitch sink and we put random things in it, if we can move all runtime related bits in this file it is actually nicer..
There was a problem hiding this comment.
Discussed this offline. We'll start to move APIs that interact directly with the runtime to a new module, and leave any modeling-related APIs in xla_model. I moved the PJRT version of rendezvous back to xla_model, and the old rendezvous is will be an alias of that implementation when PJRT is enabled.
8fa857f to
660a4cf
Compare
93def62 to
de2f117
Compare
| return | ||
|
|
||
| logging.warning( | ||
| 'XRT configuration not detected. Defaulting to preview PJRT ' |
There was a problem hiding this comment.
do we want to change preview to stable here?
There was a problem hiding this comment.
Good catch. Done.
JackCaoG
left a comment
There was a problem hiding this comment.
LGTM, but prefer to merge on Monday
dc1130e to
f11be83
Compare
Reorganize the experimental PJRT Python APIs.
_internalmodule for APIs that are well-tested, but likely to change. I moved device-specific logic here, since I expect to rework it in the near future. All of these functions are mainly used for framework development. In general, users shouldn't have to call them directly.torch.runtimemodule.deprecationmodule to register deprecated aliases for all public functions that are moving out into other parts of oftorch_xla.Summary of new modules:
torch_xla.runtimetorch_xla._internal.tputorch_xla._internal.gputorch_xla._internal.pjrt