Minimal support for calling JAX from PyTorch/XLA by tengyifei · Pull Request #8781 · pytorch/xla

tengyifei · 2025-03-04T00:55:20Z

The trick is to turn JAX into HLO and weave that into an existing PyTorch/XLA lazy tensor graph.

So far we just test a simple numerical function.

The key is to use `as_serialized_hlo_module_proto`. Our self cooked `_xla_computation_text_to_proto` causes an undefined op reference error.

tengyifei · 2025-03-04T04:23:40Z

Our CPU CI doesn't have JAX and torchax so I'm installing it like this: https://github.com/pytorch/xla/pull/8781/files#diff-d30e144e1a94b9125c97915d4bd55eeb00d136257f96c9c753c255b66b1c00b4

miladm · 2025-03-05T01:03:58Z

What happens if we tie torch.aotautograd.grad() to here? won't work?

miladm · 2025-03-05T01:05:27Z

I guess we are skipping this op :)
This is a great exploration

I'd love to understand how non-functional code behaves in this code path?
How do we make tracing decisions on deta-dependent conditional operations?

@qihqi @tengyifei

tengyifei · 2025-03-05T01:59:59Z

What happens if we tie torch.aotautograd.grad() to here? won't work?

We'll have to option to either use this with AOTAutograd, or avoid AOTAutograd.

If we would like to use AOTAutograd, we can use AOTAutograd to turn a PyTorch function into FX Graph, then use torchax to turn the FX Graph to a JAX function, then use call_jax to turn the JAX function into a LazyTensor node.

If we would like to avoid AOTAutograd, we'll use torchax to turn a PyTorch function into a JAX function, use jax.grad to get the backward (also as a JAX function), then use call_jax to turn the JAX function into a LazyTensor node.

In the latter approach, we'll be able to use powerful JAX remat features like checkpoint_name (checkpoint or offload an arbitrary tensor).

I'd love to understand how non-functional code behaves in this code path?

In the general case, this won't support non-functional code. For example if some PyTorch function inserts a tensor into a global list, we'll just end up inserting a JAX tracer into the global list. But AOTAutograd has the same constraint. This limitation does not exist when only using LazyTensor because that framework blurs the boundary between tracing/execution at the cost of accidental graph breaks.

My hypothesis is that most high performance models (esp. ones in torchprime) won't do weird stuff like that.

We'll need to find some other solution to cover the long tail e.g. use dynamo.

How do we make tracing decisions on data-dependent conditional operations?

We can't do that (e.g. if (tensor) { ... }) in jax.jit. It'll raise a Python exception. To be clear, we also can't efficiently do that in LazyTensor as you'll graph break.

Similarly, my hypothesis is that high performance models won't do a data dependent branch. They need to rewrite the if into a jax.lax.cond.

Co-authored-by: Han Qi <hanq@google.com>

qihqi and others added 4 commits March 3, 2025 10:40

test: call jax from torch_xla

f817acb

add jax interop

08d6dc4

wip

1f4f65d

Fix HLO verifier crash

2a33afa

The key is to use `as_serialized_hlo_module_proto`. Our self cooked `_xla_computation_text_to_proto` causes an undefined op reference error.

tengyifei requested a review from qihqi March 4, 2025 00:55

tengyifei marked this pull request as ready for review March 4, 2025 00:55

qihqi approved these changes Mar 4, 2025

View reviewed changes

Fix test on CI

ae6815a

tengyifei enabled auto-merge (squash) March 4, 2025 03:50

Also install torchax in PyTorch/XLA CI

877adbe

tengyifei merged commit 17270e2 into master Mar 4, 2025

miladm self-assigned this Mar 5, 2025

pgmoka pushed a commit that referenced this pull request Mar 5, 2025

Minimal support for calling JAX from PyTorch/XLA (#8781)

083a1c0

Co-authored-by: Han Qi <hanq@google.com>

tengyifei mentioned this pull request Mar 7, 2025

Use jax autograd from PyTorch #8805

Closed

zpcore mentioned this pull request Mar 8, 2025

Support of Splash Attention using xla_builder.call_jax AI-Hypercomputer/torchprime#145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal support for calling JAX from PyTorch/XLA#8781

Minimal support for calling JAX from PyTorch/XLA#8781
tengyifei merged 6 commits intomasterfrom
hanq_jax_torchxla_0302

tengyifei commented Mar 4, 2025

Uh oh!

tengyifei commented Mar 4, 2025

Uh oh!

miladm commented Mar 5, 2025 •

edited

Loading

Uh oh!

miladm commented Mar 5, 2025 •

edited

Loading

Uh oh!

tengyifei commented Mar 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tengyifei commented Mar 4, 2025

Uh oh!

tengyifei commented Mar 4, 2025

Uh oh!

miladm commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

miladm commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tengyifei commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

miladm commented Mar 5, 2025 •

edited

Loading

miladm commented Mar 5, 2025 •

edited

Loading

tengyifei commented Mar 5, 2025 •

edited

Loading