Cache HLO in xb.call_jax and support non-tensor args by tengyifei · Pull Request #8878 · pytorch/xla

tengyifei · 2025-03-24T20:28:26Z

The main purpose is to replace the clunky manual XlaComputation object caching at
https://github.com/AI-Hypercomputer/torchprime/blob/b0bd47e3c732c56e75d8d2b315f05e06d485dd22/torchprime/torch_xla_models/experimental/custom_kernel.py#L16, and just write xb.call_jax(some_jax_func) and simply avoid repeated tracing there.

We can't reuse the tracing cache in jax.jit because we jit a wrapper and not jax_func. Also as_serialized_hlo_module_proto has overhead itself and it would be nice to avoid calling that repeatedly.

Also we improve xb.call_jax to support non-tensor arguments. These arguments are passed from xb.call_jax to the JAX function unchanged. They are considered "static arguments" and will be baked into the HLO.

Because they are considered static args, we'll re-trace the jax function whenever their values change.

Fixes #8795.

The main purpose is to replace the clunky manual XlaComputation object caching at https://github.com/AI-Hypercomputer/torchprime/blob/b0bd47e3c732c56e75d8d2b315f05e06d485dd22/torchprime/torch_xla_models/experimental/custom_kernel.py#L16, and just write `xb.call_jax(some_jax_func)` and simply avoid repeated tracing there. We can't reuse the tracing cache in `jax.jit` because we jit a wrapper and not `jax_func`. Also `as_serialized_hlo_module_proto` has overhead itself and it would be nice to avoid calling that repeatedly. Also we improve `xb.call_jax` to support non-tensor arguments. These arguments are passed from `xb.call_jax` to the JAX function unchanged. They are considered "static arguments" and will be baked into the HLO. Because they are considered static args, we'll re-trace the jax function whenever their values change. Fixes #8795.

tengyifei marked this pull request as ready for review March 24, 2025 20:28

tengyifei requested review from bhavya01, qihqi and zpcore March 24, 2025 20:31

qihqi reviewed Mar 24, 2025

View reviewed changes

Comment thread torch_xla/core/xla_builder.py

qihqi approved these changes Mar 24, 2025

View reviewed changes

Address comments

1eb8f16

tengyifei merged commit a3ef52e into master Mar 24, 2025
23 checks passed

zpcore pushed a commit that referenced this pull request Mar 26, 2025

Cache HLO in xb.call_jax and support non-tensor args (#8878)

d803946

zpcore mentioned this pull request Mar 26, 2025

2.7 backport PR request list #8829

Closed

zpcore reviewed Mar 31, 2025

View reviewed changes

Comment thread torch_xla/core/xla_builder.py

zpcore mentioned this pull request Mar 31, 2025

Adapt Splash Attention from TorchPrime #8911

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache HLO in xb.call_jax and support non-tensor args#8878

Cache HLO in xb.call_jax and support non-tensor args#8878
tengyifei merged 2 commits intomasterfrom
yifeit/call-jax-cache

tengyifei commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tengyifei commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants