Support gmm and tgmm trace_pallas caching by JackCaoG · Pull Request #7921 · pytorch/xla

JackCaoG · 2024-08-29T00:27:17Z

was able to reduce the tracing time of gmm from 6ms to 2.4 ms

JackCaoG · 2024-08-29T00:28:42Z

still need to add a test for the cache miss case.

alanwaketan · 2024-08-29T17:37:17Z

+    global trace_pallas_arg_to_payload
+    # implcit assumption here that everything in kwargs is hashable and not a tensor,
+    # which is true for the gmm and tgmm.
+    hash_key = (kernel, static_argnums, tuple(static_argnames), tuple(jax_args),


How does this work with different objects but with the same size, dtype and device?

jax_args are just meta tensors, I verified that same size will always map to the same hash. we are not hashing the id(static_argnames) so as long as the value is the same it will generate the same hash.

That's interesting. I guess if it works it works. Then why don't just use @cache?

my understanding is that @cache cache the input, inputs of this functions are xla tensor, I felt like cache will try to access the value of those tensors. in here I only cache the JAX meta tensor.

Also let me reverify this with the real moe models.

I see. That's fair.

JackCaoG · 2024-08-30T00:55:42Z

verified in the profile that trace_pallas is cached.

Support gmm and tgmm trace_pallas caching

42869e4

JackCaoG requested a review from alanwaketan August 29, 2024 00:27

JackCaoG added the tpuci label Aug 29, 2024

JackCaoG added 3 commits August 29, 2024 01:27

fix typo

576928f

add failing case test

cfcd35e

remove viztracer

424fb50

alanwaketan reviewed Aug 29, 2024

View reviewed changes

alanwaketan approved these changes Aug 29, 2024

View reviewed changes

JackCaoG marked this pull request as ready for review August 30, 2024 00:55

JackCaoG merged commit 8955571 into master Aug 30, 2024

JackCaoG deleted the JackCaoG/trace_pallas_cache branch August 30, 2024 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support gmm and tgmm trace_pallas caching#7921

Support gmm and tgmm trace_pallas caching#7921
JackCaoG merged 4 commits intomasterfrom
JackCaoG/trace_pallas_cache

JackCaoG commented Aug 29, 2024 •

edited

Loading

Uh oh!

JackCaoG commented Aug 29, 2024

Uh oh!

alanwaketan Aug 29, 2024

Uh oh!

JackCaoG Aug 29, 2024

Uh oh!

alanwaketan Aug 29, 2024

Uh oh!

JackCaoG Aug 29, 2024

Uh oh!

alanwaketan Aug 29, 2024

Uh oh!

JackCaoG commented Aug 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JackCaoG commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented Aug 29, 2024

Uh oh!

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

JackCaoG Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

JackCaoG Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

JackCaoG commented Aug 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JackCaoG commented Aug 29, 2024 •

edited

Loading