Let grad ckpt apply opt-barrier to all params and buffers by alanwaketan · Pull Request #7206 · pytorch/xla

alanwaketan · 2024-06-05T23:38:24Z

Summary:
Let grad ckpt apply opt-barrier to all params and buffers.

Test Plan:
python test/test_operations.py -v -k test_opt_barrier

JackCaoG · 2024-06-05T23:43:46Z

+    output = torch.sum(output)
+    output.backward()
+
+    hlo = torch_xla._XLAC._get_xla_tensors_hlo([model.x.weight.grad])


didn't your change include the weight and buffer, but the test seems to try to test the HLO of the weight.grad

That's the way to get the full hlo. As long as the opt-barrier contains all tensors, it's fine.

alanwaketan · 2024-06-05T23:48:47Z

Thanks, Jack.

JackCaoG · 2024-06-06T06:26:52Z

-    self.assertEqual(opt_barrier.count("f32[128,128]"), 6)
-    self.assertEqual(opt_barrier.count("f32[128]"), 2)
-    self.assertEqual(opt_barrier.count("f32[64,64]"), 2)
+    # Somehow the CPU/GPU CI will not have the opt-barrier.


ehh this is weird... I can look into this tmr...

alanwaketan · 2024-06-06T06:28:14Z

Skip the GPU CI to move fast.

alanwaketan requested a review from JackCaoG June 5, 2024 23:38

alanwaketan self-assigned this Jun 5, 2024

JackCaoG reviewed Jun 5, 2024

View reviewed changes

JackCaoG approved these changes Jun 5, 2024

View reviewed changes

alanwaketan added 4 commits June 6, 2024 02:58

initial commit

f094f5c

Fix linters

a9cc304

Fix test

acaa4df

Improve test

b287944

alanwaketan added the tpuci label Jun 6, 2024

alanwaketan force-pushed the alanwaketan/grad_ckpt branch from 1cfb469 to b287944 Compare June 6, 2024 03:24

fix ci

050bfec

JackCaoG reviewed Jun 6, 2024

View reviewed changes

alanwaketan merged commit aec2730 into master Jun 6, 2024

alanwaketan deleted the alanwaketan/grad_ckpt branch June 6, 2024 06:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let grad ckpt apply opt-barrier to all params and buffers#7206

Let grad ckpt apply opt-barrier to all params and buffers#7206
alanwaketan merged 5 commits intomasterfrom
alanwaketan/grad_ckpt

alanwaketan commented Jun 5, 2024

Uh oh!

JackCaoG Jun 5, 2024

Uh oh!

alanwaketan Jun 5, 2024 •

edited

Loading

Uh oh!

alanwaketan commented Jun 5, 2024

Uh oh!

JackCaoG Jun 6, 2024

Uh oh!

alanwaketan commented Jun 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alanwaketan commented Jun 5, 2024

Uh oh!

JackCaoG Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanwaketan commented Jun 5, 2024

Uh oh!

JackCaoG Jun 6, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan commented Jun 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alanwaketan Jun 5, 2024 •

edited

Loading