Skip to content

[benchmarks] Fix AMP setup for torchbench models.#7067

Merged
JackCaoG merged 4 commits intomasterfrom
ysiraichi/fix-benchmark-amp-setup
May 16, 2024
Merged

[benchmarks] Fix AMP setup for torchbench models.#7067
JackCaoG merged 4 commits intomasterfrom
ysiraichi/fix-benchmark-amp-setup

Conversation

@ysiraichi
Copy link
Copy Markdown
Collaborator

Fix: #6556 (and, possibly #6833)

This PR fixes the benchmarks script when running with AMP. Previously, we were calling torch.amp.autocast(..., device_type="xla") for both XLA:CUDA and XLA:TPU. However, we should be using torch.cuda.amp.autocast for XLA:CUDA (see this for more details).

Context: after #6518, Super_Slomo inference started being run using AMP. However, due to #6511, that PR tried to mimic torch_xla.amp.autocast behavior, using torch.amp.autocast.

cc @miladm @JackCaoG @vanbasten23 @zpcore

@ysiraichi
Copy link
Copy Markdown
Collaborator Author

Confirmed it also fixes #6833.

@ysiraichi ysiraichi requested a review from vanbasten23 May 15, 2024 20:59
@JackCaoG
Copy link
Copy Markdown
Collaborator

Hmm according to https://github.com/pytorch/xla/blob/master/docs/amp.md we should be able to use autocast for both TPU and GPU, is that no longer the case?

@ysiraichi
Copy link
Copy Markdown
Collaborator Author

That document is correct. Problem is that I didn't notice XLA:CUDA is supposed to run with CUDA autocast, i.e. torch.amp.autocast("cuda"). Instead, I was running both XLA:CUDA and XLA:TPU with XLA autocast, i.e. torch.amp.autocast("xla"). This behavior is implemented in torch_xla.amp.autocast. However, since it currently doesn't work (#6511) with dynamo, I was using torch.amp.autocast directly.

@JackCaoG JackCaoG merged commit aeed89e into master May 16, 2024
# https://github.com/pytorch/xla/issues/6511
if self.is_accelerator_cuda():
# For inductor and XLA:CUDA, we use CUDA autocast.
autocast = torch.cuda.amp.autocast
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess torch.cuda.amp.autocast is the same as torch.amp.autocast("cuda")?

# https://github.com/pytorch/xla/issues/6511
if self.is_accelerator_cuda():
# For inductor and XLA:CUDA, we use CUDA autocast.
autocast = torch.cuda.amp.autocast
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to set kwargs["device_type"] = "xla" for XLA:GPU case?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. torch.cuda.amp.autocast already does that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

3 participants