Skip to content

[ROCm] enabling miopen_batch_norm lowering in inductor#105740

Closed
jataylo wants to merge 12 commits intopytorch:mainfrom
jataylo:miopen_batch_norm_lowering
Closed

[ROCm] enabling miopen_batch_norm lowering in inductor#105740
jataylo wants to merge 12 commits intopytorch:mainfrom
jataylo:miopen_batch_norm_lowering

Conversation

@jataylo
Copy link
Collaborator

@jataylo jataylo commented Jul 21, 2023

Enabling miopen_batch_norm lowering for inductor only.

This is to avoid errors observed in some models and perf difference is very close from initial benchmarks.

LoweringException: RuntimeError: Expected contiguous tensor, but got non-contiguous tensor for argument #1 'input' (while checking arguments for miopen_batch_norm)
  target: aten.miopen_batch_norm.default

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @hongxiayang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

@jataylo jataylo added module: rocm AMD GPU support for Pytorch ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/inductor rocm This tag is for PRs from ROCm team ciflow/unstable Run all experimental or flaky jobs on PyTorch unstable workflow ciflow/slow labels Jul 21, 2023
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 21, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/105740

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 4 Unrelated Failures

As of commit 82faf5b:

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following job failed but were present on the merge base 29f856e:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@jataylo jataylo added topic: not user facing topic category keep-going Don't stop on first failure, keep running tests until the end labels Jul 21, 2023
@jithunnair-amd
Copy link
Collaborator

@jataylo Are the perf issues ( due to which the lowering was disabled) resolved in ROCm5.6 specifically?

@jataylo
Copy link
Collaborator Author

jataylo commented Jul 21, 2023

@jataylo Are the perf issues ( due to which the lowering was disabled) resolved in ROCm5.6 specifically?

@jithunnair-amd Not from ROCm5.6 specifically seems to mostly be python/triton updates that is closing the gap here.

The primary motivation for pushing this change is a failure faced in a few models that use batch_norm from some pytorch change.

LoweringException: RuntimeError: Expected contiguous tensor, but got non-contiguous tensor for argument #1 'input' (while checking arguments for miopen_batch_norm)
  target: aten.miopen_batch_norm.default

We will keep tabs on performance and revert if deemed necessary cc: @dllehr-amd

@jataylo jataylo removed the keep-going Don't stop on first failure, keep running tests until the end label Jul 26, 2023
@jataylo jataylo requested a review from malfet July 26, 2023 15:33
@jataylo jataylo marked this pull request as ready for review July 26, 2023 15:33
@jataylo jataylo added the rocm priority high priority ROCm PRs from performance or other aspects label Jul 26, 2023
@jataylo jataylo requested a review from jithunnair-amd July 26, 2023 15:36
@jataylo
Copy link
Collaborator Author

jataylo commented Jul 26, 2023

Hey @malfet enabling lowering for miopen_batch_norm on inductor to avoid some failures and unskipped the affected UTs. Could you help us approve this?

UTs failures are unrelated to this change.

The batch norm related UTs pass e.g.
2023-07-26T10:08:31.3428967Z inductor/test_torchinductor.py::CudaTests::test_batch_norm_2d_cuda PASSED [2.4299s] [ 58%]

cc: @jithunnair-amd

@jataylo jataylo requested a review from ngimel July 27, 2023 13:26
@jataylo jataylo requested a review from huydhn August 1, 2023 13:48
@jataylo
Copy link
Collaborator Author

jataylo commented Aug 1, 2023

Hi @malfet @huydhn @ngimel this is causing some pt2 model breakages for us until merged if you have some time to help us review.

The failures are unrelated but please let me know if I should rebase to get this green.

@malfet
Copy link
Contributor

malfet commented Aug 1, 2023

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/slow ciflow/trunk Trigger trunk jobs on your pull request ciflow/unstable Run all experimental or flaky jobs on PyTorch unstable workflow Merged module: inductor module: rocm AMD GPU support for Pytorch open source rocm priority high priority ROCm PRs from performance or other aspects rocm This tag is for PRs from ROCm team topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants