Allow ORT actively fallback CUDAExecutionProvider to ROCMExecutionProvider by cloudhan · Pull Request #16895 · microsoft/onnxruntime

cloudhan · 2023-07-28T04:17:46Z

In the wild, for example, pytorch and huggingface (pytorch pipelines) use cuda for amd gpu. Their user can basically painless switch from cuda devices to rocm devices. That is, in pytorch world they fallback cuda to rocm.

When switched to hf ort backend, it will populate a string CUDAExecutionProvider automatically to pin down the provider. Then ORT will not play well in this case because the framework will fallback cuda to cpu.

The disparity creates a lot of headache during benchmarking the ROCm EP when reusing scripts for cuda.

This PR address it by allow cuda to fallback to rocm.

…vider

pranavsharma · 2023-08-02T05:15:27Z

onnxruntime/python/onnxruntime_inference_collection.py


    def set_provider_options(name, options):
+        if (
+            os.environ.get("ORT_FALLBACK_CUDA_EP_TO_ROCM_EP", "0") == "1"


We've avoided env variables so far for configs. Let's continue this convention.

This is not config, this is a walkaround. This fallback logic happens right before the construction of Session, so entry ORT_FALLBACK_CUDA_EP_TO_ROCM_EP neither fits into session options nor provider options.

onnxruntime/python/onnxruntime_inference_collection.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

Allow ORT actively fallback CUDAExecutionProvider to ROCMExecutionPro…

a9cb7ce

…vider

cloudhan force-pushed the guangyunhan/fallback-cuda-to-rocm branch from 6e24e9c to a9cb7ce Compare July 28, 2023 05:59

cloudhan requested review from PeixuanZuo and thiagocrepaldi July 28, 2023 06:00

pranavsharma reviewed Aug 2, 2023

View reviewed changes

thiagocrepaldi reviewed Aug 2, 2023

View reviewed changes

onnxruntime/python/onnxruntime_inference_collection.py Show resolved Hide resolved

Check ROCM EP available

ecbb261

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

cloudhan requested review from pranavsharma and thiagocrepaldi August 3, 2023 16:16

cloudhan mentioned this pull request Aug 22, 2023

Add Whisper benchmark script #17043

Merged

cloudhan closed this Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow ORT actively fallback CUDAExecutionProvider to ROCMExecutionProvider#16895

Allow ORT actively fallback CUDAExecutionProvider to ROCMExecutionProvider#16895
cloudhan wants to merge 2 commits intomainfrom
guangyunhan/fallback-cuda-to-rocm

cloudhan commented Jul 28, 2023

Uh oh!

pranavsharma Aug 2, 2023

Uh oh!

cloudhan Aug 2, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cloudhan commented Jul 28, 2023

Uh oh!

pranavsharma Aug 2, 2023

Choose a reason for hiding this comment

Uh oh!

cloudhan Aug 2, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants