-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[WebNN EP] Optimize model partitioning #23332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4dc1715 to
2558060
Compare
|
@fdwr, PTAL, thanks! |
fdwr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting - I didn't consider disconnected graph chunks to be considered potentially the same partition. I am good with it if Wanming is.
|
/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline |
|
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
|
/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
Honry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % a nit.
7b8f570 to
03df4f3
Compare
|
/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline |
|
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
|
/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI |
|
/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
Azure Pipelines successfully started running 4 pipeline(s). |
1 similar comment
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
|
Merge conflicts :/.
|
The old GetCapability function of WebNN EP is just a very simple search for groups of nodes that can be handled.
This doesn't work well in the following example graph:
A B
| |
\|/ \|/
C -> D
This graph topological order is A, B, C, D, and WebNN EP supports only A and C.
In the past, the partitioning result is {A}, {B}, {C}, {D}, four partitions.
But the optimized result is {A, C} and {B, D}.
Therefore, we improve partitioning results by reusing utils::CreateSupportedPartitions,
which walks the edges for each node that the EP can handle as they are iterated in topological order.
This would guarantee that all connected nodes that can be handled are grouped together.
Correspondingly, we modify the webnn::GetSupportedNodes function to return the supported nodes instead of the group of supported partitions.
Update onnxruntime/core/providers/webnn/builders/helper.cc
Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
03df4f3 to
852690d
Compare
|
Oh...🤦 Just rebased code. Please help to re-trigger the tests. @fdwr, thanks. |
|
/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline |
|
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
|
/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI |
|
/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
Azure Pipelines successfully started running 4 pipeline(s). |
1 similar comment
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
fdwr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Reapproved.
### Description
<!-- Describe your changes. -->
The old `GetCapability` function of WebNN EP is just a very simple
search for groups of nodes that can be handled. This doesn't work well
in the following example graph, where A and D could be handled by the
EP, but B is between them in the topological order, as you get two
single node capabilities. However, it may also be advantageous if C and
E could be handled by the EP, since they would be combined with D even
though they are not connected.
```
A B C
| / |
D E
| |
```
Therefore, we improve partitioning results by reusing
`utils::CreateSupportedPartitions`, which walks the edges for each node
that the EP can handle as they are iterated in topological order. This
would guarantee that all connected nodes that can be handled are grouped
together. Correspondingly, we modify the `webnn::GetSupportedNodes`
function to return the supported nodes instead of the group of supported
partitions.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
### Description
<!-- Describe your changes. -->
The old `GetCapability` function of WebNN EP is just a very simple
search for groups of nodes that can be handled. This doesn't work well
in the following example graph, where A and D could be handled by the
EP, but B is between them in the topological order, as you get two
single node capabilities. However, it may also be advantageous if C and
E could be handled by the EP, since they would be combined with D even
though they are not connected.
```
A B C
| / |
D E
| |
```
Therefore, we improve partitioning results by reusing
`utils::CreateSupportedPartitions`, which walks the edges for each node
that the EP can handle as they are iterated in topological order. This
would guarantee that all connected nodes that can be handled are grouped
together. Correspondingly, we modify the `webnn::GetSupportedNodes`
function to return the supported nodes instead of the group of supported
partitions.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
### Description
<!-- Describe your changes. -->
The old `GetCapability` function of WebNN EP is just a very simple
search for groups of nodes that can be handled. This doesn't work well
in the following example graph, where A and D could be handled by the
EP, but B is between them in the topological order, as you get two
single node capabilities. However, it may also be advantageous if C and
E could be handled by the EP, since they would be combined with D even
though they are not connected.
```
A B C
| / |
D E
| |
```
Therefore, we improve partitioning results by reusing
`utils::CreateSupportedPartitions`, which walks the edges for each node
that the EP can handle as they are iterated in topological order. This
would guarantee that all connected nodes that can be handled are grouped
together. Correspondingly, we modify the `webnn::GetSupportedNodes`
function to return the supported nodes instead of the group of supported
partitions.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
Description
The old
GetCapabilityfunction of WebNN EP is just a very simple search for groups of nodes that can be handled. This doesn't work well in the following example graph, where A and D could be handled by the EP, but B is between them in the topological order, as you get two single node capabilities. However, it may also be advantageous if C and E could be handled by the EP, since they would be combined with D even though they are not connected.Therefore, we improve partitioning results by reusing
utils::CreateSupportedPartitions, which walks the edges for each node that the EP can handle as they are iterated in topological order. This would guarantee that all connected nodes that can be handled are grouped together. Correspondingly, we modify thewebnn::GetSupportedNodesfunction to return the supported nodes instead of the group of supported partitions.Motivation and Context