Add FP16 and I64 support for wasi-nn WinML backend. by jianjunz · Pull Request #8964 · bytecodealliance/wasmtime

jianjunz · 2024-07-16T13:48:12Z

Some devices may not support FP32.

.github/workflows/main.yml

crates/wasi-nn/Cargo.toml

abrown · 2024-07-18T22:47:57Z

crates/wasi-nn/src/backend/winml.rs


 impl BackendExecutionContext for WinMLExecutionContext {
    fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> {
+        // TODO: Clear previous bindings when needed.


When is this needed?

WinML may report an error

self.binding.Bind("input", tensor of shape [10]); self.binding.Bind("input", tensor of shape [11]); <-- error

But it works

self.binding.Bind("input", tensor of shape [10]); self.binding.Clear(); self.binding.Bind("input", tensor of shape [11]);

Ok, so this needs to be fixed then in this PR?

No, not in this one. It cannot be easily fixed by adding a self.binding.Clear() here because a model may have multiple input features. In this case, application calls set_input more than once.

I think I understand what you're saying about Clear: it erases all the bindings, even for other model inputs. We can't have that. But what happens in this conceivable sequence?

Wasm guest calls set_input on input N with shape A

Wasm guest again calls set_input on input N with shape B

This is valid, though silly. The user should not have to face an error from wasi-nn in this case, right? But, if WinML is going to raise an error, then should we protect this some other way, e.g., by checking that the tensor shape is what the model expects (either A or B)?

That's a variable input, accepts both A and B (like a string with different length). I feel like this is a bug of WinML, or the calling flow is incorrect. Clear fixes the issue but I'm not sure if that's the only solution, so I'm not adding Clear at this time.

crates/wasi-nn/src/backend/winml.rs

abrown · 2024-07-18T23:18:56Z

crates/wasi-nn/src/backend/winml.rs

+    let tensor = match tensor_kind {
+        TensorKind::Float16 => {
+            let output_tensor = inspectable.cast::<TensorFloat16Bit>()?;
+            let dimensions = dimensions_as_u32(&output_tensor.Shape()?)?;


We can do this once at the top of the function.

output_tensor's type is unknown at the top of the function.

How about

let itensor = inspectable.cast<ITensor>()?; let dimensions = dimensions_as_u32(&itensor.Shape()?)?;

or something like that?

It works, but GetAsVectorView is a method of TensorFloat16Bit. Then we'll need to cast itensor again to specific tensor type. The change above makes code clean but will it be a performance issue to cast twice?

crates/wasi-nn/src/backend/winml.rs

abrown · 2024-07-24T01:24:25Z

crates/wasi-nn/src/backend/winml.rs


 impl BackendExecutionContext for WinMLExecutionContext {
    fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> {
+        // TODO: Clear previous bindings when needed.


Ok, so this needs to be fixed then in this PR?

crates/wasi-nn/src/backend/winml.rs

crates/wasi-nn/src/lib.rs

Some devices may not support FP32. prtest:full

jianjunz force-pushed the f16-i64 branch 4 times, most recently from 613cae7 to 95657d4 Compare July 17, 2024 07:21

jianjunz marked this pull request as ready for review July 17, 2024 08:28

jianjunz requested review from a team as code owners July 17, 2024 08:29

jianjunz requested review from alexcrichton and removed request for a team July 17, 2024 08:29

alexcrichton requested review from abrown and removed request for a team and alexcrichton July 17, 2024 14:30

abrown reviewed Jul 18, 2024

View reviewed changes

abrown reviewed Jul 24, 2024

View reviewed changes

abrown reviewed Aug 6, 2024

View reviewed changes

crates/wasi-nn/src/lib.rs Outdated Show resolved Hide resolved

jianjunz added 9 commits August 8, 2024 07:46

Add FP16 and I64 support for wasi-nn WinML backend.

921241a

Some devices may not support FP32. prtest:full

Remove unnecessary features.

fa47532

Address comments.

6585e32

Check alignment before from_raw_parts.

283f8d5

Implement PartialEq for Tensor.

7cbcc01

Remove duplicated shape info from set_input.

e3d3996

Update alignment checker.

359b08e

Add comments about creating TensorFloat16Bit from f32 array.

393bdac

Use PartialEq attribute.

75dda78

jianjunz force-pushed the f16-i64 branch from fa6f212 to 75dda78 Compare August 7, 2024 23:51

Audit new WinML dependencies

762e0a0

abrown approved these changes Aug 8, 2024

View reviewed changes

abrown added this pull request to the merge queue Aug 8, 2024

Merged via the queue into bytecodealliance:main with commit 6907868 Aug 8, 2024

Conversation

jianjunz commented Jul 16, 2024

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants