Skip to content

Add FP16 and I64 support for wasi-nn WinML backend.#8964

Merged
abrown merged 10 commits intobytecodealliance:mainfrom
jianjunz:f16-i64
Aug 8, 2024
Merged

Add FP16 and I64 support for wasi-nn WinML backend.#8964
abrown merged 10 commits intobytecodealliance:mainfrom
jianjunz:f16-i64

Conversation

@jianjunz
Copy link
Copy Markdown
Contributor

Some devices may not support FP32.

@jianjunz jianjunz force-pushed the f16-i64 branch 4 times, most recently from 613cae7 to 95657d4 Compare July 17, 2024 07:21
@jianjunz jianjunz marked this pull request as ready for review July 17, 2024 08:28
@jianjunz jianjunz requested review from a team as code owners July 17, 2024 08:29
@jianjunz jianjunz requested review from alexcrichton and removed request for a team July 17, 2024 08:29
@alexcrichton alexcrichton requested review from abrown and removed request for a team and alexcrichton July 17, 2024 14:30

impl BackendExecutionContext for WinMLExecutionContext {
fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> {
// TODO: Clear previous bindings when needed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WinML may report an error

self.binding.Bind("input", tensor of shape [10]);
self.binding.Bind("input", tensor of shape [11]);  <-- error

But it works

self.binding.Bind("input", tensor of shape [10]);
self.binding.Clear();
self.binding.Bind("input", tensor of shape [11]);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this needs to be fixed then in this PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not in this one. It cannot be easily fixed by adding a self.binding.Clear() here because a model may have multiple input features. In this case, application calls set_input more than once.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're saying about Clear: it erases all the bindings, even for other model inputs. We can't have that. But what happens in this conceivable sequence?

  • Wasm guest calls set_input on input N with shape A
  • Wasm guest again calls set_input on input N with shape B

This is valid, though silly. The user should not have to face an error from wasi-nn in this case, right? But, if WinML is going to raise an error, then should we protect this some other way, e.g., by checking that the tensor shape is what the model expects (either A or B)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a variable input, accepts both A and B (like a string with different length). I feel like this is a bug of WinML, or the calling flow is incorrect. Clear fixes the issue but I'm not sure if that's the only solution, so I'm not adding Clear at this time.

let tensor = match tensor_kind {
TensorKind::Float16 => {
let output_tensor = inspectable.cast::<TensorFloat16Bit>()?;
let dimensions = dimensions_as_u32(&output_tensor.Shape()?)?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this once at the top of the function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_tensor's type is unknown at the top of the function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

let itensor = inspectable.cast<ITensor>()?;
let dimensions = dimensions_as_u32(&itensor.Shape()?)?;

or something like that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works, but GetAsVectorView is a method of TensorFloat16Bit. Then we'll need to cast itensor again to specific tensor type. The change above makes code clean but will it be a performance issue to cast twice?


impl BackendExecutionContext for WinMLExecutionContext {
fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> {
// TODO: Clear previous bindings when needed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this needs to be fixed then in this PR?

@abrown abrown added this pull request to the merge queue Aug 8, 2024
Merged via the queue into bytecodealliance:main with commit 6907868 Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants