Add FP16 and I64 support for wasi-nn WinML backend.#8964
Add FP16 and I64 support for wasi-nn WinML backend.#8964abrown merged 10 commits intobytecodealliance:mainfrom
Conversation
613cae7 to
95657d4
Compare
|
|
||
| impl BackendExecutionContext for WinMLExecutionContext { | ||
| fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> { | ||
| // TODO: Clear previous bindings when needed. |
There was a problem hiding this comment.
WinML may report an error
self.binding.Bind("input", tensor of shape [10]);
self.binding.Bind("input", tensor of shape [11]); <-- error
But it works
self.binding.Bind("input", tensor of shape [10]);
self.binding.Clear();
self.binding.Bind("input", tensor of shape [11]);
There was a problem hiding this comment.
Ok, so this needs to be fixed then in this PR?
There was a problem hiding this comment.
No, not in this one. It cannot be easily fixed by adding a self.binding.Clear() here because a model may have multiple input features. In this case, application calls set_input more than once.
There was a problem hiding this comment.
I think I understand what you're saying about Clear: it erases all the bindings, even for other model inputs. We can't have that. But what happens in this conceivable sequence?
- Wasm guest calls
set_inputon input N with shape A - Wasm guest again calls
set_inputon input N with shape B
This is valid, though silly. The user should not have to face an error from wasi-nn in this case, right? But, if WinML is going to raise an error, then should we protect this some other way, e.g., by checking that the tensor shape is what the model expects (either A or B)?
There was a problem hiding this comment.
That's a variable input, accepts both A and B (like a string with different length). I feel like this is a bug of WinML, or the calling flow is incorrect. Clear fixes the issue but I'm not sure if that's the only solution, so I'm not adding Clear at this time.
| let tensor = match tensor_kind { | ||
| TensorKind::Float16 => { | ||
| let output_tensor = inspectable.cast::<TensorFloat16Bit>()?; | ||
| let dimensions = dimensions_as_u32(&output_tensor.Shape()?)?; |
There was a problem hiding this comment.
We can do this once at the top of the function.
There was a problem hiding this comment.
output_tensor's type is unknown at the top of the function.
There was a problem hiding this comment.
How about
let itensor = inspectable.cast<ITensor>()?;
let dimensions = dimensions_as_u32(&itensor.Shape()?)?;or something like that?
There was a problem hiding this comment.
It works, but GetAsVectorView is a method of TensorFloat16Bit. Then we'll need to cast itensor again to specific tensor type. The change above makes code clean but will it be a performance issue to cast twice?
|
|
||
| impl BackendExecutionContext for WinMLExecutionContext { | ||
| fn set_input(&mut self, id: Id, tensor: &Tensor) -> Result<(), BackendError> { | ||
| // TODO: Clear previous bindings when needed. |
There was a problem hiding this comment.
Ok, so this needs to be fixed then in this PR?
Some devices may not support FP32. prtest:full
Some devices may not support FP32.