-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Reject requests for nonexistent kernels. #2364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| constexpr const char* kBrainSliceExecutionProvider = "BrainSliceExecutionProvider"; | ||
| constexpr const char* kTensorrtExecutionProvider = "TensorrtExecutionProvider"; | ||
|
|
||
| constexpr int kMaximumAiOnnxVersionSupported = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it is 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is branch rel-1905, which is a much older snapshot of ORT where 11 support was only partial (IIRC only a few ops out of them all), which was before the ONNX 1.6 specification was even published and finalized. So the little bit of 11 support contained in this branch is an uncertain state relative to the spec. We want to say cleanly that we support the published version of ONNX 1.4 (opset 10) in Vibranium.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now you need to find a way to make the tests pass :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I found a way to make them pass (at least onnxruntime_test_all.exe runs locally without errors) by applying the same rules to the other GetLatestOpsetVersions overload, which is what the Model class indirectly used to determine its default opset when no specific opset was passed to the constructor (it was using 11 by default). Also updated to testdata .onnx files which were set to opset 11 unnecessarily (moving them back down to 10).
Note that in master, I since saw that Scott McKay added explicit versions in a number of places so that we continue to still get testing on older op versions, and so that may not apply quite the same there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alas, I see that onnxruntime_test_all.exe doesn't run all the test_models, and there are still a number of opset 11 models in the CI tests complaining about failing to an 11 kernel, which is many to update O_o. I don't know if I'll make it in time before the bug bar rises to pacify the tests. I'll at least undo the latest change, but I'm considering abandoning this PR and keeping the logic exclusively in WinML for now.
2019-11-14 14:24:55.967545903 [E:onnxruntime:Default, runner.cc:42 RunTestCase] Test raw failed:Load model from test_models/test_abs_uint32/raw/model.onnx failed:This is an invalid model. Error in Node: : No Op registered for Abs with domain_version of 11
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=96833
|
Are you saying: |
Correct, I loaded and executed an Add-13 model without any warning/error. |
|
Hmm, CI tests: "No Op registered for Identity with domain_version of 11" |
| constexpr const char* kBrainSliceExecutionProvider = "BrainSliceExecutionProvider"; | ||
| constexpr const char* kTensorrtExecutionProvider = "TensorrtExecutionProvider"; | ||
|
|
||
| constexpr int kMaximumAiOnnxVersionSupported = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we be using
DomainToVersionRange from ONNX
see: https://github.com/onnx/onnx/blob/master/onnx/defs/schema.h#L687
rather than hardcoding it in ORT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. I like the idea of using it. The issue here is that ONNX and ORT have different versions (note this fix targets the rel-1905 Vibranium branch, and so this might not be applicable to the master branch) as we took an older snapshot of ORT that only supported opset 10, but its ONNX submodule has 11 since ONNX 1.6 had already started (but wasn't finalized and published yet). The ORT opset 11 support in this branch was preliminary and partial (before the ONNX 1.6 spec was published, meaning ops could still change behavior). So we don't really want 11.
A pending question is what this should ultimately look like in master. I opened a GitHub issue: #2371
|
Changes look ok. Can you add a test case? |
@pranavsharma : Test case added. Ack, there are many other model tests that fail. I think I'll have to give up on this change to make the window and revert to just having this check only in WinML, even though it's not as complete there. |
| registries.push_front(registry); | ||
| } | ||
|
|
||
| static int GetActualLastSupportedOpSet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this relate to the "DomainToVersionMap" in graph and model ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I fear my latest iteration last night (in an attempt to pacify the test code which calls this function to indirectly figure out what opset it should use for dynamically generated models by default) will break app registered schemas :/, and we do want to support that.
|
We'll make a modified change targeting master (incorporating George Wu's feedback) with the same policy (and keep this specific change local for Vibranium only). |
Description: ORT silently substitutes v10 operators when higher version operators (say v13 or v42..) are declared as opset dependencies by the model. This is problematic because it will appear to the user that the model was successfully executed when in reality the code has no idea if the results are correct because it can't predict the future - the semantics between v10 and v13 could change in the specification. It's better to fail than to return the incorrect results. Applications can always register custom schema and newer opset versions to fill in missing gaps, in cases where ORT is behind.
Note this branch of ORT is an older intermediate snapshot which only supported opset 10. Some 11 ops were implemented (BitShift, CumSum, Reduce*, DynamicQuantizeLinear, Pad, Resize, Slice), but none of the slice operators.
Motivation and Context
#2371