-
Notifications
You must be signed in to change notification settings - Fork 790
[wast-parser] Add parse_binary_modules option #2583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wast-parser] Add parse_binary_modules option #2583
Conversation
include/wabt/feature.def
Outdated
| WABT_FEATURE(extended_const, "extended-const", false, "Extended constant expressions") | ||
| WABT_FEATURE(relaxed_simd, "relaxed-simd", false, "Relaxed SIMD") | ||
| WABT_FEATURE(custom_page_sizes, "custom-page-sizes", false, "Custom page sizes") | ||
| WABT_FEATURE(validate_binary_modules, "validate-binary-modules", true, "Validate binary modules") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, perhaps this is best placed in a field of WastParserOptions instead? It would not make sense for a user to specify this flag on the command line, I expect it would break assumptions in the various tools (e.g. if the WastParser returns OK, the Module in a BinaryScriptModule should be populated).
f1b1c05 to
ee2d3b2
Compare
| &errors, module); | ||
| if (options_->validate_binary_modules) { | ||
| ReadBinaryIr(filename, bsm->data.data(), bsm->data.size(), options, | ||
| &errors, module); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the code change, this is more like ignoring the module. Not validating usually means the module is processed, but the errors are ignored.
(Note: I am not a reviewer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script_module and its binary data are still made available (see the unit test for example usage), it's just the structured wabt::Module that is not provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case perhaps a better name for this option might be "parse_binary_modules" or "no_parse_binary_modules"?
BTW, how do invalid binaries modules in the test suite currently work without this option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to "parse_binary_modules".
Currently invalid modules are parsed and fail validation in WABT, which causes the test to pass. This change does not alter that behavior.
The use case is when you're using WABT to parse WAST files, but you don't want WABT to reject invalid binary modules. Maybe you want to send that invalid module data to a WASM engine that you're building. With this change, you can set parse_binary_modules to false and extract that invalid data out without WABT rejecting it.
Enabled by default, this option implements the current behavior of WABT when it encounters "(module binary ...)" script modules. That is, it parses the binary module, reporting any errors that it sees. When disabled, the binary module will be passed back to the WAST parser as unvalidated raw bytes. This allows for the use of WABT in places which want to provide invalid modules to module loaders (e.g. spec test runners). Fixes WebAssembly#2561.
ee2d3b2 to
65a087d
Compare
sbc100
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with title updated.
|
Thank you. Love WABT btw, it's really nice to work with! |
| TEST(WastParser, InvalidBinaryModule) { | ||
| std::string text = R"((module binary "\00asm\bc\0a\00\00"))"; | ||
| std::vector<uint8_t> expected_data = { | ||
| 0, 'a', 's', 'm', static_cast<uint8_t>('\xbc'), '\x0a', 0, 0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This static_cast seems a little odd. Is there a better way to write this? Is this needed because C chars are signed by default? Does 0xbc work here instead?
Enabled by default, this option implements the current behavior of WABT when it encounters "(module binary ...)" script modules. That is, it parses the binary module, reporting any errors that it sees.
When disabled, the binary module will be passed back to the WAST parser as unvalidated raw bytes. This allows for the use of WABT in places which want to provide invalid modules to module loaders (e.g. spec test runners).
Fixes #2561.