Skip to content

Conversation

@sjamesr
Copy link
Contributor

@sjamesr sjamesr commented Apr 12, 2025

Enabled by default, this option implements the current behavior of WABT when it encounters "(module binary ...)" script modules. That is, it parses the binary module, reporting any errors that it sees.

When disabled, the binary module will be passed back to the WAST parser as unvalidated raw bytes. This allows for the use of WABT in places which want to provide invalid modules to module loaders (e.g. spec test runners).

Fixes #2561.

WABT_FEATURE(extended_const, "extended-const", false, "Extended constant expressions")
WABT_FEATURE(relaxed_simd, "relaxed-simd", false, "Relaxed SIMD")
WABT_FEATURE(custom_page_sizes, "custom-page-sizes", false, "Custom page sizes")
WABT_FEATURE(validate_binary_modules, "validate-binary-modules", true, "Validate binary modules")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, perhaps this is best placed in a field of WastParserOptions instead? It would not make sense for a user to specify this flag on the command line, I expect it would break assumptions in the various tools (e.g. if the WastParser returns OK, the Module in a BinaryScriptModule should be populated).

@sjamesr sjamesr force-pushed the add_validate_binary_modules_feature branch from f1b1c05 to ee2d3b2 Compare April 12, 2025 02:45
@sjamesr sjamesr changed the title Add validate_binary_modules feature. Add validate_binary_modules option. Apr 12, 2025
&errors, module);
if (options_->validate_binary_modules) {
ReadBinaryIr(filename, bsm->data.data(), bsm->data.size(), options,
&errors, module);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the code change, this is more like ignoring the module. Not validating usually means the module is processed, but the errors are ignored.
(Note: I am not a reviewer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script_module and its binary data are still made available (see the unit test for example usage), it's just the structured wabt::Module that is not provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case perhaps a better name for this option might be "parse_binary_modules" or "no_parse_binary_modules"?

BTW, how do invalid binaries modules in the test suite currently work without this option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to "parse_binary_modules".

Currently invalid modules are parsed and fail validation in WABT, which causes the test to pass. This change does not alter that behavior.

The use case is when you're using WABT to parse WAST files, but you don't want WABT to reject invalid binary modules. Maybe you want to send that invalid module data to a WASM engine that you're building. With this change, you can set parse_binary_modules to false and extract that invalid data out without WABT rejecting it.

Enabled by default, this option implements the current behavior of WABT
when it encounters "(module binary ...)" script modules. That is, it
parses the binary module, reporting any errors that it sees.

When disabled, the binary module will be passed back to the WAST parser
as unvalidated raw bytes. This allows for the use of WABT in places
which want to provide invalid modules to module loaders (e.g. spec test
runners).

Fixes WebAssembly#2561.
@sjamesr sjamesr force-pushed the add_validate_binary_modules_feature branch from ee2d3b2 to 65a087d Compare April 14, 2025 17:05
Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with title updated.

@sjamesr sjamesr changed the title Add validate_binary_modules option. Add parse_binary_modules option. Apr 14, 2025
@sjamesr
Copy link
Contributor Author

sjamesr commented Apr 14, 2025

Thank you. Love WABT btw, it's really nice to work with!

TEST(WastParser, InvalidBinaryModule) {
std::string text = R"((module binary "\00asm\bc\0a\00\00"))";
std::vector<uint8_t> expected_data = {
0, 'a', 's', 'm', static_cast<uint8_t>('\xbc'), '\x0a', 0, 0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This static_cast seems a little odd. Is there a better way to write this? Is this needed because C chars are signed by default? Does 0xbc work here instead?

@sbc100 sbc100 changed the title Add parse_binary_modules option. [wast-parser] Add parse_binary_modules option Apr 14, 2025
@sbc100 sbc100 enabled auto-merge (squash) April 14, 2025 17:19
@sbc100 sbc100 merged commit 5b65a58 into WebAssembly:main Apr 14, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parse invalid binary modules

3 participants