wasmparser: provide a better error if multiple modules/components concatenated #2293
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch is motivated by wasm-ld's concatenation of custom sections with the same name. wit-component and related stores component binaries in a module's custom section, and if these sections are unintentionally duplicated and then concatenated by wasm-ld, we get a strange error when parsing the corrupted binary.
There is no way to detect that a wasm module or component contains more information than its original length, because the wasm binary format is a fixed header and then any number of sections.
If two valid wasm modules or components are concatenated into a single binary, the start of the second module will be parsed as another section. The wasm magic number
\0asmwhere\0is parsed as the marker for a custom section,awill be parsed as an leb section length of 97,swill parse as an leb length 115 for the custom section's string name. Because that string does not fit in the section, this "section" will always fail to parse. Currently, it gives the error messageunexpected end-of-file, which is somewhat misleading - it wasn't actually the end-of-file encountered, that ends up just being an artifact of how wasmparser represents sections to readers.So, we now do a 4-byte lookahead at the start of parsing each section and give this error case a more useful message.