Skip to content

Conversation

@keithw
Copy link
Member

@keithw keithw commented Jun 8, 2023

Before memory64, the "offset" in a load/store expression was a u32, and we enforced this in the WastParser and BinaryReader. After memory64, the "offset" becomes a u64 syntactically, and the validator checks that it's <= UINT32_MAX for i32 memories.

We hadn't been correctly allowing these very large offsets in the text format (even when memory64 was enabled and the memory was i64).

(This change also eliminates the "memories" member in the BinaryReader. The BinaryReader no longer needs to keep track of the memories and their types to check well-formedness.)

This PR is logically dependent on #2252 (I don't think the tests will pass until #2252 is merged).

All of this is a long prelude to eventually fixing #2192 with a good regression test!

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

Opcode,
Var memidx,
Address align,
Address offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if would make sense to bundle memidx, align, offset into some kind of MemLocation struct? Are there other places in the code that want to reason about these three as a unit?

Copy link
Member Author

@keithw keithw Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make a lot of sense, but it would be a pretty intensive refactor.

There's a lot of callsites in the BinaryReader (and its delegates, e.g. BinaryReaderIR/Interp/etc.) that pass these three in the form where memidx is an Index and align is in the binary (log2) form.

And there's a lot of callsites in the SharedValidator and in the IR expr constructors that pass these three in the form where memidx is a Var and align is in the expanded form.

So I think the most elegant/comprehensive treatment might have some sort of (name TBD) MemLocationRaw, used in all the binary readers, and a (name also TBD) MemLocationCooked that's used in the SharedValidator and IR, with one codepath to convert from one to the other? Or we can just do one of these and leave the other unchanged.

Happy to do this now if you want but it might make sense for a separate PR.

@keithw keithw force-pushed the validate-offset-pr branch from d6b8159 to 44633e7 Compare June 12, 2023 07:38
keithw added 2 commits June 12, 2023 12:18
Before memory64, the "offset" in a load/store expression was
a u32, and we enforced this in the WastParser and BinaryReader.
After memory64, the "offset" becomes a u64 syntactically, and the
validator checks that it's <= UINT32_MAX for i32 memories.

We hadn't been correctly allowing these very large offsets
in the text format (even when memory64 was enabled and the memory
was i64).

(This change also eliminates the "memories" member in the
BinaryReader. The BinaryReader no longer needs to keep track
of the memories and their types to check well-formedness.)
@keithw keithw force-pushed the validate-offset-pr branch from 44633e7 to 62ed39a Compare June 12, 2023 19:18
@keithw keithw enabled auto-merge (squash) June 12, 2023 19:19
@keithw keithw merged commit 5edb241 into main Jun 12, 2023
@keithw keithw deleted the validate-offset-pr branch June 12, 2023 19:37
@keithw keithw mentioned this pull request Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants