Skip to content

Conversation

@yhdengh
Copy link
Contributor

@yhdengh yhdengh commented Nov 4, 2021

This PR adds support for multi-memory proposal.

  • Added multi_memory as a feature. Added memidx(s) to IR representations of memory instructions.
  • Added support for parsing and writing memidx in both binary and text format, and also wasm2c.
  • Updated testsuite to WebAssembly/testsuite@f1eaa43 which includes testsuite for multi-memory proposal. Rebased tests. Skipped the test directory for wasm2c as suggested by wasm2c no longer passes many spec tests #1737

@yhdengh
Copy link
Contributor Author

yhdengh commented Nov 15, 2021

@sbc100 @kripken Would you be able to take a look at this?

@sbc100 sbc100 requested review from binji and sbc100 November 15, 2021 18:52
Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks for working on it.

Quite a few comments but mostly minor stuff.

Address alignment_log2,
Address offset) override;
Address offset,
Index memidx) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be make logical sense to put the memidx first (right after the opcode) since that is where it appear in the binary and text format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memidx appears right after the opcode in text format, but it appears after alignment and offset in binary format. I could update it such that for headers related to binary format, memidx comes last and for headers related to text format, memidx comes first, but I think it would be better if the ordering is the same for all headers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that they should all be the same, but I also think we should follow the text format and have memidx come right after the opcode in all the APIs. The fact that it comes last in the binary format is kind of an accident of history whereas the text format rightly puts it first where I think it makes more sense, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binary format uses bit 6 of alignment to indicate whether the current operation has a memidx immediate. Bit 6 is 1 if and only if there is a memidx immediate. Therefore, parser has to parse alignment first to figure out whether it is going to parse memidx, and that's why memidx is placed at the last. But I agree that it makes more logical sense to put memidx right after opcode. I will update the APIs.

return reader_->name(opcode, alignment_log2, offset); \
}

#define DEFINE_CUSTOM_LOAD_STORE_OPCODE(name) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "CUSTOM" mean here?

Result BinaryReader::ReadMemidx(Index* memidx, const char* desc) {
CHECK_RESULT(ReadIndex(memidx, desc));
ERROR_UNLESS(*memidx < memories.size(),
"load/store memory %u out of range %zu", *memidx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just "memory index %u out of range"?

if (value >= 128) {
PrintError("invalid %s: %u", desc, value);
return Result::Error;
} else if (value >= 32 && !options_.features.multi_memory_enabled()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about combining these two into a single condition if (value >= 128 || value >= 32 && !options_.features.multi_memory_enabled())?

if (options_.features.multi_memory_enabled()) {
result |= CheckMemoryIndex(memidx, &mt);
} else {
result |= CheckMemoryIndex(Var(0,loc), &mt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just us CheckMemoryIndex(memidx, &mt); in both cases? Won't memidx above be zero anyway for the single-memory case?

Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was kept for the same reason as OnOpcodeUint32Uint32Uint32. For load and store expressions, it is fine to only keep CheckMemoryIndex(memidx). However, for other memory instructions (memory.copy, memory.fill, etc.), only keeping CheckMemoryIndex(memidx) makes the program failed to pass some tests because the error messages are different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I'm curious to see what how the error message are different. It could be that we simple accept the new error message and rebase the test expectations? Can you given an example of the differences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an example of the test output, essentially CheckMemoryIndex(memidx) attaches a location prior to the error message:

test/spec/memory.txt
STDOUT MISMATCH:
--- expected
+++ actual
@@ -22,7 +22,7 @@
out/test/spec/memory.wast:37: assert_invalid passed:
000001d: error: load/store memory 0 out of range 0
out/test/spec/memory.wast:41: assert_invalid passed:

  • error: memory variable out of range: 0 (max 0)
  • 0000000: error: memory variable out of range: 0 (max 0)
    0000019: error: OnMemorySizeExpr callback failed
    out/test/spec/memory.wast:45: assert_invalid passed:
    error: memory variable out of range: 0 (max 0)

Again, we could also rebase the test output if the change doesn't affect any existing parts of wabt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think rebasing the output for this kind of change is fine.

if (!options_->features.multi_memory_enabled()) {
Error(loc, "Specifying memory variable is not allowed");
return Result::Error;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No else after return.

CHECK_RESULT(ParseMemidx(loc, &memidx));
CHECK_RESULT(ParseVar(&var));
out_expr->reset(new T(var, memidx, loc));
return Result::Ok;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this return since the it already exists at the outer level.

Same for the blocks below.

WriteName(var.name(), next_char);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this?

@yhdengh yhdengh force-pushed the multimemory branch 3 times, most recently from a92e8b4 to 3bfa638 Compare November 16, 2021 22:24
@yhdengh
Copy link
Contributor Author

yhdengh commented Nov 16, 2021

Thanks for all the feedback! I have updated commits in the pull request to address some of those.

For the if/else duplications, we kept them so that wabt passes all the existing tests when the multi-memory feature is not enabled. Most of them could be removed if we change existing tests for multi-memory disabled cases. Is that the preferred way to approach it?

The only exception is for reserved memidx of memory instructions in binary format, which requires a change in WebAssembly/testsuite.

DEFINE_INDEX_DESC(OnGlobalGetExpr, "index")
DEFINE_INDEX_DESC(OnGlobalSetExpr, "index")
DEFINE_LOAD_STORE_OPCODE(OnLoadExpr);
DEFINE_MEMORY_LOAD_STORE_OPCODE(OnLoadExpr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to rename this? i.e. Is the old DEFINE_LOAD_STORE_OPCODE still used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also used by other expressions with type LoadStoreExpr. I think any operations that have alignment and offset are defined as LoadStoreExpr, including operations that are not load/store related.

if (memidx == 0) {
CALLBACK(OnOpcodeUint32Uint32, alignment_log2, offset);
} else {
CALLBACK(OnOpcodeUint32Uint32Uint32, alignment_log2, offset, memidx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine for the output of this test to be rebased if it avoid this extra code complexity.

if (memidx == 0) {
CALLBACK(OnOpcodeUint32Uint32, alignment_log2, offset);
} else {
CALLBACK(OnOpcodeUint32Uint32Uint32, alignment_log2, offset, memidx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

CHECK_RESULT(ReadMemidx(&memidx, "memory.grow memidx"));
CALLBACK(OnMemoryGrowExpr, memidx);
CALLBACK(OnOpcodeUint32, memidx);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

How about at least we do something like this to minimize duplication and make it clear precisely what the differences in the two code paths are:

Index memidx;
if (!options_.features.multi_memory_enabled())  {
  CHECK_RESULT(ReadU8(&reserved, "memory.grow reserved"));
  ERROR_UNLESS(memidx == 0, "memory.grow memory index must be 0");
} else {
  CHECK_RESULT(ReadMemidx(&memidx, "memory.grow memidx"));
}
CALLBACK(OnMemoryGrowExpr, memidx);
CALLBACK(OnOpcodeUint32, memidx);

CHECK_RESULT(ReadMemidx(&memidx, "memory.grow memidx"));
CALLBACK(OnMemoryGrowExpr, memidx);
CALLBACK(OnOpcodeUint32, memidx);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use this pattern here and elsewhere and it should reduce the size of this patch.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks good to me!

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, can you also update the README.md to say that we support this proposal?

@yhdengh yhdengh force-pushed the multimemory branch 2 times, most recently from d1fb575 to 87d7e9d Compare November 20, 2021 23:58
@yhdengh
Copy link
Contributor Author

yhdengh commented Nov 21, 2021

For reserved byte related logic, I did something like:

Index memidx = 0;
if (!options_.features.multi_memory_enabled())  {
  unit8_t reserved;
  CHECK_RESULT(ReadU8(&reserved, "memory.grow reserved"));
  ERROR_UNLESS(reserved == 0, "memory.grow memory index must be 0");
} else {
  CHECK_RESULT(ReadMemidx(&memidx, "memory.grow memidx"));
}
CALLBACK(OnMemoryGrowExpr, memidx);
CALLBACK(OnOpcodeUint32, memidx);

since I am not sure ReadU8(&memidx) gives the right result with different endianness. Is that correct?

Also, would you prefer to have all changes for the pull request squashed into a single commit or have them as multiple commits?

@sbc100
Copy link
Member

sbc100 commented Nov 21, 2021

For reserved byte related logic, I did something like:

Index memidx = 0;
if (!options_.features.multi_memory_enabled())  {
  unit8_t reserved;
  CHECK_RESULT(ReadU8(&reserved, "memory.grow reserved"));
  ERROR_UNLESS(reserved == 0, "memory.grow memory index must be 0");
} else {
  CHECK_RESULT(ReadMemidx(&memidx, "memory.grow memidx"));
}
CALLBACK(OnMemoryGrowExpr, memidx);
CALLBACK(OnOpcodeUint32, memidx);

since I am not sure ReadU8(&memidx) gives the right result with different endianness. Is that correct?

Also, would you prefer to have all changes for the pull request squashed into a single commit or have them as multiple commits?

We always squash the PR before landing it. Its up to you if you want to squash during the review process or not. I tend not to look at individual commits myself.

00001b func[0]:
00001c: 41 00 | i32.const 0
00001e: 28 02 00 | i32.load 2 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looks at these changes now, it does seem a little odd to show the extra zero in the disassembly.

Can we under this part with minimal code changes? Perhaps something like:

if (memidx) {
  CALLBACK(OnOpcodeUint32Uint32Uint32, alignment_log2, offset, memidx);
} else {
  CALLBACK(OnOpcodeUint32Uint32, alignment_log2, offset);
}

Sorry for making you go back and forth on this.

Also, should we make the memidx for the first the Uint32 arguments for the load stores so it is shown first in the disassembly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to keep memidx as the last argument. The disassembly should be the same as the binary format which has memidx as the last argument, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I don't feel strongly about it being the last argument or not, but lets completely elide it when its zero (or at least when multi memory is not enabled).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated that. Tests related to disassembly/objdump are the same as how they are in the main branch now.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@sbc100 sbc100 enabled auto-merge (squash) November 30, 2021 21:49
@sbc100 sbc100 merged commit cf1e138 into WebAssembly:main Nov 30, 2021
@yhdengh yhdengh deleted the multimemory branch December 2, 2021 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants