Skip to content

x64: Add memory operand support to EVEX instructions#6416

Merged
alexcrichton merged 5 commits intobytecodealliance:mainfrom
alexcrichton:x64-evex-memory
May 24, 2023
Merged

x64: Add memory operand support to EVEX instructions#6416
alexcrichton merged 5 commits intobytecodealliance:mainfrom
alexcrichton:x64-evex-memory

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

Currently load-sinking is enabled for EVEX instructions (aka AVX512 instructions) but the encoding of these instructions is a todo!() which can cause a panic for some wasms if the right features are enabled. This commit fills out the support for memory operands in the same manner as VEX-encoded instructions. The main stickler here was that EVEX instructions always use a scaled 8-bit offset which needed extra handling to ensure that the correct offset is emitted.

Currently load-sinking is enabled for EVEX instructions (aka AVX512
instructions) but the encoding of these instructions is a `todo!()`
which can cause a panic for some wasms if the right features are
enabled. This commit fills out the support for memory operands in the
same manner as VEX-encoded instructions. The main stickler here was that
EVEX instructions always use a scaled 8-bit offset which needed extra
handling to ensure that the correct offset is emitted.
@alexcrichton alexcrichton marked this pull request as ready for review May 19, 2023 20:57
@alexcrichton alexcrichton requested a review from a team as a code owner May 19, 2023 20:57
@alexcrichton alexcrichton requested review from abrown and removed request for a team May 19, 2023 20:57
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels May 19, 2023
};
let bits = (self.bits >> Self::LL.start()) & ((1 << Self::LL.len()) - 1);
assert_eq!(bits, u32::from(b128.bits()));
let scaling = 16;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm convinced from the tests that this scaling factor is correct for vpabs but I'm not sure this is set up well for the future (or even all of the possible EVEX-encoded instructions that can currently be emitted). Section 2.7.7 talks about the kinds of instructions that do not have embedded broadcast (single scalar result, explicit broadcast in opcode, pure load/store) — these instructions all have scaling != 16 for VL = 128. I think this means that if any future or current instructions are of this kind, we could get a miscompilation (see the expected scaling factors in table 2-35). I wish there was a way to throw an error of some kind here instead. (The same kind of logic applies for the instructions that are affected by embedded broadcast: instructions could expect scalings of 8 and 4 unless we do more checks, e.g., on EVEX.b and EVEX.W).

My argument here is "let's prevent errors" when someone tries to add a new EVEX-encoded instruction in the future. I looked into adding a new function to Avx512Opcode to signal which tuple type each opcode has but that might not totally work: vpopcnt, e.g., can be emitted as both "Full Mem" and "Full" (we use the "Full Mem" one, apparently!). vcvtudq2ps uses "Full", which only uses scaling = 16 when EVEX.b = 0 and EVEX.W = 1. Maybe EvexInstruction needs to gain a tuple field which we force users to specify somehow in emit.rs. Not sure I have the best solution here but hopefully you get what I mean about protecting against future errors.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my read was that EvexContext was what controlled all these bits where whatever sets EvexContext guides the scaling, but it sounds like that's not right. I see now the listing on each instruction and its "TupleType"!

Looking at vpopcnt if we have a separate opcode per-size (which I think we do today anyway) it looks like that means the tuple type is per-opcode still? If that's possible then I could add something like:

impl Avx512Opcode {
    fn tuple_type(&self) -> Avx512TupleType { /* ... */ }
}

which could get fed into the EvexInstruction encoder which would then futher handle all the EVEX.{b,W} cases? That way if more tuple types were added they would result in a non-exhaustive match and require handling?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've attempted to implement this, but a double-check is certainly in order

if amode.can_trap() {
sink.add_trap(TrapCode::HeapOutOfBounds);
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that sink has to get more specific... I had always hoped this code could be used for some kind of assembler. But this trap is necessary and if we eventually create an assembler this can get refactored.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh the other main purpose for a MachBuffer is this use of use_label_at_offset for rip-relative addressing modes.

Copy link
Copy Markdown
Member

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. From my reading, we now map the tuple type to the values in tables 2-34 and 2-35 for Full and FullMem so we shouldn't have problems with the existing instructions. And any additions to Avx512Opcode will be forced to update tuple_type().

@abrown abrown added this pull request to the merge queue May 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 24, 2023
@alexcrichton alexcrichton enabled auto-merge May 24, 2023 18:30
@alexcrichton alexcrichton added this pull request to the merge queue May 24, 2023
Merged via the queue into bytecodealliance:main with commit 797e769 May 24, 2023
@alexcrichton alexcrichton deleted the x64-evex-memory branch May 24, 2023 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants