Skip to content

Winch aarch64 extend, wrap, popcnt, promote & demote#9114

Merged
saulecabrera merged 14 commits intobytecodealliance:mainfrom
vulc41n:winch-aarch64-extend-wrap
Aug 14, 2024
Merged

Winch aarch64 extend, wrap, popcnt, promote & demote#9114
saulecabrera merged 14 commits intobytecodealliance:mainfrom
vulc41n:winch-aarch64-extend-wrap

Conversation

@vulc41n
Copy link
Copy Markdown
Contributor

@vulc41n vulc41n commented Aug 12, 2024

Hey 👋

This PR implements extend, wrap, popcnt, promote & demote instructions for winch targeting aarch64.

#8321

@vulc41n vulc41n requested review from a team as code owners August 12, 2024 16:28
@vulc41n vulc41n requested review from abrown and alexcrichton and removed request for a team August 12, 2024 16:28
@vulc41n vulc41n force-pushed the winch-aarch64-extend-wrap branch from 5a2afc0 to 638b4a5 Compare August 12, 2024 16:47
@saulecabrera
Copy link
Copy Markdown
Member

I can help reviewing this one.

@saulecabrera saulecabrera requested review from saulecabrera and removed request for abrown and alexcrichton August 13, 2024 11:52
Comment on lines +605 to +613
let (signed, from_bits, to_bits) = match kind {
I64ExtendI32S => (true, 32, 64),
I64ExtendI32U => (false, 32, 64),
I32Extend8S => (true, 8, 32),
I32Extend16S => (true, 16, 32),
I64Extend8S => (true, 8, 64),
I64Extend16S => (true, 16, 64),
I64Extend32S => (true, 32, 64),
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some methods to the ExtendKind implementation instead? Something like:

impl ExtendKind {
   fn signed(&self) -> bool { ... }
   fn from_bits(&self) -> u8 { ... }
   fn to_bits(&self) -> u8 { ... }
}

todo!()
fn popcnt(&mut self, context: &mut CodeGenContext, size: OperandSize) {
let src = context.pop_to_reg(self, None);
let tmp = context.reg_for_class(RegClass::Float, self);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce register pressure, could we use the scratch float register here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of aarch64 I believe that the fallback test is redundant i.e., since we are not generating an alternative sequence of instructions like in x64.

Comment on lines +439 to +440
OperandSize::S8 => {}
OperandSize::S16 => self.asm.addp_rrr(tmp, tmp, tmp, VectorSize::Size8x8),
Copy link
Copy Markdown
Member

@saulecabrera saulecabrera Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not wrong, Wasm only defines {i32, i64}_popcnt so I believe we only need to handle the S32 | S64 case and handle all the other operand sizes with unreachable!() instead of unimplemented, which also means that we can probably drop the addp_rrr implementation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I used lower.isle as a reference but CLIF is different from WASM.

handle all the other operand sizes with unreachable!() instead of unimplemented

I just removed the match, please tell me if you think an assertion is needed

Copy link
Copy Markdown
Member

@saulecabrera saulecabrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor nit and this should be good to go.

self.asm.addv(tmp, tmp, VectorSize::Size8x8);
self.asm.mov_from_vec(tmp, src.into(), 0, OperandSize::S8);
context.stack.push(src.into());
context.free_reg(tmp);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scratch register is manually tracked i.e., free_reg is a no-op in this case, so we can remove this call.

@vulc41n vulc41n force-pushed the winch-aarch64-extend-wrap branch from d896ad5 to f2e2425 Compare August 14, 2024 10:09
Copy link
Copy Markdown
Member

@saulecabrera saulecabrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@saulecabrera saulecabrera added this pull request to the merge queue Aug 14, 2024
Merged via the queue into bytecodealliance:main with commit 3bd6708 Aug 14, 2024
@vulc41n vulc41n deleted the winch-aarch64-extend-wrap branch August 14, 2024 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants