add the llvm.x86.sse42.crc32.32.32 intrinsic#1488
Conversation
| CInlineAsmOperand::InOut { | ||
| reg: InlineAsmRegOrRegClass::Reg(InlineAsmReg::X86(X86InlineAsmReg::ax)), | ||
| _late: true, | ||
| in_value: crc, | ||
| out_place: Some(ret), | ||
| }, | ||
| CInlineAsmOperand::In { | ||
| reg: InlineAsmRegOrRegClass::Reg(InlineAsmReg::X86(X86InlineAsmReg::dx)), | ||
| value: v, | ||
| }, |
There was a problem hiding this comment.
is there any particular reason to choose particular register names here? this seemed to be the ordering used in some of the other examples.
There was a problem hiding this comment.
There is no particular reason for the choice of registers.
bjorn3
left a comment
There was a problem hiding this comment.
I can add the other variations of this instruction (having a differently-sized v argument) but those are not used by zlib-rs so some dedicated testing would presumably be required for them?
Running the stdarch test suite should cover this. I can run it for you if you want add the other variants, but implementing them as needed is fine too.
how should this be tested? I exercised the instruction by compiling and running (part of) the test suite of zlib-rs, is that sufficient?
example/std_example.rs has various simd tests. You can copy the test from stdarch there and call it as regular function in test_simd.
| CInlineAsmOperand::InOut { | ||
| reg: InlineAsmRegOrRegClass::Reg(InlineAsmReg::X86(X86InlineAsmReg::ax)), | ||
| _late: true, | ||
| in_value: crc, | ||
| out_place: Some(ret), | ||
| }, | ||
| CInlineAsmOperand::In { | ||
| reg: InlineAsmRegOrRegClass::Reg(InlineAsmReg::X86(X86InlineAsmReg::dx)), | ||
| value: v, | ||
| }, |
There was a problem hiding this comment.
There is no particular reason for the choice of registers.
Co-authored-by: bjorn3 <17426603+bjorn3@users.noreply.github.com>
|
Thanks! |
|
that testing approach does not quite work because so then I guess local testing just has to suffice (for now)? for future reference, here is the test #[cfg(target_arch = "x86_64")]
#[target_feature(enable = "sse4.2")]
unsafe fn test_crc32() {
assert!(is_x86_feature_detected!("sse4.2"));
let a = 42u32;
let b = 0xdeadbeefu64;
assert_eq!(_mm_crc32_u8(a, b as u8), 4135334616);
assert_eq!(_mm_crc32_u16(a, b as u16), 1200687288);
assert_eq!(_mm_crc32_u32(a, b as u32), 2543798776);
assert_eq!(_mm_crc32_u64(a as u64, b as u64), 241952147);
} |
|
Adding |
part of #1487
how should this be tested? I exercised the instruction by compiling and running (part of) the test suite of zlib-rs, is that sufficient?
I can add the other variations of this instruction (having a differently-sized
vargument) but those are not used by zlib-rs so some dedicated testing would presumably be required for them?