Accelerate Vector512 ConvertToSingle and ConvertToInt32.#84932
Accelerate Vector512 ConvertToSingle and ConvertToInt32.#84932tannergooding merged 2 commits intodotnet:mainfrom
ConvertToSingle and ConvertToInt32.#84932Conversation
|
cc @dotnet/jit-contrib @dotnet/avx512-contrib |
7b3c160 to
62ebb42
Compare
| switch (simdSize) | ||
| { | ||
| case 16: | ||
| intrinsic = NI_SSE2_ConvertToVector128Int32WithTruncation; | ||
| break; | ||
| case 32: | ||
| intrinsic = NI_AVX_ConvertToVector256Int32WithTruncation; | ||
| break; | ||
| case 64: | ||
| intrinsic = NI_AVX512F_ConvertToVector512Int32WithTruncation; | ||
| break; | ||
| default: | ||
| unreached(); | ||
| } |
There was a problem hiding this comment.
nit: We might not want to switch on the size because we sometimes see simdSize == 8 and simdSize == 12 for some nodes.
I don't think we currently have any for ConvertTo, but its probably better to be "safe" and do if (simdSize == 64) { } else if (simdSize == 32) { } else { } like the other paths are doing.
There was a problem hiding this comment.
@tannergooding - so for simdSize == 12 and simdSize == 8 we will end up with NI_SSE2_ConvertToVector128Int32WithTruncation which is ok?
@anthonycanino - can you address this?
There was a problem hiding this comment.
Right. We won't see it today so its not required to fix it "right now". But there are some optimizations we'd like to do in the future that would need it so we'd want to at least fix it at that time
|
Merging. @anthonycanino, if you could address the one minor feedback in a follow up PR that'd be much appreciated! |
This PR accelerates
ConvertToSingleandConvertToInt32which are accelerated forVector128andVector256. With AVX512, the remainingConvertToXfunctions can be accelerated forVector128,Vector256, andVector512which will be done in a follow up PR as an additional optimization.