Skip to content

Accelerate Vector512 ConvertToSingle and ConvertToInt32.#84932

Merged
tannergooding merged 2 commits intodotnet:mainfrom
anthonycanino:avx512-convert-to-single
Apr 20, 2023
Merged

Accelerate Vector512 ConvertToSingle and ConvertToInt32.#84932
tannergooding merged 2 commits intodotnet:mainfrom
anthonycanino:avx512-convert-to-single

Conversation

@anthonycanino
Copy link
Contributor

This PR accelerates ConvertToSingle and ConvertToInt32 which are accelerated for Vector128 and Vector256. With AVX512, the remaining ConvertToX functions can be accelerated for Vector128, Vector256, and Vector512 which will be done in a follow up PR as an additional optimization.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 17, 2023
@anthonycanino anthonycanino marked this pull request as ready for review April 17, 2023 17:58
@BruceForstall BruceForstall added the avx512 Related to the AVX-512 architecture label Apr 17, 2023
@BruceForstall
Copy link
Contributor

cc @dotnet/jit-contrib @dotnet/avx512-contrib

@anthonycanino anthonycanino force-pushed the avx512-convert-to-single branch from 7b3c160 to 62ebb42 Compare April 18, 2023 22:04
Comment on lines +1012 to +1025
switch (simdSize)
{
case 16:
intrinsic = NI_SSE2_ConvertToVector128Int32WithTruncation;
break;
case 32:
intrinsic = NI_AVX_ConvertToVector256Int32WithTruncation;
break;
case 64:
intrinsic = NI_AVX512F_ConvertToVector512Int32WithTruncation;
break;
default:
unreached();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We might not want to switch on the size because we sometimes see simdSize == 8 and simdSize == 12 for some nodes.

I don't think we currently have any for ConvertTo, but its probably better to be "safe" and do if (simdSize == 64) { } else if (simdSize == 32) { } else { } like the other paths are doing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding - so for simdSize == 12 and simdSize == 8 we will end up with NI_SSE2_ConvertToVector128Int32WithTruncation which is ok?

@anthonycanino - can you address this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We won't see it today so its not required to fix it "right now". But there are some optimizations we'd like to do in the future that would need it so we'd want to at least fix it at that time

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tannergooding
Copy link
Member

Merging. @anthonycanino, if you could address the one minor feedback in a follow up PR that'd be much appreciated!

@tannergooding tannergooding merged commit 9df61d9 into dotnet:main Apr 20, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants