Add SIMD acceleration for Matrix4x4.Invert #34394#36323
Add SIMD acceleration for Matrix4x4.Invert #34394#36323tannergooding merged 11 commits intodotnet:masterfrom eanova:SIMDMatrix4x4
Conversation
Fix for #34394. Added SIMD hardware acceleration support to the Matrix4x4.Invert function.
Added the link to Microsoft/DirectXMath source code and appended license to THIRD-PARTY-NOTICES.TXT
Given a Matrix4x4 of only rank 3 test to see the matrix is non-invertable.
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
|
Overall, LGTM. I just had a few code cleanup and codegen quality nits. |
Update containing all suggested fixes.
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
Added missing using statement for Internal.Runtime.CompilerServices.Unknown static object.
| } | ||
|
|
||
| // Create Vector128<float> copy of the determinant and invert them. | ||
| Vector128<float> ones = Vector128.Create(1.0f, 1.0f, 1.0f, 1.0f); |
There was a problem hiding this comment.
| Vector128<float> ones = Vector128.Create(1.0f, 1.0f, 1.0f, 1.0f); | |
| Vector128<float> ones = Vector128.Create(1.0f); |
src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs
Outdated
Show resolved
Hide resolved
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.836 (1909/November2018Update/19H2)
Intel Core i9-9900K CPU 3.60GHz (Coffee Lake), 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=5.0.100-preview.3.20216.6
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.21406, CoreFX 5.0.20.21406), X64 RyuJIT
DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.21406, CoreFX 5.0.20.21406), X64 RyuJIT
These are the results corresponding to the current commit, which uses the CompilerServices Unknown class to store the final output using SSE Store intrinsics. I followed the the benchmarking guidelines in the docs to generate the tests. |
|
|
||
|
|
||
|
|
||
| private static unsafe bool SseImpl(Matrix4x4 matrix, out Matrix4x4 result) |
There was a problem hiding this comment.
This and the other should still be made static local functions as I called out here: #36323 (comment)
Otherwise they may conflict with other methods we also accelerate.
There was a problem hiding this comment.
Everything thing else LGTM and I believe it is otherwise ready to merge
There was a problem hiding this comment.
Sorry, wasn't familiar with local functions. I thought you were pointing out the public modifier instead. Fixed!
There was a problem hiding this comment.
No worries, I should have clarified 😄
Moved the SSE implementation and SoftwareFallback to local functions of Invert.
|
CC. @pgovind if you want to give this a pair of eyes before it gets merged |
|
Thanks for all your help Tanner. Really appreciate it! |
Fix for #34394.
Added SIMD hardware acceleration support to the Matrix4x4.Invert function.