Optimise 'Math.CopySign' and 'MathF.CopySign'#40782
Optimise 'Math.CopySign' and 'MathF.CopySign'#40782john-h-k wants to merge 3 commits intodotnet:masterfrom
Conversation
|
Thanks for the contribution @john-h-k. Could you reopen this targeting the files in dotnet/coreclr? The shared files generally need to be modified there first as they are built, tested, and shipped as part of System.Private.CoreLib. |
|
Let me know if you need any assistance retargeting this. |
|
I was going to suggest also a microbenchmark but the usage of these API's seems very low (https://apisof.net/catalog/System.MathF.CopySign(Single,Single)) and the guidance at https://github.com/dotnet/performance/blob/master/docs/microbenchmark-design-guidelines.md points out these are only for common cases. |
|
It's low because it was only exposed in .NET Core 3. It is used internally in a few places and is fairly perf critical for optimizing other math algorithms. |
|
Well @john-h-k perhaps you might consider adding a benchmark (such as some form of the one you're using here) to https://github.com/dotnet/performance as well ? There's excellent documentation there: it's easy to use. That would protect your change. |
Optimise 'Math.CopySign' and 'MathF.CopySign'. Both of these methods can be improved from their current implementation. The new implementation uses SSE intrinsics which are faster, as well as having a faster intrinsic-free branch.
The SSE pathway doesn't spill to the stack, unlike the others, and is branch free. The fallback is also branch free but does spill to the stack (the current implementation spills and contains a branch). These are faster in every scenario, except for the non-SSE fallback being marginally (10%) slower on x64 in the Same scenario - however, windows (and x64) requires SSE2 so it is unlikely this code is ever going to be run on x86. I haven't profiled on ARM as I don't have access to an ARM system.
Scenarios:
Same - every sign is the same (e.g
x == 1f, y == 2f)Different - every sign is different (e.g
x == 1f, y == -2f)Alternating - alternates between same and different
Random - randomly same or different
Single precision (`MathF):
Double precision (
Math):