Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Optimise 'Math.CopySign' and 'MathF.CopySign'#40782

Closed
john-h-k wants to merge 3 commits intodotnet:masterfrom
john-h-k:copysign-opt
Closed

Optimise 'Math.CopySign' and 'MathF.CopySign'#40782
john-h-k wants to merge 3 commits intodotnet:masterfrom
john-h-k:copysign-opt

Conversation

@john-h-k
Copy link

@john-h-k john-h-k commented Sep 3, 2019

Optimise 'Math.CopySign' and 'MathF.CopySign'. Both of these methods can be improved from their current implementation. The new implementation uses SSE intrinsics which are faster, as well as having a faster intrinsic-free branch.

The SSE pathway doesn't spill to the stack, unlike the others, and is branch free. The fallback is also branch free but does spill to the stack (the current implementation spills and contains a branch). These are faster in every scenario, except for the non-SSE fallback being marginally (10%) slower on x64 in the Same scenario - however, windows (and x64) requires SSE2 so it is unlikely this code is ever going to be run on x86. I haven't profiled on ARM as I don't have access to an ARM system.

Scenarios:
Same - every sign is the same (e.g x == 1f, y == 2f)
Different - every sign is different (e.g x == 1f, y == -2f)
Alternating - alternates between same and different
Random - randomly same or different

Single precision (`MathF):

Method Scenario Mean Error StdDev
Standard Random 181.14 us 1.0300 us 0.8601 us
John Random 47.49 us 0.1613 us 0.1347 us
John_Intrinsic Random 39.79 us 0.4956 us 0.4636 us
Standard Same 43.59 us 0.2158 us 0.2019 us
John Same 49.11 us 0.6777 us 0.6339 us
John_Intrinsic Same 39.74 us 0.2084 us 0.1949 us
Standard Different 56.04 us 0.6402 us 0.5988 us
John Different 49.58 us 0.9119 us 0.8529 us
John_Intrinsic Different 39.80 us 0.2419 us 0.1889 us
Standard Alternating 48.89 us 0.1422 us 0.1330 us
John Alternating 47.48 us 0.0766 us 0.0717 us
John_Intrinsic Alternating 39.05 us 0.0289 us 0.0226 us

Double precision (Math):

Method Scenario Mean Error StdDev
Standard Random 176.77 us 0.3537 us 0.2954 us
John Random 53.74 us 0.1543 us 0.1443 us
John_Intrinsic Random 41.22 us 0.1321 us 0.1236 us
Standard Same 48.64 us 0.1322 us 0.1237 us
John Same 53.85 us 0.1499 us 0.1402 us
John_Intrinsic Same 41.44 us 0.2931 us 0.2742 us
Standard Different 59.91 us 0.0642 us 0.0569 us
John Different 53.72 us 0.1779 us 0.1664 us
John_Intrinsic Different 41.09 us 0.0346 us 0.0289 us
Standard Alternating 54.06 us 0.1293 us 0.1210 us
John Alternating 53.84 us 0.1772 us 0.1658 us
John_Intrinsic Alternating 41.11 us 0.0681 us 0.0604 us

@john-h-k
Copy link
Author

john-h-k commented Sep 3, 2019

cc @tannergooding

@tannergooding
Copy link
Member

Thanks for the contribution @john-h-k. Could you reopen this targeting the files in dotnet/coreclr?

The shared files generally need to be modified there first as they are built, tested, and shipped as part of System.Private.CoreLib.

@tannergooding
Copy link
Member

Let me know if you need any assistance retargeting this.

@danmoseley
Copy link
Member

I was going to suggest also a microbenchmark but the usage of these API's seems very low (https://apisof.net/catalog/System.MathF.CopySign(Single,Single)) and the guidance at https://github.com/dotnet/performance/blob/master/docs/microbenchmark-design-guidelines.md points out these are only for common cases.

@tannergooding
Copy link
Member

It's low because it was only exposed in .NET Core 3. It is used internally in a few places and is fairly perf critical for optimizing other math algorithms.

@danmoseley
Copy link
Member

Well @john-h-k perhaps you might consider adding a benchmark (such as some form of the one you're using here) to https://github.com/dotnet/performance as well ? There's excellent documentation there: it's easy to use. That would protect your change.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants