-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[ARM] Use Math[F].Round implementation in managed code #97964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsFixes #97922 I decided to go for managed implementation to make it trimmable.
|
|
Does this work with Scanner? Can the implementation just call Math.Round instead? I do not understand why we need these JIT intrinsics in the first place. They seem to be only used on Arm and only to morph System.Math.Round. What would break if these JIT intrinsics are deleted? cc @dotnet/jit-contrib |
|
ARM doesn't have native instructions to perform the rounding (unlike eg. ARM64). I didn't realize there is actually already a managed implementation in Math.cs. I suppose it can be used. |
Even if there were no managed impl, it could still just go the InternalCall path like most Math APIs. I wasn't able to track who and why added those, looks like it happened before the |
The |
Does doing fesetround(FE_TONEAREST);
return rintf(f);instead not work? |
|
|
It is likely a left over from the times before we had support for recognizing managed methods as JIT intrinsics where all JIT intrinsics had to be FCalls. |
| CORINFO_HELP_DBL2ULNG_OVF, | ||
| CORINFO_HELP_FLTREM, | ||
| CORINFO_HELP_DBLREM, | ||
| CORINFO_HELP_FLTROUND, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs new JIT/EE interface GUID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can remove it without removing the R2R helpers due to how it's mapped in the VM code. If the CI passes I can add comment for the next person to remove it.
|
Worth noting as well that However, I also agree that having this in managed is ultimately better. Our currently implementation is notably not super efficient and we could improve it, but that can be a separate work item. |
|
Build break during crossgen: |
This reverts commit c4c5b3f.
…nerates CORINFO_HELP_FLTROUND/CORINFO_HELP_DBLROUND
81e7f93 to
9278cfc
Compare
|
I think it is also safe to delete the references in |
Diff results for #97964Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 134,774 contexts (18,569 MinOpts, 116,205 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 111,892 (44.67%) Overall (-65,128 bytes)
MinOpts (-15,224 bytes)
FullOpts (-49,904 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 136,302 contexts (18,734 MinOpts, 117,568 FullOpts). MISSED contexts: base: 2 (0.00%), diff: 116,512 (45.38%) Overall (-60,804 bytes)
MinOpts (-15,022 bytes)
FullOpts (-45,782 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 143,750 contexts (21,411 MinOpts, 122,339 FullOpts). MISSED contexts: base: 4 (0.00%), diff: 124,661 (45.84%) Overall (-40,932 bytes)
MinOpts (-17,532 bytes)
FullOpts (-23,400 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 132,266 contexts (15,289 MinOpts, 116,977 FullOpts). MISSED contexts: base: 3 (0.00%), diff: 115,590 (45.92%) Overall (-61,336 bytes)
MinOpts (-13,448 bytes)
FullOpts (-47,888 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 135,677 contexts (21,464 MinOpts, 114,213 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 120,821 (46.32%) Overall (-82,504 bytes)
MinOpts (-18,807 bytes)
FullOpts (-63,697 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 104,802 contexts (10,727 MinOpts, 94,075 FullOpts). MISSED contexts: base: 826 (0.42%), diff: 90,575 (45.53%) Overall (-47,582 bytes)
MinOpts (-7,502 bytes)
FullOpts (-40,080 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 113,735 contexts (7,938 MinOpts, 105,797 FullOpts). MISSED contexts: base: 47 (0.02%), diff: 100,776 (46.21%) Overall (-28,228 bytes)
MinOpts (-4,257 bytes)
FullOpts (-23,971 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (-2.07% to -0.04%)
MinOpts (-0.68% to +0.00%)
FullOpts (-2.07% to -0.03%)
Throughput diffs for linux/x64 ran on linux/x64Overall (-2.70% to -0.04%)
MinOpts (-0.69% to 0.00%)
FullOpts (-2.70% to -0.02%)
Throughput diffs for osx/arm64 ran on linux/x64Overall (-1.99% to -0.04%)
MinOpts (-0.70% to +0.00%)
FullOpts (-1.99% to -0.04%)
Throughput diffs for windows/arm64 ran on linux/x64Overall (-3.23% to -0.05%)
MinOpts (-0.69% to +0.00%)
FullOpts (-3.23% to -0.03%)
Throughput diffs for windows/x64 ran on linux/x64Overall (-6.05% to -0.46%)
MinOpts (-78.15% to 0.00%)
FullOpts (-6.05% to -0.20%)
Details here |
This will take me a little while to understand and resolve. |
|
@filipnavara I think you need to take out runtime/src/coreclr/jit/importercalls.cpp Lines 6884 to 6890 in bb2376c
|
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Diff results for #97964Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (-5.00% to -0.15%)
MinOpts (-2.88% to 0.00%)
FullOpts (-5.00% to +0.00%)
Throughput diffs for linux/x64 ran on linux/x64Overall (-5.74% to -0.01%)
MinOpts (-0.67% to 0.00%)
FullOpts (-5.74% to +0.01%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (-2.94% to -0.03%)
MinOpts (-0.63% to 0.00%)
FullOpts (-2.94% to -0.01%)
Throughput diffs for windows/x86 ran on windows/x86Overall (-1.93% to -0.05%)
MinOpts (-0.56% to 0.00%)
FullOpts (-1.93% to -0.05%)
Details here |
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Diff results for #97964Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 1,450,643 contexts (345,734 MinOpts, 1,104,909 FullOpts). MISSED contexts: base: 53,862 (3.58%), diff: 54,343 (3.61%) Overall (-4,886 bytes)
FullOpts (-4,886 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (+0.03%)
MinOpts (+0.02% to +0.04%)
FullOpts (+0.03% to +0.04%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here |
Diff results for #97964Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 1,450,643 contexts (345,734 MinOpts, 1,104,909 FullOpts). MISSED contexts: base: 53,862 (3.58%), diff: 54,343 (3.61%) Overall (-4,886 bytes)
FullOpts (-4,886 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (+0.03%)
MinOpts (+0.02% to +0.04%)
FullOpts (+0.03% to +0.04%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here |
Fixes #97922
I decided to go for managed implementation to make it trimmable.