Vectorize Atanh for Vector64/128/256/512 and TensorPrimitives#126420
Vectorize Atanh for Vector64/128/256/512 and TensorPrimitives#126420stephentoub wants to merge 2 commits intomainfrom
Conversation
Add vectorized Atanh implementations for float and double across all SIMD vector types. - AtanhDouble: AMD atanh.c [5,5] rational polynomial + log1p identity for large values - AtanhSingle: widens to double, uses AMD atanhf.c [2,2] rational polynomial - Hook up TensorPrimitives.Atanh to use vectorized implementations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
There was a problem hiding this comment.
Pull request overview
Adds SIMD-optimized Atanh for float/double across Vector64/128/256/512 and enables corresponding vectorization in TensorPrimitives.Atanh.
Changes:
- Exposes new
Vector64/128/256/512.AtanhAPIs (and updates theSystem.Runtime.Intrinsicsref surface). - Implements the shared vectorized math kernels in
VectorMathforatanh(double core + float via widen/narrow). - Enables
TensorPrimitives.Atanhvectorization forfloat/double(and adjusts test tolerances accordingly).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs | Adds Atanh to the public ref surface for Vector64/128/256/512. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs | Introduces vectorized AtanhDouble and AtanhSingle implementations used by all vector types. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs | Adds Vector64.Atanh (double/float) and scalar fallback helper. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs | Adds Vector128.Atanh (double/float), delegating to VectorMath or splitting to Vector64. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs | Adds Vector256.Atanh (double/float), delegating to VectorMath or splitting to Vector128. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector512.cs | Adds Vector512.Atanh (double/float), delegating to VectorMath or splitting to Vector256. |
| src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs | Updates Atanh test tolerance to account for vectorized implementation accuracy. |
| src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atanh.cs | Enables vectorization for Atanh when T is float/double (NET11+), using the new vector APIs. |
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs
Show resolved
Hide resolved
|
@copilot please evaluate and address the feedback |
Code Review: Atanh Vectorization❌ Bug:
|
| Function | Line | Has Vector512.IsHardwareAccelerated check? |
|---|---|---|
Asin |
840 | ✅ |
Cos |
915 | ✅ |
Exp |
1820 | ✅ |
Hypot |
2072 | ✅ |
Sin |
3873 | ✅ |
SinCos |
3917 | ✅ |
Atanh |
879 | ❌ Missing |
Note that the PR's own sibling methods Vector64.Atanh(float) (checks Vector128.IsHardwareAccelerated) and Vector128.Atanh(float) (checks Vector256.IsHardwareAccelerated) correctly follow the pattern — only the Vector256 case was missed.
Suggested fix: Add the guard and provide a Vector256<double> fallback, matching the Asin pattern directly above:
if (IsHardwareAccelerated)
{
if (Vector512.IsHardwareAccelerated)
{
return VectorMath.AtanhSingle<Vector256<float>, Vector256<int>, Vector256<uint>,
Vector512<double>, Vector512<long>, Vector512<ulong>>(vector);
}
else
{
return VectorMath.AtanhSingle<Vector256<float>, Vector256<int>, Vector256<uint>,
Vector256<double>, Vector256<long>, Vector256<ulong>>(vector);
}
}Everything else reviewed looks correct:
- ✅
AtanhDouble/AtanhSingleCoreDoublenumerical formulas, sign/infinity/NaN handling, and ConditionalSelect ordering - ✅
log(1+ratio)instead oflog1p(ratio)is safe since the large path guarantees1+ratio ≥ 3 - ✅ Polynomial coefficients sourced from AMD AOCL-libm
- ✅ Ref assembly, Vector64/128/512 entry points, and test tolerance updates
Generated by Code Review for issue #126420 · ◷
…unit tests Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/5d519063-12be-4ef4-b760-f38c1b0e57ef Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Addressed both feedback items in 046a5e3:
|
Add vectorized Atanh implementations for float and double across all SIMD vector types.
Description
Vector64/128/256/512.AtanhAPIs and updates theSystem.Runtime.Intrinsicsref surface.VectorMathforatanh(double core + float via widen/narrow).TensorPrimitives.Atanhvectorization forfloat/doubleand adjusts test tolerances accordingly.Vector256.Atanh(float)to branch onVector512.IsHardwareAcceleratedand fall back toVector256<double>when Vector512 isn't available, matching the pattern used byAsin,Cos,Exp, and other float implementations.GenericMathTestMemberDatawithAtanhDoubleandAtanhSingledata sources covering standard mathematical constants.