Vectorize Asinh for Vector64/128/256/512 and TensorPrimitives by stephentoub · Pull Request #126419 · dotnet/runtime

stephentoub · 2026-04-01T18:10:20Z

Description

Add vectorized Asinh implementations for float and double across all SIMD vector types, with comprehensive unit tests.

Implementation

Exposes new Vector64/128/256/512.Asinh(float|double) public APIs and wires them to VectorMath implementations when hardware-accelerated.
Implements vectorized asinh core logic in VectorMath for double and float (via widening to double and polynomial/range selection).
Enables TensorPrimitives.Asinh vectorization for float/double on NET11_0_OR_GREATER and adjusts test tolerances accordingly.

Tests

Added AsinhDouble and AsinhSingle test data to GenericMathTestMemberData with 31 test cases each, covering special values (±0, ±∞, NaN) and representative mathematical constants (±π, ±e, ±ln(10), ±π/2, ±log₂(e), ±√2, ±1, ±π/4, ±1/√2, ±ln(2), ±2/π, ±log₁₀(e), ±1/π) with appropriate variance tolerances.
Added AsinhDoubleTest and AsinhSingleTest methods to Vector64Tests, Vector128Tests, Vector256Tests, and Vector512Tests, following the same [Theory]/[MemberData] pattern as existing Asin tests.

Add vectorized Asinh implementations for float and double across all SIMD vector types. - AsinhDouble: uses log identity (log(|x| + sqrt(x^2 + 1))) for vectorization - AsinhSingle: widens to double, uses AMD asinhf.c polynomial approximation - Hook up TensorPrimitives.Asinh to use vectorized implementations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet-policy-service · 2026-04-01T18:11:26Z

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

Adds vectorized Asinh implementations for float/double across all SIMD vector widths and enables vectorization for TensorPrimitives.Asinh.

Changes:

Exposes new Vector64/128/256/512.Asinh(float|double) public APIs and wires them to VectorMath implementations when hardware-accelerated.
Implements vectorized asinh core logic in VectorMath for double and float (via widening to double and polynomial/range selection).
Enables TensorPrimitives.Asinh vectorization for float/double on NET11_0_OR_GREATER and adjusts test tolerances accordingly.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs	Adds `Asinh` API surface to the reference assembly for Vector64/128/256/512.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs	Introduces the vectorized `AsinhDouble`/`AsinhSingle` implementations used by the SIMD APIs.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs	Adds `Vector64.Asinh` public APIs plus scalar fallback helper.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs	Adds `Vector128.Asinh` public APIs with accelerated and fallback paths.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs	Adds `Vector256.Asinh` public APIs with accelerated and fallback paths.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector512.cs	Adds `Vector512.Asinh` public APIs with accelerated and fallback paths.
src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs	Updates `Asinh` test entry to use explicit tolerances for float/double.
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Asinh.cs	Enables vectorization for `TensorPrimitives.Asinh` and adds vector invoke implementations.

stephentoub · 2026-04-01T18:20:32Z

@copilot please evaluate and address the feedback

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/75feae7d-528c-4fd6-aee8-88ccfa9c7841 Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Copilot · 2026-04-01T19:11:17Z

@copilot please evaluate and address the feedback

Added AsinhDouble and AsinhSingle test data to GenericMathTestMemberData.cs with representative inputs (±0, ±∞, NaN, and mathematical constants like pi, e, ln(2), sqrt(2), etc.) covering small, medium, and large magnitudes. Added corresponding AsinhDoubleTest and AsinhSingleTest methods to all four Vector test files (Vector64, Vector128, Vector256, Vector512), following the same pattern as existing Asin tests. Commit e63184e.

Note: Runtime test execution is blocked by a pre-existing SIGABRT crash in AppContext.Setup that affects all tests (including existing Asin tests), but the new tests compile successfully and follow the established patterns.

…t coverage - Fix redundant 'absx * absx' computation in AsinhSingleCoreDouble by reusing the existing 'x2' local variable. - Add test cases covering all code paths and thresholds: - Subnormals (5E-324 double, float.Epsilon) - Near-tiny values (1E-10 double, 1E-06f float) - AsinhSingleCoreDouble range B: 2 < |x| <= 4 (3.0f, 4.0f) - Log path: 4 < |x| <= 4096 (10.0, 100.0, 4096.0) - Large path: |x| > 4096 (10000.0f, 1E+10f) - AsinhDouble large threshold at 2^28 (268435456.0) - Very large values (1E+10 double) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

github-actions · 2026-04-02T19:34:42Z

🤖 Copilot Code Review — PR #126419

Note

This review was generated by GitHub Copilot using claude-opus-4.6 and gpt-5.4 models.

Holistic Assessment

Motivation: Vectorizing Asinh across SIMD vector types is a natural continuation of the existing vectorized math pattern (Asin, Cos, Sin, etc.) and enables TensorPrimitives.Asinh to use hardware acceleration. The motivation is clear and well-justified.

Approach: The PR uses two distinct strategies: a naive log-identity formula for double precision (log(|x| + sqrt(x²+1))) and AMD AOCL-LibM polynomial approximations (widened to double) for single precision. Both approaches are reasonable trade-offs for SIMD vectorization. The structure matches existing sibling implementations closely.

Summary: ⚠️ Needs Human Review. The implementation appears functionally correct — special values (NaN, ±∞, ±0, subnormals) are handled properly, the polynomial coefficients are sourced from AMD's validated AOCL library, and the code follows established patterns. However, I have two unresolved concerns for human judgment: (1) I could not verify API approval for the 8 new public APIs, and (2) there is a precision gap in the double path for intermediate values that the test data does not exercise.

Detailed Findings

⚠️ API Approval — Could not verify `api-approved` issue

This PR adds 8 new public static methods to the ref assembly (Vector64/128/256/512.Asinh for both float and double). Per dotnet/runtime policy, new public APIs require a linked issue with the api-approved label. No issue reference was found in the commit messages or PR description.

Mitigating factor: The PR author is @stephentoub (a principal maintainer who participates in API reviews), and these APIs follow the exact same pattern as the previously-approved Asin APIs. It's possible the approval was handled as part of a batch Vector math API approval. A human reviewer should confirm this.

⚠️ Precision — `AsinhDouble` uses simpler formula than `AsinhSingleCoreDouble`

The double-precision path (AsinhDouble) uses the mathematical identity log(|x| + sqrt(x²+1)) directly, while the single-precision path (AsinhSingleCoreDouble) uses range-based Sollya-generated polynomial approximations for |x| ≤ 4.0 and falls back to the log formula only for larger values.

I investigated whether the naive formula suffers from precision loss for intermediate values. The LogDouble implementation internally decomposes its argument into 1 + f (where f ∈ [-1/3, +1/3]) and uses a 20-degree polynomial approximation, which effectively implements log1p — this mitigates the worst cancellation concerns. However, for values just above the tiny threshold (|x| ≈ 2^-25 ≈ 3×10⁻⁸), the intermediate computation |x| + sqrt(x²+1) ≈ 1 + 3×10⁻⁸ causes the subtraction m - 1 inside LogDouble's range reduction to lose ~25 bits, leaving only ~28 bits of precision (~8.4 decimal digits vs. double's 15.9).

The current test data jumps from the tiny path (1E-10 → returns x directly) to 0.318 (1/π), so the formula is not exercised in the accuracy-sensitive range [2^-25, ~0.1]. Consider adding a few test values in this range (e.g., 0.001, 0.01, 0.1) to validate the actual precision.

This may be an acceptable trade-off for a vectorized implementation (the comment acknowledges the design choice), but a human reviewer should confirm the precision is sufficient for the double-precision path.

⚠️ Test Gap — No double test cases between `1E-10` and `0.318`

The AsinhDouble test data covers:

Tiny/subnormal values (5E-324, 1E-10) → these take the return x path
Values ≥ 0.318 (1/π) → these are well within the accurate range of the log formula

But there are no test cases for |x| in the range (2^-25, 0.3) — exactly where the naive formula's precision is most stressed. Adding values like 0.001 (expected: ≈0.000999999833...) and 0.01 (expected: ≈0.009999833...) would exercise this code path and validate the actual precision achieved. The AsinhSingle test data similarly has a gap between 1E-06 and 0.318, though this matters less since the single path uses polynomial approximations.

💡 Unused Type Parameters — `TVectorInt32` and `TVectorUInt32` in `AsinhSingle`

AsinhSingle declares 6 type parameters (TVectorSingle, TVectorInt32, TVectorUInt32, TVectorDouble, TVectorInt64, TVectorUInt64) but only uses 4 in its body (TVectorSingle, TVectorDouble, TVectorInt64, TVectorUInt64). TVectorInt32 and TVectorUInt32 are passed from callers but never referenced.

However, I verified that the existing AsinSingle also declares an unused TVectorInt32, so this appears to be part of the established convention (maintaining a consistent type parameter signature across vector math methods for future extensibility). No action needed, but worth noting.

💡 Minor Inconsistency — `<` vs `<=` in tiny threshold checks

AsinhDouble uses LessThanOrEqual for its tiny check while AsinhSingle uses LessThan (strict). This is functionally negligible (the boundary values round to the same result either way), but the inconsistency is worth noting for style consistency.

✅ Special Value Handling — Correct

Verified all special cases:

NaN: Propagates correctly through all paths (NaN comparisons yield false, so neither tiny nor large masks trigger; NaN flows through the computation and is returned)
±∞: Large mask triggers → LN2 + log(∞) = ∞ → sign restored via XOR ✓
±0: Tiny mask triggers → returns |0| = 0 → sign restored via XOR → ±0 ✓
Subnormals: Below tiny threshold → returns x directly ✓

✅ Pattern Consistency — Follows established conventions

The implementation correctly follows the established Vector math patterns:

Vector128.Asinh has full XML doc; Vector256/512 use inheritdoc — matches Asin pattern
The widening strategy (float→double→compute→narrow) matches AsinSingle
The fallback chain (Vector512 → Vector256 → Vector128 → Vector64 → scalar loop) is correct
TensorPrimitives.AsinhOperator correctly uses #if NET11_0_OR_GREATER guards with Debug.Assert(typeof(T) == typeof(float)) in the else branch — matches existing operators

✅ Ref Assembly — Correctly placed

The 8 new API entries are placed after Asin and before AsByte in all four vector types, consistent with the existing ordering convention.

✅ TensorPrimitives — Tolerance correctly added

The TensorPrimitives.Generic.cs change correctly adds tolerance to the vectorized Asinh test (doubleTolerance: 1e-14, floatTolerance: 1e-6f), acknowledging that the vectorized implementation may have slightly different rounding than the scalar path.

✅ Polynomial Correctness — `AsinhSingleCoreDouble` coefficients verified

The polynomial coefficients in AsinhSingleCoreDouble are sourced from AMD's AOCL-LibM asinhf.c (Sollya-generated), with hex float comments that match the decimal values. The rational polynomial form x + x³ * P(x²)/Q(x²) correctly computes asinh(x) = x + x³/6 - ... for small arguments, avoiding the cancellation issue entirely.

Generated by Code Review for issue #126419 · ◷

stephentoub added the area-System.Runtime.Intrinsics label Apr 1, 2026

Copilot AI review requested due to automatic review settings April 1, 2026 18:10

dotnet-policy-service Bot assigned stephentoub Apr 1, 2026

Copilot started reviewing on behalf of stephentoub April 1, 2026 18:11 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Comment thread src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs

Comment thread src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs

Copilot started work on behalf of stephentoub April 1, 2026 18:20 View session

This comment has been minimized.

Sign in to view

Copilot AI and others added 2 commits April 1, 2026 19:10

Add unit tests for Vector64/128/256/512.Asinh APIs

4419923

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/75feae7d-528c-4fd6-aee8-88ccfa9c7841 Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Remove unintended files from baseline build checkout

e63184e

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/75feae7d-528c-4fd6-aee8-88ccfa9c7841 Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 1, 2026 19:11

stephentoub review requested due to automatic review settings April 1, 2026 19:11

Copilot finished work on behalf of stephentoub April 1, 2026 19:12

Copilot AI review requested due to automatic review settings April 2, 2026 19:11

Copilot started reviewing on behalf of stephentoub April 2, 2026 19:12 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize Asinh for Vector64/128/256/512 and TensorPrimitives#126419

Vectorize Asinh for Vector64/128/256/512 and TensorPrimitives#126419
stephentoub wants to merge 4 commits intomainfrom
copilot/vectorize-asinh

stephentoub commented Apr 1, 2026 •

edited by Copilot AI

Loading

Uh oh!

dotnet-policy-service Bot commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

stephentoub commented Apr 1, 2026

Uh oh!

This comment has been minimized.

Copilot AI commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

stephentoub commented Apr 1, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Implementation

Tests

Uh oh!

dotnet-policy-service Bot commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

stephentoub commented Apr 1, 2026

Uh oh!

This comment has been minimized.

Copilot AI commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

github-actions Bot commented Apr 2, 2026

🤖 Copilot Code Review — PR #126419

Holistic Assessment

Detailed Findings

⚠️ API Approval — Could not verify api-approved issue

⚠️ Precision — AsinhDouble uses simpler formula than AsinhSingleCoreDouble

⚠️ Test Gap — No double test cases between 1E-10 and 0.318

💡 Unused Type Parameters — TVectorInt32 and TVectorUInt32 in AsinhSingle

💡 Minor Inconsistency — < vs <= in tiny threshold checks

✅ Special Value Handling — Correct

✅ Pattern Consistency — Follows established conventions

✅ Ref Assembly — Correctly placed

✅ TensorPrimitives — Tolerance correctly added

✅ Polynomial Correctness — AsinhSingleCoreDouble coefficients verified

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stephentoub commented Apr 1, 2026 •

edited by Copilot AI

Loading

⚠️ API Approval — Could not verify `api-approved` issue

⚠️ Precision — `AsinhDouble` uses simpler formula than `AsinhSingleCoreDouble`

⚠️ Test Gap — No double test cases between `1E-10` and `0.318`

💡 Unused Type Parameters — `TVectorInt32` and `TVectorUInt32` in `AsinhSingle`

💡 Minor Inconsistency — `<` vs `<=` in tiny threshold checks

✅ Polynomial Correctness — `AsinhSingleCoreDouble` coefficients verified