Skip to content

Vectorize Asinh for Vector64/128/256/512 and TensorPrimitives#126419

Open
stephentoub wants to merge 4 commits intomainfrom
copilot/vectorize-asinh
Open

Vectorize Asinh for Vector64/128/256/512 and TensorPrimitives#126419
stephentoub wants to merge 4 commits intomainfrom
copilot/vectorize-asinh

Conversation

@stephentoub
Copy link
Copy Markdown
Member

@stephentoub stephentoub commented Apr 1, 2026

Description

Add vectorized Asinh implementations for float and double across all SIMD vector types, with comprehensive unit tests.

Implementation

  • Exposes new Vector64/128/256/512.Asinh(float|double) public APIs and wires them to VectorMath implementations when hardware-accelerated.
  • Implements vectorized asinh core logic in VectorMath for double and float (via widening to double and polynomial/range selection).
  • Enables TensorPrimitives.Asinh vectorization for float/double on NET11_0_OR_GREATER and adjusts test tolerances accordingly.

Tests

  • Added AsinhDouble and AsinhSingle test data to GenericMathTestMemberData with 31 test cases each, covering special values (±0, ±∞, NaN) and representative mathematical constants (±π, ±e, ±ln(10), ±π/2, ±log₂(e), ±√2, ±1, ±π/4, ±1/√2, ±ln(2), ±2/π, ±log₁₀(e), ±1/π) with appropriate variance tolerances.
  • Added AsinhDoubleTest and AsinhSingleTest methods to Vector64Tests, Vector128Tests, Vector256Tests, and Vector512Tests, following the same [Theory]/[MemberData] pattern as existing Asin tests.

Add vectorized Asinh implementations for float and double across all SIMD vector types.
- AsinhDouble: uses log identity (log(|x| + sqrt(x^2 + 1))) for vectorization
- AsinhSingle: widens to double, uses AMD asinhf.c polynomial approximation
- Hook up TensorPrimitives.Asinh to use vectorized implementations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 1, 2026 18:10
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds vectorized Asinh implementations for float/double across all SIMD vector widths and enables vectorization for TensorPrimitives.Asinh.

Changes:

  • Exposes new Vector64/128/256/512.Asinh(float|double) public APIs and wires them to VectorMath implementations when hardware-accelerated.
  • Implements vectorized asinh core logic in VectorMath for double and float (via widening to double and polynomial/range selection).
  • Enables TensorPrimitives.Asinh vectorization for float/double on NET11_0_OR_GREATER and adjusts test tolerances accordingly.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs Adds Asinh API surface to the reference assembly for Vector64/128/256/512.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs Introduces the vectorized AsinhDouble/AsinhSingle implementations used by the SIMD APIs.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs Adds Vector64.Asinh public APIs plus scalar fallback helper.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs Adds Vector128.Asinh public APIs with accelerated and fallback paths.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs Adds Vector256.Asinh public APIs with accelerated and fallback paths.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector512.cs Adds Vector512.Asinh public APIs with accelerated and fallback paths.
src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs Updates Asinh test entry to use explicit tolerances for float/double.
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Asinh.cs Enables vectorization for TensorPrimitives.Asinh and adds vector invoke implementations.

@stephentoub
Copy link
Copy Markdown
Member Author

@copilot please evaluate and address the feedback

@github-actions

This comment has been minimized.

Copilot AI and others added 2 commits April 1, 2026 19:10
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/75feae7d-528c-4fd6-aee8-88ccfa9c7841

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/75feae7d-528c-4fd6-aee8-88ccfa9c7841

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 1, 2026 19:11
@stephentoub stephentoub review requested due to automatic review settings April 1, 2026 19:11
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot please evaluate and address the feedback

Added AsinhDouble and AsinhSingle test data to GenericMathTestMemberData.cs with representative inputs (±0, ±∞, NaN, and mathematical constants like pi, e, ln(2), sqrt(2), etc.) covering small, medium, and large magnitudes. Added corresponding AsinhDoubleTest and AsinhSingleTest methods to all four Vector test files (Vector64, Vector128, Vector256, Vector512), following the same pattern as existing Asin tests. Commit e63184e.

Note: Runtime test execution is blocked by a pre-existing SIGABRT crash in AppContext.Setup that affects all tests (including existing Asin tests), but the new tests compile successfully and follow the established patterns.

…t coverage

- Fix redundant 'absx * absx' computation in AsinhSingleCoreDouble by
  reusing the existing 'x2' local variable.
- Add test cases covering all code paths and thresholds:
  - Subnormals (5E-324 double, float.Epsilon)
  - Near-tiny values (1E-10 double, 1E-06f float)
  - AsinhSingleCoreDouble range B: 2 < |x| <= 4 (3.0f, 4.0f)
  - Log path: 4 < |x| <= 4096 (10.0, 100.0, 4096.0)
  - Large path: |x| > 4096 (10000.0f, 1E+10f)
  - AsinhDouble large threshold at 2^28 (268435456.0)
  - Very large values (1E+10 double)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 2, 2026 19:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

🤖 Copilot Code Review — PR #126419

Note

This review was generated by GitHub Copilot using claude-opus-4.6 and gpt-5.4 models.

Holistic Assessment

Motivation: Vectorizing Asinh across SIMD vector types is a natural continuation of the existing vectorized math pattern (Asin, Cos, Sin, etc.) and enables TensorPrimitives.Asinh to use hardware acceleration. The motivation is clear and well-justified.

Approach: The PR uses two distinct strategies: a naive log-identity formula for double precision (log(|x| + sqrt(x²+1))) and AMD AOCL-LibM polynomial approximations (widened to double) for single precision. Both approaches are reasonable trade-offs for SIMD vectorization. The structure matches existing sibling implementations closely.

Summary: ⚠️ Needs Human Review. The implementation appears functionally correct — special values (NaN, ±∞, ±0, subnormals) are handled properly, the polynomial coefficients are sourced from AMD's validated AOCL library, and the code follows established patterns. However, I have two unresolved concerns for human judgment: (1) I could not verify API approval for the 8 new public APIs, and (2) there is a precision gap in the double path for intermediate values that the test data does not exercise.


Detailed Findings

⚠️ API Approval — Could not verify api-approved issue

This PR adds 8 new public static methods to the ref assembly (Vector64/128/256/512.Asinh for both float and double). Per dotnet/runtime policy, new public APIs require a linked issue with the api-approved label. No issue reference was found in the commit messages or PR description.

Mitigating factor: The PR author is @stephentoub (a principal maintainer who participates in API reviews), and these APIs follow the exact same pattern as the previously-approved Asin APIs. It's possible the approval was handled as part of a batch Vector math API approval. A human reviewer should confirm this.

⚠️ Precision — AsinhDouble uses simpler formula than AsinhSingleCoreDouble

The double-precision path (AsinhDouble) uses the mathematical identity log(|x| + sqrt(x²+1)) directly, while the single-precision path (AsinhSingleCoreDouble) uses range-based Sollya-generated polynomial approximations for |x| ≤ 4.0 and falls back to the log formula only for larger values.

I investigated whether the naive formula suffers from precision loss for intermediate values. The LogDouble implementation internally decomposes its argument into 1 + f (where f ∈ [-1/3, +1/3]) and uses a 20-degree polynomial approximation, which effectively implements log1p — this mitigates the worst cancellation concerns. However, for values just above the tiny threshold (|x| ≈ 2^-25 ≈ 3×10⁻⁸), the intermediate computation |x| + sqrt(x²+1) ≈ 1 + 3×10⁻⁸ causes the subtraction m - 1 inside LogDouble's range reduction to lose ~25 bits, leaving only ~28 bits of precision (~8.4 decimal digits vs. double's 15.9).

The current test data jumps from the tiny path (1E-10 → returns x directly) to 0.318 (1/π), so the formula is not exercised in the accuracy-sensitive range [2^-25, ~0.1]. Consider adding a few test values in this range (e.g., 0.001, 0.01, 0.1) to validate the actual precision.

This may be an acceptable trade-off for a vectorized implementation (the comment acknowledges the design choice), but a human reviewer should confirm the precision is sufficient for the double-precision path.

⚠️ Test Gap — No double test cases between 1E-10 and 0.318

The AsinhDouble test data covers:

  • Tiny/subnormal values (5E-324, 1E-10) → these take the return x path
  • Values ≥ 0.318 (1/π) → these are well within the accurate range of the log formula

But there are no test cases for |x| in the range (2^-25, 0.3) — exactly where the naive formula's precision is most stressed. Adding values like 0.001 (expected: ≈0.000999999833...) and 0.01 (expected: ≈0.009999833...) would exercise this code path and validate the actual precision achieved. The AsinhSingle test data similarly has a gap between 1E-06 and 0.318, though this matters less since the single path uses polynomial approximations.

💡 Unused Type Parameters — TVectorInt32 and TVectorUInt32 in AsinhSingle

AsinhSingle declares 6 type parameters (TVectorSingle, TVectorInt32, TVectorUInt32, TVectorDouble, TVectorInt64, TVectorUInt64) but only uses 4 in its body (TVectorSingle, TVectorDouble, TVectorInt64, TVectorUInt64). TVectorInt32 and TVectorUInt32 are passed from callers but never referenced.

However, I verified that the existing AsinSingle also declares an unused TVectorInt32, so this appears to be part of the established convention (maintaining a consistent type parameter signature across vector math methods for future extensibility). No action needed, but worth noting.

💡 Minor Inconsistency — < vs <= in tiny threshold checks

AsinhDouble uses LessThanOrEqual for its tiny check while AsinhSingle uses LessThan (strict). This is functionally negligible (the boundary values round to the same result either way), but the inconsistency is worth noting for style consistency.

✅ Special Value Handling — Correct

Verified all special cases:

  • NaN: Propagates correctly through all paths (NaN comparisons yield false, so neither tiny nor large masks trigger; NaN flows through the computation and is returned)
  • ±∞: Large mask triggers → LN2 + log(∞) = ∞ → sign restored via XOR ✓
  • ±0: Tiny mask triggers → returns |0| = 0 → sign restored via XOR → ±0
  • Subnormals: Below tiny threshold → returns x directly ✓

✅ Pattern Consistency — Follows established conventions

The implementation correctly follows the established Vector math patterns:

  • Vector128.Asinh has full XML doc; Vector256/512 use inheritdoc — matches Asin pattern
  • The widening strategy (float→double→compute→narrow) matches AsinSingle
  • The fallback chain (Vector512Vector256Vector128Vector64 → scalar loop) is correct
  • TensorPrimitives.AsinhOperator correctly uses #if NET11_0_OR_GREATER guards with Debug.Assert(typeof(T) == typeof(float)) in the else branch — matches existing operators

✅ Ref Assembly — Correctly placed

The 8 new API entries are placed after Asin and before AsByte in all four vector types, consistent with the existing ordering convention.

✅ TensorPrimitives — Tolerance correctly added

The TensorPrimitives.Generic.cs change correctly adds tolerance to the vectorized Asinh test (doubleTolerance: 1e-14, floatTolerance: 1e-6f), acknowledging that the vectorized implementation may have slightly different rounding than the scalar path.

✅ Polynomial Correctness — AsinhSingleCoreDouble coefficients verified

The polynomial coefficients in AsinhSingleCoreDouble are sourced from AMD's AOCL-LibM asinhf.c (Sollya-generated), with hex float comments that match the decimal values. The rational polynomial form x + x³ * P(x²)/Q(x²) correctly computes asinh(x) = x + x³/6 - ... for small arguments, avoiding the cancellation issue entirely.

Generated by Code Review for issue #126419 ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants