Fix Avx.CompareScalar and Avx512F.Compare incorrectly folded with True/False comparison modes#125179
Fix Avx.CompareScalar and Avx512F.Compare incorrectly folded with True/False comparison modes#125179
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
…gression test CompareScalar preserves upper elements from the first operand, so folding to a full zero or AllBitsSet vector is incorrect. Skip folding for NI_AVX_CompareScalar with True/False modes so the hardware instruction produces the correct result. Fixes #125160 Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes a JIT constant-folding regression introduced in net10.0 where Avx.CompareScalar with OrderedFalse* or UnorderedTrue* comparison modes was incorrectly folded to a full zero or AllBitsSet vector, discarding the upper elements that should be preserved from the first operand.
Changes:
- Skip constant folding for
NI_AVX_CompareScalarin theOrderedFalse*andUnorderedTrue*cases ingtFoldExprHWIntrinsic, sinceCompareScalaronly affects element 0 and must preserve upper elements fromop1 - Add regression test covering all four True/False modes for both
floatanddouble, asserting correct scalar result in element 0 and upper element preservation
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/coreclr/jit/gentree.cpp |
Adds an early break guard for NI_AVX_CompareScalar in both OrderedFalse* and UnorderedTrue* comparison mode cases to prevent incorrect whole-vector folding |
src/tests/JIT/Regression/JitBlue/Runtime_125160/Runtime_125160.cs |
New regression test covering all four True/False comparison modes for both float and double, verifying scalar result in element 0 and upper element preservation |
src/tests/JIT/Regression/Regression_ro_2.csproj |
Registers the new regression test in the merged test runner project |
|
@copilot there's also an issue with static Vector512<float> CompareFalse(Vector512<float> x, Vector512<float> y)
=> Avx512F.Compare(x, y, FloatComparisonMode.OrderedFalseSignaling); |
1 similar comment
|
@copilot there's also an issue with static Vector512<float> CompareFalse(Vector512<float> x, Vector512<float> y)
=> Avx512F.Compare(x, y, FloatComparisonMode.OrderedFalseSignaling); |
…2 test For NI_AVX512_CompareMask, retType is TYP_MASK which gtNewZeroConNode and gtNewAllBitsSetConNode don't handle (hitting unreached()). Use the SIMD vector type instead; the existing varTypeIsMask conversion at the end of gtFoldExprHWIntrinsic will wrap the result with CvtVectorToMask. Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
... Fixed in 01f88f3. |
|
@copilot address the CI failures, e.g. |
…redTrue assertions Vector512.Create(-1f) has bit pattern 0xBF800000 per element, but UnorderedTrue comparison modes produce all-bits-set (0xFFFFFFFF per element, which is NaN). Use Vector512<T>.AllBitsSet to correctly match the expected result. Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
Fixed in 1826aad. The Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Description
Fixes two issues in JIT constant folding of comparison intrinsics with
OrderedFalse*/UnorderedTrue*modes ingtFoldExprHWIntrinsic:Avx.CompareScalar:CompareScalarpreserves upper elements from the first operand, but the folding replaced the entire vector with zero or AllBitsSet, discarding those upper elements.Avx512F.Compare(NI_AVX512_CompareMask):CompareMaskreturnsTYP_MASK, butgtNewZeroConNode/gtNewAllBitsSetConNodedon't handle mask types, causing anunreachedassertion failure.Changes
src/coreclr/jit/gentree.cpp:ni == NI_AVX_CompareScalar, sinceCompareScalarmust preserve upper elements fromop1.NI_AVX512_CompareMask, usegetSIMDTypeForSize(simdSize)instead ofretTypewhen creating zero/AllBitsSet constants, sinceretTypeisTYP_MASK. The existingvarTypeIsMaskconversion at the end of the function wraps the result withCvtVectorToMask.Runtime_125160): Covers all four True/False modes for bothfloatanddoublescalar comparisons (asserting lowest element result and upper element preservation), and all four True/False modes forAvx512F.ComparewithVector512<float>andVector512<double>(usingVector512<T>.AllBitsSetforUnorderedTrueassertions since all-bits-set produces0xFFFFFFFFper element, not-1f).Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.