Skip to content

Vec256 Test cases#42685

Closed
quickwritereader wants to merge 36 commits intopytorch:masterfrom
quickwritereader:vec256_test_issue15676
Closed

Vec256 Test cases#42685
quickwritereader wants to merge 36 commits intopytorch:masterfrom
quickwritereader:vec256_test_issue15676

Conversation

@quickwritereader
Copy link
Copy Markdown
Contributor

@quickwritereader quickwritereader commented Aug 6, 2020

Tests for Vec256 classes #15676

Testing
Current list:

  • Blends
  • Memory: UnAlignedLoadStore
  • Arithmetics: Plus,Minu,Multiplication,Division
  • Bitwise: BitAnd, BitOr, BitXor
  • Comparison: Equal, NotEqual, Greater, Less, GreaterEqual, LessEqual
  • MinMax: Minimum, Maximum, ClampMin, ClampMax, Clamp
  • SignManipulation: Absolute, Negate
  • Interleave: Interleave, DeInterleave
  • Rounding: Round, Ceil, Floor, Trunc
  • Mask: ZeroMask
  • SqrtAndReciprocal: Sqrt, RSqrt, Reciprocal
  • Trigonometric: Sin, Cos, Tan
  • Hyperbolic: Tanh, Sinh, Cosh
  • InverseTrigonometric: Asin, ACos, ATan, ATan2
  • Logarithm: Log, Log2, Log10, Log1p
  • Exponents: Exp, Expm1
  • ErrorFunctions: Erf, Erfc, Erfinv
  • Pow: Pow
  • LGamma: LGamma
  • Quantization: quantize, dequantize, requantize_from_int
  • Quantization: widening_subtract, relu, relu6
    Missing:
  • Constructors, initializations
  • Conversion , Cast
  • Additional: imag, conj, angle (note: imag and conj only checked for float complex)

Notes on tests and testing framework

  • some math functions are tested within domain range
  • mostly testing framework randomly tests against std implementation within the domain or within the implementation domain for some math functions.
  • some functions are tested against the local version. For example, std::round and vector version of round differs. so it was tested against the local version
  • round was tested against pytorch at::native::round_impl. for double type on Vsx vec_round failed for (even)+0 .5 values . it was solved by using vec_rint
  • complex types are not tested After enabling complex testing due to precision and domain some of the complex functions failed for vsx and x86 avx as well. I will either test it against local implementation or check within the accepted domain
  • quantizations are not tested Added tests for quantizing, dequantize, requantize_from_int, relu, relu6, widening_subtract functions
  • the testing framework should be improved further
  • For now -DBUILD_MOBILE_TEST=ON will be used for Vec256Test too
    Vec256 Test cases will be built for each CPU_CAPABILITY

Fixes: #15676

@quickwritereader
Copy link
Copy Markdown
Contributor Author

For now AVX2 and VSX both fails on these tests

[ FAILED ] InverseTrigonometric/2.Asin, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >
[ FAILED ] InverseTrigonometric/2.ACos, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >
[ FAILED ] InverseTrigonometric/2.ATan, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >
[ FAILED ] InverseTrigonometric/3.Asin, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >
[ FAILED ] InverseTrigonometric/3.ACos, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >
[ FAILED ] InverseTrigonometric/3.ATan, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >

Additionally VSX fails on Multiplication because of precision
[ FAILED ] Arithmetics/2.Multiplication, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex >

Failures above are precision related and tests can be checked within the domain and with low precision.
But ATan for both VSX and AVX has sign related error. As VSX complex is simply a translation of AVX codes for complex numbers:

[ RUN      ] InverseTrigonometric/2.ATan
Total Trial Count:131070
Domain:
{ -10, 10 }
Error epsilon: 1e-05
../../../aten/src/ATen/test/Vec256Test.h:491: Failure
Expected equality of these values:
  nearlyEqual(exp.real(), act.real(), absErr.real())
    Which is: false
  true
-1.5707963705062866 1.5707963705062866
atan: {
vec[(-5.22354,-4), (-4.52348,2.63532), (-5.75149,0.258563), (-0,-1.43798)]
vec_exp:vec[(-1.44969,-0.0913255), (-1.40576,0.0938578), (-1.39898,0.00757274), (-1.5708,-0.858373)]
vec_act:vec[(-1.44969,-0.0913255), (-1.40576,0.0938578), (-1.39898,0.00757276), (1.5708,-0.858373)]
}
[  FAILED  ] InverseTrigonometric/2.ATan, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex<float> > (34 ms)
[----------] 1 test from InverseTrigonometric/2 (34 ms total)

[----------] 1 test from InverseTrigonometric/3, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex<double> >
[ RUN      ] InverseTrigonometric/3.ATan
Total Trial Count:131070
Domain:
{ -10, 10 }
Error epsilon: 1e-05
../../../aten/src/ATen/test/Vec256Test.h:460: Failure
Expected equality of these values:
  nearlyEqual(exp.real(), act.real(), absErr.real())
    Which is: false
  true
1.5707963267948966 -1.5707963267948966
atan: {
vec[(-5,2.47906), (0,6.06085)]
vec_exp:vec[(-1.41065,0.0777398), (1.5708,0.166516)]
vec_act:vec[(-1.41065,0.0777398), (-1.5708,0.166516)]
}
[  FAILED  ] InverseTrigonometric/3.ATan, where TypeParam = at::vec256::(anonymous namespace)::Vec256<c10::complex<double> > (1 ms)

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Aug 7, 2020

💊 CI failures summary and remediations

As of commit 14e2ef9 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 153 times.

@ezyang ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 7, 2020
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Aug 7, 2020

@colesbury @VitalyFedyunin let me know if you need help finding other people to review

@VitalyFedyunin
Copy link
Copy Markdown
Contributor

Thank you for separating and valuable find about precision, I will make review my highest priority at Monday.

Comment thread caffe2/CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Comment thread aten/src/ATen/test/Vec256Test.cpp Outdated
Copy link
Copy Markdown
Contributor

@VitalyFedyunin VitalyFedyunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall test structure and testing approach looks good to me, but it will take time to review tests one by one.

Comment thread aten/src/ATen/test/Vec256Test.cpp Outdated
Comment thread aten/src/ATen/test/Vec256Test.cpp Outdated
Comment thread aten/src/ATen/test/Vec256Test.cpp Outdated
Comment thread aten/src/ATen/test/Vec256Test.cpp Outdated
@VitalyFedyunin VitalyFedyunin added the module: vectorization Related to SIMD vectorization, e.g., Vec256 label Aug 11, 2020
Copy link
Copy Markdown
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quickwritereader @VitalyFedyunin
One thing want to mention here, I remembered that the TYPED_TEST_CASE is going to be deprecated and TYPED_TEST_SUITE is the new one to use, I believe no syntax change. I think we should use TYPED_TEST_SUITE if possible.

I will start to review the tests within this code tomorrow.

@quickwritereader
Copy link
Copy Markdown
Contributor Author

@glaringlee I wanted to rename it, but gtest in the third_party folder is old

Copy link
Copy Markdown
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quickwritereader
Sry for updating late, the code is a bit long here. I made some comments.
Let's keep TYPED_TEST_CASE for now (upgrading the googletest version for pytorch is not that trivial within facebook)

@VitalyFedyunin
std::cout is used within the test, should we comment all the std::cout lines?

Comment thread aten/src/ATen/test/vec256_test_all_types.h Outdated
Comment thread aten/src/ATen/test/vec256_test_all_types.h Outdated
Comment thread aten/src/ATen/test/vec256_test_all_types.h
Comment thread aten/src/ATen/test/vec256_test_all_types.h
Comment thread aten/src/ATen/test/vec256_test_all_types.h Outdated
Comment thread aten/src/ATen/test/vec256_test_all_types.cpp
Comment thread aten/src/ATen/test/vec256_test_all_types.cpp
Comment thread aten/src/ATen/test/vec256_test_all_types.cpp
Comment thread aten/src/ATen/test/vec256_test_all_types.cpp
Comment thread aten/src/ATen/test/vec256_test_all_types.cpp Outdated
@glaringlee
Copy link
Copy Markdown
Contributor

glaringlee commented Sep 10, 2020

@glaringlee thanks for clarifying. So maybe I should add those avx checks inside test too.
But still its strange as how cmake detects avx availability then.
I will see maybe I messed something

I don't think you need to add those AVX checks in the test. FB internal actually use a different mechanism to build pytorch, not CMake (It is similar to bazel but not the same, You can see some TARGET files, those are used internally). Pytorch rely on preprocessor defined macro to detected AVX (CPU_CAPABILITY_DEFAULT/VSX/AVX2 etc), and those flags are all undefined in fb internal test. If no such flags, all vec functions will fall back into vec256_base.h which uses std instead. This is as designed. So once we fix those overflow/precision loss issue in local_xxxx. I think we will be able to pass all the tests.

@quickwritereader
Copy link
Copy Markdown
Contributor Author

I will try to compare against std then with the decreased domain.
But for asin acos It would be really tough to do. but I will see

@glaringlee
Copy link
Copy Markdown
Contributor

glaringlee commented Sep 10, 2020

I will try to compare against std then with the decreased domain.
But for asin acos It would be really tough to do. but I will see

Ah, haha, there might be some misunderstanding here. I think you don't need to decrease your domain.
How about change the local_abs and local_sqrt like what I did in my godbolt example? https://godbolt.org/z/f4xvn9
The problem here is overflow and precision loss due to floating type is not large enough, so how about use long double (or just use double and decrease your domain in this case) to store intermediate values (note that float * float will blow up float if input float is big, and samething for double)? And I think this will fix the errors in all the failures (abs, acos, asin, etc.)

@quickwritereader
Copy link
Copy Markdown
Contributor Author

well, you replicating std behavior. which I chose to tests against local Pytorch complex avx behavior to make tests correct (excluding gcc fp contract mess)

@quickwritereader
Copy link
Copy Markdown
Contributor Author

I can fix inf vs big number easily actually but not asin acos. for that I need to decrease precision

@glaringlee
Copy link
Copy Markdown
Contributor

I can fix inf vs big number easily actually but not asin acos. for that I need to decrease precision

Ah, got u. Agree!

@quickwritereader
Copy link
Copy Markdown
Contributor Author

quickwritereader commented Sep 14, 2020

@glaringlee I will try to add ifdef block to fallback when CPU_CAPABILITY_* are not defined.
what do you think something like this could fix internal build tests?

#if defined(CPU_CAPABILITY_DEFAULT) || defined(_MSC_VER)
#define TEST_AGAINST_DEFAULT 1
#elif !defined(CPU_CAPABILITY_AVX) &&  !defined(CPU_CAPABILITY_AVX2) && !defined(CPU_CAPABILITY_VSX)
#define TEST_AGAINST_DEFAULT 1
#else
#undef TEST_AGAINST_DEFAULT
#endif

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glaringlee has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@glaringlee
Copy link
Copy Markdown
Contributor

glaringlee commented Sep 15, 2020

@glaringlee I will try to add ifdef block to fallback when CPU_CAPABILITY_* are not defined.
what do you think something like this could fix internal build tests?

#if defined(CPU_CAPABILITY_DEFAULT) || defined(_MSC_VER)
#define TEST_AGAINST_DEFAULT 1
#elif !defined(CPU_CAPABILITY_AVX) &&  !defined(CPU_CAPABILITY_AVX2) && !defined(CPU_CAPABILITY_VSX)
#define TEST_AGAINST_DEFAULT 1
#else
#undef TEST_AGAINST_DEFAULT
#endif

@quickwritereader
sry, off the work earlier today. Did not see ur update till now. I imported this change to fb internal just now. I think it should work. Let's see. Just wondering what is the time difference we have, so I can pay more attention to this in the right time, I am in EST time zone, how about you?

@quickwritereader
Copy link
Copy Markdown
Contributor Author

GMT+4

@glaringlee
Copy link
Copy Markdown
Contributor

glaringlee commented Sep 15, 2020

@quickwritereader This worked in fb internal. All test passed. I am running a github CI test now. #44712
Will update here later.

Just have one question here:
https://github.com/pytorch/pytorch/pull/42685/files?file-filters%5B%5D=.cmake&file-filters%5B%5D=.h&file-filters%5B%5D=.txt#diff-c218972661375812ff646eaec7b7ddd7R1118
(The file might be folded, this is in vec256_test_all_types.h line 1118)

Now above piece will be hit only when CPU_CAPABILITY_DEFAULT is defined and AVX/AVX2 macros are not defined under Clang or GNU. And T rr = real * real; can still cause overflow when real is a big number in its type (eg. float, and real = 1e+38, etc). Are we sure this won't be the problem anymore?

@quickwritereader
Copy link
Copy Markdown
Contributor Author

@glaringlee
hi, I'm glad it seems working.
for the abs operation, it checks relatively. so for cases inf vs big number it should return equal. So I believe there should not be any problem.
As human works are error-prone I will try to keep an eye on it if there happens any issue.

@glaringlee
Copy link
Copy Markdown
Contributor

@glaringlee
hi, I'm glad it seems working.
for the abs operation, it checks relatively. so for cases inf vs big number it should return equal. So I believe there should not be any problem.
As human works are error-prone I will try to keep an eye on it if there happens any issue.

Understood! Thanks a lot. Github CI test is still running, so far so good.

Copy link
Copy Markdown
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quickwritereader Github CI finished. All passed. I think we are good now. I'll go ahead to land this.
Thank you so much for working on this huge test!

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glaringlee has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

xuzhao9 pushed a commit that referenced this pull request Sep 18, 2020
Summary:
[Tests for Vec256 classes https://github.com/pytorch/pytorch/issues/15676](https://github.com/pytorch/pytorch/issues/15676)

Testing
Current list:

- [x] Blends
- [x] Memory: UnAlignedLoadStore
- [x] Arithmetics: Plus,Minu,Multiplication,Division
- [x] Bitwise: BitAnd, BitOr, BitXor
- [x] Comparison: Equal, NotEqual, Greater, Less, GreaterEqual, LessEqual
- [x] MinMax: Minimum, Maximum, ClampMin, ClampMax, Clamp
- [x] SignManipulation: Absolute, Negate
- [x] Interleave: Interleave, DeInterleave
- [x] Rounding: Round, Ceil, Floor, Trunc
- [x] Mask: ZeroMask
- [x] SqrtAndReciprocal: Sqrt, RSqrt, Reciprocal
- [x] Trigonometric: Sin, Cos, Tan
- [x] Hyperbolic: Tanh, Sinh, Cosh
- [x] InverseTrigonometric: Asin, ACos, ATan, ATan2
- [x] Logarithm: Log, Log2, Log10, Log1p
- [x] Exponents: Exp, Expm1
- [x] ErrorFunctions: Erf, Erfc, Erfinv
- [x] Pow: Pow
- [x] LGamma: LGamma
- [x] Quantization: quantize, dequantize, requantize_from_int
- [x] Quantization: widening_subtract, relu, relu6
Missing:
- [ ] Constructors, initializations
- [ ] Conversion , Cast
- [ ] Additional: imag, conj, angle (note: imag and conj only checked for float complex)

#### Notes on tests and testing framework
- some math functions are tested within domain range
- mostly testing framework randomly tests against std implementation within the domain or within the implementation domain for some math functions.
- some functions are tested against the local version. ~~For example, std::round and vector version of round differs. so it was tested against the local version~~
- round was tested against pytorch at::native::round_impl. ~~for double type on **Vsx  vec_round failed  for  (even)+0 .5 values**~~ . it was solved by using vec_rint
- ~~**complex types are not tested**~~  **After enabling complex testing due to precision and domain some of the complex functions failed for vsx and x86 avx as well. I will either test it against local implementation or check within the accepted domain**
- ~~quantizations are not tested~~  Added tests for quantizing, dequantize, requantize_from_int, relu, relu6, widening_subtract functions
- the testing framework should be improved further
- ~~For now  `-DBUILD_MOBILE_TEST=ON `will be used for Vec256Test too~~
Vec256 Test cases will be built for each CPU_CAPABILITY

Fixes: #15676

Pull Request resolved: #42685

Reviewed By: malfet

Differential Revision: D23034406

Pulled By: glaringlee

fbshipit-source-id: d1bf03acdfa271c88744c5d0235eeb8b77288ef8
glaringlee pushed a commit that referenced this pull request Sep 18, 2020
This is to add vec256 test (introduced in #42685) into linux CI system.
The whole test will last 50 to 60 seconds.

Differential Revision: [D23772923](https://our.internmc.facebook.com/intern/diff/D23772923)

[ghstack-poisoned]
glaringlee pushed a commit that referenced this pull request Sep 21, 2020
This is to add vec256 test (introduced in #42685) into linux CI system.
The whole test will last 50 to 60 seconds.

Differential Revision: [D23772923](https://our.internmc.facebook.com/intern/diff/D23772923)

[ghstack-poisoned]
glaringlee pushed a commit that referenced this pull request Sep 21, 2020
This is to add vec256 test (introduced in #42685) into linux CI system.
The whole test will last 50 to 60 seconds.

Differential Revision: [D23772923](https://our.internmc.facebook.com/intern/diff/D23772923)

[ghstack-poisoned]
glaringlee pushed a commit that referenced this pull request Sep 22, 2020
This is to add vec256 test (introduced in #42685) into linux CI system.
The whole test will last 50 to 60 seconds.

Differential Revision: [D23772923](https://our.internmc.facebook.com/intern/diff/D23772923)

[ghstack-poisoned]
@ezyang ezyang added the merged label Sep 28, 2020
facebook-github-bot referenced this pull request Dec 10, 2020
Summary:
### Pytorch Vec256 ppc64le support
implemented types:

- double
- float
- int16
- int32
- int64
- qint32
- qint8
- quint8
- complex_float
- complex_double

Notes:
All basic vector operations are implemented:
There are a few problems:
- minimum maximum nan propagation for ppc64le is missing and was not checked
- complex multiplication, division, sqrt, abs are implemented as PyTorch x86. they can overflow and have precision problems than std ones.  That's why they were either excluded or tested in smaller domain range
- precisions of the implemented float math functions

~~Besides, I added CPU_CAPABILITY for power. but as because of  quantization errors for DEFAULT I had to undef and  use vsx for DEFAULT too~~

#### Details
##### Supported math functions

+ plus sign means vectorized, -  minus sign means missing,   (implementation notes are added inside braces)
(notes). Example: -(both ) means it was also missing on x86 side
g( func_name)  means vectorization is using func_name
sleef - redirected to the Sleef
unsupported

function_name | float | double | complex float | complex double
|-- | -- | -- | -- | --|
acos | sleef | sleef | f(asin) | f(asin)
asin | sleef | sleef | +(pytorch impl) | +(pytorch impl)
atan | sleef | sleef | f(log) | f(log)
atan2 | sleef | sleef | unsupported | unsupported
cos | +((ppc64le:avx_mathfun) ) | sleef | -(both) | -(both)
cosh | f(exp)   | -(both) | -(both) |
erf | sleef | sleef | unsupported | unsupported
erfc | sleef | sleef | unsupported | unsupported
erfinv | - (both) | - (both) | unsupported | unsupported
exp | + | sleef | - (x86:f()) | - (x86:f())
expm1 | f(exp)  | sleef | unsupported | unsupported
lgamma | sleef | sleef |   |
log | +  | sleef | -(both) | -(both)
log10 | f(log)  | sleef | f(log) | f(log)
log1p | f(log)  | sleef | unsupported | unsupported
log2 | f(log)  | sleef | f(log) | f(log)
pow | + f(exp)  | sleef | -(both) | -(both)
sin | +((ppc64le:avx_mathfun) ) | sleef | -(both) | -(both)
sinh | f(exp)  | sleef | -(both) | -(both)
tan | sleef | sleef | -(both) | -(both)
tanh | f(exp)  | sleef | -(both) | -(both)
hypot | sleef | sleef | -(both) | -(both)
nextafter | sleef  | sleef | -(both) | -(both)
fmod | sleef | sleef | -(both) | -(both)

[Vec256 Test cases Pr https://github.com/pytorch/pytorch/issues/42685](https://github.com/pytorch/pytorch/pull/42685)
Current list:

- [x] Blends
- [x] Memory: UnAlignedLoadStore
- [x] Arithmetics: Plus,Minu,Multiplication,Division
- [x] Bitwise: BitAnd, BitOr, BitXor
- [x] Comparison: Equal, NotEqual, Greater, Less, GreaterEqual, LessEqual
- [x] MinMax: Minimum, Maximum, ClampMin, ClampMax, Clamp
- [x] SignManipulation: Absolute, Negate
- [x] Interleave: Interleave, DeInterleave
- [x] Rounding: Round, Ceil, Floor, Trunc
- [x] Mask: ZeroMask
- [x] SqrtAndReciprocal: Sqrt, RSqrt, Reciprocal
- [x] Trigonometric: Sin, Cos, Tan
- [x] Hyperbolic: Tanh, Sinh, Cosh
- [x] InverseTrigonometric: Asin, ACos, ATan, ATan2
- [x] Logarithm: Log, Log2, Log10, Log1p
- [x] Exponents: Exp, Expm1
- [x] ErrorFunctions: Erf, Erfc, Erfinv
- [x] Pow: Pow
- [x] LGamma: LGamma
- [x] Quantization: quantize, dequantize, requantize_from_int
- [x] Quantization: widening_subtract, relu, relu6
Missing:
- [ ] Constructors, initializations
- [ ] Conversion , Cast
- [ ] Additional: imag, conj, angle (note: imag and conj only checked for float complex)

#### Notes on tests and testing framework
- some math functions are tested within domain range
- mostly testing framework randomly tests against std implementation within the domain or within the implementation domain for some math functions.
- some functions are tested against the local version. ~~For example, std::round and vector version of round differs. so it was tested against the local version~~
- round was tested against pytorch at::native::round_impl. ~~for double type on **Vsx  vec_round failed  for  (even)+0 .5 values**~~ . it was solved by using vec_rint
- ~~**complex types are not tested**~~  **After enabling complex testing due to precision and domain some of the complex functions failed for vsx and x86 avx as well. I will either test it against local implementation or check within the accepted domain**
- ~~quantizations are not tested~~  Added tests for quantizing, dequantize, requantize_from_int, relu, relu6, widening_subtract functions
- the testing framework should be improved further
- ~~For now `-DBUILD_MOBILE_TEST=ON `will be used for Vec256Test too~~
Vec256 Test cases will be built for each CPU_CAPABILITY

Pull Request resolved: #41541

Reviewed By: zhangguanheng66

Differential Revision: D23922049

Pulled By: VitalyFedyunin

fbshipit-source-id: bca25110afccecbb362cea57c705f3ce02f26098
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
[Tests for Vec256 classes https://github.com/pytorch/pytorch/issues/15676](https://github.com/pytorch/pytorch/issues/15676)

Testing
Current list:

- [x] Blends
- [x] Memory: UnAlignedLoadStore
- [x] Arithmetics: Plus,Minu,Multiplication,Division
- [x] Bitwise: BitAnd, BitOr, BitXor
- [x] Comparison: Equal, NotEqual, Greater, Less, GreaterEqual, LessEqual
- [x] MinMax: Minimum, Maximum, ClampMin, ClampMax, Clamp
- [x] SignManipulation: Absolute, Negate
- [x] Interleave: Interleave, DeInterleave
- [x] Rounding: Round, Ceil, Floor, Trunc
- [x] Mask: ZeroMask
- [x] SqrtAndReciprocal: Sqrt, RSqrt, Reciprocal
- [x] Trigonometric: Sin, Cos, Tan
- [x] Hyperbolic: Tanh, Sinh, Cosh
- [x] InverseTrigonometric: Asin, ACos, ATan, ATan2
- [x] Logarithm: Log, Log2, Log10, Log1p
- [x] Exponents: Exp, Expm1
- [x] ErrorFunctions: Erf, Erfc, Erfinv
- [x] Pow: Pow
- [x] LGamma: LGamma
- [x] Quantization: quantize, dequantize, requantize_from_int
- [x] Quantization: widening_subtract, relu, relu6
Missing:
- [ ] Constructors, initializations
- [ ] Conversion , Cast
- [ ] Additional: imag, conj, angle (note: imag and conj only checked for float complex)

#### Notes on tests and testing framework
- some math functions are tested within domain range
- mostly testing framework randomly tests against std implementation within the domain or within the implementation domain for some math functions.
- some functions are tested against the local version. ~~For example, std::round and vector version of round differs. so it was tested against the local version~~
- round was tested against pytorch at::native::round_impl. ~~for double type on **Vsx  vec_round failed  for  (even)+0 .5 values**~~ . it was solved by using vec_rint
- ~~**complex types are not tested**~~  **After enabling complex testing due to precision and domain some of the complex functions failed for vsx and x86 avx as well. I will either test it against local implementation or check within the accepted domain**
- ~~quantizations are not tested~~  Added tests for quantizing, dequantize, requantize_from_int, relu, relu6, widening_subtract functions
- the testing framework should be improved further
- ~~For now  `-DBUILD_MOBILE_TEST=ON `will be used for Vec256Test too~~
Vec256 Test cases will be built for each CPU_CAPABILITY

Fixes: pytorch#15676

Pull Request resolved: pytorch#42685

Reviewed By: malfet

Differential Revision: D23034406

Pulled By: glaringlee

fbshipit-source-id: d1bf03acdfa271c88744c5d0235eeb8b77288ef8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: vectorization Related to SIMD vectorization, e.g., Vec256 open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tests for Vec256 classes

6 participants