Fixing bug with comparison of v_int64x2 or v_uint64x2 by ChipKerchner · Pull Request #15738 · opencv/opencv

ChipKerchner · 2019-10-18T15:22:44Z

Casting v_int64x2 or v_uint64x2 to v_float64x2 and comparing does NOT work in all cases. Rewrite using epi64 instructions - faster too.

Here is an example that does NOT work

v_int64x2 a = v_setall_s64(-1);
v_int64x2 b = v_setall_s64(-1);

if (v_reinterpret_as_f64(a) == v_reinterpret_as_f64(b))

// Always fails because reinterpreting it as a f64 produces a NaN. NaN is never equal to any other number, not even itself.

force_builders=Linux AVX2,Custom
buildworker:Custom=linux-3
build_image:Custom=ubuntu:18.04
CPU_BASELINE:Custom=AVX512_SKX
disable_ipp=ON

…cases. Rewrite using epi64 instructions - faster too.

ChipKerchner · 2019-10-18T16:15:49Z

The WASM version needs updating as well. I'm not sure I can do this.

It seems like the changes are something like this:

--- a/modules/core/include/opencv2/core/hal/intrin_wasm.hpp
+++ b/modules/core/include/opencv2/core/hal/intrin_wasm.hpp
@@ -2742,6 +2742,8 @@ OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_uint16x8, u16x8, i16x8)
 OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_int16x8, i16x8, i16x8)
 OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_uint32x4, u32x4, i32x4)
 OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_int32x4, i32x4, i32x4)
+OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_uint64x2, u64x2, i64x2)
+OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_int64x2, i64x2, i64x2)
 OPENCV_HAL_IMPL_WASM_INIT_CMP_OP(v_float32x4, f32x4, f32x4)

 #ifdef __wasm_unimplemented_simd128__
@@ -2762,15 +2764,6 @@ OPENCV_HAL_IMPL_INIT_FALLBACK_CMP_OP(v_float64x2, <=)
 OPENCV_HAL_IMPL_INIT_FALLBACK_CMP_OP(v_float64x2, >=)
 #endif

-#define OPENCV_HAL_IMPL_WASM_64BIT_CMP_OP(_Tpvec, cast) \
-inline _Tpvec operator == (const _Tpvec& a, const _Tpvec& b) \
-{ return cast(v_reinterpret_as_f64(a) == v_reinterpret_as_f64(b)); } \
-inline _Tpvec operator != (const _Tpvec& a, const _Tpvec& b) \
-{ return cast(v_reinterpret_as_f64(a) != v_reinterpret_as_f64(b)); }
-
-OPENCV_HAL_IMPL_WASM_64BIT_CMP_OP(v_uint64x2, v_reinterpret_as_u64)
-OPENCV_HAL_IMPL_WASM_64BIT_CMP_OP(v_int64x2, v_reinterpret_as_s64)
-
 inline v_float32x4 v_not_nan(const v_float32x4& a)
 {
     v128_t z = wasm_i32x4_splat(0x7fffffff);

alalek

Nice catch!

alalek · 2019-10-18T16:17:36Z

modules/core/include/opencv2/core/hal/intrin_sse.hpp

+#define OPENCV_HAL_IMPL_SSE_64BIT_CMP_OP(_Tpvec) \
+inline _Tpvec operator == (const _Tpvec& a, const _Tpvec& b) \
+{ __m128i cmp = _mm_cmpeq_epi32(a.val, b.val); \
+  return _Tpvec(_mm_or_si128(cmp, _mm_shuffle_epi32(cmp, _MM_SHUFFLE(2, 3, 0, 1)))); } \


or

low 32-bits are equal "or" high 32-bits are equal?

Probably operator != should be defined first through "or".

Actually if either low or upper are set, the whole 64-bit register needs to be set. If not, functions like v_check_any/all will not work

Not quite sure what you are suggesting in your above comment.

Fix _mm_or_si128 => _mm_and_si128() is enough.

Example:

v_int64x2 a = v_setall_s64(0x111223344); v_int64x2 b = v_setall_s64(0x011223344);

You're right. It should be AND and not OR.

I'll add some test cases.

alalek

BTW, it would be nice to add test case for that.

alalek · 2019-10-18T19:07:37Z

modules/core/include/opencv2/core/hal/intrin_sse.hpp

+#define OPENCV_HAL_IMPL_SSE_64BIT_CMP_OP(_Tpvec) \
+inline _Tpvec operator == (const _Tpvec& a, const _Tpvec& b) \
+{ __m128i cmp = _mm_cmpeq_epi32(a.val, b.val); \
+  return _Tpvec(_mm_or_si128(cmp, _mm_shuffle_epi32(cmp, _MM_SHUFFLE(2, 3, 0, 1)))); } \


Fix _mm_or_si128 => _mm_and_si128() is enough.

Example:

v_int64x2 a = v_setall_s64(0x111223344); v_int64x2 b = v_setall_s64(0x011223344);

alalek · 2019-10-18T19:39:41Z

WASM

Looks like there is no wasm_i64x2_eq() yet:
https://github.com/emscripten-core/emscripten/blob/incoming/system/include/wasm_simd128.h

Lets update WASM backend in a separate PR (test would help to catch error there too).

…parisons.

alalek · 2019-10-18T21:27:38Z

merge conflict

Your branch is based on very old commit from 2019-08-08
Please avoid moving code - put fresh code near the end of the file.

# add upstream (once)
git remote add upstream	https://github.com/opencv/opencv.git
# this is very useful for conflicts resolving: git config --global merge.conflictstyle diff3

# rebase current branch on latest code
git fetch -t upstream master:master 3.4:3.4
git rebase -i upstream/3.4
# in case of conflict: edit files, "git add <modified files>", "git rebase --continue"

# update with --force
git push origin <your_branch_name_like_bugInt64x2Comparison> -f

# In the future: starting of new bugfix branch:
git fetch -t upstream master:master 3.4:3.4
git checkout -B my_new_fix upstream/3.4

alalek · 2019-10-18T22:11:18Z

Please ignore errors from Custom/AVX512 builder - they are not related to this PR.

alalek · 2019-10-20T13:10:11Z

modules/core/test/test_intrin_utils.hpp

        .test_loadstore()
        .test_addsub()
+#if CV_SIMD_64F
+        .test_cmp64()


Perhaps we should avoid this check for v_uint64 types.
@terfendail Could you take a look?

The problem is for NEON comparison of v_int64x2 and v_uint64x2 does NOT exist unless CV_SIMD_64F.

You are right.
In theory, CV_SIMD_64F should guard using of vectors of doubles and their operations.
Additionally, it guards some 64u/64s operations and some operations with 32f (like div) on NEON.

Perhaps we should address that in a separate PR.

On the first look It seems like there is no reason to guard 64u/64s for NEON. This intrinsics doesn't use specific 64f related instructions. Maybe it make sense to change NEON guard scope?

Underlying instructions are available on AARCH64 only. Removing guard breaks ARMv7 builds.

Probably we need dedicated macro for this functionality (btw, == can be emulated through 32-bit (like SSE2/3), but less/greater comparisons are not easy to re-implement).

alalek

Well done! Thank you 👍

ChipKerchner added 2 commits October 18, 2019 10:11

Casting v_uint64x2 to v_float64x2 and comparing does NOT work in all …

4fc6af4

…cases. Rewrite using epi64 instructions - faster too.

Fix bad merge.

567a1d0

ChipKerchner mentioned this pull request Oct 18, 2019

Vectorize minMaxIdx functions #15488

Merged

alalek reviewed Oct 18, 2019

View reviewed changes

ChipKerchner added 2 commits October 18, 2019 15:45

Fix equal comparsion for non-SSE4.1. Add test cases for v_int64x2 com…

2f840de

…parisons.

Try to fix merge conflict.

8c35512

ChipKerchner added 2 commits October 18, 2019 16:35

Only test v_int64x2 comparisons if CV_SIMD_64F

aedbac7

Fix compiler warning.

fba4aa0

alalek reviewed Oct 20, 2019

View reviewed changes

alalek approved these changes Oct 22, 2019

View reviewed changes

terfendail approved these changes Oct 22, 2019

View reviewed changes

alalek assigned terfendail Oct 22, 2019

alalek merged commit 5a6a494 into opencv:3.4 Oct 22, 2019

alalek mentioned this pull request Oct 24, 2019

Merge 3.4 #15771

Merged

ChipKerchner deleted the bugInt64x2Comparison branch November 5, 2019 17:54

This was referenced Feb 26, 2023

finiteMask() and doubles for patchNaNs() #23098

Merged

core(simd): 64-bit integer EQ/NE without misused 64F guard #23307

Merged

Uh oh!

Conversation

ChipKerchner commented Oct 18, 2019 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChipKerchner commented Oct 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek commented Oct 18, 2019

Uh oh!

alalek commented Oct 18, 2019

Uh oh!

alalek commented Oct 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChipKerchner Oct 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChipKerchner commented Oct 18, 2019 •

edited by alalek

Loading

ChipKerchner commented Oct 18, 2019 •

edited

Loading

ChipKerchner Oct 21, 2019 •

edited

Loading