Skip to content

core: fix simd emulator code#16236

Merged
alalek merged 5 commits intoopencv:3.4from
alalek:fix_core_simd_emulator
Jan 10, 2020
Merged

core: fix simd emulator code#16236
alalek merged 5 commits intoopencv:3.4from
alalek:fix_core_simd_emulator

Conversation

@alalek
Copy link
Copy Markdown
Member

@alalek alalek commented Dec 25, 2019

Merge with contrib: opencv/opencv_contrib#2403

Usage:

cmake -DOPENCV_EXTRA_FLAGS="-DCV_FORCE_SIMD128_CPP=1" ...
force_builders=Custom
buildworker:Custom=linux-1
build_image:Custom=simd-emulator

#test_timeout:Linux x64 Debug=1200
#test_maxtime:Linux x64 Debug=14400

static inline Tvec r(const Tvec& a, const Tvec& b)
{
const Tvec v_zero = Tvec();
const Tvec v_zero = vx_setall<typename Tvec::lane_type>(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it make sense to add vx_setzero template as well and use it here instead

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be added as a separate PR

yf0 = v_fma(v_cvt_f32(yi0), vln2, yf0);

v_float32 delta = v_reinterpret_as_f32(h0 == vx_setall_s32(510)) & vshift;
v_float32 delta = v_select(v_reinterpret_as_f32(h0 == vx_setall_s32(510)), vshift, vx_setall<float>(0));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a possibility that this version will be slower on SSE2 and SSE3. However I think the line could be optimized by a compiler.

@alalek alalek merged commit e180cc0 into opencv:3.4 Jan 10, 2020
@alalek alalek mentioned this pull request Jan 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants