Skip to content

Fix hfloat conflicts of v_func in merging 4.x to 5.x#26369

Merged
asmorkalov merged 4 commits intoopencv:5.xfrom
WanliZhong:5x_fix_hfloat_vfunc
Oct 26, 2024
Merged

Fix hfloat conflicts of v_func in merging 4.x to 5.x#26369
asmorkalov merged 4 commits intoopencv:5.xfrom
WanliZhong:5x_fix_hfloat_vfunc

Conversation

@WanliZhong
Copy link
Copy Markdown
Member

This PR solves the conflicts in merging 4.x to 5.x #26358

  1. Explicitly convert the inputs number for v_setall_ to hfloat number
  2. Loosens the threshold for v_sincos test. (related issue: Test test_sincos_fp16 fails accuracy check #26362)
  3. Remove the new but temp api template <> inline v_float16x8 v_setall_(float v) { return v_setall_f16((hfloat)v); }

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov
Copy link
Copy Markdown
Contributor

@vpisarev @WanliZhong I think that 1e-2 is too soft requirement for sin/cos test. What do you think?

@asmorkalov asmorkalov added bug optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc labels Oct 26, 2024
@WanliZhong
Copy link
Copy Markdown
Member Author

As the error is shown below. 1e-2 is just a relative threshold, the real difference is 3e-5, it's still tiny for fp16 number. Besides, the implementation of fp16 is the same as fp32, I didn't do some specific optimization for fp16 constants.

2024-10-24T09:51:49.3740700Z [ RUN      ] hal_intrin128.float16x8_FP16
2024-10-24T09:51:49.4363000Z SIMD128: void opencv_test::hal::intrin128::opt_FP16::test_hal_intrin_float16()
2024-10-24T09:51:49.4364060Z test_loadstore_fp16_f32 ...
2024-10-24T09:51:49.4365710Z /Users/opencv-cn/GHA-OCV-2/_work/opencv/opencv/opencv/modules/core/test/test_intrin_utils.hpp:2159: Failure
2024-10-24T09:51:49.4368000Z Expected: (std::abs(resCos[j] - std_cos)) < (diff_thr * (std::abs(std_cos) + flt_min * 100)), actual: 1.14441e-05 vs 1.05023e-05
2024-10-24T09:51:49.4369380Z Google Test trace:
2024-10-24T09:51:49.4371000Z /Users/opencv-cn/GHA-OCV-2/_work/opencv/opencv/opencv/modules/core/test/test_intrin_utils.hpp:2146: Random test value: 344.000000
2024-10-24T09:51:49.4373490Z /Users/opencv-cn/GHA-OCV-2/_work/opencv/opencv/opencv/modules/core/test/test_intrin_utils.hpp:2159: Failure
2024-10-24T09:51:49.4375780Z Expected: (std::abs(resCos[j] - std_cos)) < (diff_thr * (std::abs(std_cos) + flt_min * 100)), actual: 2.28882e-05 vs 1.93254e-05
...
2024-10-24T09:51:49.7652680Z [  FAILED  ] hal_intrin128.float16x8_FP16 (118 ms)

@WanliZhong
Copy link
Copy Markdown
Member Author

It‘s my mistake, I should use 4e-3 instead of 1e-2 😂

@asmorkalov asmorkalov added this to the 5.0-alpha milestone Oct 26, 2024
@asmorkalov asmorkalov self-assigned this Oct 26, 2024
@asmorkalov asmorkalov merged commit 29e712e into opencv:5.x Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants