highgui: wayland: fix to pass highgui test#25551
Conversation
- optimize Mat to XRGB8888 conversion with SIMD - extend to support CV_8S/16U/16S/32F/64F - extend to support 1/4 channels - fix to update value timing - initilize slider_ value if value is not nullptr. - Update user-ptr value and call on_change() function if cv_wl_trackbar::draw() is not called. - Update usage of WAYLAND/XDG macro to avoid reference undefined macro. - Update documents
|
This patch contains performance tuning. ResultI compare count of Instruction references.
The differences between SSE3 and SSE4.1 comes from intristic implementation. opencv/modules/core/include/opencv2/core/hal/intrin_sse.hpp Lines 2422 to 2453 in dad8af6 TestSource code is here. // g++ main.cpp -o a.out -I /usr/local/include/opencv4 -lopencv_core -lopencv_highgui -lopencv_imgcodecs
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <iostream>
#include <string>
int main(int argc, char *argv[])
{
std::cout << "cv::currentUIFramework() returns " << cv::currentUIFramework() << std::endl;
cv::Mat src;
src = cv::imread("opencv-logo.png");
cv::namedWindow("src");
cv::imshow("src", src);
(void)cv::waitKey(1000);
return 0;
}Command is here. valgrind --tool=callgrind ./a.out
callgrind_annotate callgrind.out.[PID] | grep 8888 | head -1 |
asmorkalov
left a comment
There was a problem hiding this comment.
👍 Tested manually with Kubuntu 24.04 and Wayland session.
Proposed todo: Add RISC-V RVV and other scalable vector intrinsics support. Need to use CV_SIMD_SCALABLE macro and run-time value step in loops.
Thank you for your proposal ! I'll try it this weekend. I think current implementation will be refactoring similar to split function. For example(this is only my imagination ). template<typename T, typename VecT> static void
vecwrite_T_to_xrgb8888( const T* src, T* dst, int len, int scn )
{
const int VECSZ = VTraits<VecT>::vlanes();
const int dcn = 4; // XRGB
:
:
else if( scn == 3 )
{
for( i = 0; i < len; i += VECSZ )
{
if( i > len - VECSZ )
{
i = len - VECSZ;
mode = hal::STORE_UNALIGNED;
}
VecT b,g,r;
v_load_deinterleave(src + i*scn, b, g, r);
v_store_interleave (dst + i*dcn, b, g, r, r, mode);
if( i < i0 )
{
i = i0 - VECSZ;
mode = hal::STORE_ALIGNED_NOCACHE;
}
}
} |
|
I update code to support CV_SIMD_SCALABLE and tested with VMWare(AVX2) and Raspi4(NEON) with ubuntu24.04. opencv_test_highgui is passed and it called vector implementation. This logic is simple. I add AVX512_SKX LASX and RVV because I expected it to be effective.
|
|
I see cvtColor implementation, and I have second idea to use cvtColor(cv::BGR2BGRA) or cvtColor(cv::GRAY2BGRA) instead of this SIMD implementation. I'l try it. Wayland requests [B8:G8:R8:X8], not [B8:G8:R8:A8]. But I notice X channel is not used, it means there are no problem even if it stores non-transparency alpha value. We can get many performance improvemet, which are provided from OpenCL, IPP, multithread, via cvtColor(). And furthermore, the maintainability of the code is also improved. |
- In comment, use "Wayland" instead of "wayland". - Remove redundant newline.
highgui: wayland: fix to pass highgui test opencv#25551 Close opencv#25550 - optimize Mat to XRGB8888 conversion with OpenCV functions - extend to support CV_8S/16U/16S/32F/64F - extend to support 1/4 channels - fix to update value timing - initilize slider_ value if value is not nullptr. - Update user-ptr value and call on_change() function if cv_wl_trackbar::draw() is not called. - Update usage of WAYLAND/XDG macro to avoid reference undefined macro. - Update documents ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Close #25550
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.