Skip to content

Add simd256 intrinsics to demosaicing#26012

Closed
Burnside999 wants to merge 6 commits intoopencv:4.xfrom
Burnside999:simd-demosaicing
Closed

Add simd256 intrinsics to demosaicing#26012
Burnside999 wants to merge 6 commits intoopencv:4.xfrom
Burnside999:simd-demosaicing

Conversation

@Burnside999
Copy link
Copy Markdown
Contributor

Add simd256 intrinsics to demosaicing, boasts 1.5x to 2x efficiency and passes all accuracy tests.

Related issue: #25724

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov
Copy link
Copy Markdown
Contributor

modules/imgproc/src/demosaicing.cpp:229: trailing whitespace.
+            
modules/imgproc/src/demosaicing.cpp:232: trailing whitespace.
+            
modules/imgproc/src/demosaicing.cpp:429: trailing whitespace.
+            
modules/imgproc/src/demosaicing.cpp:1477: trailing whitespace.

@asmorkalov asmorkalov requested a review from mshabunin August 8, 2024 18:12
@mshabunin
Copy link
Copy Markdown
Contributor

Do you have numbers from performance tests? (https://github.com/opencv/opencv/wiki/HowToUsePerfTests)

@Burnside999
Copy link
Copy Markdown
Contributor Author

Do you have numbers from performance tests? (https://github.com/opencv/opencv/wiki/HowToUsePerfTests)

Sure, sir. Here is it.

Geometric mean (ms)

                            Name of Test                              imgproc imgproc  imgproc  
                                                                      simd128 simd256  simd256  
                                                                                          vs    
                                                                                       imgproc  
                                                                                       simd128  
                                                                                      (x-factor)
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)       0.005   0.005     1.02   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)      0.005   0.004     1.42   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)   0.066   0.038     1.71   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)      0.004   0.002     1.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)       0.005   0.005     1.02   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)      0.005   0.004     1.40   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)   0.066   0.038     1.72   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)      0.005   0.002     2.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)       0.005   0.005     1.03   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)      0.005   0.004     1.42   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)   0.065   0.038     1.70   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)      0.004   0.002     1.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)       0.005   0.005     1.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)      0.005   0.004     1.41   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)   0.066   0.050     1.32   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)      0.004   0.004     1.19   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)      0.068   0.069     0.98   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)     0.060   0.039     1.53   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)  2.936   1.418     2.07   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)     0.034   0.021     1.64   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)      0.056   0.043     1.31   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)     0.060   0.040     1.52   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)  3.042   1.402     2.17   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)     0.055   0.019     2.90   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)      0.063   0.052     1.22   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)     0.071   0.040     1.79   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)  2.876   1.351     2.13   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)     0.028   0.016     1.73   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)      0.038   0.060     0.63   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)     0.044   0.036     1.24   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)  2.677   1.440     1.86   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)     0.027   0.019     1.43   

The conversion to BGA is unsurprisingly slower. This is because most of simd's instructions are processed in 64 bits. Processing a three-channel image requires extra overhead to get rid of those extra zeros.

I use the blend operation when processing storage with simd256, which is a big reason for the slowness of this operation, but I haven't come up with a better solution at the moment due to the peculiarities of three-channel storage.

@Burnside999
Copy link
Copy Markdown
Contributor Author

To visualize the difference in timing due to the different transformation types, I've shifted their positions in the table and split them up.

I didn't touch any of the values except to change the position.

Geometric mean (ms)

                            Name of Test                              imgproc imgproc  imgproc  
                                                                      simd128 simd256  simd256  
                                                                                          vs    
                                                                                       imgproc  
                                                                                       simd128  
                                                                                      (x-factor)
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)       0.005   0.005     1.02   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)       0.005   0.005     1.02   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)       0.005   0.005     1.03   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)       0.005   0.005     1.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)      0.068   0.069     0.98   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)      0.056   0.043     1.31   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)      0.063   0.052     1.22   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)      0.038   0.060     0.63   
-----------------------------------------------------------------------------------------------
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)      0.005   0.004     1.42   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)      0.005   0.004     1.40   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)      0.005   0.004     1.42   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)      0.005   0.004     1.41   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)     0.060   0.039     1.53   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)     0.060   0.040     1.52   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)     0.071   0.040     1.79   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)     0.044   0.036     1.24   
-----------------------------------------------------------------------------------------------
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)   0.066   0.038     1.71   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)   0.066   0.038     1.72   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)   0.065   0.038     1.70   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)   0.066   0.050     1.32   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)  2.936   1.418     2.07   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)  3.042   1.402     2.17   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)  2.876   1.351     2.13   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)  2.677   1.440     1.86   
-----------------------------------------------------------------------------------------------
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)      0.004   0.002     1.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)      0.005   0.002     2.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)      0.004   0.002     1.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)      0.004   0.004     1.19   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)     0.034   0.021     1.64   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)     0.055   0.019     2.90   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)     0.028   0.016     1.73   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)     0.027   0.019     1.43   

@opencv-alalek
Copy link
Copy Markdown
Contributor

Could you double check this case (re-measure)?

cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)      0.038   0.060     0.63   

@Burnside999
Copy link
Copy Markdown
Contributor Author

Could you double check this case (re-measure)?

cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)      0.038   0.060     0.63   

Okay. I'll retest it later.

@Burnside999
Copy link
Copy Markdown
Contributor Author

@opencv-alalek @mshabunin I did a separate test for that target(BGR) and it doesn't seem to be performing very well.

I guess I should figure out how to optimize it a bit. The extra processing done by the storage does seem a bit too much.

Geometric mean (ms)

                          Name of Test                            imgproc imgproc  imgproc  
                                                                  simd128 simd256  simd256  
                                                                                      vs    
                                                                                   imgproc  
                                                                                   simd128  
                                                                                  (x-factor)
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)   0.004   0.005     0.84   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)   0.004   0.005     0.85   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)   0.004   0.005     0.85   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)   0.004   0.005     0.84   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)  0.038   0.037     1.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)  0.037   0.035     1.04   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)  0.040   0.042     0.95   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)  0.031   0.061     0.51

@Burnside999
Copy link
Copy Markdown
Contributor Author

Burnside999 commented Aug 13, 2024

I've tried two programs in the recent past, but neither has worked well. Eventually found this rather strange and rather violent method to be the best option.

The test is partially unstable, so it occasionally goes high and low.

Geometric mean (ms)

                           Name of Test                             simd128 simd256  simd256  
                                                                                        vs    
                                                                                     simd128  
                                                                                    (x-factor)
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)     0.004   0.004     0.91   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)     0.004   0.004     0.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)     0.004   0.004     0.91   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)     0.004   0.004     0.90   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)    0.046   0.034     1.33   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)    0.035   0.030     1.18   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)    0.038   0.031     1.21   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)    0.045   0.031     1.42   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerBG2BGR)  0.224   0.193     1.16   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGB2BGR)  0.334   0.140     2.38   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGR2BGR)  0.251   0.158     1.59   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerRG2BGR)  0.225   0.206     1.09

p.s. The 1080p test was added locally by myself and was not submitted, and this code passed the accuracy tests.

@asmorkalov
Copy link
Copy Markdown
Contributor

I made experiments with my AMD Ryzen 7 PRO 3700 and see speedup for 2GRAY conversions only. Other cases provide the same performance or worse.
Example:

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)                                                                                          0.004    0.003      1.22   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)                                                                                          0.004    0.003      1.21   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)                                                                                          0.004    0.003      1.21   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)                                                                                          0.004    0.003      1.21   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)                                                                                         0.034    0.023      1.50   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)                                                                                         0.034    0.027      1.29   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)                                                                                         0.030    0.027      1.12   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)                                                                                         0.047    0.026      1.77   

@asmorkalov
Copy link
Copy Markdown
Contributor

m.b. resolution is too small and numbers too volatile to make a decision.

@FantasqueX
Copy link
Copy Markdown
Contributor

FantasqueX commented Aug 26, 2024

Just a notice, since #25968 has been merged, bayer2gray needs to adapt the new test case.

@Burnside999
Copy link
Copy Markdown
Contributor Author

Other cases provide the same performance or worse.

@asmorkalov Thank you for taking the time to test, but there's something strange. In theory, 2GRAY should indeed be the fastest, as it uses the fewest instructions and has the simplest implementation. But as for the 2BGRA and 2BGR_VNG, I didn't do anything extra other than modify the boundary conditions and replace the 128-bit instructions with 256-bit instructions. They should give a slight speedup, but cannot reach 2x. (Tests show it should be around 1.5x)

And as for BGR, it's the hardest to deal with, I'll need to (inevitably) use more instructions to store it, and it will probably stay the same or even a little worse in terms of efficiency.

I'm at a bit of a loss as to what to do now, I think the rest of the conversions seem to have reached their theoretical limit of efficiency except for 2BGR; and 2BGR I haven't found a faster solution for.

@Burnside999
Copy link
Copy Markdown
Contributor Author

Just a notice, since #25968 has been merged, bayer2gray needs to adapt the new test case.

@FantasqueX Thank you, I will take note and modify my 2GRAY appropriately!

@mshabunin
Copy link
Copy Markdown
Contributor

For better performance tests stability you can increase number of test iterations (./opencv_perf_imgproc --perf_min_samples=100 --perf_force_samples=100) and tune machine parameters (disable Hyper Threading and Turbo Boost, set fixed frequency, etc.) (see also - https://pyperf.readthedocs.io/en/latest/system.html).

@Burnside999
Copy link
Copy Markdown
Contributor Author

For better performance tests stability you can increase number of test iterations (./opencv_perf_imgproc --perf_min_samples=100 --perf_force_samples=100) and tune machine parameters (disable Hyper Threading and Turbo Boost, set fixed frequency, etc.) (see also - https://pyperf.readthedocs.io/en/latest/system.html).

Thanks, I'll try it later.

@Burnside999
Copy link
Copy Markdown
Contributor Author

Due to my busy work schedule recently, I will be handling this unfinished pr on September 7th and 8th. I apologize for the long wait.

@asmorkalov
Copy link
Copy Markdown
Contributor

I made experiment with forced samples to 500 on my Ryzen 7 3700:

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)          0.006      0.006       1.00    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)         0.005      0.005       1.00    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)      0.046      0.046       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)         0.004      0.004       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)          0.005      0.005       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)         0.005      0.005       1.00    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)      0.046      0.046       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)         0.004      0.004       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)          0.005      0.005       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)         0.005      0.005       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)      0.045      0.046       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)         0.004      0.004       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)          0.005      0.005       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)         0.005      0.005       1.00    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)      0.045      0.045       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)         0.004      0.004       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)         0.049      0.041       1.18    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)        0.041      0.042       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)     1.820      1.835       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)        0.027      0.028       0.98    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)         0.041      0.041       1.00    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)        0.044      0.042       1.04    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)     1.837      1.823       1.01    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)        0.027      0.028       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)         0.041      0.040       1.01    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)        0.043      0.041       1.03    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)     1.836      1.825       1.01    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)        0.027      0.028       0.99    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)         0.041      0.040       1.01    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)        0.043      0.042       1.02    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)     1.835      1.823       1.01    
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)        0.027      0.028       0.98  

Force serialized (1 thread):

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)             0.005             0.005               0.99       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)            0.005             0.005               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)         0.044             0.044               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)            0.004             0.004               0.93       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)             0.005             0.005               0.99       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)            0.005             0.005               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)         0.044             0.044               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)            0.004             0.004               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)             0.005             0.005               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)            0.005             0.005               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)         0.044             0.044               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)            0.004             0.004               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)             0.005             0.005               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)            0.005             0.005               0.99       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)         0.044             0.044               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)            0.004             0.004               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)            0.180             0.180               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)           0.178             0.178               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)        1.810             1.812               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)           0.122             0.122               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)            0.180             0.180               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)           0.178             0.179               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)        1.810             1.807               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)           0.122             0.122               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)            0.180             0.180               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)           0.177             0.178               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)        1.810             1.808               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)           0.121             0.122               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)            0.180             0.180               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)           0.178             0.179               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)        1.810             1.810               1.00       
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)           0.122             0.122               1.00 

@asmorkalov
Copy link
Copy Markdown
Contributor

I'll try Intel platform, but I do not expect significant difference.

@Burnside999
Copy link
Copy Markdown
Contributor Author

@asmorkalov Thank you for testing. Would it be a little better with a higher resolution? Like 1920x1080.

@asmorkalov
Copy link
Copy Markdown
Contributor

I do not know. It may depend on cache size.

@Burnside999
Copy link
Copy Markdown
Contributor Author

I used this command to test:

taskset -c 0 python modules/ts/misc/run.py build -t imgproc --gtest_filter=Size_CvtMode_Bayer_cvtColorBayer8u.* --perf_min_samples=200 --perf_force_samples=200

and I added 1080p resolution in test cases. My Intel Core i7-11800H got the following results:

Geometric mean (ms)

                             Name of Test                               simd128 simd256  simd256  
                                                                                            vs    
                                                                                         simd128  
                                                                                        (x-factor)
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)         0.003   0.004     0.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)        0.003   0.003     1.11   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)     0.044   0.032     1.37   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)        0.003   0.003     1.34   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)         0.003   0.004     0.91   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)        0.003   0.003     1.10   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)     0.045   0.033     1.37   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)        0.003   0.003     1.32   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)         0.003   0.004     0.91   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)        0.003   0.003     1.10   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)     0.045   0.033     1.37   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)        0.003   0.003     1.31   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)         0.003   0.004     0.92   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)        0.003   0.003     1.11   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)     0.048   0.033     1.48   
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)        0.004   0.003     1.58   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)        0.145   0.133     1.09   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)       0.114   0.091     1.26   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)    1.946   1.111     1.75   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)       0.108   0.088     1.22   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)        0.117   0.132     0.89   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)       0.102   0.091     1.13   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)    1.935   1.119     1.73   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)       0.108   0.088     1.23   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)        0.115   0.132     0.87   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)       0.102   0.104     0.98   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)    1.932   1.135     1.70   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)       0.106   0.087     1.22   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)        0.115   0.132     0.87   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)       0.102   0.091     1.12   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)    1.911   1.192     1.60   
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)       0.110   0.104     1.06   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerBG2BGR)      0.897   0.884     1.01   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerBG2BGRA)     0.800   0.610     1.31   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerBG2BGR_VNG) 14.204   7.563     1.88   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerBG2GRAY)     0.750   0.567     1.32   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGB2BGR)      0.924   0.869     1.06   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGB2BGRA)     0.725   0.585     1.24   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGB2BGR_VNG) 13.647   7.501     1.82   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGB2GRAY)     0.717   0.564     1.27   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGR2BGR)      0.783   0.878     0.89   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGR2BGRA)     0.733   0.580     1.26   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGR2BGR_VNG) 13.128   7.464     1.76   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerGR2GRAY)     0.753   0.562     1.34   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerRG2BGR)      0.784   0.877     0.89   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerRG2BGRA)     0.689   0.613     1.12   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerRG2BGR_VNG) 13.090   7.419     1.76   
cvtColorBayer8u::Size_CvtMode_Bayer::(1920x1080, COLOR_BayerRG2GRAY)     0.712   0.567     1.26

Except for the Bayer2BGR problem that is indeed difficult to solve, there are some performance improvements in the rest

@asmorkalov
Copy link
Copy Markdown
Contributor

asmorkalov commented Sep 10, 2024

Ok, found out the issue reason. OpenCV does not enable AVX2 (256-bit) flags by default to be cross-platform. The new branch is not used in the default build. -DCPU_BASELINE=AVX2 enables it and I get (4.x default vs patch with AVX2 forced, Ryzen 3700):

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)          0.006        0.006             0.90      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)         0.005        0.005             1.15      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)      0.047        0.037             1.27      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)         0.004        0.003             1.29      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)          0.006        0.006             0.91      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)         0.005        0.005             1.15      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)      0.047        0.037             1.27      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)         0.004        0.003             1.31      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)          0.006        0.006             0.91      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)         0.005        0.005             1.14      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)      0.047        0.037             1.27      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)         0.004        0.003             1.29      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)          0.006        0.006             0.91      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)         0.005        0.005             1.14      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)      0.046        0.037             1.27      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)         0.004        0.003             1.31      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)         0.053        0.061             0.86      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)        0.042        0.043             0.97      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)     1.865        1.160             1.61      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)        0.028        0.027             1.01      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)         0.041        0.052             0.78      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)        0.041        0.035             1.17      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)     1.858        1.152             1.61      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)        0.028        0.022             1.25      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)         0.041        0.047             0.87      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)        0.041        0.037             1.10      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)     1.848        1.146             1.61      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)        0.028        0.021             1.30      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)         0.041        0.047             0.87      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)        0.041        0.037             1.10      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)     1.845        1.146             1.61      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)        0.028        0.021             1.29 

@asmorkalov
Copy link
Copy Markdown
Contributor

More accurate benchmark, 4.x AVX2 vs patch AVX2:

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)            0.005           0.006             0.82      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)           0.004           0.003             1.11      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)            0.005           0.006             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)           0.004           0.003             1.12      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)            0.005           0.006             0.82      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)           0.004           0.003             1.11      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)            0.005           0.006             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)           0.004           0.003             1.12      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)           0.046           0.061             0.74      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)          0.034           0.043             0.80      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)       1.415           1.160             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)          0.023           0.027             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)           0.037           0.052             0.70      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)          0.034           0.035             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)       1.406           1.152             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)          0.022           0.022             1.00      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)           0.037           0.047             0.78      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)          0.037           0.037             1.00      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)       1.400           1.146             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)          0.022           0.021             1.04      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)           0.050           0.047             1.06      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)          0.041           0.037             1.10      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)       1.400           1.146             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)          0.022           0.021             1.04    

@Burnside999
Copy link
Copy Markdown
Contributor Author

More accurate benchmark, 4.x AVX2 vs patch AVX2:

cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR)            0.005           0.006             0.82      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerBG2GRAY)           0.004           0.003             1.11      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR)            0.005           0.006             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGB2GRAY)           0.004           0.003             1.12      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR)            0.005           0.006             0.82      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerGR2GRAY)           0.004           0.003             1.11      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR)            0.005           0.006             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGRA)           0.004           0.005             0.89      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2BGR_VNG)        0.036           0.037             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(127x61, COLOR_BayerRG2GRAY)           0.004           0.003             1.12      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR)           0.046           0.061             0.74      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGRA)          0.034           0.043             0.80      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2BGR_VNG)       1.415           1.160             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerBG2GRAY)          0.023           0.027             0.83      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR)           0.037           0.052             0.70      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGRA)          0.034           0.035             0.98      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2BGR_VNG)       1.406           1.152             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGB2GRAY)          0.022           0.022             1.00      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR)           0.037           0.047             0.78      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGRA)          0.037           0.037             1.00      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2BGR_VNG)       1.400           1.146             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerGR2GRAY)          0.022           0.021             1.04      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR)           0.050           0.047             1.06      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGRA)          0.041           0.037             1.10      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2BGR_VNG)       1.400           1.146             1.22      
cvtColorBayer8u::Size_CvtMode_Bayer::(640x480, COLOR_BayerRG2GRAY)          0.022           0.021             1.04    

Is this building a 4.x patch using -DCPU_BASELINE=AVX2?

I'm not quite sure what happens with this, does the 4.x patch execute simd128 or non-simd function?

@FantasqueX
Copy link
Copy Markdown
Contributor

As stated in #25019, Universal intrinsics is preferred. I just look through SIMDBayerInterpolator_8u::bayer2Gray. The SIMD256 code is similar to existing SIMD128 code. So, I suggested to change existing code to use universal intrinsics to optimize with SIMD512 and SIMD_SCALABLE which can also make the code cleaner.

@Burnside999
Copy link
Copy Markdown
Contributor Author

As stated in #25019, Universal intrinsics is preferred. I just look through SIMDBayerInterpolator_8u::bayer2Gray. The SIMD256 code is similar to existing SIMD128 code. So, I suggested to change existing code to use universal intrinsics to optimize with SIMD512 and SIMD_SCALABLE which can also make the code cleaner.

@FantasqueX This is great. I've been unable to find a performance bottleneck in COLOR_BayerGR2BGRA and hopefully I'll be back to normal after using it.

I'll be working on this on October 5th & October 6th.

@mshabunin
Copy link
Copy Markdown
Contributor

Currently I'm observing accuracy test issue with this PR (we don't have AVX2 builders in precommit):

[ RUN      ] ImgProc_BayerEdgeAwareDemosaicing.accuracy
/work/opencv/modules/imgproc/test/test_color.cpp:3027: Failure
Expected equality of these values:
  countNonZero(diff.reshape(1) > 1)
    Which is: 760986
  0
/work/opencv/modules/ts/src/ts.cpp:612: Failure
Failed

	failure reason: Bad accuracy
	test case #-1
	seed: 0000000000000000
-----------------------------------
	SUM: 
Reference value: 127
Actual value: 137
(y, x): (0, 0)
Channel pos: 0
Pattern: bg
Bayer image type: CV_8U
-----------------------------------

[  FAILED  ] ImgProc_BayerEdgeAwareDemosaicing.accuracy (8 ms)

@Burnside999
Copy link
Copy Markdown
Contributor Author

Currently I'm observing accuracy test issue with this PR (we don't have AVX2 builders in precommit):

Okay. I've been a little busy lately, but I'll keep this in mind. Thanks.

@asmorkalov asmorkalov modified the milestones: 4.11.0, 4.12.0 Dec 20, 2024
@asmorkalov
Copy link
Copy Markdown
Contributor

asmorkalov commented Mar 11, 2025

Is the PR relevant after #26868?

@asmorkalov asmorkalov closed this Mar 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants