bit-exact cuda::equalizeHist#18136
Conversation
|
@nglee Good Job! I'll test the solution for older cards and return back. |
|
We also need the following PR to pass performance tests: |
|
Hello @nglee Please take a look on CI status, there is build failure on configuration Ubuntu 14.04 + CUDA: I see the same on my Jetson TK1. It's rootfs is based on Ubuntu 14.04 too. Also you need to update performance test regression data to make the test pass performance test. Follow this instruction to do it: https://github.com/opencv/opencv/wiki/HowToUsePerfTests#how-to-update-perf-data |
|
@nglee I updated CI instructions to take your test data patch in account. It should be handled by CI bot automatically. Please take a look on the build issue. |
115fcf3 to
f617f18
Compare
|
@asmorkalov I've fixed the build error, and also some compiler warnings. |
It should be handled automatically on Public CI without extra instructions (it checks branch with the same name in opencv_extra) |
|
Great to see this in, thanks very much @nglee. Can someone please provide an estimate on when this would be merged into the |
|
@rgov It is now merged. |
Merge with extra: opencv/opencv_extra#797
This PR aims to make the
cuda::equalizeHistbit-exact to the CPU counterpart.resolves #18035
resolves #10330
When building a lookup table from a histogram, the CPU implementation does this:
(from #226 (comment))
However, what the CUDA implementation did was this:
This PR implements a CUDA kernel to make the CUDA implementation bit-exact.
For tests, now we can change this:
to this:
The performance of new CUDA implementation is similar to the previous one, and faster than the CPU implementation.
Tested on RTX 2080 Ti and CUDA 11.
Patch to opencv_extra has the same branch name.