-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
Closed
Labels
bugcategory: gpu/cuda (contrib)OpenCV 4.0+: moved to opencv_contribOpenCV 4.0+: moved to opencv_contrib
Description
System information (version)
- OpenCV => master
- Operating System / Platform => Windows 10 Pro 64bit / CUDA 8.0
- Compiler => Visual Studio 2015 Win64
- Operating System / Platform => Ubuntu 64bit / Jetson TX1 / CUDA 8.0
- Compiler => GCC 5.4.0
Detailed description
- test from
opencv_test_cudaarithmfails
[==========] 10629 tests from 65 test cases ran. (34990 ms total)
[ PASSED ] 10625 tests.
[ FAILED ] 4 tests, listed below:
[ FAILED ] CUDA_Arithm/MeanStdDev.Async/0, where GetParam() = (NVIDIA Tegra X1, 128x128, whole matrix)
[ FAILED ] CUDA_Arithm/MeanStdDev.Async/1, where GetParam() = (NVIDIA Tegra X1, 128x128, sub matrix)
[ FAILED ] CUDA_Arithm/MeanStdDev.Async/2, where GetParam() = (NVIDIA Tegra X1, 113x113, whole matrix)
[ FAILED ] CUDA_Arithm/MeanStdDev.Async/3, where GetParam() = (NVIDIA Tegra X1, 113x113, sub matrix)
4 FAILED TESTS
- Also confirmed with Jetson TX2, Geforce GTX 650M, Geforce GTX 1060
- I cut out the minimum code from OpenCV and then the crash didn't reproduce.
- I googled up with the message "Microsoft C++ exception: NppStatus at memory location" which appeared on Visual Studio, and found this post from a nVidia person
It seems that any time you issue a nppSetStream() call that in fact changes the underlying stream, you will need to issue a cudaDeviceSynchronize() call (first), before issuing the nppSetStream() call.
- This is exactly what is happening inside OpenCV
- I added a macro to call
cudaStreamSynchronizebeforenppSetStreamand this solved the issue - References
For this reason it is recommended that cudaDeviceSynchronize (or at least cudaStreamSynchronize) be called before making an nppSetStream call to change to a new stream ID. This will insure that any internal function calls that have not yet occurred will be completed using the current stream ID before it changes to a new ID.
-
I'll send a PR later
Steps to reproduce
- run
opencv_test_cudaarithm --gtest_filter=*MeanStdDev*Async*
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugcategory: gpu/cuda (contrib)OpenCV 4.0+: moved to opencv_contribOpenCV 4.0+: moved to opencv_contrib