Skip to content

cudaarithm: make the asynchronous call to NPP safe#11064

Merged
opencv-pushbot merged 2 commits intoopencv:masterfrom
tomoaki0705:fixCudaStreamAsync
Mar 23, 2018
Merged

cudaarithm: make the asynchronous call to NPP safe#11064
opencv-pushbot merged 2 commits intoopencv:masterfrom
tomoaki0705:fixCudaStreamAsync

Conversation

@tomoaki0705
Copy link
Copy Markdown
Contributor

@tomoaki0705 tomoaki0705 commented Mar 13, 2018

resolves #11063

force_builders=Custom
docker_image:Custom=ubuntu-cuda:16.04
allow_multiple_commits=1

{
oldStream = nppGetStream();
nppSetStream(StreamAccessor::getStream(newStream));
safeNppSetStream(oldStream, StreamAccessor::getStream(newStream));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should bailout if newStream equals to oldStream

nppSafeSetStream(oldStream, newStream);
}

inline ~NppStreamHandler()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the destructor should also use the macro.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

As I wrote in #11063, this patch is insufficient to avoid the crash.
I switched to cudaDeviceSynchronize, but it didn't work.
I tried adding cudaDeviceSynchronize before the deconstructor (as @nglee suggested) but it didn't work.

Please wait merge until I find a solid fix for this.

@tomoaki0705 tomoaki0705 force-pushed the fixCudaStreamAsync branch 2 times, most recently from ef1ae71 to 6e31b2b Compare March 15, 2018 09:41
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@opencv-pushbot opencv-pushbot merged commit f4e5d77 into opencv:master Mar 23, 2018
@tomoaki0705 tomoaki0705 deleted the fixCudaStreamAsync branch March 23, 2018 13:13
@nglee
Copy link
Copy Markdown
Contributor

nglee commented May 18, 2018

@tomoaki0705
Hi, I'd like to suggest changing nppSetStream to nppSafeSetStream in the deconstructor of class NppStreamHandler.

That is, from here,

inline ~NppStreamHandler()
{
    nppSetStream(oldStream);
}

we might want to change above code to:

inline ~NppStreamHandler()
{
    nppSafeSetStream(oldStream);
}

IMHO, the npp kernels may still be running when the NppStreamHandler object is being destroyed, and we should wait for the kernel to finish before switching npp stream in such cases.

@tomoaki0705
Copy link
Copy Markdown
Contributor Author

Ahhh!!!! Very nice catch @nglee,
At last, I could be released from this two month duration headache....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cudaarithm: async call to NPP fails

4 participants