Skip to content

add feature: cuda::Stream(cudaStream_t&&) constructor #19285

@diablodale

Description

@diablodale

I propose adding a cuda::Stream constructor that takes ownership of a cudaStream_t. Therefore, when that Stream is destructed, it also cudaStreamDestroy(cudaStream_t). This proposal is driven from the needs of a multi-threaded application that decodes frames of images using Nvidia's decoding SDKs like nvJpeg.

System information (version)
  • OpenCV => 4.5.1
  • Operating System / Platform => Microsoft Windows [Version 10.0.19042.685] 64-bit
  • Compiler => Visual Studio 2019 Community v16.8.3
Detailed description
  • Each thread needs its own CUDA stream. It can use the single CUDA default stream or
    create unique CUDA streams per-thread. And as is written in OpenCV documentation,
    "Please use different Stream objects for different CPU threads."
  • It is not possible in OpenCV to construct a cuda::Stream from a cudaStream_t.
    There is no way to new cuda::Stream(cudaStream_t).
  • cuda::StreamAccessor::wrapStream(cudaStream_t stream) is not a constructor, it is a
    static function. Therefore, it can not be used to construct a cuda::Stream and return
    a pointer to it. It can not be used in shared_ptr, not used in concurrency::combinable, etc.
  • It is unreasonable to create a shared_ptr, that has a lambda with a custom deleter,
    that has a custom struct created with new, that holds the Stream returned from wrapStream(), etc. This is not a good approach.

Therefore, I propose a new constructor for cuda::Stream. It is a small and focused new
feature, easy to use in code, and meets all my needs. Use of C++11 move semantics makes
it clear that ownership of the cudaStream_t is passed to the cuda::Stream. And when the
Stream is destructed, it cleanly deallocates the cudaStream_t using already existing code.

Personally, the constructor would accept a cudaStream_t type. However, it is
abundantly clear that the authors of the cuda module do not want to include the needed
CUDA headers in that module. Therefore, the constructor accepts a void*. This also
aligns with the return of Stream::cudaPtr().

I have a PR ready and will submit it after this issue is written. It includes the new
constructor, expands the signature of the Stream::Impl as is needed, includes a test,
and documentation.

Proposed signature and usage

Imagine an app that has a pool of threads. Perhaps it is a pipeline, flow graph, or other mechanism that does not manually create a thread with std::thread(func()). Threads and created, destroyed, increased, reduced automatically by a managed system. Yet every thread needs to initialize some thread local storage/data like a CUDA stream. Now there is a need to make this into a class, use TLS tools like concurrency::combinable to only initialize once, etc. And further needing a Stream constructor to support this.

cv::cuda::Stream::Stream(void*&& cudaStream);

cv::cuda::Stream threadInitLocalData() {
    cudaStream_t stream;
    if (cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking) != cudaSuccess)
        throw std::runtime_error("Can not create CUDA stream");
    return cv::cuda::Stream(std::move((void*)stream));
}
Issue submission checklist
  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    answers.opencv.org, Stack Overflow, etc and have not found solution
  • I updated to latest OpenCV version and the issue is still there
  • There is reproducer code and related data files: videos, images, onnx, etc

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions