Skip to content

Add CUDA backend for LSTM layer#20938

Merged
opencv-pushbot merged 1 commit intoopencv:4.xfrom
JulieBar:lstm_cuda2
Apr 1, 2022
Merged

Add CUDA backend for LSTM layer#20938
opencv-pushbot merged 1 commit intoopencv:4.xfrom
JulieBar:lstm_cuda2

Conversation

@JulieBar
Copy link
Copy Markdown
Contributor

@JulieBar JulieBar commented Oct 25, 2021

Add CUDA backend for LSTM layer

force_builders=Custom
buildworker:Custom=linux-4,linux-6
Xbuild_image:Custom=ubuntu-cuda:18.04
build_image:Custom=ubuntu-cuda-cc52:18.04
test_modules:Custom=dnn

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@alalek
Copy link
Copy Markdown
Member

alalek commented Oct 28, 2021

BTW, Warning from https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#torch.nn.LSTM :

There are known non-determinism issues for RNN functions on some versions of cuDNN and CUDA. You can enforce deterministic behavior by setting the following environment variables:

On CUDA 10.1, set environment variable CUDA_LAUNCH_BLOCKING=1. This may affect performance.

On CUDA 10.2 or later, set environment variable (note the leading colon symbol) CUBLAS_WORKSPACE_CONFIG=:16:8 or CUBLAS_WORKSPACE_CONFIG=:4096:2.

See the cuDNN 8 Release Notes for more information.

@YashasSamaga
Copy link
Copy Markdown
Contributor

YashasSamaga commented Oct 29, 2021

Does the CI build with both cuDNN 7 and cuDNN 8?

(Sorry, didn't notice this was a draft)

@JulieBar
Copy link
Copy Markdown
Contributor Author

Does the CI build with both cuDNN 7 and cuDNN 8?

(Sorry, didn't notice this was a draft)

I'm going to ask for your review anyway :) Thank you!
I'll address your comments, correct what's left, and ping you again later if you don't mind.

@asmorkalov asmorkalov requested a review from rogday December 17, 2021 10:34
@rogday rogday changed the base branch from 4.x to 3.4 March 1, 2022 18:22
@rogday rogday changed the base branch from 3.4 to master March 1, 2022 18:22
@rogday rogday force-pushed the lstm_cuda2 branch 2 times, most recently from 45e3073 to f52c761 Compare March 1, 2022 23:22
@rogday rogday changed the base branch from master to 4.x March 2, 2022 08:19
@rogday rogday mentioned this pull request Mar 4, 2022
6 tasks
@rogday rogday marked this pull request as ready for review March 30, 2022 13:35
Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@rogday
Copy link
Copy Markdown
Member

rogday commented Mar 30, 2022

@YashasSamaga, could you please take a look?

Co-authored-by: Julia Bareeva <jbareeva@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants