Enable cusolver potrf batched for Cholesky decomposition when cuda >= 11.3 by xwang233 · Pull Request #57788 · pytorch/pytorch

xwang233 · 2021-05-07T01:10:58Z

This PR enables the usage of cusolver potrf batched as the backend of Cholesky decomposition (torch.linalg.cholesky and torch.linalg.cholesky_ex) when cuda version is greater than or equal to 11.3.

Benchmark available at https://github.com/xwang233/code-snippet/tree/master/linalg/cholesky-new. It is seen that cusolver potrf batched performs better than magma potrf batched in most cases.

cholesky dispatch heuristics:

before:

batch size == 1: cusolver potrf
batch size > 1: magma xpotrf batched

after:

cuda >= 11.3:

batch size == 1: cusolver potrf
batch size > 1: cusolver potrf batched

cuda < 11.3 (not changed):

batch size == 1: cusolver potrf
batch size > 1: magma xpotrf batched

💊 CI failures summary and remediations

As of commit 003fe38 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

xwang233 · 2021-05-07T01:11:22Z

reserved

codecov · 2021-05-07T07:49:04Z

Codecov Report

Merging #57788 (003fe38) into master (747312b) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #57788      +/-   ##
==========================================
- Coverage   76.83%   76.83%   -0.01%     
==========================================
  Files        1986     1986              
  Lines      197430   197430              
==========================================
- Hits       151691   151690       -1     
- Misses      45739    45740       +1

IvanYashchuk · 2021-05-10T20:45:25Z

If both cuSOLVER and MAGMA are available and CUDA version is < 11.3 we should continue using batched MAGMA as it has better performance, than single input cuSOLVER variant called in a loop, right? This PR modifies the behavior for < 11.3 versions to use looped cuSOLVER instead of batched MAGMA.

Besides that dispatch issue, everything looks good.

xwang233 · 2021-05-10T20:58:29Z

Ohhh, yes, you're right. Let me fix that dispatch logic. 😄

…holesky-batched_cuda11.3

mruberry · 2021-05-11T05:02:05Z


+// Implementation of Cholesky decomposition using batched cusolverDn<T>potrfBatched
+// Warning: cusolverDn<T>potrfBatched doesn't work quite well when matrix size or batch size is zero.
+// If you write your own C++ extension and use this function, make sure you do a zero numel check for the input.


mruberry · 2021-05-11T05:02:36Z

 #define USE_CUSOLVER
 #endif

+// cusolverDn<T>potrfBatched may have numerical issue before cuda 11.3 release,


Great comment

mruberry

An excellent CUDA performance PR to cap the many performance improvements realized for the release of torch.linalg. in PyTorch 1.9

cc @ptrblck

facebook-github-bot · 2021-05-11T05:03:59Z

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-05-11T17:27:48Z

@mruberry merged this pull request in 7faac08.

… 11.3 (pytorch#57788) Summary: This PR enables the usage of cusolver potrf batched as the backend of Cholesky decomposition (`torch.linalg.cholesky` and `torch.linalg.cholesky_ex`) when cuda version is greater than or equal to 11.3. Benchmark available at https://github.com/xwang233/code-snippet/tree/master/linalg/cholesky-new. It is seen that cusolver potrf batched performs better than magma potrf batched in most cases. ## cholesky dispatch heuristics: ### before: - batch size == 1: cusolver potrf - batch size > 1: magma xpotrf batched ### after: cuda >= 11.3: - batch size == 1: cusolver potrf - batch size > 1: cusolver potrf batched cuda < 11.3 (not changed): - batch size == 1: cusolver potrf - batch size > 1: magma xpotrf batched --- See also pytorch#42666 pytorch#47953 pytorch#53104 pytorch#53879 Pull Request resolved: pytorch#57788 Reviewed By: ngimel Differential Revision: D28345530 Pulled By: mruberry fbshipit-source-id: 3022cf73b2750e1953c0e00a9e8b093dfc551f61

enable cusolver potrf batched when cuda >= 11.3

29cec7c

xwang233 requested review from IvanYashchuk, mruberry and ngimel May 7, 2021 01:10

facebook-github-bot added the cla signed label May 7, 2021

xwang233 mentioned this pull request May 7, 2021

Linear algebra GPU library function bug tracking issue [magma/cusolver/cublas] #53879

Open

pytorchbot added the open source label May 7, 2021

lint

481d0c8

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 7, 2021

xwang233 added 2 commits May 10, 2021 16:52

Merge remote-tracking branch 'upstream/master' into ci-all/cusolver-c…

ae7711b

…holesky-batched_cuda11.3

change heuristic < cuda 11.3

003fe38

mruberry reviewed May 11, 2021

View reviewed changes

mruberry approved these changes May 11, 2021

View reviewed changes

facebook-github-bot closed this in 7faac08 May 11, 2021

facebook-github-bot added the Merged label May 11, 2021

github-actions Bot deleted the ci-all/cusolver-cholesky-batched_cuda11.3 branch February 11, 2024 01:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable cusolver potrf batched for Cholesky decomposition when cuda >= 11.3#57788

Enable cusolver potrf batched for Cholesky decomposition when cuda >= 11.3#57788
xwang233 wants to merge 4 commits intomasterfrom
ci-all/cusolver-cholesky-batched_cuda11.3

xwang233 commented May 7, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented May 7, 2021 •

edited

Loading

Uh oh!

xwang233 commented May 7, 2021

Uh oh!

codecov Bot commented May 7, 2021 •

edited

Loading

Uh oh!

IvanYashchuk commented May 10, 2021

Uh oh!

xwang233 commented May 10, 2021

Uh oh!

mruberry May 11, 2021

Uh oh!

mruberry May 11, 2021

Uh oh!

mruberry left a comment

Uh oh!

facebook-github-bot commented May 11, 2021

Uh oh!

facebook-github-bot commented May 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

xwang233 commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

cholesky dispatch heuristics:

before:

after:

Uh oh!

facebook-github-bot commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

xwang233 commented May 7, 2021

Uh oh!

codecov Bot commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

IvanYashchuk commented May 10, 2021

Uh oh!

xwang233 commented May 10, 2021

Uh oh!

mruberry May 11, 2021

Choose a reason for hiding this comment

Uh oh!

mruberry May 11, 2021

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 11, 2021

Uh oh!

facebook-github-bot commented May 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xwang233 commented May 7, 2021 •

edited

Loading

facebook-github-bot commented May 7, 2021 •

edited

Loading

codecov Bot commented May 7, 2021 •

edited

Loading