Skip to content

Fix test_inverse_singular for cublas path; fix cusolver inverse multi-stream issue#47026

Closed
xwang233 wants to merge 3 commits intopytorch:masterfrom
xwang233:fix-test-inverse-singular-cublas
Closed

Fix test_inverse_singular for cublas path; fix cusolver inverse multi-stream issue#47026
xwang233 wants to merge 3 commits intopytorch:masterfrom
xwang233:fix-test-inverse-singular-cublas

Conversation

@xwang233
Copy link
Copy Markdown
Collaborator

@xwang233 xwang233 commented Oct 28, 2020

test_inverse_singular for cublas failure

Related
#46616 (comment)
https://app.circleci.com/pipelines/github/pytorch/pytorch/232112/workflows/4131d4ca-cd51-44e3-8e6c-b1c3555c62fa/jobs/8523970/tests

The cuda 11.1 CI container doesn't have MAGMA library, so cublas matrix inverse path is enabled.

Oct 27 23:13:47 -- MAGMA not found. Compiling without MAGMA support

The test_inverse_singular was introduced in #46625, but I forgot to fix that functionality for cublas path as well.

cusolver inverse multi-stream failure

fix #47272

The original cuda event record/block stream was wrong, which could cause NaN in output tensor.

On my machine, the original code observes NaN in about 50k~500k loops. After this change, no NaN is observed in more than 2.5m loops.

The performance for batch 2 matrix inverse is still the same as those in #42403.

@xwang233 xwang233 requested review from ngimel and zasdfgbnm October 28, 2020 22:28
@xwang233
Copy link
Copy Markdown
Collaborator Author

cc @ptrblck @malfet

@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 29, 2020
@facebook-github-bot
Copy link
Copy Markdown
Contributor

Hi @xwang233!

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but we do not have a signature on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@codecov

This comment has been minimized.

@xwang233 xwang233 changed the title Fix test_inverse_singular for cublas path Fix test_inverse_singular for cublas path; fix cusolver inverse multi-stream issue Nov 6, 2020
auto dataPtr = allocator.allocate(sizeof(int)*batch_size*n);
int* ipiv_array = reinterpret_cast<int*>(dataPtr.get());

Tensor _info1 = at::zeros({batch_size}, self.options().dtype(at::kInt));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you allocating infos here again?

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ngimel merged this pull request in a1f494c.

emcastillo pushed a commit to emcastillo/pytorch that referenced this pull request Mar 16, 2022
…-stream issue (pytorch#47026)

Summary:
### test_inverse_singular for cublas failure

Related
pytorch#46616 (comment)
https://app.circleci.com/pipelines/github/pytorch/pytorch/232112/workflows/4131d4ca-cd51-44e3-8e6c-b1c3555c62fa/jobs/8523970/tests

The cuda 11.1 CI container doesn't have MAGMA library, so cublas matrix inverse path is enabled.
```
Oct 27 23:13:47 -- MAGMA not found. Compiling without MAGMA support
```

The test_inverse_singular was introduced in pytorch#46625, but I forgot to fix that functionality for cublas path as well.

### cusolver inverse multi-stream failure

fix pytorch#47272

The original cuda event record/block stream was wrong, which could cause NaN in output tensor.

On my machine, the original code observes NaN in about 50k~500k loops. After this change, no NaN is observed in more than 2.5m loops.

The performance for batch 2 matrix inverse is still the same as those in pytorch#42403.

Pull Request resolved: pytorch#47026

Reviewed By: mruberry

Differential Revision: D24838546

Pulled By: ngimel

fbshipit-source-id: 3b83e4ab8e6b47a8273cba277251765bd6d97911
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…-stream issue (pytorch#47026)

Summary:
### test_inverse_singular for cublas failure

Related
pytorch#46616 (comment)
https://app.circleci.com/pipelines/github/pytorch/pytorch/232112/workflows/4131d4ca-cd51-44e3-8e6c-b1c3555c62fa/jobs/8523970/tests

The cuda 11.1 CI container doesn't have MAGMA library, so cublas matrix inverse path is enabled.
```
Oct 27 23:13:47 -- MAGMA not found. Compiling without MAGMA support
```

The test_inverse_singular was introduced in pytorch#46625, but I forgot to fix that functionality for cublas path as well.

### cusolver inverse multi-stream failure

fix pytorch#47272

The original cuda event record/block stream was wrong, which could cause NaN in output tensor.

On my machine, the original code observes NaN in about 50k~500k loops. After this change, no NaN is observed in more than 2.5m loops.

The performance for batch 2 matrix inverse is still the same as those in pytorch#42403.

Pull Request resolved: pytorch#47026

Reviewed By: mruberry

Differential Revision: D24838546

Pulled By: ngimel

fbshipit-source-id: 3b83e4ab8e6b47a8273cba277251765bd6d97911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The return of torch.inverse contains nan sometime

5 participants