[CUDA10 fixes] Adds max plan number for CUDA 10 cufft plan cache array#12553
[CUDA10 fixes] Adds max plan number for CUDA 10 cufft plan cache array#12553syed-ahmed wants to merge 2 commits intopytorch:masterfrom
Conversation
facebook-github-bot
left a comment
There was a problem hiding this comment.
soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
| @@ -394,12 +393,12 @@ class CuFFTParamsLRUCache { | |||
| // in CUDA 10. Hence, when compiling with CUDA 10, just | |||
| // don't do the erase. | |||
| #if CUDA_VERSION < 10000 | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| for (size_t i = 0; i < cur_size - _max_size; i++) { | ||
| delete_it--; | ||
| _cache_map.erase(delete_it->first); | ||
| auto cur_size = _usage_list.size(); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ssnl
left a comment
There was a problem hiding this comment.
The max capacity is not enforced, but it should be, considering that cuFFT plans may use GPU memory.
|
@ssnl Thanks for the review and explanation! I have pushed the requested changes. |
|
Thanks! |
facebook-github-bot
left a comment
There was a problem hiding this comment.
SsnL is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: SsnL As per your review in pytorch/pytorch#12017, I added a max plan number for CUDA 10 path. Our internal cuFFT team couldn't suggest a number since the limit depends on host/device memory. That is, a plan allocates some buffers on the device and also creates objects for the plans on the host side. I raised this number to 4x arbitrarily per you suggestion. Pull Request resolved: pytorch/pytorch#12553 Differential Revision: D10320832 Pulled By: SsnL fbshipit-source-id: 3148d45cd280dffb2039756e2f6a74fbc7aa086d
@ssnl As per your review in #12017, I added a max plan number for CUDA 10 path. Our internal cuFFT team couldn't suggest a number since the limit depends on host/device memory. That is, a plan allocates some buffers on the device and also creates objects for the plans on the host side. I raised this number to 4x arbitrarily per you suggestion.