Conversation
cupy_setup_build.py
Outdated
| if module['name'] == 'cub': | ||
| # for <cupy/complex.cuh> | ||
| cupy_header = os.path.join( | ||
| os.path.dirname(os.path.realpath(__file__)), | ||
| 'cupy/core/include') | ||
| settings['include_dirs'].append(cupy_header) |
There was a problem hiding this comment.
Hi @anaruse, is this needed? I thought the CuPy header path is already added in get_compiler_setting() (a few lines above before preconfigure_modules() is called).
By the way, I intend to support CUDA 11 CUB in #2584. Could you comment on my approach there (see my changes in install/build.py)? Thanks!
There was a problem hiding this comment.
I'm sorry I couldn't mention this until CUDA11 was officially released, but starting with CUDA11, CUB is now included in CUDA toolkit.
There was a problem hiding this comment.
@anaruse I think it's semi-public info now...I already knew this from the release announcements and PRs in thrust/cub 😇 So, from the implementation in thrust/thrust, I inferred that the CUB headers will be available in /usr/local/cuda/include/cub/, and made the changes accordingly.
There was a problem hiding this comment.
With CUDA 11.0, you don't need to set the CUB_PATH, so I modified preconfigure_modules() to also add the CuPy header path here regardless of the CUB_PATH setting. But I agree that adding CuPy header path here is not appropriate, so I'll make sure the include path is set correctly in get_compiler_setting() for CUDA 11.0 as well.
There was a problem hiding this comment.
BTW, I did not know that you knew this already 😓
There was a problem hiding this comment.
It's open now:
cub (High performance primitives for CUDA)
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
It's just very odd that this is the only mention of CUB throughout the whole documentation; no other references or standalone sections (ex: Thrust has its own section). I wouldn't know what it is if I had not knew it already...
|
|
Thanks for the feedback, @leofang and @kmaehashi . I think I've incorporated all the feedback into the source code, so could you check it again? |
There was a problem hiding this comment.
Left two notes on the CUB thing.
Could you also update this function for NVRTC support, please? 🙏
Lines 60 to 79 in a04e38e
I think you'll also need to touch
tests/cupy_tests/cuda_tests/test_compiler.py due to this change.
| '../cupy/core/include') | ||
| cub_path = None | ||
| if cuda_path: | ||
| cub_path = os.path.join(cuda_path, 'include', 'cub') |
There was a problem hiding this comment.
Just wanna confirm: this line means util_namespace.cuh will live under /usr/local/cuda/cub/cub/ instead of /usr/local/cuda/cub/. Is this correct?
There was a problem hiding this comment.
Yes, that is correct.
[ CUDA 11.0 ]
$ ls /usr/local/cuda/include/cub/util_namespace.cuh
/usr/local/cuda/include/cub/util_namespace.cuh
install/build.py
Outdated
| cub_path = os.environ.get('CUB_PATH', '') | ||
| if os.path.exists(cub_path): | ||
| include_dirs.append(cupy_header) | ||
| include_dirs.append(cub_path) |
There was a problem hiding this comment.
Leaving a note: This search precedence is 1. CUDA-bundled CUB (for CUDA 11+), 2 user-provided CUB. This is different from what's done in my CUB-bundle PR (#2584). If this PR is merged first, we can move the discussion there.
There was a problem hiding this comment.
Thank you for pointing that out. I hadn't considered the case of using a user-specified CUB after CUDA 11, but it might be better to allow user-specified CUB to be used in CUDA 11 or later. I will change the search precedence.
|
It is failing to build in the Travis CI, but id doesn't seem to be related to my fix... Could you check it? |
|
I think something might go wrong with Sphinx, so Travis keeps complaining today. Possibly the latest release is backward incompatible? |
leofang
left a comment
There was a problem hiding this comment.
The changes in compiler.py LGTM. Just found one bug in the CUB search path.
install/build.py
Outdated
| include_dirs.append(cupy_header) | ||
| include_dirs.append(cub_path) |
There was a problem hiding this comment.
This part will need to be changed (do prepend/insert instead of append), because cub_path would otherwise be searched after cuda_path, so for CUDA 11+ we'd still end up with the built-in CUB.
There was a problem hiding this comment.
That's exactly right. I will fix it. Thanks!
| '../cupy/core/include') | ||
| cub_path = None | ||
| if cuda_path: | ||
| cub_path = os.path.join(cuda_path, 'include', 'cub') |
|
LGTM! pfnCI, test this please. |
|
Successfully created a job for commit c1cf00f: |
|
Jenkins CI test (for commit c1cf00f, target branch master) failed with status FAILURE. |
|
Test failures are caused by SciPy 1.5.0 and not related to this PR. |
|
Thanks for the support @anaruse 👍 |
|
@kmaehashi This pull-request is marked as |
|
@kmaehashi This pull-request is marked as |
3 similar comments
|
@kmaehashi This pull-request is marked as |
|
@kmaehashi This pull-request is marked as |
|
@kmaehashi This pull-request is marked as |
|
Is there anything I can help with this PR backport? |
|
@kmaehashi This pull-request is marked as |
Support CUDA 11.0
This PR is to support CUDA 11.0.