Skip to content

Support CUDA 11.0#3405

Merged
kmaehashi merged 20 commits intocupy:masterfrom
anaruse:cuda-11.0
Jun 22, 2020
Merged

Support CUDA 11.0#3405
kmaehashi merged 20 commits intocupy:masterfrom
anaruse:cuda-11.0

Conversation

@anaruse
Copy link
Copy Markdown
Contributor

@anaruse anaruse commented Jun 8, 2020

This PR is to support CUDA 11.0.

Comment on lines +359 to +364
if module['name'] == 'cub':
# for <cupy/complex.cuh>
cupy_header = os.path.join(
os.path.dirname(os.path.realpath(__file__)),
'cupy/core/include')
settings['include_dirs'].append(cupy_header)
Copy link
Copy Markdown
Member

@leofang leofang Jun 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @anaruse, is this needed? I thought the CuPy header path is already added in get_compiler_setting() (a few lines above before preconfigure_modules() is called).

By the way, I intend to support CUDA 11 CUB in #2584. Could you comment on my approach there (see my changes in install/build.py)? Thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I couldn't mention this until CUDA11 was officially released, but starting with CUDA11, CUB is now included in CUDA toolkit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anaruse I think it's semi-public info now...I already knew this from the release announcements and PRs in thrust/cub 😇 So, from the implementation in thrust/thrust, I inferred that the CUB headers will be available in /usr/local/cuda/include/cub/, and made the changes accordingly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With CUDA 11.0, you don't need to set the CUB_PATH, so I modified preconfigure_modules() to also add the CuPy header path here regardless of the CUB_PATH setting. But I agree that adding CuPy header path here is not appropriate, so I'll make sure the include path is set correctly in get_compiler_setting() for CUDA 11.0 as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I did not know that you knew this already 😓

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's open now:

cub (High performance primitives for CUDA)

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
It's just very odd that this is the only mention of CUB throughout the whole documentation; no other references or standalone sections (ex: Thrust has its own section). I wouldn't know what it is if I had not knew it already...

@kmaehashi kmaehashi mentioned this pull request Jun 8, 2020
27 tasks
@kmaehashi
Copy link
Copy Markdown
Member

  • Could you add FP16 headers under cupy/core/include/cupy/_cuda/ and update cupy/core/core.pyx (compile_with_cache)?
  • Could you add compute_80 support in cupy/cuda/compiler.py and cupy_setup_build.py?
  • I guess it's better to cache check_availability results as it's call is now frequent. So far I'm unsure if how much it affects performance, so I think it's ok to discuss this in separate issue orPR.

@anaruse
Copy link
Copy Markdown
Contributor Author

anaruse commented Jun 8, 2020

Thanks for the feedback, @leofang and @kmaehashi . I think I've incorporated all the feedback into the source code, so could you check it again?

Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two notes on the CUB thing.

Could you also update this function for NVRTC support, please? 🙏

@util.memoize(for_each_device=True)
def _get_arch():
# See Supported Compile Options section of NVRTC User Guide for
# the maximum value allowed for `--gpu-architecture`.
major, minor = _get_nvrtc_version()
if major < 9:
# CUDA 8.0
_nvrtc_max_compute_capability = '52'
elif major < 10 or (major == 10 and minor == 0):
# CUDA 9.x / 10.0
_nvrtc_max_compute_capability = '70'
else:
# CUDA 10.1 / 10.2
_nvrtc_max_compute_capability = '75'
arch = device.Device().compute_capability
if arch in _tegra_archs:
return arch
else:
return min(arch, _nvrtc_max_compute_capability)

I think you'll also need to touch tests/cupy_tests/cuda_tests/test_compiler.py due to this change.

'../cupy/core/include')
cub_path = None
if cuda_path:
cub_path = os.path.join(cuda_path, 'include', 'cub')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanna confirm: this line means util_namespace.cuh will live under /usr/local/cuda/cub/cub/ instead of /usr/local/cuda/cub/. Is this correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct.

    [ CUDA 11.0 ]
    $ ls /usr/local/cuda/include/cub/util_namespace.cuh 
    /usr/local/cuda/include/cub/util_namespace.cuh

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

install/build.py Outdated
cub_path = os.environ.get('CUB_PATH', '')
if os.path.exists(cub_path):
include_dirs.append(cupy_header)
include_dirs.append(cub_path)
Copy link
Copy Markdown
Member

@leofang leofang Jun 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a note: This search precedence is 1. CUDA-bundled CUB (for CUDA 11+), 2 user-provided CUB. This is different from what's done in my CUB-bundle PR (#2584). If this PR is merged first, we can move the discussion there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. I hadn't considered the case of using a user-specified CUB after CUDA 11, but it might be better to allow user-specified CUB to be used in CUDA 11 or later. I will change the search precedence.

@anaruse
Copy link
Copy Markdown
Contributor Author

anaruse commented Jun 9, 2020

It is failing to build in the Travis CI, but id doesn't seem to be related to my fix... Could you check it?

@leofang
Copy link
Copy Markdown
Member

leofang commented Jun 9, 2020

I think something might go wrong with Sphinx, so Travis keeps complaining today. Possibly the latest release is backward incompatible?

Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in compiler.py LGTM. Just found one bug in the CUB search path.

install/build.py Outdated
Comment on lines 172 to 173
include_dirs.append(cupy_header)
include_dirs.append(cub_path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part will need to be changed (do prepend/insert instead of append), because cub_path would otherwise be searched after cuda_path, so for CUDA 11+ we'd still end up with the built-in CUB.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly right. I will fix it. Thanks!

'../cupy/core/include')
cub_path = None
if cuda_path:
cub_path = os.path.join(cuda_path, 'include', 'cub')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kmaehashi kmaehashi added this to the v8.0.0b4 milestone Jun 11, 2020
@leofang
Copy link
Copy Markdown
Member

leofang commented Jun 15, 2020

Hi @anaruse, I realized that the changes I made in #3430 need attention for CUDA 11 support. Would you like me to help you here?

@anaruse
Copy link
Copy Markdown
Contributor Author

anaruse commented Jun 16, 2020

Thank you for letting me know that, @leofang.
I think I resolved the conflicts to the updated master branch and also added csc2csrEx2() to be consistent with the change in #3430.
Could you check if this is OK?

@kmaehashi
Copy link
Copy Markdown
Member

LGTM!
I tested the build locally with CUDA 11.0 RC.

pfnCI, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit c1cf00f:

@chainer-ci
Copy link
Copy Markdown
Member

Jenkins CI test (for commit c1cf00f, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Copy Markdown
Member

Test failures are caused by SciPy 1.5.0 and not related to this PR.

@kmaehashi kmaehashi merged commit e9b193a into cupy:master Jun 22, 2020
@kmaehashi kmaehashi added cat:feature New features/APIs to-be-backported Pull-requests to be backported to stable branch labels Jun 22, 2020
@leofang
Copy link
Copy Markdown
Member

leofang commented Jun 22, 2020

Thanks for the support @anaruse 👍

@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

3 similar comments
@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

@anaruse
Copy link
Copy Markdown
Contributor Author

anaruse commented Jul 29, 2020

Is there anything I can help with this PR backport?
I'm not familiar with the backport procedure in CuPy, but can I squash this PR commits, merge it into the v7 branch, and then resolve the conflicts?

@chainer-ci
Copy link
Copy Markdown
Member

@kmaehashi This pull-request is marked as to-be-backported, but the corresponding backport PR could not be found. Could you check?

kmaehashi added a commit to kmaehashi/cupy that referenced this pull request Aug 4, 2020
kmaehashi added a commit that referenced this pull request Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:feature New features/APIs to-be-backported Pull-requests to be backported to stable branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants