Skip to content

Merge CUPY_CUB_BLOCK_REDUCTION_DISABLED and CUB_DISABLED#3461

Merged
kmaehashi merged 25 commits intocupy:masterfrom
asi1024:core-backends
Jul 9, 2020
Merged

Merge CUPY_CUB_BLOCK_REDUCTION_DISABLED and CUB_DISABLED#3461
kmaehashi merged 25 commits intocupy:masterfrom
asi1024:core-backends

Conversation

@asi1024
Copy link
Copy Markdown
Member

@asi1024 asi1024 commented Jun 19, 2020

Part of #3445.

  • Merges CUPY_CUB_BLOCK_REDUCTION_DISABLED and CUB_DISABLED into CUPY_CUB_ENABLED.
  • Supports interfaces to switch to use each backend, or define their priorities.
    • Environment variable: CUPY_BACKENDS.
    • cupy.core.set_reduction_backends and cupy.core.routine_backends (These functions are for debug use. We will leave them undocumented.)

@leofang leofang mentioned this pull request Jun 20, 2020
3 tasks
@kmaehashi kmaehashi self-assigned this Jun 22, 2020
@kmaehashi kmaehashi added this to the v8.0.0b4 milestone Jun 22, 2020
@asi1024 asi1024 marked this pull request as ready for review June 24, 2020 04:46
@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jun 24, 2020

PTAL.

@asi1024 asi1024 modified the milestones: v8.0.0b4, v8.0.0b5 Jun 24, 2020
Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, @asi1024, for working on this! I've done a quick pass and left some comments. The major change needed here is to handle scan properly. I think we can use CUB for both faster indexing (which I intended to do but overlooked for some reason!) and cumsum/cumprod, but the current implementation introduces a regression for the latter.

Comment on lines 40 to 44
cub_enabled = False
if int(os.getenv('CUB_DISABLED', 0)) == 0:
if int(os.getenv('CUPY_CUB_ENABLED', 0)) == 1:
try:
from cupy.cuda import cub # NOQA
cub_enabled = True
Copy link
Copy Markdown
Member

@leofang leofang Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire block (including the two lines below) should be changed to support CUPY_BACKENDS? At least cupy.cuda.cub_enabled should be removed as per #3445 (comment), but you'll need to fix every place that it appears in, such as the tests added in #2598 and #3428.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ccd71b0.

Comment on lines +399 to +408
for backend in _backend._routine_backends:
if backend == _backend.BACKEND_CUB:
# result will be None if the scan is not compatible with CUB
if op == scan_op.SCAN_SUM:
cub_op = cub.CUPY_CUB_CUMSUM
else:
cub_op = cub.CUPY_CUB_CUMPROD
res = cub.cub_scan(out, cub_op)
if res is not None:
return out
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need this here (for fast indexing) as well as in scan_core() (where this came from), because cupy/math/sumprod.py calls the latter directly. Moreover, this early appearance does not take care:

  1. copy input: scan is an in-place operation (whether using CUB or not), so the input a must be copied to out.
  2. ravel input: CUB scan needs a 1D contiguous input.
  3. axis: CUB only handles non-batched scans (axis is None).

So at least for cupy.cumsum() and cupy.cumprod() this won't work.

Copy link
Copy Markdown
Member Author

@asi1024 asi1024 Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix in 6d12fdc. Thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Comment on lines +502 to +516
if cupy.cuda.cub_enabled:
# result will be None if the scan is not compatible with CUB
if op == scan_op.SCAN_SUM:
cub_op = cub.CUPY_CUB_CUMSUM
else:
cub_op = cub.CUPY_CUB_CUMPROD
res = cub.cub_scan(result, cub_op)
if not cupy.cuda.cub_enabled or res is None:
scan(result, op, dtype, result)
scan(result, op, dtype, result)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above for CUB scan.

asi1024 and others added 3 commits June 25, 2020 05:48
@emcastillo emcastillo modified the milestone: v8.0.0b5 Jun 25, 2020
@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jun 25, 2020

PTAL.

Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for a few small nitpicks. I think you can apply all suggestions at once by switching to the "Files changed" tab and adding them to the batch. This way we won't generate excessive commits.

@unittest.skipUnless(
cupy.core._backend.get_routine_backends() == ['cub'],
'The CUB routine is not enabled')
class TestFcontiguousReduction(unittest.TestCase):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the class name was changed? Note both C- & F- contiguity get tested.

Suggested change
class TestFcontiguousReduction(unittest.TestCase):
class TestCUBreduction(unittest.TestCase):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought the skip condition is not needed and renamed before, and I forgot to undo after this tests don't pass without skip condition.

@unittest.skipIf(cupy.cuda.cub_enabled is False, 'The CUB module is not built')
class TestCUBreduction(unittest.TestCase):
@unittest.skipUnless(
cupy.core._backend.get_routine_backends() == ['cub'],
Copy link
Copy Markdown
Member

@leofang leofang Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as CUB is set enabled (or cupy.cuda.cub is built), it should be fine that other backends exist?

Suggested change
cupy.core._backend.get_routine_backends() == ['cub'],
'cub' in cupy.core._backend.get_routine_backends(),

Copy link
Copy Markdown
Member Author

@asi1024 asi1024 Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, another backend may be chosen. However, cupy.core._backend.get_routine_backends()[:1] == ['cub'] may be better because the get_routine_backends indicates the priority of backends.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. That sounds good. Either [0] or [:1] is fine.

Copy link
Copy Markdown
Member

@leofang leofang Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @asi1024 I thought about this again, and I think the rationale here was that if cupy.cuda.cub is built it should get tested (regardless how backends were set). If one of the CIs would be set up with CUPY_BACKENDS="cub," then this is OK; otherwise, once #2584 is up, we'll need to ensure these tests are run (by removing skipUnless() and using setUp() and tearDown() to temporarily overwrite the backends).

Let me know if you'd configure the CI or I should address the skips in #2584, thanks!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to set backends in setUp and tearDown. Fixed in 43b2da6.

@testing.gpu
@unittest.skipIf(cupy.cuda.cub_enabled is False, 'The CUB module is not built')
@unittest.skipUnless(
cupy.core._backend.get_routine_backends() == ['cub'],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cupy.core._backend.get_routine_backends() == ['cub'],
'cub' in cupy.core._backend.get_routine_backends(),

@testing.gpu
@unittest.skipIf(cupy.cuda.cub_enabled is False, 'The CUB module is not built')
@unittest.skipUnless(
cupy.core._backend.get_routine_backends() == ['cub'],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cupy.core._backend.get_routine_backends() == ['cub'],
'cub' in cupy.core._backend.get_routine_backends(),

@testing.gpu
@unittest.skipIf(cupy.cuda.cub_enabled is False, 'The CUB module is not built')
@unittest.skipUnless(
cupy.core._backend.get_routine_backends() == ['cub'],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cupy.core._backend.get_routine_backends() == ['cub'],
'cub' in cupy.core._backend.get_routine_backends(),

Comment on lines -40 to -46
cub_enabled = False
if int(os.getenv('CUB_DISABLED', 0)) == 0:
try:
from cupy.cuda import cub # NOQA
cub_enabled = True
except ImportError:
pass
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I thought this change would disable from cupy.cuda import cub but it seems still working. Interesting...

res = cub.cub_scan(out, cub_op)
if res is not None:
break
else:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GJ on using for-else 👍

if cupy.cuda.cub_enabled:

try:
from cupy.cuda.cub import device_csrmv
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this violates the dev guideline...Let's fix it properly:

Suggested change
from cupy.cuda.cub import device_csrmv
from cupy.cuda import cub

try:
from cupy.cuda.cub import device_csrmv
except ImportError:
pass
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pass
cub = None

for backend in _backend.get_routine_backends():
if (backend == _backend.BACKEND_CUB
and other.flags.c_contiguous):
return device_csrmv(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return device_csrmv(
return cub.device_csrmv(

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 7, 2020

Jenkins, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit a4ab70e:

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 7, 2020

Jenkins, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit a4ab70e:

@chainer-ci
Copy link
Copy Markdown
Member

Jenkins CI test (for commit a4ab70e, target branch master) failed with status FAILURE.

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 7, 2020

I should trigger after chainer/chainer-test#586 is merged. I canceled the Jenkins CI.

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 8, 2020

Jenkins, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit a4ab70e:

@chainer-ci
Copy link
Copy Markdown
Member

Jenkins CI test (for commit a4ab70e, target branch master) failed with status FAILURE.

Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! This line in test_csr.py:

func = 'cupyx.scipy.sparse.csr.device_csrmv'

needs to be changed to

func = 'cupyx.scipy.sparse.csr.cub.device_csrmv'

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 8, 2020

Jenkins, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit 41f4e45:

@chainer-ci
Copy link
Copy Markdown
Member

Jenkins CI test (for commit 41f4e45, target branch master) succeeded!

@leofang
Copy link
Copy Markdown
Member

leofang commented Jul 8, 2020

@asi1024 Do you know why some CUB tests (I think) were skipped in cupy-py35? For example, search test_sumprod.py here:
https://jenkins.preferred.jp/job/chainer/job/cupy_pr/1493/TEST=cupy-py35,label=mn1-p100/console
Other CIs do not have this issue.

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 8, 2020

@leofang cupy-py35 uses NumPy 1.15.4, so 4 tests with @testing.with_requires('numpy>=1.16') in test_sumprod.py seems to have been skipped.

@asi1024
Copy link
Copy Markdown
Member Author

asi1024 commented Jul 9, 2020

Jenkins, test this please.

@pfn-ci-bot
Copy link
Copy Markdown
Collaborator

Successfully created a job for commit 8a8feea:

@chainer-ci
Copy link
Copy Markdown
Member

Jenkins CI test (for commit 8a8feea, target branch master) succeeded!

@kmaehashi kmaehashi added the cat:enhancement Improvements to existing features label Jul 9, 2020
@kmaehashi kmaehashi merged commit 96fc0e8 into cupy:master Jul 9, 2020
@kmaehashi
Copy link
Copy Markdown
Member

LGTM!

@asi1024 asi1024 deleted the core-backends branch July 9, 2020 16:26
@asi1024 asi1024 added the no-compat Disrupts backward compatibility label Jul 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:enhancement Improvements to existing features no-compat Disrupts backward compatibility

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants