Conversation
| return a.argmax() | ||
|
|
||
|
|
||
| # This class compares CUB results against CuPy's original implementation |
There was a problem hiding this comment.
Currently we don't have a performance benchmark in our test code base.
How about adding it in examples/ directory like as done in cuTENSOR?
There was a problem hiding this comment.
Right, I noticed that too. This part was meant to detect performance regression. CUB is generally faster, but there could be exceptions. Do you think we don’t need to worry about it for now?
There was a problem hiding this comment.
Let me clarify, what do you mean by "performance regression"? Do you want to find which parameter combinations are slower when CUB is enabled (ie, find "exceptions") ? or you want to detect regression between versions/commits?
There was a problem hiding this comment.
This class has the former (finding exceptions) in mind. It compares with the old reduction kernels.
I don’t think the latter is possible without having some persistent database recording the performance data. It’s not easy to set this up...
There was a problem hiding this comment.
I removed the performance tests when refactoring the tests. Let me know if you want me to bring them back in an alternative form. Presumably it could also be another PR.
This comment has been minimized.
This comment has been minimized.
|
@grlee77 I wonder if you are able to reproduce this error? |
This comment has been minimized.
This comment has been minimized.
|
I am currently on a branch where I had merged your bug fixes and am not seeing the error. I can try on current master, though and see if it occurs there. |
|
Let me also try a fresh build. Perhaps I didn't do it right... |
This comment has been minimized.
This comment has been minimized.
|
I was unable to reproduce it |
|
Thanks @grlee77. I am 99% sure this is a false alert. Sorry for the noise. Had a bad day here... |
This comment has been minimized.
This comment has been minimized.
|
@kmaehashi PTAL when you start reviewing #2584. Thanks. |
|
@kmaehashi To pass the CI for this you need to have CUB installed (#2584). I can confirm locally this PR passes with the master branch. |
|
Sorry for the review delay in this PR 🙇 |
|
I think it's better to merge this PR first, then merge #2584 after verifying CUDA 11.0 support. |
|
Thanks for revisiting this, @kmaehashi!
Agreed. It makes sense (not sure what you meant on CUDA 11).
I have been testing CUB module extensively recently and didn't see any error. I just rebuilt locally (against the master) and applied this patch, and the three touched tests still passed. Could you post a snippet of error logs so that I can look into which test fails (and how it fails), please? |
Ping @kmaehashi |
|
Sorry, I retested and confirmed all tests pass. It seems my build was broken. |
|
Successfully created a job for commit 85900bc: |
|
Jenkins CI test (for commit 85900bc, target branch master) succeeded! |
style in line with the existing tests and cupy#2598
|
LGTM! Thank you for your great work! |
|
Thanks, @kmaehashi! By the way, I just added CUB spmv tests in #3428. |
I think it's better to have a separate PR for CUB tests.
This PR will hopefully help #2579 once the CI is set up to test CUB. Preferably this PR should get reviewed after #2584 and #2596 are merged, but it is not blocked by those two PRs for two reasons:
cupy.cuda.cubis not builtcupy.argmin()andcupy.argmax()would be tested instead, if the CUB support from Add CUB support forargmax()andargmin()#2596 is not ready.Note: using the
TestCUBperformanceclass here I found that CUB may be slower for a few cases (such as small workloads), so I think the performance warning/documentation discussed with @emcastillo in #2562 may still be needed.