fix: cuda atomicMin and atomicMax for <double>s#3545
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files🚀 New features to boost your workflow:
|
|
related question, how is the CUDA backend tested? If there's CI I would add a test too, but I don't see any... yet |
@ariostas has opened a PR for CUDA integration testing |
|
The GPU CI is live now! |
@ariostas - many thanks! I think, we need to add |
Oh okay, I missed that part. I assumed that it only generated cpu kernels |
|
@all-contributors please add @Moelf for code |
|
I've put up a pull request to add @Moelf! 🎉 |
fix #3528
the root cause is that the trick for
floatdoes not work fordoubledue to, eh, IEEE754 ... My understanding is you can't castdoubleintolong longand still have a total order like you would have forfloat -> int. See also: https://stackoverflow.com/questions/55140908/can-anybody-help-me-with-atomicmin-function-syntax-for-cuda/55145948#55145948