-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Support compilation via nvcc in RawKernel #1928
Copy link
Copy link
Closed
Description
Currently, RawKernel utilises NVRTC to compile device code into cubins. This is great for code that only contains device code.
There are useful C++ CUDA libraries such as cub and trove that contain device, block and warp wide primitives (scan, transpose, reduce etc.) that are useful for CUDA programmers in general.
Unfortunately it is not currently possibly to use these libraries with NVRTC as it does not automatically include system headers. For example, see:
- https://devtalk.nvidia.com/default/topic/1028233/include-header-in-nvrtc/
- https://stackoverflow.com/questions/50565200/including-c-standard-headers-in-cuda-nvrtc-code
There's an issue requesting NVRTC compatibility from cub (https://github.com/NVlabs/cub/issues/131) due to these problems, but in the mean time it would be useful to work around this by allowing RawKernel to compile code via nvcc (perhaps using the functionality in install/*.py)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels