You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use explicit templates in gpu_kernel_with_scalars (pytorch#40992)
Summary:
This trick should have no effect on performance, but it reduces size of kernels using the template by 10%
For example, sizeof(BinaryMulDivKernel.cu.o) compiled by CUDA-10.1 toolchain for sm_75 before the change was 4.2Mb, after 3.8Mb
Pull Request resolved: pytorch#40992
Differential Revision: D22398733
Pulled By: malfet
fbshipit-source-id: 6576f4da00dc5fc2575b2313577f52c6571d5e6f
0 commit comments