improper use of __CUDA_ARCH__ in host code (in dev)?

in dev, in common.hpp, there are tests on __CUDA_ARCH__.

as per:
http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#virtual-architecture-identification-macro

it seems host code should not rely on this macro. while it all seems a bit complex/convoluted to me, i think this is not a pedantic/trivial point issue, but sensible: the host code is compiled once, with __CUDA_ARCH__ undefined (or at least not guaranteed to have any particular value), while the device code is compiled (potentially) multiple times with different values of __CUDA_ARCH__. mind you, i'm a bit fuzzy on the exact way the source code is split between host and device ...

AFAIK, at run time, depending on the GPU's CM, one of these per-CM versions of each device function will actually be used for kernel launches -- maybe multiple ones for the same process when multiple GPUs are in play. so it's not really possible for the host code to have any correct single _or_ static value of the arch.

for caffe, the net result seems to be that the block size is effectively chosen as a static 512, as opposed to 1024 as seems to be the intent (not that the intent is a good idea, mind you). this happens because the code that uses the value (indirectly) is host code (i.e. in a host function), despite being in a .cu file.

i'd assume that if one wanted to make the block size depend on the arch, you'd need use cudaGetDeviceProperties() and case-split at least per cuda device (the are various ways to do something like this).

sticking this cpp code prior to the #if that uses __CUDA_ARCH__ in common.hpp was illustrative for me. note that the warnings get printed many times per compilation of a single .cu file:

```
#ifndef __CUDA_ARCH__
#warning( "CA undef" )
#else
#warning( "CA def" )
#endif
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improper use of __CUDA_ARCH__ in host code (in dev)? #418

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

improper use of __CUDA_ARCH__ in host code (in dev)? #418

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions