Skip to content

Conversation

@longjon
Copy link
Contributor

@longjon longjon commented Nov 5, 2015

See #3281, #418, and https://groups.google.com/forum/#!topic/caffe-users/4abF674UaYY/discussion.

The preprocessor check added by #62 is not correct. Although the code appears to check the CUDA architecture version, the __CUDA_ARCH__ macro is not defined in host code, so CAFFE_CUDA_NUM_THREADS always gets set to 512.

This patch simply removes the misleading dead code, and keeps CAFFE_CUDA_NUM_THREADS at 512. Given that we've been using this value for 21 months, and to my knowledge there's no a priori compelling reason why it ought to be maxed out, I don't see any reason to add additional code right now to restore the intended behavior (although I'd welcome a future PR that does so, if it makes a compelling performance difference, which I don't expect it does).

__CUDA_ARCH__ is not defined in host code; the #if was vacuous and
misleading.
shelhamer added a commit that referenced this pull request Dec 2, 2015
Remove dead preprocessor code for number of CUDA threads
@shelhamer shelhamer merged commit 71abb92 into BVLC:master Dec 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants