-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Add Timer class unifying CPU and GPU timer and use it in net_speed_benchmark #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mavenlin, I observed no significant difference among pure CPU timer, CPU timer with cudaDeviceSynchronize, and cudaEvent_t based GPU timer. Are these what you called weird results? |
|
Reopen to simplify future benchmark works. |
|
Please rebase on the latest dev and we'll merge. Thanks. |
|
@shelhamer, it has been rebased and polished with the newly added cpplint. |
|
So sorry, but this needs another rebase because of a complicated merge. If it's any consolation, this has prepared the reconciliation of the MKL and non-MKL versions of Caffe and brought in support for DAGs, improved documentation, and a better organization of the project. We are adopting a new development strategy that will not have a constant need for rebasing. It will be documented shortly, but the bottom line is that we will no longer rewrite the history of |
|
I am glad that Caffe is approaching version 1.0. The workflow that I used to rebase on the most recent merge is as follows. Absolutely clean history. |
|
@shelhamer, this utility class is rebased and tested again. Please merge it for those who are interested in benchmarking run time. Thanks! |
Add Timer class unifying CPU and GPU timer and use it in net_speed_benchmark
|
@kloudkl Thanks! We're catching up on PRs now, so hope to merge lots of the new developments soon. |
Add Timer class unifying CPU and GPU timer and use it in net_speed_benchmark
This resolves the concern about timing CUDA codes in the discussions of #128.
http://devblogs.nvidia.com/parallelforall/how-implement-performance-metrics-cuda-cc/
http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/#performance-metrics