-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Performance related update of im2col() and col2im() functions #3536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
What is the performance impact of this on a model like AlexNet or VGG? I see ~10% speedup for the forward pass on VGG-16 after patching this in, is that in line with your expectations? |
|
Hi, I made only some general improvements of both mentioned functions on master. The changes affected Average Forward-Backward time: using CPU only, on platform with Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz, compiled with MKL, I got x1.189 speedup for AlexNet, x1.077 for GoogleNet and x1.217 for CaffeNet; while compiled with OpenBLAS (single threaded): x1.095 for AlexNet and x1.056 for GoogleNet. |
|
Here are some results from this layerwise benchmark with the new PR: https://github.com/soumith/convnet-benchmarks#layer-wise-benchmarking-last-updated-april-2015 Configuration is as previous: CPU_ONLY set to 1, with MKL and platform Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (36 threads each on separate core). Results captured before and after the modification are attached: Below are listed Average Forward-Backward times [ms] before and after CPU optimizations and improvement ratio: run_imagenet.sh run_forcegradinput.sh run_nogradinput.sh |
src/caffe/util/im2col.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these arguments can remain const, right?
|
Thanks @intelmm, this looks like a nice boost. I've made a few comments as noted. Two points I'd like to clear up before merge:
|
|
Thank you @longjon for review and comments! I'll commit updated version very soon. Regarding your questions, here are some hints in general:
Why is this code faster?
What could we do next?
|
|
Thanks for the explanation, that makes this clearer (and might be a useful reference for future optimization). The An improved |
|
Dear @longjon, all the changes have been committed according to your comments. |
src/caffe/util/im2col.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a two-space indent per Google style guide (not sure why lint doesn't catch it).
|
The latest version seems understandable and the diff is much cleaner, thanks! I've commented on one space error, and then please squash into one commit before merge. |
|
Dear @longjon thank you for review. The commit is ready. |
Performance related update of im2col() and col2im() functions
|
Clean diff, clean history, clear code... merging! Thanks, and looking forward to future CPU optimizations! |
Improvement of im2col() and col2im() functions on CPU.