Python3 vs C vs JavaScript performance comparison
This article was inspired by the following presentation given by a famous machine learning champion and educator Andrew Ng. In the video he explains how the use of numpy library can dramatically speed up performance in Python3. I decided to give it a try and check it out on my Mac. I created a similar test file in Python3 which is available on my github page. There are a couple of things to note about this code:
- I have Python3 installed in the following directory: #!/usr/local/bin/python3, which is specified on top of vecvsfor.py. If you want to try this on your machine, the path may be different.
- Because it takes such a short time to multiply two vectors in Python3 using numpy, the result varies from run to run significantly. To make it stable I added a loop where I run vectorized version 1000 times, and then average the result per run.
So here is the result on my MacBook Pro i7 notebook:
Very impressive indeed. Vectorized version runs almost three times faster on my MacBook Pro than on the jupyter notebook.
Now let’s check how does this compare to C. I wrote a similar program where I create and multiply two arrays in a for loop. The code is also available on my github page. Here is the result:
Now wait a second. The result of the same multiplication in a for loop that takes 420 ms in Python3, takes about as much time in optimized C version as it takes to multiply using numpy. I can turn this into a C++ program by renaming the file into vecvsfor.cpp and the result is about the same:
I decided not to stop here and created a JavaScript program that performs the same multiplication in a for loop. Here, again I specified the path to node js executable on top of the file. It may be different on your system. And here is the result:
Very good. It’s just twice as slow as a heavily optimized C and C++ versions. Compare this with 420 ms in Python3. Now lets see all results when executed one by one. Actually you can see it at the top of this page.
In this exercise we found that Python3 is a very slow scripting language. It runs multiplications in a for loop approximately 600 times slower than C or C++ and 300 times slower than JavaScript with node js engine. numpy library is indeed very efficient and allows developers to use Python3 without significant performance degradation.
Does it mean that using matrix libraries in C or C++ does not make sense, since the language itself is very efficient? Of course not. There are libraries that can parallelize matrix multiplication across multiple CPU cores, provide efficient algorithms to speed up sparse matrix multiplication, etc. There is a great course on this topic at MIT OCW 6.172:Performance Engineering of Software Systems
Feel free to followup. Follow me on Twitter: @AlexSmallet
P.S. After I wrote this article I realized that numpy library is open source and available on github. Sure enough the core is written in C.
This is pretty insightful! It may be interesting to also compare using Armadillo's C++ library which uses OpenBLAS to optimize vector and matrix calculations. Makers of Armadillo report very fast performance on vector and matrix operations. See here for more details: http://arma.sourceforge.net/speed.html