We've seen a dramatic (about 25%) performance loss in our numerical model tests when switching from numpy 1.16.4 to 1.17.
The code is quite standard numerics on small, but numerous arrays at frequent time steps. After profiling, it turned out that the number of calls and the cumtime spent on np.clip increased drastically.
Microbenchmarks on np.clip did not show any significant change until I found this one:
In [1]: import numpy as np
In [2]: np.__version__
Out[2]: '1.16.4'
In [3]: %timeit np.clip(10, 0, None)
2.99 µs ± 26.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
To be compared to:
In [1]: import numpy as np
In [2]: np.__version__
Out[2]: '1.17.0'
In [3]: %timeit np.clip(10, 0, None)
15.1 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
So, about 5 times slower.
I made a simple test and profiled it:
def test_clip():
for i in range(1000000):
np.clip(1, 0, None)
And here are the profiler outputs (trimmed to show only numpy outputs:
numpy 1.16.4
Thu Aug 15 11:14:12 2019 prof_16/prof/test_clip.prof
8002900 function calls (8002830 primitive calls) in 4.223 seconds
Ordered by: cumulative time
List reduced from 382 to 6 due to restriction <'numpy'>
ncalls tottime percall cumtime percall filename:lineno(function)
1000000 0.352 0.000 3.949 0.000 (...)/numpy/core/fromnumeric.py:1903(clip)
1000000 0.770 0.000 3.597 0.000 (...)/numpy/core/fromnumeric.py:54(_wrapfunc)
1000000 0.903 0.000 2.565 0.000 (...)/numpy/core/fromnumeric.py:41(_wrapit)
1000000 0.305 0.000 0.873 0.000 (...)/numpy/core/numeric.py:469(asarray)
1000000 0.704 0.000 0.704 0.000 {method 'clip' of 'numpy.ndarray' objects}
1000000 0.568 0.000 0.568 0.000 {built-in method numpy.array}
numpy 1.17
Thu Aug 15 11:15:39 2019 prof_17/prof/test_clip.prof
35003584 function calls (33003457 primitive calls) in 21.119 seconds
Ordered by: cumulative time
List reduced from 448 to 14 due to restriction <'numpy'>
ncalls tottime percall cumtime percall filename:lineno(function)
3000000/1000000 0.928 0.000 19.951 0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
1000000 0.565 0.000 19.577 0.000 (...)/numpy/core/fromnumeric.py:1974(clip)
1000000 0.620 0.000 19.011 0.000 (...)/numpy/core/fromnumeric.py:55(_wrapfunc)
1000000 1.134 0.000 18.039 0.000 (...)/numpy/core/fromnumeric.py:42(_wrapit)
1000000 0.431 0.000 15.865 0.000 {method 'clip' of 'numpy.ndarray' objects}
1000000 1.445 0.000 15.434 0.000 (...)/numpy/core/_methods.py:97(_clip)
2000000 4.990 0.000 11.738 0.000 (...)/numpy/core/_methods.py:63(_clip_dep_is_scalar_nan)
2000000 1.603 0.000 3.749 0.000 (...)/numpy/core/fromnumeric.py:2986(ndim)
3000000 0.945 0.000 3.076 0.000 (...)/numpy/core/_asarray.py:16(asarray)
3000000 2.130 0.000 2.130 0.000 {built-in method numpy.array}
1000000 1.434 0.000 1.434 0.000 (...)/numpy/core/_methods.py:78(_clip_dep_invoke_with_casting)
2000000 0.597 0.000 0.816 0.000 (...)/numpy/core/_methods.py:73(_clip_dep_is_byte_swapped)
2000000 0.186 0.000 0.186 0.000 (...)/numpy/core/fromnumeric.py:2982(_ndim_dispatcher)
1000000 0.122 0.000 0.122 0.000 (...)/numpy/core/fromnumeric.py:1970(_clip_dispatcher)
These are very different!
We've seen a dramatic (about 25%) performance loss in our numerical model tests when switching from numpy 1.16.4 to 1.17.
The code is quite standard numerics on small, but numerous arrays at frequent time steps. After profiling, it turned out that the number of calls and the cumtime spent on
np.clipincreased drastically.Microbenchmarks on
np.clipdid not show any significant change until I found this one:To be compared to:
So, about 5 times slower.
I made a simple test and profiled it:
And here are the profiler outputs (trimmed to show only numpy outputs:
numpy 1.16.4
numpy 1.17
These are very different!