-
Notifications
You must be signed in to change notification settings - Fork 311
Optimize windowed #809
Copy link
Copy link
Closed
Description
Intro
I have been working towards increasing the performance of the windowed method and have made some good headway. Main performance gains come from using built in itertools methods and using islice to "fast forward" the windows.
At low list sizes the performance does take a small hit but I did try to minimize that as much as possible.
As the list size, n, or step get larger the benefits become clearer gaining approximately a 25-50% performance gain.
Testing
Test:
from more_itertools import windowed, consume
%timeit -n 100000 consume(windowed([1]*10, 3, step=1))
%timeit -n 100000 consume(windowed([1]*10, 3, step=2))
%timeit -n 100000 consume(windowed([1]*10, 3, step=4))
%timeit -n 100000 consume(windowed([1]*10, 3, step=6))
print()
%timeit -n 100000 consume(windowed([1]*20, 5, step=1))
%timeit -n 100000 consume(windowed([1]*20, 5, step=2))
%timeit -n 100000 consume(windowed([1]*20, 5, step=4))
%timeit -n 100000 consume(windowed([1]*20, 5, step=8))
%timeit -n 100000 consume(windowed([1]*20, 5, step=12))
print()
%timeit -n 100000 consume(windowed([1]*40, 10, step=1))
%timeit -n 100000 consume(windowed([1]*40, 10, step=4))
%timeit -n 100000 consume(windowed([1]*40, 10, step=7))
%timeit -n 100000 consume(windowed([1]*40, 10, step=9))
%timeit -n 100000 consume(windowed([1]*40, 10, step=15))
%timeit -n 100000 consume(windowed([1]*40, 10, step=25))
print()
%timeit -n 100000 consume(windowed([1]*80, 20, step=1))
%timeit -n 100000 consume(windowed([1]*80, 20, step=6))
%timeit -n 100000 consume(windowed([1]*80, 20, step=12))
%timeit -n 100000 consume(windowed([1]*80, 20, step=19))
%timeit -n 100000 consume(windowed([1]*80, 20, step=25))
%timeit -n 100000 consume(windowed([1]*80, 20, step=35))
print()
%timeit -n 10000 consume(windowed([1]*1000, 30, step=1))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=2))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=3))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=4))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=8))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=12))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=18))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=25))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=40))
%timeit -n 10000 consume(windowed([1]*1000, 30, step=80))Before:
2.19 µs ± 18.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.01 µs ± 24.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.85 µs ± 19.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.65 µs ± 14.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.39 µs ± 27 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.87 µs ± 36.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.47 µs ± 28.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.26 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.1 µs ± 32.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
6.08 µs ± 63.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.93 µs ± 38.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.58 µs ± 35.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.44 µs ± 26.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.05 µs ± 26.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.97 µs ± 36.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
14 µs ± 111 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
6.25 µs ± 73.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.39 µs ± 43.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.48 µs ± 29.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.31 µs ± 51.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.07 µs ± 47.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
227 µs ± 901 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
135 µs ± 762 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
106 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
92.3 µs ± 669 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
67.7 µs ± 957 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
60.3 µs ± 1.25 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
53.6 µs ± 375 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
50.7 µs ± 225 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
47.5 µs ± 556 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
45.3 µs ± 487 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
After:
2.43 µs ± 15.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.16 µs ± 40.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.46 µs ± 8.91 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.25 µs ± 14.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.34 µs ± 48 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.78 µs ± 46.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.42 µs ± 6.42 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.54 µs ± 18.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.34 µs ± 34.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.56 µs ± 42.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.4 µs ± 101 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
3.03 µs ± 88.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.99 µs ± 127 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.65 µs ± 14.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.44 µs ± 11 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
12.4 µs ± 70.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.86 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.2 µs ± 19 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.19 µs ± 22.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.53 µs ± 19.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.23 µs ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
204 µs ± 782 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
112 µs ± 472 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
82.2 µs ± 661 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
67.4 µs ± 505 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
44.5 µs ± 486 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
37.8 µs ± 534 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
33.3 µs ± 1.17 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
29.8 µs ± 471 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
15.5 µs ± 314 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
10.8 µs ± 89.8 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels