-
-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Suggestion: Sliding Window Function #7753
Description
Using np.lib.stride_tricks.as_stride one can very efficiently create a sliding window that segments an array as a preprocessing step for vectorized applications. For example a moving average of a window length 3, stepsize 1:
a = numpy.arange(10)
a_strided = numpy.lib.stride_tricks.as_strided(
a, shape=(8, 3), strides=(8, 8)
)
print numpy.mean(a_strided, axis=1)
This is very performant but very hard to do, as the shape and strides parameters are very hard to understand.
I suggest the implementation of a "simple sliding_window function" that does the job of figuring out those two parameters for you.
The implementation I have been using for years allows the above to be replaced by
a = numpy.arange(10)
a_strided = sliding_window(a, size=3, stepsize=1)
print numpy.mean(a_strided, axis=1)
@teoliphant also has an implementation that would change the above to
a = numpy.arange(10)
a_strided = array_for_sliding_window(a, 3)
print numpy.mean(a_strided, axis=1)
both making it much more readable.
Seeing it is a common usecase in vectorized computing I suggest we put a similar function into NumPy itself.
Regarding to which implementation to follow, they are both assume different things but allow you to do the same thing eventually:
sliding_window
- slides over one axis only
- allows setting windowsize and stepsize
- returns array with dimension n+1
- sliding over several axes requires two calls (which come for free as there is no memory reordered)
- has a superfluous
copyparameter that can be removed and replaced by appending.copy()after the call
array_for_sliding_window
- slides over all axes simultaneously, window lengths are given as
tupleparameter - Assumes a stepsize one in all directions
- returns array with dimension n*2
- stepsize not equal to one requires slicing of output data (unsure if this implies copying data)
- Disabling sliding over
axis[n]requires you set argumentwshape[n] = 1orwshape[n] = a.shape[n]
This means for flexible stepsize the following are equivalent (with some minor bug in sliding_window there):
a = numpy.arange(10)
print sliding_window(a, size=3, stepsize=2)
a = numpy.arange(10)
print array_for_sliding_window(a, 3)[::2, :] # Stepsize 2 by dropping every 2nd row
for sliding over one axis the following are equivalent (with some transposing and squeezing):
a = numpy.arange(25).reshape(5, 5)
print sliding_window(a, size=3, axis=1)
a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (1, 3))
and for sliding over two axis the following are equivalent:
a = numpy.arange(25).reshape(5, 5)
print sliding_window(sliding_window(a, size=3, axis=0), size=2, axis=1)
a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (3, 2))
This issue is about sparking discussion about
- Do we need such a function?
- Which features are required?
- Which interface should we persue?
After discussion I am willing to draft up a pull request with an implementation we agreed on.