Skip to content

Suggestion: Sliding Window Function #7753

@nils-werner

Description

@nils-werner

Using np.lib.stride_tricks.as_stride one can very efficiently create a sliding window that segments an array as a preprocessing step for vectorized applications. For example a moving average of a window length 3, stepsize 1:

a = numpy.arange(10)
a_strided = numpy.lib.stride_tricks.as_strided(
    a, shape=(8, 3), strides=(8, 8)
)
print numpy.mean(a_strided, axis=1)

This is very performant but very hard to do, as the shape and strides parameters are very hard to understand.

I suggest the implementation of a "simple sliding_window function" that does the job of figuring out those two parameters for you.

The implementation I have been using for years allows the above to be replaced by

a = numpy.arange(10)
a_strided = sliding_window(a, size=3, stepsize=1)
print numpy.mean(a_strided, axis=1)

@teoliphant also has an implementation that would change the above to

a = numpy.arange(10)
a_strided = array_for_sliding_window(a, 3)
print numpy.mean(a_strided, axis=1)

both making it much more readable.

Seeing it is a common usecase in vectorized computing I suggest we put a similar function into NumPy itself.

Regarding to which implementation to follow, they are both assume different things but allow you to do the same thing eventually:

sliding_window

  • slides over one axis only
  • allows setting windowsize and stepsize
  • returns array with dimension n+1
  • sliding over several axes requires two calls (which come for free as there is no memory reordered)
  • has a superfluous copy parameter that can be removed and replaced by appending .copy() after the call

array_for_sliding_window

  • slides over all axes simultaneously, window lengths are given as tuple parameter
  • Assumes a stepsize one in all directions
  • returns array with dimension n*2
  • stepsize not equal to one requires slicing of output data (unsure if this implies copying data)
  • Disabling sliding over axis[n] requires you set argument wshape[n] = 1 or wshape[n] = a.shape[n]

This means for flexible stepsize the following are equivalent (with some minor bug in sliding_window there):

a = numpy.arange(10)
print sliding_window(a, size=3, stepsize=2)

a = numpy.arange(10)
print array_for_sliding_window(a, 3)[::2, :] # Stepsize 2 by dropping every 2nd row

for sliding over one axis the following are equivalent (with some transposing and squeezing):

a = numpy.arange(25).reshape(5, 5)
print sliding_window(a, size=3, axis=1)

a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (1, 3))

and for sliding over two axis the following are equivalent:

a = numpy.arange(25).reshape(5, 5)
print sliding_window(sliding_window(a, size=3, axis=0), size=2, axis=1)

a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (3, 2))

This issue is about sparking discussion about

  • Do we need such a function?
  • Which features are required?
  • Which interface should we persue?

After discussion I am willing to draft up a pull request with an implementation we agreed on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions