slice – A Passionate Techie

Example:

>>>import numpy as np >>>arr = np.array(range(42)).reshape(6,7) >>>arr array([[ 0, 1, 2, 3, 4, 5, 6], [ 7, 8, 9, 10, 11, 12, 13], [14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27], [28, 29, 30, 31, 32, 33, 34], [35, 36, 37, 38, 39, 40, 41]]) >>># What I want to do: >>>select_random_windows(arr, window_size=3) array([[ 1, 2, 3], [11, 12, 13], [14, 15, 16], [22, 23, 24], [38, 39, 40]])

What I’m going with right now is terribly slow:

def select_random_windows(arr, wndow_size): result = [] offsets = np.random.randint(0, arr.shape[0]-window_size, size = arr.shape[1]) for row, offset in enumerate(start_indices): result.append(arr[row][offset: offset + window_size]) return np.array(result)

Sure, I could do the same with a list comprehension (and get a minimal speed boost), but I was wondering wether there is some super smart numpy vectorized way to do this.

Solution:

def random_windows_per_row_strided(arr, W=3): idx = np.random.randint(0,arr.shape[1]-W+1, arr.shape[0]) strided = np.lib.stride_tricks.as_strided m,n = arr.shape s0,s1 = arr.strides windows = strided(arr, shape=(m,n-W+1,W), strides=(s0,s1,s1)) return windows[np.arange(len(idx)), idx]

Runtime test on bigger array with 10,000 rows –

In [469]: arr = np.random.rand(100000,100) # @Psidom's soln In [470]: %timeit select_random_windows(arr, window_size=3) 100 loops, best of 3: 7.41 ms per loop In [471]: %timeit random_windows_per_row_strided(arr, W=3) 100 loops, best of 3: 6.84 ms per loop # @Psidom's soln In [472]: %timeit select_random_windows(arr, window_size=30) 10 loops, best of 3: 26.8 ms per loop In [473]: %timeit random_windows_per_row_strided(arr, W=30) 100 loops, best of 3: 9.65 ms per loop # @Psidom's soln In [474]: %timeit select_random_windows(arr, window_size=50) 10 loops, best of 3: 41.8 ms per loop In [475]: %timeit random_windows_per_row_strided(arr, W=50) 100 loops, best of 3: 10 ms per loop

I’ve come up with this question while trying to apply a Cesar Cipher to a matrix with different shift values for each row, i.e. given a matrix X

array([[1, 0, 8],
   [5, 1, 4],
   [2, 1, 1]])

with shift values of S = array([0, 1, 1]), the output needs to be

array([[1, 0, 8],
   [1, 4, 5],
   [1, 1, 2]])

This is easy to implement by the following code:

Y = []
for i in range(X.shape[0]):
    if (S[i] > 0):
        Y.append( X[i,S[i]::].tolist() + X[i,:S[i]:].tolist() )
    else:
        Y.append(X[i,:].tolist())
Y = np.array(Y)

This is a left-cycle-shift. I wonder how to do this in a more efficient way using numpy arrays?

Update: This example applies the shift to the columns of a matrix. Suppose that we have a 3D array

array([[[8, 1, 8],
        [8, 6, 2],
        [5, 3, 7]],

       [[4, 1, 0],
        [5, 9, 5],
        [5, 1, 7]],

       [[9, 8, 6],
        [5, 1, 0],
        [5, 5, 4]]])

Then, the cyclic right shift of S = array([0, 0, 1]) over the columns leads to

array([[[8, 1, 7],
        [8, 6, 8],
        [5, 3, 2]],

       [[4, 1, 7],
        [5, 9, 0],
        [5, 1, 5]],

       [[9, 8, 4],
        [5, 1, 6],
        [5, 5, 0]]])

Solution:

Approach #1 : Use modulus to implement the cyclic pattern and get the new column indices and then simply use advanced-indexing to extract the elements, giving us a vectorized solution, like so –

def cyclic_slice(X, S):
    m,n = X.shape
    idx = np.mod(np.arange(n) + S[:,None],n)
    return X[np.arange(m)[:,None], idx]

Approach #2 : We can also leverage the power of strides for further speedup. The idea would be to concatenate the sliced off portion from the start and append it at the end, then create sliding windows of lengths same as the number of cols and finally index into the appropriate window numbers to get the same rolled over effect. The implementation would be like so –

def cyclic_slice_strided(X, S):
    X2 = np.column_stack((X,X[:,:-1]))
    s0,s1 = X2.strides
    strided = np.lib.stride_tricks.as_strided 

    m,n1 = X.shape
    n2 = X2.shape[1]
    X2_3D = strided(X2, shape=(m,n2-n1+1,n1), strides=(s0,s1,s1))
    return X2_3D[np.arange(len(S)),S]

Sample run –

In [34]: X
Out[34]: 
array([[1, 0, 8],
       [5, 1, 4],
       [2, 1, 1]])

In [35]: S
Out[35]: array([0, 1, 1])

In [36]: cyclic_slice(X, S)
Out[36]: 
array([[1, 0, 8],
       [1, 4, 5],
       [1, 1, 2]])

Runtime test –

In [75]: X = np.random.rand(10000,100)
    ...: S = np.random.randint(0,100,(10000))

# @Moses Koledoye's soln
In [76]: %%timeit
    ...: Y = []
    ...: for i, x in zip(S, X):
    ...:     Y.append(np.roll(x, -i))
10 loops, best of 3: 108 ms per loop

In [77]: %timeit cyclic_slice(X, S)
100 loops, best of 3: 14.1 ms per loop

In [78]: %timeit cyclic_slice_strided(X, S)
100 loops, best of 3: 4.3 ms per loop

Adaption for 3D case

Adapting approach #1 for the 3D case, we would have –

shift = 'left'
axis = 1 # axis along which S is to be used (axis=1 for rows)
n = X.shape[axis]
if shift == 'left':
    Sa = S
else:
    Sa = -S    

# For rows
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[:,idx, np.arange(len(S))]

# For columns
idx = np.mod(Sa[:,None] + np.arange(n),n)
out = X[:,np.arange(len(S))[:,None], idx]

# For axis=0
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[idx, np.arange(len(S))]

There could be a way to have a generic solution for a generic axis, but I will keep it to this point.

	gamejudilebaran.word… on Chef: Roles and Environme…
	WARN: Waiting for se… on OSSEC start problem due to…
	aratik711 on Ansible issues
	aratik711 on Chef: Roles and Environme…
	situs judi on Chef: Roles and Environme…

Tag: slice

Selecting Random Windows from Multidimensional Numpy Array Rows

Example:

What an ideal solution would look like to me:

What I’m going with right now is terribly slow:

Cycling Slicing in Python

Example:

What an ideal solution would look like to me:

What I’m going with right now is terribly slow:

Rate this:

Share this:

Rate this:

Share this: