本文介绍了numpy的矢量化二维移动窗口(包括边)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我意识到我的问题与 2D上的矢量化移动窗口非常相似numpy中的数组,但是那里的答案不能完全满足我的需求.

I realize my question is fairly similar to Vectorized moving window on 2D array in numpy, but the answers there don't quite satisfy my needs.

是否可以进行包含所谓的边缘效果的矢量化2D移动窗口(滚动窗口)?最有效的方法是什么?

Is it possible to do a vectorized 2D moving window (rolling window) which includes so-called edge effects? What would be the most efficient way to do this?

也就是说,我想在网格上滑动移动窗口的中心,以使中心可以在网格中的每个单元格上移动.当沿着网格的边缘移动时,此操作将仅返回与网格重叠的窗口部分.如果窗口完全在网格内,则返回整个窗口.例如,如果我有网格:

That is, I would like to slide the center of a moving window across my grid, such that the center can move over each cell in the grid. When moving along the margins of the grid, this operation would return only the portion of the window that overlaps the grid. Where the window is entirely within the grid, the full window is returned. For example, if I have the grid:

array([[1,2,3,4],
       [2,3,4,5],
       [3,4,5,6],
       [4,5,6,7]])

...并且我想使用以该点为中心的3x3窗口对该网格中的每个点进行采样,该操作应返回一系列数组,或者理想情况下,应将一系列视图返回到原始数组中,如下所示:

…and I want to sample each point in this grid using a 3x3 window centered at that point, the operation should return a series of arrays, or, ideally, a series of views into the original array, as follows:

array([[1,2],    array([[1,2,3],    array([[2,3,4],    array([[3,4],
       [2,3]])          [2,3,4]])          [3,4,5]])          [4,5]])

array([[1,2],    array([[1,2,3],    array([[2,3,4],    array([[3,4],
       [2,3],           [2,3,4],           [3,4,5],           [4,5],
       [3,4]])          [3,4,5]])          [4,5,6]])          [5,6]])

array([[2,3],    array([[2,3,4],    array([[3,4,5],    array([[4,5],
       [3,4],           [3,4,5],           [4,5,6],           [5,6],
       [4,5]])          [4,5,6]])          [5,6,7]])          [6,7]])

array([[3,4],    array([[3,4,5],    array([[4,5,6],    array([[5,6],
       [4,5]])          [4,5,6]])          [5,6,7]])          [6,7]])

因为我需要多次执行此操作,所以速度很重要&理想的解决方案是矢量化操作.

Because I need to perform this operation many times, speed is important & the ideal solution would be a vectorized operation.

推荐答案

您可以定义一个产生生成器的函数并使用它.窗口将是您要除以2的形状的底,而技巧将是沿着行和列移动时沿该窗口索引数组.

You could define a function that yields a generator and use that. The window would be the floor of the shape you want divided by 2 and the trick would be just indexing the array along that window as you move along the rows and columns.

def window(arr, shape=(3, 3)):
    # Find row and column window sizes
    r_win = np.floor(shape[0] / 2).astype(int)
    c_win = np.floor(shape[1] / 2).astype(int)
    x, y = arr.shape
     for i in range(x):
         xmin = max(0, i - r_win)
         xmax = min(x, i + r_win + 1)
         for j in range(y):
             ymin = max(0, j - c_win)
             ymax = min(y, j + c_win + 1)
             yield arr[xmin:xmax, ymin:ymax]

您可以像这样使用此功能:

You could use this function like so:

arr = np.array([[1,2,3,4],
               [2,3,4,5],
               [3,4,5,6],
               [4,5,6,7]])
gen = window(arr)
next(gen)
array([[1, 2],
       [2, 3]])

遍历生成器将产生示例中的所有窗口.

Going through the generator produces all of the windows in your example.

它不是向量化的,但是我不确定是否有一个现有的向量化函数可以返回不同大小的数组.正如@PaulPanzer指出的那样,您可以将数组填充到所需的大小,并使用 np.lib.stride_tricks.as_strided 来生成切片的视图.像这样:

It's not vectorized, but I'm not sure there is an existing vectorized function that returns different sized arrays. As @PaulPanzer points out you could pad your array to the size you need and use a np.lib.stride_tricks.as_strided to generate a view of the slices. Something like so:

def rolling_window(a, shape):
    s = (a.shape[0] - shape[0] + 1,) + (a.shape[1] - shape[1] + 1,) + shape
    strides = a.strides + a.strides
    return np.lib.stride_tricks.as_strided(a, shape=s, strides=strides)

def window2(arr, shape=(3, 3)):
    r_extra = np.floor(shape[0] / 2).astype(int)
    c_extra = np.floor(shape[1] / 2).astype(int)
    out = np.empty((arr.shape[0] + 2 * r_extra, arr.shape[1] + 2 * c_extra))
    out[:] = np.nan
    out[r_extra:-r_extra, c_extra:-c_extra] = arr
    view = rolling_window(out, shape)
    return view

window2(arr, (3,3))
array([[[[ nan,  nan,  nan],
         [ nan,   1.,   2.],
         [ nan,   2.,   3.]],

        [[ nan,  nan,  nan],
         [  1.,   2.,   3.],
         [  2.,   3.,   4.]],

        [[ nan,  nan,  nan],
         [  2.,   3.,   4.],
         [  3.,   4.,   5.]],

        [[ nan,  nan,  nan],
         [  3.,   4.,  nan],
         [  4.,   5.,  nan]]],


       [[[ nan,   1.,   2.],
         [ nan,   2.,   3.],
         [ nan,   3.,   4.]],

        [[  1.,   2.,   3.],
         [  2.,   3.,   4.],
         [  3.,   4.,   5.]],

        [[  2.,   3.,   4.],
         [  3.,   4.,   5.],
         [  4.,   5.,   6.]],

        [[  3.,   4.,  nan],
         [  4.,   5.,  nan],
         [  5.,   6.,  nan]]],


       [[[ nan,   2.,   3.],
         [ nan,   3.,   4.],
         [ nan,   4.,   5.]],

        [[  2.,   3.,   4.],
         [  3.,   4.,   5.],
         [  4.,   5.,   6.]],

        [[  3.,   4.,   5.],
         [  4.,   5.,   6.],
         [  5.,   6.,   7.]],

        [[  4.,   5.,  nan],
         [  5.,   6.,  nan],
         [  6.,   7.,  nan]]],


       [[[ nan,   3.,   4.],
         [ nan,   4.,   5.],
         [ nan,  nan,  nan]],

        [[  3.,   4.,   5.],
         [  4.,   5.,   6.],
         [ nan,  nan,  nan]],

        [[  4.,   5.,   6.],
         [  5.,   6.,   7.],
         [ nan,  nan,  nan]],

        [[  5.,   6.,  nan],
         [  6.,   7.,  nan],
         [ nan,  nan,  nan]]]])

此版本用np.nan填充边缘,以避免与数组中的任何其他值混淆.与window函数相比,给定数组的速度大约快3倍,但是我不确定带填充的输出将如何影响您要在下游执行的任何操作.

This version pads the edges with np.nan to avoid confusion with any other values in your array. It is about 3x faster with the given array than the window function, but I am not sure how having padded output will impact anything you want to do downstream.

这篇关于numpy的矢量化二维移动窗口(包括边)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-17 01:02