本文介绍了如何理解外行的numpy strides?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在学习 numpy,在 numpy 中有一个名为strides"的主题.我明白它是什么.但它是如何工作的?我在网上没有找到任何有用的信息.谁能让我通俗的理解一下?

解决方案

numpy 数组的实际数据存储在称为数据缓冲区的同构且连续的内存块中.有关详细信息,请参阅

为了将多维数组的索引 i,j,k,... 映射到数据缓冲区中的位置(偏移量,以字节为单位),NumPy 使用了 strides 的概念.步幅是在内存中跳过的字节数,以便沿着数组的每个方向/维度从一个项目到下一个项目.换句话说,它是每个维度的连续项目之间的字节分隔.

例如:

>>>a = np.arange(1,10).reshape(3,3)>>>一种数组([[1, 2, 3],[4, 5, 6],[7, 8, 9]])

这个二维数组有两个方向,axis-0(垂直向下跨行)和axis-1(水平跨列),每个项目都有大小:

>>>a.itemsize # 以字节为单位4

所以从 a[0, 0] ->a[0, 1](沿第 0 行水平移动,从第 0 列到第 1 列)数据缓冲区中的字节步长为 4.a[0, 1] 相同 ->a[0, 2], a[1, 0] ->a[1, 1] etc. 这意味着水平方向(axis-1)的步幅数为4个字节.

然而,从 a[0, 0] ->a[1, 0](沿第0列垂直移动,从第0行到第1行),需要先遍历第0行剩余的所有项到达第1行,然后通过第一行移动到 a[1, 0] 项,即 a[0, 0] ->[0, 1] ->[0, 2] ->一个[1, 0].因此,垂直方向(轴 0)的步幅数为 3*4 = 12 字节.请注意,从 a[0, 2] ->a[1, 0],一般从第i行的最后一项到第(i+1)行的第一项,也是4个字节,因为数组a 以行优先顺序存储.

这就是为什么

>>>a.strides # (strides[0], strides[1])(12, 4)

这是另一个示例,显示二维数组在水平方向(轴 1)上的步幅 strides[1] 不必等于项目大小(例如,带有列的数组)- 大订单):

>>>b = np.array([[1, 4, 7],[2, 5, 8],[3, 6, 9]]).T>>>乙数组([[1, 2, 3],[4, 5, 6],[7, 8, 9]])>>>b.大步(4, 12)

这里 strides[1] 是 item-size 的倍数.尽管数组b 看起来与数组a 相同,但它是一个不同的数组:内部b 存储为|1|4|7|2|5|8|3|6|9|(因为转置不会影响数据缓冲区,而只会交换步幅和形状),而 a 作为 |1|2|3|4|5|6|7|8|9|.使他们看起来相似的是不同的步幅.也就是说,b[0, 0] 的字节步长 ->b[0, 1] 是 3*4=12 字节,对于 b[0, 0] ->b[1, 0] 是 4 个字节,而对于 a[0, 0] ->a[0, 1] 是 4 个字节,对于 a[0, 0] ->a[1, 0] 是 12 个字节.

最后但并非最不重要的是,NumPy 允许创建现有数组的视图,并可选择修改步幅和形状,请参阅 步幅技巧.例如:

>>>np.lib.stride_tricks.as_strided(a, shape=a.shape[::-1], strides=a.strides[::-1])数组([[1, 4, 7],[2, 5, 8],[3, 6, 9]])

相当于转置数组a.

让我补充一点,但没有详细说明,甚至可以定义不是项目大小倍数的步幅.举个例子:

>>>a = np.lib.stride_tricks.as_strided(np.array([1, 512, 0, 3], dtype=np.int16),形状=(3,),步幅=(3,))>>>一种数组([1, 2, 3], dtype=int16)>>>a.strides[0]3>>>a.物品尺寸2

I am currently going through numpy and there is a topic in numpy called "strides". I understand what it is. But how does it work? I did not find any useful information online. Can anyone let me understand in a layman's terms?

解决方案

The actual data of a numpy array is stored in a homogeneous and contiguous block of memory called data buffer. For more information see NumPy internals. Using the (default) row-major order, a 2D array looks like this:

To map the indices i,j,k,... of a multidimensional array to the positions in the data buffer (the offset, in bytes), NumPy uses the notion of strides.Strides are the number of bytes to jump-over in the memory in order to get from one item to the next item along each direction/dimension of the array. In other words, it's the byte-separation between consecutive items for each dimension.

For example:

>>> a = np.arange(1,10).reshape(3,3)
>>> a
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

This 2D array has two directions, axes-0 (running vertically downwards across rows), and axis-1 (running horizontally across columns), with each item having size:

>>> a.itemsize  # in bytes
4  

So to go from a[0, 0] -> a[0, 1] (moving horizontally along the 0th row, from the 0th column to the 1st column) the byte-step in the data buffer is 4. Same for a[0, 1] -> a[0, 2], a[1, 0] -> a[1, 1] etc. This means that the number of strides for the horizontal direction (axis-1) is 4 bytes.

However, to go from a[0, 0] -> a[1, 0] (moving vertically along the 0th column, from the 0th row to the 1st row), you need first to traverse all the remaining items on the 0th row to get to the 1st row, and then move through the 1st row to get to the item a[1, 0], i.e. a[0, 0] -> a[0, 1] -> a[0, 2] -> a[1, 0]. Therefore the number of strides for the vertical direction (axis-0) is 3*4 = 12 bytes. Note that going from a[0, 2] -> a[1, 0], and in general from the last item of the i-th row to the first item of the (i+1)-th row, is also 4 bytes because the array a is stored in the row-major order.

That's why

>>> a.strides  # (strides[0], strides[1])
(12, 4)  

Here's another example showing that the strides in the horizontal direction (axis-1), strides[1], of a 2D array is not necessary equal to the item size (e.g. an array with column-major order):

>>> b = np.array([[1, 4, 7],
                  [2, 5, 8],
                  [3, 6, 9]]).T
>>> b
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

>>> b.strides
(4, 12)

Here strides[1] is a multiple of the item-size. Although the array b looks identical to the array a, it's a different array: internally b is stored as |1|4|7|2|5|8|3|6|9| (because transposing doesn't affect the data buffer but only swaps the strides and the shape), whereas a as |1|2|3|4|5|6|7|8|9|. What makes them look alike is the different strides. That is, the byte-step for b[0, 0] -> b[0, 1] is 3*4=12 bytes and for b[0, 0] -> b[1, 0] is 4 bytes, whereas for a[0, 0] -> a[0, 1] is 4 bytes and for a[0, 0] -> a[1, 0] is 12 bytes.

Last but not least, NumPy allows to create views of existing arrays with the option of modifying the strides and the shape, see stride tricks. For example:

>>> np.lib.stride_tricks.as_strided(a, shape=a.shape[::-1], strides=a.strides[::-1])
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

which is equivalent to transposing the array a.

Let me just add, but without going into much detail, that one can even define strides that are not multiples of the item size. Here's an example:

>>> a = np.lib.stride_tricks.as_strided(np.array([1, 512, 0, 3], dtype=np.int16), 
                                        shape=(3,), strides=(3,))
>>> a
array([1, 2, 3], dtype=int16)

>>> a.strides[0]
3

>>> a.itemsize
2

这篇关于如何理解外行的numpy strides?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 04:42