如何使用二维数组中的numpy.search进行矢量化

本文介绍了如何使用二维数组中的numpy.search进行矢量化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个二维数组(a)用于查找，还有一个数组(v)来查找应在其中插入元素的索引:

I have a 2d-array (a) for lookup and an array (v) to find indices where elements should be inserted:

import numpy as np

# [EDIT] Add more records which contain NaNs
a = np.array(
[[0., 923.9943, 996.8978, 1063.9064, 1125.639, 1184.3985, 1259.9854, 1339.6107, 1503.4462, 2035.6527],
 [0., 1593.6196, 1885.2442, 2152.956, 2419.0038, 2843.517, 3551.225, 5423.009, 18930.8694, 70472.4002],
 [0., 1593.6196, 1885.2442, 2152.956, 2419.0038, 2843.517, 3551.225, 5423.009, 18930.8694, 70472.4002],
 [0., 1084.8388, 1132.6918, 1172.2278, 1215.7986, 1259.062, 1334.4778, 1430.738, 1650.4502, 3966.1578],
 [0., 1084.8388, 1132.6918, 1172.2278, 1215.7986, 1259.062, 1334.4778, 1430.738, 1650.4502, 3966.1578],
 [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
 [0., 923.9943, 996.8978, 1063.9064, 1125.639, 1184.3985, 1259.9854, 1339.6107, 1503.4462, 2035.6527],
 [0., 1593.6196, 1885.2442, 2152.956, 2419.0038, 2843.517, 3551.225, 5423.009, 18930.8694, 70472.4002],
 [0., 1593.6196, 1885.2442, 2152.956, 2419.0038, 2843.517, 3551.225, 5423.009, 18930.8694, 70472.4002],
 [0., 1084.8388, 1132.6918, 1172.2278, 1215.7986, 1259.062, 1334.4778, 1430.738, 1650.4502, 3966.1578],
 [0., 1084.8388, 1132.6918, 1172.2278, 1215.7986, 1259.062, 1334.4778, 1430.738, 1650.4502, 3966.1578]])

v = np.array([641.954, 56554.498, 168078.307, 1331.692, 2233.327, 1120.03, 641.954, 56554.498, 168078.307, 1331.692, 2233.327])

这是我想要得到的结果:

This is the result I want to get:

[1, 9, 10, 6, 9, 0, 1, 9, 10, 6, 9]

显然，通过for循环，我可以像这样对数组a和v进行索引:

Obviously, with a for loop I can index the array a and v like this:

for i, _ in enumerate(a):
    print(np.searchsorted(a[i], v[i]))

是否有任何 vectorized 方法可以更有效地进行此操作?

Are there any vectorized ways to do this which are more efficient?

Inspired by Vectorized searchsorted numpy for the underlying idea, here's one between 2D and 1D arrays -

def searchsorted2d(a,b):
    # Inputs : a is (m,n) 2D array and b is (m,) 1D array.
    # Finds np.searchsorted(a[i], b[i])) in a vectorized way by
    # scaling/offsetting both inputs and then using searchsorted

    # Get scaling offset and then scale inputs
    s = np.r_[0,(np.maximum(a.max(1)-a.min(1)+1,b)+1).cumsum()[:-1]]
    a_scaled = (a+s[:,None]).ravel()
    b_scaled = b+s

    # Use searchsorted on scaled ones and then subtract offsets
    return np.searchsorted(a_scaled,b_scaled)-np.arange(len(s))*a.shape[1]

给定样本的输出-

In [101]: searchsorted2d(a,v)
Out[101]: array([ 1,  9, 10,  6,  9])

包含所有NaN行的情况

要扩展使其适用于所有NaN行，我们还需要一些步骤-

To extend to make it work for all NaNs rows, we need few more steps -

valid_mask = ~np.isnan(a).any(1)
out = np.zeros(len(a), dtype=int)
out[valid_mask] = searchsorted2d(a[valid_mask],v[valid_mask])

这篇关于如何使用二维数组中的numpy.search进行矢量化的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

数组

如何使用二维数组中的numpy.search进行矢量化

问题描述

推荐答案