如何以矢量化方式在特定轴上找到二维数组的唯一矢量?

本文介绍了如何以矢量化方式在特定轴上找到二维数组的唯一矢量?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个形状为(n,t)的数组，我想将其视为n-vectors的时间序列.

I have an array of shape (n,t) which I'd like to treat as a timeseries of n-vectors.

我想知道沿着t-dimension的唯一n-vector值以及每个唯一矢量的关联t-indices.我很高兴使用任何合理的均等定义(例如numpy.unique将采用浮点数)

I'd like to know the unique n-vector values that exist along the t-dimension as well as the associated t-indices for each unique vector. I'm happy to use any reasonable definition of equality (e.g. numpy.unique will take floats)

通过t上的Python循环，这很容易，但是我希望有一种矢量化方法.

This is easy with a Python loop over t but I'm hoping for a vectorized approach.

在某些特殊情况下，可以通过将n-vectors分解为标量(并在一维结果上使用numpy.unique)来完成，例如如果您有布尔值，则可以将向量化的dot与(2**k)向量一起使用，以将(布尔向量)转换为整数，但是我正在寻找一种比较通用的解决方案.

In some special cases it can be done by collapsing the n-vectors into scalars (and using numpy.unique on the 1d result), e.g. if you had booleans you could use a vectorized dot with the (2**k) vector to convert (boolean vectors) to integers, but I'm looking for a fairly general solution.

推荐答案

如果数组的形状为(t，n)-则每个n向量的数据在内存中是连续的-您可以创建一个视图将二维数组作为一维结构化数组，然后在该视图上使用numpy.unique.

If the shape of your array was (t, n)--so the data for each n-vector was contiguous in memory--you could create a view of the 2-d array as a 1-d structured array, and then use numpy.unique on this view.

如果您可以更改阵列的存储约定，或者不介意制作转置阵列的副本，那么这可能对您有用.

If you can change the storage convention of your array, or if you don't mind making a copy of the transposed array, this could work for you.

这是一个例子:

import numpy as np

# Demo data.
x = np.array([[1,2,3],
              [2,0,0],
              [1,2,3],
              [3,2,2],
              [2,0,0],
              [2,1,2],
              [3,2,1],
              [2,0,0]])

# View each row as a structure, with field names 'a', 'b' and 'c'.
dt = np.dtype([('a', x.dtype), ('b', x.dtype), ('c', x.dtype)])
y = x.view(dtype=dt).squeeze()

# Now np.unique can be used.  See the `unique` docstring for
# a description of the options.  You might not need `idx` or `inv`.
u, idx, inv = np.unique(y, return_index=True, return_inverse=True)

print("Unique vectors")
print(u)

这篇关于如何以矢量化方式在特定轴上找到二维数组的唯一矢量?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！