如何对在较大矩阵的子集上运行函数的代码进行矢量化处理?

本文介绍了如何对在较大矩阵的子集上运行函数的代码进行矢量化处理?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有以下9 x 5矩阵:

Let's assume I have the following 9 x 5 matrix:

myArray = [
   54.7    8.1   81.7   55.0   22.5
   29.6   92.9   79.4   62.2   17.0
   74.4   77.5   64.4   58.7   22.7
   18.8   48.6   37.8   20.7   43.5
   68.6   43.5   81.1   30.1   31.1
   18.3   44.6   53.2   47.0   92.3
   36.8   30.6   35.0   23.0   43.0
   62.5   50.8   93.9   84.4   18.4
   78.0   51.0   87.5   19.4   90.4
];

我对此矩阵有11个子集"，我需要在每个这些子集上运行一个函数(比如说max).可以使用以下逻辑矩阵(按列而不是按行)标识子集:

I have 11 "subsets" of this matrix and I need to run a function (let's say max) on each of these subsets. The subsets can be identified with the following matirx of logicals (identified column-wise, not row-wise):

myLogicals = logical([
    0 1 0 1 1
    1 1 0 1 1
    1 1 0 0 0
    0 1 0 1 1
    1 0 1 1 1
    1 1 1 1 0
    0 1 1 0 1
    1 1 0 0 1
    1 1 0 0 1
]);

或通过线性索引:

starts = [2 5 8 10 15 23 28 31 37 40 43]; #%index start of each subset
ends =   [3 6 9 13 18 25 29 33 38 41 45]; #%index end of each subset

第一个子集是2:3，第二个子集是5:6，依此类推.

such that the first subset is 2:3, the second is 5:6, and so on.

我可以找到每个子集的max并将其存储在向量中，如下所示:

I can find the max of each subset and store it in a vector as follows:

finalAnswers = NaN(11,1); 
for n=1:length(starts) #%i.e. 1 through the number of subsets
    finalAnswers(n) = max(myArray(starts(n):ends(n)));
end

循环运行后，finalAnswers包含每个数据子集的最大值:

After the loop runs, finalAnswers contains the maximum value of each of the data subsets:

74.4  68.6  78.0  92.9  51.0  81.1  62.2  47.0  22.5  43.5  90.4

是否可以在不使用for循环的情况下获得相同的结果?换句话说，可以将此代码向量化吗?这样的方法会比当前的方法更有效吗?

Is it possible to obtain the same result without the use of a for loop? In other words, can this code be vectorized? Would such an approach be more efficient than the current one?

我对建议的解决方案进行了一些测试.我使用的数据是一个1,510 x 2,185的矩阵，其中包含10,103个子集，其长度在2到916之间变化，子集长度的标准偏差为101.92.

I did some testing of the proposed solutions. The data I used was a 1,510 x 2,185 matrix with 10,103 subsets that varied in length from 2 to 916 with a standard deviation of subset length of 101.92.

我将每个解决方案都包裹在tic;for k=1:1000 [code here] end; toc;中，结果如下:

I wrapped each solution in tic;for k=1:1000 [code here] end; toc; and here are the results:

for循环方法--- Elapsed time is 16.237400 seconds.
Shai的方法--- Elapsed time is 153.707076 seconds.
丹的方法--- Elapsed time is 44.774121 seconds.
Divakar的方法#2 --- Elapsed time is 127.621515 seconds.

for loop approach --- Elapsed time is 16.237400 seconds.
Shai's approach --- Elapsed time is 153.707076 seconds.
Dan's approach --- Elapsed time is 44.774121 seconds.
Divakar's approach #2 --- Elapsed time is 127.621515 seconds.

注意:

我还尝试通过将k=1:1000 for循环仅包裹在accumarray行周围来对Dan的方法进行基准测试(因为其余的可能是理论上只运行一次).在这种情况下，时间为28.29秒.
以Shai的方法为基准，同时保留lb = ...行k循环的时间为113.48秒.
运行Divakar的代码时，在bsxfun行中出现Non-singleton dimensions of the twoinput arrays must match each other.错误.我通过使用共轭换位(撇号)来解决"这个问题trade_starts(1:starts_extent)上的运算符')和intv(1:starts_extent)在调用bsxfun的代码行中.我是不知道为什么会发生此错误...

I also tried benchmarking Dan's approach by wrapping the k=1:1000 for loop around just the accumarray line (since the rest could betheoretically run just once). In this case the time was 28.29seconds.
Benchmarking Shai's approach, while leaving the lb = ... line outof the k loop, the time was 113.48 seconds.
When I ran Divakar's code, I got Non-singleton dimensions of the twoinput arrays must match each other. errors for the bsxfun lines. I "fixed" this by using conjugate transposition (the apostropheoperator ') on trade_starts(1:starts_extent) andintv(1:starts_extent) in the lines of code calling bsxfun. I'mnot sure why this error was occuring...

我不确定我的基准测试设置是否正确，但是在这种情况下，for循环实际上运行最快.

I'm not sure if my benchmarking setup is correct, but it appears that the for loop actually runs the fastest in this case.

推荐答案

一种方法是使用accumarray.不幸的是，为了做到这一点，我们首先需要标记"您的逻辑矩阵.如果您没有图像处理工具箱，这是一种复杂的操作方式:

One approach is to use accumarray. Unfortunately in order to do that we first need to "label" your logical matrix. Here is a convoluted way of doing that if you don't have the image processing toolbox:

sz=size(myLogicals);
s_ind(sz(1),sz(2))=0;
%// OR: s_ind = zeros(size(myLogicals))

s_ind(starts) = 1;
labelled = cumsum(s_ind(:)).*myLogicals(:);

所以Shai的bwlabeln实现就是这样做的(但是形状是1 -by- numel(myLogicals)，而不是形状的size(myLogicals))

So that just does what Shai's bwlabeln implementation does (but this will be 1-by-numel(myLogicals) in shape as opposed to size(myLogicals) in shape)

现在您可以使用accumarray:

accumarray(labelled(myLogicals), myArray(myLogicals), [], @max)

否则尝试起来可能会更快

or else it may be faster to try

result = accumarray(labelled+1, myArray(:), [], @max);
result = result(2:end)

这是完全矢量化的，但是值得吗?您必须针对您的循环解决方案进行速度测试才能知道.

This is fully vectorized, but is it worth it? You'll have to do speed tests against your loop solution to know.

这篇关于如何对在较大矩阵的子集上运行函数的代码进行矢量化处理?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！