本文介绍了Numpy:找到掩码边缘的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试查找掩码段的 indeces.例如:
I'm trying to find indeces of masked segments. For example:
mask = [1, 0, 0, 1, 1, 1, 0, 0]
segments = [(0, 0), (3, 5)]
当前的解决方案看起来像这样(并且非常很慢,因为我的掩码包含数百万个数字):
Current solution looks like this (and it's very slow, because my mask contains millions of numbers):
segments = []
start = 0
for i in range(len(mask) - 1):
e1 = mask[i]
e2 = mask[i + 1]
if e1 == 0 and e2 == 1:
start = i + 1
elif e1 == 1 and e2 == 0:
segments.append((start, i))
有没有办法用 numpy 有效地做到这一点?
Is there any way to do this efficiently with numpy?
我唯一能用谷歌搜索的是 numpy.ma.notmasked_edges,但它看起来不像我需要的.
The only thing that i've managed to google is numpy.ma.notmasked_edges, but it doesn't look like what i need.
推荐答案
这是一种方法 -
def start_stop(a, trigger_val):
# "Enclose" mask with sentients to catch shifts later on
mask = np.r_[False,np.equal(a, trigger_val),False]
# Get the shifting indices
idx = np.flatnonzero(mask[1:] != mask[:-1])
# Get the start and end indices with slicing along the shifting ones
return zip(idx[::2], idx[1::2]-1)
样品运行 -
In [216]: mask = [1, 0, 0, 1, 1, 1, 0, 0]
In [217]: start_stop(mask, trigger_val=1)
Out[217]: [(0, 0), (3, 5)]
使用它来获取0s
-
In [218]: start_stop(mask, trigger_val=0)
Out[218]: [(1, 2), (6, 7)]
100000x
上的时序放大数据大小 -
Timings on 100000x
scaled up datasize -
In [226]: mask = [1, 0, 0, 1, 1, 1, 0, 0]
In [227]: mask = np.repeat(mask,100000)
# Original soln
In [230]: %%timeit
...: segments = []
...: start = 0
...: for i in range(len(mask) - 1):
...: e1 = mask[i]
...: e2 = mask[i + 1]
...: if e1 == 0 and e2 == 1:
...: start = i + 1
...: elif e1 == 1 and e2 == 0:
...: segments.append((start, i))
1 loop, best of 3: 401 ms per loop
# @Yakym Pirozhenko's soln
In [231]: %%timeit
...: slices = np.ma.clump_masked(np.ma.masked_where(mask, mask))
...: result = [(s.start, s.stop - 1) for s in slices]
100 loops, best of 3: 4.8 ms per loop
In [232]: %timeit start_stop(mask, trigger_val=1)
1000 loops, best of 3: 1.41 ms per loop
这篇关于Numpy:找到掩码边缘的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!