如何在numpy中有效地连接多个arange调用?

前端之家收集整理的这篇文章主要介绍了如何在numpy中有效地连接多个arange调用?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我想在cnt值的向量上向numpy.arange(0,cnt_i)这样的调用进行向量化,并像这个片段一样连接结果:
import numpy
cnts = [1,2,3]
numpy.concatenate([numpy.arange(cnt) for cnt in cnts])

array([0,1,2])

不幸的是,由于临时数组和列表推导循环,上面的代码非常低效.

有没有办法在numpy中更有效地做到这一点?

解决方法

这是一个完全矢量化的函数
def multirange(counts):
    counts = np.asarray(counts)
    # Remove the following line if counts is always strictly positive.
    counts = counts[counts != 0]

    counts1 = counts[:-1]
    reset_index = np.cumsum(counts1)

    incr = np.ones(counts.sum(),dtype=int)
    incr[0] = 0
    incr[reset_index] = 1 - counts1

    # Reuse the incr array for the final result.
    incr.cumsum(out=incr)
    return incr

这是@ Developer的答案的变体,它只调用一次范围:

def multirange_loop(counts):
    counts = np.asarray(counts)
    ranges = np.empty(counts.sum(),dtype=int)
    seq = np.arange(counts.max())
    starts = np.zeros(len(counts),dtype=int)
    starts[1:] = np.cumsum(counts[:-1])
    for start,count in zip(starts,counts):
        ranges[start:start + count] = seq[:count]
    return ranges

这是原始版本,作为函数编写:

def multirange_original(counts):
    ranges = np.concatenate([np.arange(count) for count in counts])
    return ranges

演示:

In [296]: multirange_original([1,3])
Out[296]: array([0,2])

In [297]: multirange_loop([1,3])
Out[297]: array([0,2])

In [298]: multirange([1,3])
Out[298]: array([0,2])

使用更多的计数比较时间:

In [299]: counts = np.random.randint(1,50,size=50)

In [300]: %timeit multirange_original(counts)
10000 loops,best of 3: 114 µs per loop

In [301]: %timeit multirange_loop(counts)
10000 loops,best of 3: 76.2 µs per loop

In [302]: %timeit multirange(counts)
10000 loops,best of 3: 26.4 µs per loop

猜你在找的Python相关文章