python – 使用随机长度的块生成1d numpy

前端之家收集整理的这篇文章主要介绍了python – 使用随机长度的块生成1d numpy前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我需要生成一维数组,其中重复的整数序列由随机数的零分隔.

到目前为止,我正在使用下一个代码

  1. from random import normalvariate
  2.  
  3. regular_sequence = np.array([1,2,3,4,5],dtype=np.int)
  4. n_iter = 10
  5. lag_mean = 10 # mean length of zeros sequence
  6. lag_sd = 1 # standard deviation of zeros sequence length
  7.  
  8. # Sequence of lags lengths
  9. lag_seq = [int(round(normalvariate(lag_mean,lag_sd))) for x in range(n_iter)]
  10.  
  11. # Generate list of concatenated zeros and regular sequences
  12. seq = [np.concatenate((np.zeros(x,dtype=np.int),regular_sequence)) for x in lag_seq]
  13. seq = np.concatenate(seq)

当我需要很多长序列时它可以工作但看起来很慢.那么,我该如何优化呢?

解决方法

您可以预先计算要放置重复regular_sequence元素的索引,然后以矢量化方式设置具有regular_sequence的索引.为了预先计算这些索引,可以使用 np.cumsum来获得每个这样的regular_sequence块的开始,然后添加一组连续的整数,扩展到regular_sequence的大小以获得要更新的所有索引.因此,实现看起来像这样 –
  1. # Size of regular_sequence
  2. N = regular_sequence.size
  3.  
  4. # Use cumsum to pre-compute start of every occurance of regular_sequence
  5. offset_arr = np.cumsum(lag_seq)
  6. idx = np.arange(offset_arr.size)*N + offset_arr
  7.  
  8. # Setup output array
  9. out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype)
  10.  
  11. # Broadcast the start indices to include entire length of regular_sequence
  12. # to get all positions where regular_sequence elements are to be set
  13. np.put(out,idx[:,None] + np.arange(N),regular_sequence)

运行时测试 –

  1. def original_app(lag_seq,regular_sequence):
  2. seq = [np.concatenate((np.zeros(x,regular_sequence)) for x in lag_seq]
  3. return np.concatenate(seq)
  4.  
  5. def vectorized_app(lag_seq,regular_sequence):
  6. N = regular_sequence.size
  7. offset_arr = np.cumsum(lag_seq)
  8. idx = np.arange(offset_arr.size)*N + offset_arr
  9. out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype)
  10. np.put(out,regular_sequence)
  11. return out
  12.  
  13. In [64]: # Setup inputs
  14. ...: regular_sequence = np.array([1,dtype=np.int)
  15. ...: n_iter = 1000
  16. ...: lag_mean = 10 # mean length of zeros sequence
  17. ...: lag_sd = 1 # standard deviation of zeros sequence length
  18. ...:
  19. ...: # Sequence of lags lengths
  20. ...: lag_seq = [int(round(normalvariate(lag_mean,lag_sd))) for x in range(n_iter)]
  21. ...:
  22.  
  23. In [65]: out1 = original_app(lag_seq,regular_sequence)
  24.  
  25. In [66]: out2 = vectorized_app(lag_seq,regular_sequence)
  26.  
  27. In [67]: %timeit original_app(lag_seq,regular_sequence)
  28. 100 loops,best of 3: 4.28 ms per loop
  29.  
  30. In [68]: %timeit vectorized_app(lag_seq,regular_sequence)
  31. 1000 loops,best of 3: 294 µs per loop

猜你在找的Python相关文章