考虑一个返回百分比的numpy数组的直方图计算:
# 500 random numbers between 0 and 10,000
values = np.random.uniform(0,10000,500)
# Histogram using e.g. 200 buckets
perc,edges = np.histogram(values,bins=200,weights=np.zeros_like(values) + 100/values.size)
以上返回两个数组:
> perc包含总数中每对连续边[ix]和边[ix 1]内的值的百分比(即百分比).
>长度为len(hist)的边缘1
现在,假设我想过滤perc和edge,这样我最终只得到新范围[m,M]中包含的值的百分比和边缘. “
也就是说,我想使用perc的子数组和与[m,M]内的值间隔对应的边.不用说,新的百分比数组仍然指的是输入数组的总分数.我们只想过滤perc和edge以得到正确的子数组.
如何对perc和edge进行后期处理呢?
m和M的值可以是任何数量的过程.在上面的例子中,我们可以假设,例如m = 0且M = 200.
最佳答案
m = 0; M = 200
mask = [(m < edges) & (edges < M)]
>>> edges[mask]
array([ 37.4789683,87.07491593,136.67086357,186.2668112 ])
让我们处理一个较小的数据集,以便更容易理解:
np.random.seed(0)
values = np.random.uniform(0,100,10)
values.sort()
>>> values
array([ 38.34415188,42.36547993,43.75872113,54.4883183,54.88135039,60.27633761,64.58941131,71.51893664,89.17730008,96.36627605])
# Histogram using e.g. 10 buckets
perc,bins=10,weights=np.zeros_like(values) + 100./values.size)
>>> perc
array([ 30.,0.,20.,10.,10.])
>>> edges
array([ 38.34415188,44.1463643,49.94857672,55.75078913,61.55300155,67.35521397,73.15742638,78.9596388,84.76185122,90.56406363,96.36627605])
m = 0; M = 50
mask = (m <= edges) & (edges < M)
>>> mask
array([ True,True,False,False],dtype=bool)
>>> edges[mask]
array([ 38.34415188,49.94857672])
>>> perc[mask[:-1]][:-1]
array([ 30.,0.])
m = 40; M = 60
mask = (m < edges) & (edges < M)
>>> edges[mask]
array([ 44.1463643,55.75078913])
>>> perc[mask[:-1]][:-1]
array([ 0.,20.])