问题描述
您可以使用np.where。如果cond
是布尔数组,A
并且B
是数组,则
C = np.where(cond, A, B)
@H_502_11@将C定义为等于
A
哪里cond
为True,B
哪里cond
为False。import numpy as np import pandas as pd a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']] df = pd.DataFrame(a, columns=['one', 'two', 'three']) df['que'] = np.where((df['one'] >= df['two']) & (df['one'] <= df['three']) , df['one'], np.nan)
@H_502_11@产量
one two three que 0 10 1.2 4.2 10 1 15 70 0.03 NaN 2 8 5 0 NaN
@H_502_11@
如果您有多个条件,则可以使用np.select代替。例如,如果你想
df['que']
等于df['two']
时df['one'] < df['two']
,则conditions = [ (df['one'] >= df['two']) & (df['one'] <= df['three']), df['one'] < df['two']] choices = [df['one'], df['two']] df['que'] = np.select(conditions, choices, default=np.nan)
@H_502_11@产量
one two three que 0 10 1.2 4.2 10 1 15 70 0.03 70 2 8 5 0 NaN
@H_502_11@如果我们可以假设
df['one'] >= df['two']
whendf['one'] < df['two']
为False,那么条件和选择可以简化为conditions = [ df['one'] < df['two'], df['one'] <= df['three']] choices = [df['two'], df['one']]
@H_502_11@(如果包含
df['one']
或df['two']
包含NaN,则该假设可能不正确。)
注意
a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']] df = pd.DataFrame(a, columns=['one', 'two', 'three'])
@H_502_11@用字符串值定义一个DataFrame。由于它们看起来是数字,因此最好将这些字符串转换为浮点数:
df2 = df.astype(float)
@H_502_11@但是,这会改变结果,因为字符串会逐个字符地进行比较,而浮点数会进行数字比较。
In [61]: '10' <= '4.2' Out[61]: True In [62]: 10 <= 4.2 Out[62]: False
@H_502_11@解决方法
以此为起点:
a = [['10','1.2','4.2'],['15','70','0.03'],['8','5','0']] df = pd.DataFrame(a,columns=['one','two','three']) Out[8]: one two three 0 10 1.2 4.2 1 15 70 0.03 2 8 5 0
我想
if
在熊猫中使用类似声明的内容。if df['one'] >= df['two'] and df['one'] <= df['three']: df['que'] = df['one']
基本上,通过
if
语句检查每一行,然后创建新列。文档说要使用,
.all
但没有示例…