我有两列的数据框.我想创建一个新列并输入具有最长字符串的列.所以
column_a column_b column_c
0 'dog is fast' 'dog is faster' 'dog is faster' (desired output)
我尝试了这段代码,但遇到一个错误,说int不可迭代,我在想将序列合并到df中.我不确定如何立即将其实现到df的一栏中.
column_c = pd.Series()
for i in len(df.column_a):
if len(df.column_a.iloc[i]) >= len(df.column_b.iloc[0]):
column_c.append(df.column_a.iloc[i])
else:
column_c.append(df.column_b.iloc[i])
任何帮助都是感激的.
最佳答案
使用pandas.DataFrame.apply:
给定样本数据
import pandas as pd
df = pd.DataFrame([['fast','faster'],['slower','slow']])
0 1
0 fast faster
1 slower slow
df['column_c'] = df.apply(lambda x:max(x,key=len),1)
输出:
0 1 column_c
0 fast faster faster
1 slower slow slower