我使用以下输入创建了一个pandas dataframe mn:
keyA state n1 n2 d1 d2 key1 CA 100 1000 1 2 key2 FL 200 2000 2 4 key1 CA 300 3000 3 6 key1 AL 400 4000 4 8 key2 FL 500 5000 5 2 key1 NY 600 6000 6 4 key2 CA 700 7000 7 6
创建了一个sum对象,如下所示:
s = mn.groupby(['keyA','state'],as_index=False).sum()
如何迭代sum对象,以便我可以得到以下输出:
下面结果中的v1列计算为s [‘n1’] / s [‘d1’]
以下结果中的v2列计算为s [‘n2’] / s [‘d2’]
keyA state v1 v2 'key1','AL',100,500 'key1','CA','NY',1500 'key2',1166 'key2','FL',1166
几乎就像你的伪代码一样写它.
In [14]: s = mn.groupby(['keyA',as_index=False).sum() In [15]: s['v1'] = s['n1'] / s['d1'] In [16]: s['v2'] = s['n2'] / s['d2'] In [17]: s[['keyA','state','v1','v2']] Out[17]: keyA state v1 v2 0 key1 AL 100 500.000000 1 key1 CA 100 500.000000 2 key1 NY 100 1500.000000 3 key2 CA 100 1166.666667 4 key2 FL 100 1166.666667 [5 rows x 4 columns]
顺便说一下,我认为你的示例数据中有一个拼写错误.第二个n1标头应为n2.