本文介绍了 pandas :对于特定列中的所有重复条目,请获取一些信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的大数据框:

I have a large Dataframe that looks similar to this:

     ID_Code    Status1    Status2
0      A          Done       Not
1      A          Done       Done
2      B          Not        Not
3      B          Not        Done
4      C          Not        Not
5      C          Not        Not
6      C          Done       Done

我要对每个重复的ID码进行计算,找出存在Not-Not条目的百分比.(即[Not-Not的数量/总条目的数量] * 100)

What I want to do is calculate is for each of the set of duplicate ID codes, find out the percentage of Not-Not entries are present. (i.e. [# of Not-Not/# of total entries] * 100)

我正在努力使用groupby这样做,而且似乎无法获得正确的语法来执行此操作.

I'm struggling to do so using groupby and can't seem to get the right syntax to perform this.

推荐答案

我可能误解了这个问题,但是您似乎指的是Status1Status2的值都是两者 Not,对吗?如果是这样,您可以执行以下操作:

I may have misunderstood the question, but you appear to be referring to when values of Status1 and Status2 are both Not, correct? If that's the case, you can do something like:

df.groupby('ID_Code').apply(lambda x: (x[['Status1','Status2']] == 'Not').all(1).sum()/len(x)*100)

ID_Code
A     0.000000
B    50.000000
C    66.666667
dtype: float64

这篇关于 pandas :对于特定列中的所有重复条目,请获取一些信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-02 21:16