本文介绍了DataFrame,apply,Lambda,列表理解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试对某些数据集进行一些清理,我可以使用一些for循环来完成任务,但是我想要一种更具Pythonic/可疑性的方式来做到这一点.
I'm trying to do a bit of cleanse to some data sets, I can accomplish the task with some for loops but I wanted a more pythonic/pandorable way to do this.
这是我想出的代码,数据不是真实的..但它应该可以工作
This is the code I came up with, the data is not real..but it should work
import pandas as pd
# This is a dataframe containing the correct values
correct = pd.DataFrame([{"letters":"abc","data":1},{"letters":"ast","data":2},{"letters":"bkgf","data":3}])
# This is the dataframe containing source data
source = pd.DataFrame([{"c":"ab"},{"c":"kh"},{"c":"bkg"}])
for i,word in source["c"].iteritems():
for j,row in correct.iterrows():
if word in row["letters"]:
source.at[i,"c"] = row["data"]
break
这是我的一种可笑的尝试,但是由于列表理解返回了生成器而失败了:
This is my attempt to a pandorable way but it fails because of the list comprehension returning a generator:
source["c"] = source["c"].apply(
lambda x: row["data"] if x in row["letters"] else x for row in
correct.iterrows()
)
推荐答案
这是使用 pd.Series.apply
与next
和一个生成器表达式:
Here's one solution using pd.Series.apply
with next
and a generator expression:
def update_value(x):
return next((k for k, v in correct.set_index('data')['letters'].items() if x in v), x)
source['c'] = source['c'].apply(update_value)
print(source)
c
0 1
1 kh
2 3
这篇关于DataFrame,apply,Lambda,列表理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!