问题描述
考虑data
,其中包含一些下面的nan:
Consider data
which contains some nan below:
Column-1 Column-2 Column-3 Column-4 Column-5
0 NaN 15.0 63.0 8.0 40.0
1 60.0 51.0 NaN 54.0 31.0
2 15.0 17.0 55.0 80.0 NaN
3 54.0 43.0 70.0 16.0 73.0
4 94.0 31.0 94.0 29.0 53.0
5 99.0 52.0 77.0 91.0 58.0
6 84.0 19.0 36.0 NaN 97.0
7 41.0 91.0 62.0 67.0 68.0
8 44.0 38.0 27.0 53.0 37.0
9 58.0 NaN 63.0 57.0 28.0
10 66.0 68.0 89.0 36.0 47.0
11 7.0 81.0 5.0 99.0 16.0
12 43.0 55.0 64.0 88.0 NaN
13 8.0 90.0 91.0 44.0 4.0
14 29.0 52.0 94.0 71.0 47.0
15 22.0 21.0 68.0 61.0 38.0
16 76.0 36.0 70.0 99.0 50.0
17 38.0 31.0 66.0 79.0 99.0
18 94.0 22.0 92.0 39.0 58.0
我想使用sklearn.impute.IterativeImputer
替换data
中的nan.一位朋友用下面的代码帮助了我:
I want to replace nan in the data
using sklearn.impute.IterativeImputer
. A friend helped me with the code below:
imp = IterativeImputer(missing_values=np.nan, sample_posterior=False,
max_iter=10, tol=0.001,
n_nearest_features=4, initial_strategy='median')
imp.fit(data)
imputed_data = pd.DataFrame(data=imp.transform(data),
columns=['Column-1', 'Column-2', 'Column-3', 'Column-4', 'Column-5'],
dtype='int')
imputed_data
是:
Column-1 Column-2 Column-3 Column-4 Column-5
0 59 15 63 8 40
1 60 51 66 54 31
2 15 17 55 80 48
3 54 43 70 16 73
4 94 31 94 29 53
5 99 52 77 91 58
6 84 19 36 59 97
7 41 91 62 67 68
8 44 38 27 53 37
9 58 46 63 57 28
10 66 68 89 36 47
11 7 81 5 99 16
12 43 55 64 88 47
13 8 90 91 44 4
14 29 52 94 71 47
15 22 21 68 61 38
16 76 36 70 99 50
17 38 31 66 79 99
18 94 22 92 39 58
在IterativeImputer
中文档,默认估算值为BayesianRidge()
.但是,如果我在下面的代码中使用诸如estimator=ExtraTreesRegressor(n_estimators=10, random_state=0)
之类的其他估算器,它将返回警告消息.代码:
From the IterativeImputer
documentation, the default estimator is BayesianRidge()
. But if I use other estimators such as estimator=ExtraTreesRegressor(n_estimators=10, random_state=0)
like in the code below, it returns a warning message.The code:
imp = IterativeImputer(estimator=ExtraTreesRegressor(n_estimators=10, random_state=0), missing_values=np.nan, sample_posterior=False,
max_iter=10, tol=0.001,
n_nearest_features=4, initial_strategy='median')
imp.fit(data)
消息:
C:\Users\...\sklearn\impute\_iterative.py:599: ConvergenceWarning: [IterativeImputer] Early stopping criterion not reached. " reached.", ConvergenceWarning).
我的问题:这是正确的方法还是我应该采取一些措施来修正警告消息?
谢谢.
My question: is this a correct approach or should I do something to fix the warning message?
Thank you.
推荐答案
他们在这里遇到相同的问题:
They are having the same issue here:
https://github.com/scikit-learn/scikit-learn /issues/14338
这篇关于sklearn.impute.IterativeImputer的实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!