本文介绍了Sklearn StratifiedKFold:ValueError:支持的目标类型为:("binary","multiclass").改为使用"multilabel-indicator"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Sklearn进行分层kfold拆分,当我尝试使用多类进行拆分时,收到错误消息(请参见下文).当我尝试使用二进制文件进行拆分时,它没有问题.

Working with Sklearn stratified kfold split, and when I attempt to split using multi-class, I received on error (see below). When I tried and split using binary, it works no problem.

num_classes = len(np.unique(y_train))
y_train_categorical = keras.utils.to_categorical(y_train, num_classes)
kf=StratifiedKFold(n_splits=5, shuffle=True, random_state=999)

# splitting data into different folds
for i, (train_index, val_index) in enumerate(kf.split(x_train, y_train_categorical)):
    x_train_kf, x_val_kf = x_train[train_index], x_train[val_index]
    y_train_kf, y_val_kf = y_train[train_index], y_train[val_index]

ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead.

推荐答案

keras.utils.to_categorical生成一个单编码的类矢量,即错误消息中提到的multilabel-indicator. StratifiedKFold不适用于此类输入.从split方法文档中:

keras.utils.to_categorical produces a one-hot encoded class vector, i.e. the multilabel-indicator mentioned in the error message. StratifiedKFold is not designed to work with such input; from the split method docs:

[...]

y :类似数组的形状(n_samples个)

y : array-like, shape (n_samples,)

监督学习问题的目标变量.根据y标签进行分层.

The target variable for supervised learning problems. Stratification is done based on the y labels.

即您的y必须是类标签的一维数组.

i.e. your y must be a 1-D array of your class labels.

本质上,您要做的只是简单地反转操作顺序:先拆分(使用初始y_train),然后再转换to_categorical.

Essentially, what you have to do is simply to invert the order of the operations: split first (using your intial y_train), and convert to_categorical afterwards.

这篇关于Sklearn StratifiedKFold:ValueError:支持的目标类型为:("binary","multiclass").改为使用"multilabel-indicator"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 08:41