Google AutoML自然语言多标签文本分类的输入数据集格式

本文介绍了Google AutoML自然语言多标签文本分类的输入数据集格式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于Google AutoML自然语言多标签文本分类，输入数据集的格式应该是什么?我知道对于多类分类，我需要一列文本和另一列标签.标签列每行包含一个标签.

What should the format of the input dataset be for Google AutoML Natural Language multi-label text classification? I know that for multi-class classification I need a column of text and another column for labels. The labels column include one label per row.

我为每个文本有多个标签，并且我想进行多标签分类.我尝试每个标签有一个列和一个热编码，但是却收到此错误消息:最多支持1000个标签.找到了9823个标签.

I have multiple labels for each text and I want to do multi-label classification. I tried having one column per label and one-hot encoding but I got this error message:Max 1000 labels supported. Found 9823 labels.

file

Google AutoML自然语言多标签文本分类的输入数据集格式

问题描述

推荐答案