本文介绍了CNN的二进制图像分类-选择“负数"的最佳做法数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说,我想训练CNN来检测图像是否是汽车.

Say, I want to train a CNN to detect whether an image is a car or not.

选择非汽车"数据集有哪些最佳实践或方法?

What are some best practices or methods to choosing the "Not-Car" dataset?

因为此数据集可能是无限的(基本上不是汽车的任何事物)-是否有关于数据集需要多大的准则?它们是否应该包含与汽车非常相似但不与飞机(船,船等)相似的物体?

Because this dataset could potentially be infinite (basically anything that is not a car) - is there a guideline on how big the dataset needs to be? Should they contain objects which are very similar to cars, but are not (planes, boats, etc.)?

推荐答案

就像在所有有监督的机器学习中一样,训练集应反映模型将要使用的实际分布.神经网络基本上是一个函数逼近器.您的实际目标是逼近真实世界的分布,但实际上只能从该分布中获取样本,而该样本是神经网络可以看到的 only .对于除训练歧管之外的任何输入方式,输出都只是一个猜测(另请参见有关AI.SE的讨论).

Like in all of supervised machine learning, the training set should reflect the real distribution that the model is going to work with. Neural network is basically a function approximator. Your actual goal is to approximate the real-world distribution, but in practice it's only possible to get the sample from it, and this sample is the only thing a neural network will see. For any input way outside of the training manifold, the output will be a just a guess (see also this discussion on AI.SE).

因此,在选择否定数据集时,您应该回答的第一个问题是:该模型的可能用例是什么?例如,如果您要为智能手机构建应用程序,则负样本可能应该包括街景,建筑物和商店的图片,人物,室内环境等.智能手机摄像头拍摄的图像不太可能是野生动物或抽象绘画,也就是说,在您的真实摄影中这是不太可能的输入分布.

So when choosing a negative dataset, the first question you should answer is: What will be the likely use-case of this model? E.g., if you're building an app for a smartphone, then the negative sample should probably include street views, pictures of buildings and stores, people, indoor environment, etc. It's unlikely that the image from the smartphone camera will be a wild animal or abstract painting, i.e., it's an improbable input in your real distribution.

包括看起来像是正面类的图像(卡车,飞机,轮船等)是一个好主意,因为低转换层的特征(边缘,角)将非常相似,并且神经网络学习到这一点很重要重要的高级功能.

Including images that look like a positive class (trucks, airplanes, boats, etc) is a good idea, because the low-conv-layer features (edges, corners) will be very similar and it's important that the neural network learned important high-level features correctly.

通常,我会使用比正面图片多5到10倍的负面图片. CIFAR-10 是一个很好的起点:在50000张训练图像中,有5000张是汽车,5000架飞机等.实际上,构建10级分类器并不是一个坏主意.在这种情况下,您可以通过确定其推断类为汽车的确定性,将此CNN转换为二进制分类器. CNN不确定的任何内容都将被解释为不是汽车.

In general, I'd use 5-10x more negative images that positive ones. CIFAR-10 is a good starting point: out of 50000 training images 5000 are the cars, 5000 are the planes, etc. In fact, building a 10-class classifier is not a bad idea. In this case, you'll transform this CNN to a binary classifier by thresholding its certainty that the inferred class is a car. Anything that the CNN isn't certain about will be interpreted as not a car.

这篇关于CNN的二进制图像分类-选择“负数"的最佳做法数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 12:26