随机裁剪数据增强卷积神经网络

本文介绍了随机裁剪数据增强卷积神经网络的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在训练卷积神经网络，但数据集相对较小。所以我正在实施增强它的技术。现在这是我第一次研究核心计算机视觉问题，所以对它来说相对较新。为了进行扩充，我阅读了许多技术，其中一篇在论文中被大量提及的是随机裁剪。现在我正在尝试实现它，我已经搜索了很多关于这种技术，但找不到合适的解释。所以有一些疑问：

随机裁剪如何实际帮助数据增加？是否有任何库（例如OpenCV，PIL，scikit-image，scipy）在python中隐式实现随机裁剪？如果没有，我应该如何实现呢？

解决方案

在我看来，随机裁剪有助于数据增加的原因是，虽然语义是保留图像（除非你选择了一个非常糟糕的裁剪，但我们假设你设置了随机裁剪，这样的概率非常低）你在转发网中获得的激活值是不同的。因此，实际上我们的转换网络学习将更广泛的空间激活统计与一定的类标签相关联，因此通过随机裁剪的数据增强有助于提高我们的特征检测器在网络中的稳健性。同样，随机作物产生不同的中间激活值并产生不同的前向通道，因此它就像一个新的训练点。

这也不是一件容易的事。请参阅最近关于神经网络中对抗性示例的工作（对于AlexNet大小相对较浅）。图像在语义上看起来相同，或多或少，当我们通过一个带有softmax分类器的神经网络传递它们时，我们可以得到截然不同的类概率。因此，从语义的角度来看，微妙的变化最终可能会通过转发网络进行不同的前向传递。有关详细信息，请参阅。

回答问题的最后部分：我通常只是制作自己的随机裁剪脚本。假设我的图像是（3,256,256）（3个RGB通道，256x256空间大小），您可以编写一个循环，通过随机选择一个有效的角点来获取图像的224x224随机裁剪。所以我通常会计算一个有效角点数组，如果我想要10个随机作物，我会从这个集合中随机选择10个不同的角点，比如我选择（x0，y0）作为我的左上角点，我会选择裁剪X [x0：x0 + 224，y0：y0 + 224]，就像这样。我个人喜欢从预先计算出的一组有效角点中随机选择，而不是一次随机选择一个角落，因为这样我保证不会得到重复的裁剪，但实际上它的概率可能很低。 / p>

I am training a convolutional neural network, but have a relatively small dataset. So I am implementing techniques to augment it. Now this is the first time i am working on a core computer vision problem so am relatively new to it. For augmenting, i read many techniques and one of them that is mentioned a lot in the papers is random cropping. Now i'm trying to implement it ,i've searched a lot about this technique but couldn't find a proper explanation. So had a few queries:

How is random cropping actually helping in data augmentation? Is there any library (e.g OpenCV, PIL, scikit-image, scipy) in python implementing random cropping implicitly? If not, how should i implement it?

解决方案

In my opinion the reason random cropping helps data augmentation is that while the semantics of the image are preserved (unless you pick out a really bad crop, but let's assume that you setup your random cropping so that this is very low probability) the activations values you get in your conv net are different. So in effect our conv net learns to associate a broader range of spatial activation statistics with a certain class label and thus data augmentation via random cropping helps improve the robustness of our feature detectors in conv nets. Also in the same vein, the random crop produces different intermediate activation values and produces a different forwardpass so it's like a "new training point."

It's also not trivial. See the recent work on adversarial examples in neural networks (relatively shallow to AlexNet sized). Images that semantically look the same, more or less, when we pass them through a neural net with a softmax classifier on top, we can get drastically different class probabilities. So subtle changes from a semantic point of view can end up having different forward passes through a conv net. For more details see Intriguing properties of neural networks.

To answer the last part of your question: I usually just make my own random cropping script. Say my images are (3, 256, 256) (3 RGB channels, 256x256 spatial size) you can code up a loop which takes 224x224 random crops of your image by just randomly selecting a valid corner point. So I typically compute an array of valid corner points and if I want to take 10 random crops, I randomly select 10 different corner points from this set, say I choose (x0, y0) for my upper left hand corner point, I will select the crop X[x0:x0+224, y0:y0+224], something like this. I personally like to randomly choose from a pre-computed set of valid corner points instead of randomly choosing a corner one draw at a time because this way I guarantee I do not get a duplicate crop, though in reality it's probably low probability anyway.

这篇关于随机裁剪数据增强卷积神经网络的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！