本文介绍了如何将多个 N 维数组输入到 caffe 中的网络?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在需要多个输入的 caffe 中创建一个用于语义分割的自定义损失层.我希望这个损失函数有一个额外的输入因子,以惩罚小物体中的漏检.

I want to create a custom loss layer for semantic segmentation in caffe that requires multiple inputs. I wish this loss function to have an additional input factor in order to penalize the miss detection in small objects.

为此,我创建了一个图像 GT,其中包含每个像素的权重.如果像素属于小物体,则权重很高.

To do that I have created an image GT that contains for each pixel a weight. If the pixel belongs to a small object the weight is high.

我是 caffe 的新手,我不知道如何同时为我的网络提供三个二维信号(图像、gt-mask 和每像素权重).我对caffe如何进行rgb数据和gt数据之间的对应表示怀疑.
我想扩展它以便有 2 gt 一个用于类标签图像,另一个用于将此因子放入损失函数中.

I am newbie in caffe and I do not know how to feed my net with three 2-D signals at the same time (image, gt-mask and the per-pixel weights). I have doubts regarding how is caffe doing the correspondence between rgb data and gt data.
I want to expand this in order to have 2 gt one for the class label image and the other to put this factor in the loss function.

您能否给出一些提示以实现这一目标?

Can you give some hint in order to achive that?

谢谢,

推荐答案

您希望 caffe 为每个训练样本使用多个 N-D 信号.您担心默认的 "Data" 层只能处理一张图像作为训练样本.
有几种解决方案可以解决此问题:

You want to caffe to use several N-D signals for each training sample. You are concerned with the fact that the default "Data" layer can only handle one image as a training sample.
There are several solutions for this concern:

  1. 使用多个数据"(正如在模型中所做的那样,您链接到).为了在三个 "Data" 层之间同步,您需要知道 caffe 按顺序从底层 LMDB 读取样本.因此,如果您以相同的顺序准备三个 LMDB,caffe 将按照样本放置的顺序从每个 LMDB 中一次读取一个样本,因此三个输入将在在训练/验证期间同步.
    请注意,convert_imageset 有一个 'shuffle' 标志,请执行 NOT 使用它,因为它会在三个 LMDB 中的每一个中以不同的方式调整您的样本,并且您将没有同步.强烈建议您在准备 LMDB 之前自己洗牌,但要相同 "shuffle" 应用于所有三个输入,使它们彼此同步.

  1. Using several "Data" layers (as was done in the model you linked to). In order to sync between the three "Data" layers you'll have you need to know that caffe reads the samples from the underlying LMDB sequentially. So, if you prepare your three LMDBs in the same order caffe will read one sample at a time from each of the LMDBs in the order in which the samples were put there, so the three inputs will be in sync during training/validation.
    Note that convert_imageset has a 'shuffle' flag, do NOT use it as it will shuffle your samples differently in each of the three LMDBs and you will have no sync. You are strongly advised to shuffle the samples yourself before preparing the LMDBs but in a way that the same "shuffle" is applied to all three inputs leaving them in sync with each other.

使用 5 通道输入.caffe 可以在 LMDB 中存储 N-D 数据,而不仅仅是彩色/灰色图像.您可以使用python创建LMDB,每个图像"都是5-channel 数组的前三个通道是图像的 RGB,后两个是真实标签和每像素损失的权重.
在您的模型中,您只需要添加一个 "Slice"数据" 之上的层:

Using 5 channel input. caffe can store N-D data in LMDB and not only color/gray images. You can use python to create LMDB with each "image" is a 5-channel array with the first three channels are image's RGB and the last two are the ground-truth labels and the weight for the per-pixel loss.
In your model you only need to add a "Slice" layer on top of your "Data":

layer {
  name: "slice_input"
  type: "Slice"
  bottom: "raw_input" # 5-channel "image" stored in LMDB
  top: "rgb"
  top: "gt"
  top: "weight"
  slice_param {
    axis: 1
    slice_point: 3
    slice_point: 4
  }
}

  • 使用"HDF5Data"(我个人的最爱).您可以将输入存储为二进制 hdf5 格式,并从这些文件中读取 caffe.在 caffe 中使用 "HDF5Data" 更加灵活,并允许您根据需要调整输入.在您的情况下,您需要准备一个包含三个数据集"的二进制 hdf5 文件:'rgb''gt''weight'.您需要确保在创建 hdf5 文件时同步样本.准备好之后,您就可以拥有一个 "HDF5Data" 层,其中包含三个可以使用的顶部".

  • Using "HDF5Data" layer (my personal favorite). You can store your inputs in a binary hdf5 format and have caffe read from these files. Using "HDF5Data" is much more flexible in caffe and allows you to shape the inputs as much as you like. In your case you need to prepare a binary hdf5 file with three "datasets": 'rgb', 'gt' and 'weight'. You need to make sure the samples are synced when you create the hdf5 file(s). Once you have the, ready you can have a "HDF5Data" layer with three "top"s ready to be used.

    编写您自己的Python"输入层.我不会在这里详细介绍.但是你可以在 python 中实现你自己的输入层.有关详细信息,请参阅此主题.

    Write your own "Python" input layer. I will not go into the details here. But you can implement your own input layer in python. See this thread for more details.

    这篇关于如何将多个 N 维数组输入到 caffe 中的网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

  • 05-19 05:27