当我阅读ssd-caffe代码时有一些问题,我确实需要您的帮助。

  • 本机caffe仅支持分类,数据读取层通常用于读取LMDB数据库和读取图像以训练
  • 为了支持多个标签和输入注释框的输入,我决定使用ssd-caffe,它向本地caffe添加了AnnotatedDataLayer层。这个新添加的层可以支持多个标签和注释框,但是有其局限性。原因是它读取的数据类型仍然是lmdb。
  • 现在,我们需要随机读取数据集的数据,但是根据查询结果,lmdb是B +树结构,只能通过迭代器顺序读取,因此我们想更改lmdb以直接读取图像。但是,本机咖啡的直接阅读图片不支持多标签和注释框。如何修改caffe的image_data_layers以支持注释框的输入(我可以遵循AnnotatedDataLayer的方法来解决此问题)吗?

  • 注意:
  • 修改的ssd-caffe源代码:https://github.com/eric612/MobileNet-YOLO
  • 新添加的注释框的文件路径:/MobileNet-YOLO/src/caffe/layers/annotated_data_layer.cpp
  • 用于直接读取图片的本机caffe文件路径:/MobileNet-YOLO/src/caffe/layers/image_data_layer.cpp
  • 最佳答案

    数据层提供了异步读取硬盘上随机数据的可能性(它使用2个线程:一个线程读取数据,另一个线程将数据传送到神经网络)。您的顶部Blob由数据和标签组成。不幸的是,标签是一维的。为了解决此问题,可以以特殊顺序组织lmdb数据库。然后,当我们读取数据时,在将其传递到神经网络之前,我们先对其进行转换以使其适应我们的问题。在下面的示例中,我将进行演示:首先,我将编写一个LMDB数据库,其中包含10个不同的图像(它是同一张图像,但我们假定它们是不同的),10个随机边界框和10个维度为3的随机标签。

    注意:要重现以下代码,您必须安装caffe。如果只编译了caffe文件夹,则在 root_caffe / examples / new_folder 中创建该文件夹,将代码放入其中,然后进行编译 make

    #include <caffe/caffe.hpp>
    #include "caffe/proto/caffe.pb.h"
    #include "caffe/util/db.hpp"
    #include "boost/scoped_ptr.hpp"
    #include <opencv2/imgcodecs.hpp>
    #include <iostream>
    #include <stdlib.h>
    
    
    using namespace caffe;
    using boost::scoped_ptr;
    
    
    std::vector<float> generate_random_boxes(const int max_num_bbx){
    
            std::vector<float> bbx(4*max_num_bbx);
    
        for(int i = 0; i < max_num_bbx; i++){
    
           float scale = 500*static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
           float x1 = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
           float y1 = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
           float x2 = x1 + static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
           float y2 = x1 + static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
           bbx[i*4] = scale*x1;
           bbx[i*4 + 1] = scale*y1;
           bbx[i*4 + 2] = scale*x2;
           bbx[i*4 + 3] = scale*y2;
    
        }
    
        return bbx;
    }
    
    std::vector<float> generate_random_labels(const int dim_label, const int max_num_bbx){
    
            std::vector<float> labels(dim_label*max_num_bbx);
    
        for(int i = 0; i < max_num_bbx; i++){
           for(int j = 0; j < dim_label; j++){
    
              labels[dim_label*i + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
    
               }
        }
    
        return labels;
    }
    
    
    int main(){
    
      const std::string root_path = "/path/for/test/";
      const std::string path_lmdb = root_path + "lmdb";
      std::string rm_lmdb = std::string("rm -rf ") + path_lmdb.c_str();
      system(rm_lmdb.c_str());
      scoped_ptr<db::DB> db(db::GetDB("lmdb"));
      db->Open(path_lmdb, db::NEW);
      scoped_ptr<db::Transaction> txn(db->NewTransaction());
    
    
      int n = 10;
      int max_num_bbx = 7;
      int dim_label = 3;
      cv::Mat aux_img = cv::imread(root_path + "image.jpg");
      int rows = aux_img.rows;
      int cols = aux_img.cols;
    
      std::vector<cv::Mat> vec_img(n);
      std::vector< std::vector<float> > vec_bbx(n);
      std::vector< std::vector<float> > vec_label(n);
    
      for(int i = 0; i < n; i++){
    
         vec_img[i] = aux_img.clone();
         vec_bbx[i] = generate_random_boxes(max_num_bbx);
         vec_label[i] = generate_random_labels(dim_label, max_num_bbx);
    
      }
    
      for(int i = 0; i< n; i++){
    
         int sz = 3*rows*cols + 4*max_num_bbx + dim_label*max_num_bbx;
    
         Datum datum;
         datum.set_label(0); //no used
         datum.set_channels(1);
         datum.set_height(1);
         datum.set_width(sz);
    
         google::protobuf::RepeatedField<float>* datumFloatData = datum.mutable_float_data();
    
         //store images
         cv::Mat img = vec_img[i];
         for(int d = 0; d < 3; d++){ //BGR
            for(int r = 0; r < rows; r++){
               for(int c = 0; c < cols; c++){
    
                  cv::Vec3b pixel = img.at<cv::Vec3b>(r, c);
                  datumFloatData->Add(float(pixel[d]));
    
               }
            }
         }
    
    
        //store bounding-boxes
        std::vector<float>& bbx = vec_bbx[i];
        for(int j = 0; j < 4*max_num_bbx; j++)
           datumFloatData->Add(bbx[j]);
    
        //store labels
        std::vector<float>& label = vec_label[i];
        for(int j = 0; j < dim_label*max_num_bbx; j++)
           datumFloatData->Add(label[j]);
    
    
        //store lmdb
        std::string key_str = caffe::format_int(i);
        std::string out;
        CHECK(datum.SerializeToString(&out));
        txn->Put(key_str, out);
        txn->Commit();
        txn.reset(db->NewTransaction());
        std::cout<<"save data: "<<i<<std::endl;
    
    
      }
    
     return 0;
    
    }
    

    然后在文件夹“/ path / for / test” 中,我们将有一个名为 lmdb 的文件夹,其中包含我们的数据库。现在,我们必须读取数据并按所需顺序对其进行组织。为此,我将使用 Slice 层,该层允许将输入的底部数据划分为多个顶部。因此,由图像批处理,边界框和标签组成的输入数据将被分为5个顶部Blob:img_b,img_g,img_r,bbx,标签。
    #include <caffe/caffe.hpp>
    
    #include <opencv2/imgcodecs.hpp>
    #include <opencv2/core.hpp>
    #include <opencv2/imgcodecs.hpp>
    #include <opencv2/highgui.hpp>
    #include <opencv2/imgproc/imgproc.hpp>
    #include "boost/scoped_ptr.hpp"
    #include <iostream>
    #include <stdio.h>
    #include <stdlib.h>
    
    using namespace caffe;
    using boost::scoped_ptr;
    
    int main(){
    
    
      const std::string root_path = "/path/for/test/";
      const std::string path_lmdb = root_path + "lmdb";
    
    
      //parameters used to store lmdb data base
      int n = 10;
      int max_num_bbx = 7;
      int dim_label = 3;
      cv::Mat aux_img = cv::imread(root_path + "image.jpg");
      int rows = aux_img.rows;
      int cols = aux_img.cols;
    
    
      //here we build the network input
    
      NetParameter net_param;
    
      LayerParameter* db_layer_param = net_param.add_layer();
      db_layer_param->set_name("data");
      db_layer_param->set_type("Data");
      DataParameter* db_data_param = db_layer_param->mutable_data_param();
    
      db_data_param->set_batch_size(2);
      db_data_param->set_prefetch(3);
    
    
      db_data_param->set_source(path_lmdb);
      db_data_param->set_backend(DataParameter_DB_LMDB);
    
    
      db_layer_param->add_top("data");
    
      LayerParameter* slice_layer_param = net_param.add_layer();
      slice_layer_param->set_name("slice");
      slice_layer_param->set_type("Slice");
      slice_layer_param->mutable_slice_param()->set_axis(3);//starting B
      slice_layer_param->mutable_slice_param()->add_slice_point(rows*cols);//starting G
      slice_layer_param->mutable_slice_param()->add_slice_point(2*rows*cols);//starting R
      slice_layer_param->mutable_slice_param()->add_slice_point(3*rows*cols);//starting bbx
      slice_layer_param->mutable_slice_param()->add_slice_point(3*rows*cols + 4*max_num_bbx);//starting labels
    
    
      slice_layer_param->add_bottom("data");
    
      slice_layer_param->add_top("img_b");
      slice_layer_param->add_top("img_g");
      slice_layer_param->add_top("img_r");
      slice_layer_param->add_top("bbx");
      slice_layer_param->add_top("labels");
    
    
      //NOTE: you must add the additional layers of your model
      /*
      .
      .
      .
      .
      */
    
    
    
      //here we store and load the model
      //NOTE:In this example is not necessary to store the model in prototxt file
      const std::string net_file = root_path + "model.prototxt";
      Net<float> net(net_param);
      WriteProtoToTextFile(net_param,net_file);
    
    
    
    
      //here we make forward in order to read our data
      net.Forward();
    
    
    
      /*Note that in this example we read 2 images, but then we will only show the first*/
    
      //read first image
      boost::shared_ptr< Blob< float > > img_b = net.blob_by_name("img_b");
      boost::shared_ptr< Blob< float > > img_g = net.blob_by_name("img_g");
      boost::shared_ptr< Blob< float > > img_r = net.blob_by_name("img_r");
    
      cv::Mat img(rows,cols,CV_8UC3);
    
      for(int r = 0; r < rows; r++){
          for(int c = 0; c < cols; c++){
    
          img.at<cv::Vec3b>(r,c)[0] = (uchar) img_b->cpu_data()[r*cols + c];
          img.at<cv::Vec3b>(r,c)[1] = (uchar) img_g->cpu_data()[r*cols + c];
          img.at<cv::Vec3b>(r,c)[2] = (uchar) img_r->cpu_data()[r*cols + c];
          }
      }
    
    
    
      //read bounding boxes
      boost::shared_ptr< Blob< float > > bbx = net.blob_by_name("bbx");
    
      for(int i = 0; i < max_num_bbx; i++){
    
         float x1 = bbx->cpu_data()[4*i];
         float y1 = bbx->cpu_data()[4*i + 1];
         float x2 = bbx->cpu_data()[4*i + 2];
         float y2 = bbx->cpu_data()[4*i + 3];
    
         cv::Point pt1(y1, x1);
         cv::Point pt2(y2, x2);
         cv::rectangle(img, pt1, pt2, cv::Scalar(0, 255, 0));
    
      }
    
    
     //read labels
     boost::shared_ptr< Blob< float > > labels = net.blob_by_name("labels");
    
     std::cout<<"labels: "<<std::endl;
     for(int i = 0; i < max_num_bbx; i++){
        for(int j = 0; j < dim_label; j++){
    
         std::cout<<labels->cpu_data()[i*dim_label + j]<<" ";
    
        }
        std::cout<<std::endl;
     }
    
    
     cv::imshow("img", img);
     cv::waitKey(0);
    
     return 0;
    
    }
    

    生成的输出如下:

    c&#43;&#43; - 如何在ssd-caffe中加载图像而不是LMDB-LMLPHP

    数据 Slice层使用WriteProtoToTextFile(net_param,net_file)生成的原型(prototype)文件如下:
    layer {
      name: "data"
      type: "Data"
      top: "data"
      data_param {
        source: "/path/for/test/lmdb"
        batch_size: 2
        backend: LMDB
        prefetch: 3
      }
    }
    layer {
      name: "slice"
      type: "Slice"
      bottom: "data"
      top: "img_b"
      top: "img_g"
      top: "img_r"
      top: "bbx"
      top: "labels"
      slice_param {
        slice_point: 344000
        slice_point: 688000
        slice_point: 1032000
        slice_point: 1032028
        axis: 3
      }
    }
    

    Slice图层之后,您可能需要添加其他 Reshape图层以使数据适应后续图层。

    关于c++ - 如何在ssd-caffe中加载图像而不是LMDB,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/62307905/

    10-12 23:31