项目简介
**本项目旨在设计以YOLOv3为主体框架的高性能目标检测网络.主要思路为:利用YOLOv3的结构,将主干网络换成现阶段比较高效的ShuffleNetv2,进而提升网络的性能,相比较于原版YOLOv3预测,预测速度能提高10ms~20ms,模型大小不到原来的八分之一,相比较YOLO-tiny, 在同样的训练方式下map能够提升约20个百分点。
下载安装命令
## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle
## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
ShuffleNetV2-YOLOv3模型结构
主干网络为ShuffleNetV2, 其主要结构Inverted_residual_unit的组成单元如下图(c)、(d)所示:
(a): the basic ShuffleNet-V1 unit; (b) the ShuffleNet-V1 unit for spatial down sampling (2×); (c) ShuffleNet-V2 basic unit; (d) ShuffleNet-V2 unit for spatial down sampling (2×)
ShuffleNetV2网络结构如下:
参考链接
YOLOv3
论文原文:https://arxiv.org/pdf/1804.02767v1.pdf
参考链接:https://blog.csdn.net/litt1e/article/details/88907542
ShuffleNetV2
论文原文:ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
代码结构
code/
├── model
├── pretrained_model 预训练模型
├── YOLOv3.py 以darknet53为主干网络的yolov3(原版yolov3)
├── shufflenet_yolo3.py 以ShuffleNetV2为主干网络的yolov3
├── reader.py 数据读取脚本
├── train.py 训练脚本
├── freeze.py 模型固化脚本
├── infer.py 预测脚本
├── config.py 配置参数脚本
├── train_list.txt 训练数据列表
├── val_list.txt 验证数据列表
├── label_list label列表
数据集
本项目采用Pascal VOC数据集,数据集具体介绍可以参考https://blog.csdn.net/u013832707/article/details/80060327
本项目根据实际需要,将VOC数据集的Annotation文件做了进一步整理,写入在code/train_list.txt, code/val_list.txt方便reader调用
#解压数据集,并将voc数据集的训练和验证文件拷贝到相应位置
!cd data/data4379/ && unzip -qo pascalvoc.zip
!cd data/data4379/pascalvoc/ && rm label_list
!cp code/train_list.txt data/data4379/pascalvoc/
!cp code/val_list.txt data/data4379/pascalvoc/
!cp code/label_list data/data4379/pascalvoc/
#模型训练
!cd code && python train.py
{'aeroplane': 0, 'bicycle': 1, 'bird': 2, 'boat': 3, 'bottle': 4, 'bus': 5, 'car': 6, 'cat': 7, 'chair': 8, 'cow': 9, 'diningtable': 10, 'dog': 11, 'horse': 12, 'motorbike': 13, 'person': 14, 'pottedplant': 15, 'sheep': 16, 'sofa': 17, 'train': 18, 'tvmonitor': 19} 2020-02-27 14:29:34,172-INFO: start train YOLOv3, train params:{'data_dir': '../data/data4379/pascalvoc', 'train_list': 'train_list.txt', 'eval_list': 'val_list.txt', 'use_filter': False, 'class_dim': 20, 'label_dict': {'aeroplane': 0, 'bicycle': 1, 'bird': 2, 'boat': 3, 'bottle': 4, 'bus': 5, 'car': 6, 'cat': 7, 'chair': 8, 'cow': 9, 'diningtable': 10, 'dog': 11, 'horse': 12, 'motorbike': 13, 'person': 14, 'pottedplant': 15, 'sheep': 16, 'sofa': 17, 'train': 18, 'tvmonitor': 19}, 'num_dict': {0: 'aeroplane', 1: 'bicycle', 2: 'bird', 3: 'boat', 4: 'bottle', 5: 'bus', 6: 'car', 7: 'cat', 8: 'chair', 9: 'cow', 10: 'diningtable', 11: 'dog', 12: 'horse', 13: 'motorbike', 14: 'person', 15: 'pottedplant', 16: 'sheep', 17: 'sofa', 18: 'train', 19: 'tvmonitor'}, 'image_count': 16551, 'continue_train': True, 'pretrained': True, 'pretrained_model_dir': 'model/pretrained_model', 'save_model_dir': 'model/model', 'model_prefix': 'yolo-v3', 'freeze_dir': 'model/freeze_model', 'use_tiny': False, 'max_box_num': 50, 'num_epochs': 1, 'train_batch_size': 16, 'use_gpu': True, 'yolo_cfg': {'input_size': [3, 384, 384], 'anchors': [9, 12, 15, 28, 30, 21, 28, 56, 57, 42, 54, 110, 107, 83, 144, 183, 344, 301], 'anchor_mask': [[6, 7, 8], [3, 4, 5], [0, 1, 2]]}, 'yolo_tiny_cfg': {'input_size': [3, 384, 384], 'anchors': [9, 12, 20, 23, 33, 51, 72, 75, 122, 150, 308, 287], 'anchor_mask': [[3, 4, 5], [0, 1, 2]]}, 'ignore_thresh': 0.7, 'mean_rgb': [127.5, 127.5, 127.5], 'mode': 'train', 'multi_data_reader_count': 4, 'apply_distort': True, 'nms_top_k': 400, 'nms_pos_k': 100, 'valid_thresh': 0.005, 'nms_thresh': 0.45, 'image_distort_strategy': {'expand_prob': 0.5, 'expand_max_ratio': 4, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.5}, 'sgd_strategy': {'learning_rate': 0.002, 'lr_epochs': [10, 45, 80, 110, 135, 160, 180], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.025, 0.004, 0.001, 0.0005]}, 'early_stop': {'sample_frequency': 50, 'rise_limit': 10, 'min_loss': 5e-08, 'min_curr_map': 0.84}} 2020-02-27 14:29:34,172 - train.py[line:429] - INFO: start train YOLOv3, train params:{'data_dir': '../data/data4379/pascalvoc', 'train_list': 'train_list.txt', 'eval_list': 'val_list.txt', 'use_filter': False, 'class_dim': 20, 'label_dict': {'aeroplane': 0, 'bicycle': 1, 'bird': 2, 'boat': 3, 'bottle': 4, 'bus': 5, 'car': 6, 'cat': 7, 'chair': 8, 'cow': 9, 'diningtable': 10, 'dog': 11, 'horse': 12, 'motorbike': 13, 'person': 14, 'pottedplant': 15, 'sheep': 16, 'sofa': 17, 'train': 18, 'tvmonitor': 19}, 'num_dict': {0: 'aeroplane', 1: 'bicycle', 2: 'bird', 3: 'boat', 4: 'bottle', 5: 'bus', 6: 'car', 7: 'cat', 8: 'chair', 9: 'cow', 10: 'diningtable', 11: 'dog', 12: 'horse', 13: 'motorbike', 14: 'person', 15: 'pottedplant', 16: 'sheep', 17: 'sofa', 18: 'train', 19: 'tvmonitor'}, 'image_count': 16551, 'continue_train': True, 'pretrained': True, 'pretrained_model_dir': 'model/pretrained_model', 'save_model_dir': 'model/model', 'model_prefix': 'yolo-v3', 'freeze_dir': 'model/freeze_model', 'use_tiny': False, 'max_box_num': 50, 'num_epochs': 1, 'train_batch_size': 16, 'use_gpu': True, 'yolo_cfg': {'input_size': [3, 384, 384], 'anchors': [9, 12, 15, 28, 30, 21, 28, 56, 57, 42, 54, 110, 107, 83, 144, 183, 344, 301], 'anchor_mask': [[6, 7, 8], [3, 4, 5], [0, 1, 2]]}, 'yolo_tiny_cfg': {'input_size': [3, 384, 384], 'anchors': [9, 12, 20, 23, 33, 51, 72, 75, 122, 150, 308, 287], 'anchor_mask': [[3, 4, 5], [0, 1, 2]]}, 'ignore_thresh': 0.7, 'mean_rgb': [127.5, 127.5, 127.5], 'mode': 'train', 'multi_data_reader_count': 4, 'apply_distort': True, 'nms_top_k': 400, 'nms_pos_k': 100, 'valid_thresh': 0.005, 'nms_thresh': 0.45, 'image_distort_strategy': {'expand_prob': 0.5, 'expand_max_ratio': 4, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.5}, 'sgd_strategy': {'learning_rate': 0.002, 'lr_epochs': [10, 45, 80, 110, 135, 160, 180], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.025, 0.004, 0.001, 0.0005]}, 'early_stop': {'sample_frequency': 50, 'rise_limit': 10, 'min_loss': 5e-08, 'min_curr_map': 0.84}} 2020-02-27 14:29:34,173-INFO: create place, use gpu:True 2020-02-27 14:29:34,173 - train.py[line:431] - INFO: create place, use gpu:True 2020-02-27 14:29:34,173-INFO: build network and program 2020-02-27 14:29:34,173 - train.py[line:434] - INFO: build network and program /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/nn.py:10251: UserWarning: actual_shape will be deprecated, it is recommended to use out_shape instead of actual_shape to specify output shape dynamically. "actual_shape will be deprecated, it is recommended to use " 2020-02-27 14:29:34,586-INFO: origin learning rate: 0.002 boundaries: [10340, 46530, 82720, 113740, 139590, 165440, 186120] values: [0.002, 0.001, 0.0005, 0.0002, 5e-05, 8e-06, 2e-06, 1e-06] 2020-02-27 14:29:34,586 - train.py[line:255] - INFO: origin learning rate: 0.002 boundaries: [10340, 46530, 82720, 113740, 139590, 165440, 186120] values: [0.002, 0.001, 0.0005, 0.0002, 5e-05, 8e-06, 2e-06, 1e-06] 2020-02-27 14:29:36,470-INFO: build executor and init params 2020-02-27 14:29:36,470 - train.py[line:447] - INFO: build executor and init params W0227 14:29:37.427084 115 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0227 14:29:37.430891 115 device_context.cc:244] device: 0, cuDNN Version: 7.3. 2020-02-27 14:29:38,703-INFO: load param from pretrained model 2020-02-27 14:29:38,703 - train.py[line:415] - INFO: load param from pretrained model 2020-02-27 14:29:38,804-INFO: current pass: 0, start read image 2020-02-27 14:29:38,804 - train.py[line:466] - INFO: current pass: 0, start read image 2020-02-27 14:32:15,026-INFO: pass 0, trainbatch 200, loss 13.787325859069824, time 0.23 sec 2020-02-27 14:32:15,026 - train.py[line:481] - INFO: pass 0, trainbatch 200, loss 13.787325859069824, time 0.23 sec 2020-02-27 14:34:54,558-INFO: pass 0, trainbatch 400, loss 32.522918701171875, time 0.22 sec 2020-02-27 14:34:54,558 - train.py[line:481] - INFO: pass 0, trainbatch 400, loss 32.522918701171875, time 0.22 sec 2020-02-27 14:37:32,941-INFO: pass 0, trainbatch 600, loss 35.67811584472656, time 0.22 sec 2020-02-27 14:37:32,941 - train.py[line:481] - INFO: pass 0, trainbatch 600, loss 35.67811584472656, time 0.22 sec 2020-02-27 14:40:11,286-INFO: pass 0, trainbatch 800, loss 21.77606964111328, time 0.41 sec 2020-02-27 14:40:11,286 - train.py[line:481] - INFO: pass 0, trainbatch 800, loss 21.77606964111328, time 0.41 sec 2020-02-27 14:42:47,528-INFO: pass 0, trainbatch 1000, loss 26.96040916442871, time 0.22 sec 2020-02-27 14:42:47,528 - train.py[line:481] - INFO: pass 0, trainbatch 1000, loss 26.96040916442871, time 0.22 sec 2020-02-27 14:43:14,532-INFO: pass 0 train result, current pass mean loss: 23.394970231355675 2020-02-27 14:43:14,532 - train.py[line:483] - INFO: pass 0 train result, current pass mean loss: 23.394970231355675 pred (160135, 6) 2020-02-27 14:47:48,308-INFO: 0 epoch current pass map is 0.15955796837806702, accum_map is 0.15955796837806702 2020-02-27 14:47:48,308 - train.py[line:489] - INFO: 0 epoch current pass map is 0.15955796837806702, accum_map is 0.15955796837806702 2020-02-27 14:47:48,310-INFO: model save 0 epcho train result, current best pass MAP 0.15955796837806702 2020-02-27 14:47:48,310 - train.py[line:495] - INFO: model save 0 epcho train result, current best pass MAP 0.15955796837806702 2020-02-27 14:47:48,859-INFO: best pass 0 current best pass MAP is 0.15955796837806702 2020-02-27 14:47:48,859 - train.py[line:499] - INFO: best pass 0 current best pass MAP is 0.15955796837806702 2020-02-27 14:47:48,861-INFO: end training 2020-02-27 14:47:48,861 - train.py[line:516] - INFO: end training
#模型固化
!cd code && python freeze.py
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/nn.py:10251: UserWarning: actual_shape will be deprecated, it is recommended to use out_shape instead of actual_shape to specify output shape dynamically. "actual_shape will be deprecated, it is recommended to use " freeze end
#模型预测
!cd code && python infer.py
######检测结果可视化
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import cv2
detect_img= cv2.imread('code/result.jpg')
plt.imshow(detect_img)
plt.show()
W0227 14:47:55.996620 265 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0227 14:47:56.000547 265 device_context.cc:244] device: 0, cuDNN Version: 7.3. predict cost time:38.32 ms 检测到目标 检测结果保存为result.jpg infer one picture cost 38.31744194030762 ms
点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/273746
下载安装命令
## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle
## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
>> 访问 PaddlePaddle 官网,了解更多相关内容。