项目简介

本项目基于paddle 动态图实现了图像分类模型 VGG-Net,建议使用GPU来运行本项目,具体介绍如下

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

数据集介绍

使用公开鲜花据集,数据集压缩包里包含五个文件夹,每个文件夹一种花卉。分别是雏菊,蒲公英,玫瑰,向日葵,郁金香。每种各690-890张不等。

模型简介

牛津大学VGG(Visual Geometry Group)组在2014年ILSVRC提出的模型被称作VGG模型。该模型相比以往模型进一步加宽和加深了网络结构,它的核心是五组卷积操作,每两组之间做Max-Pooling空间降维。同一组内采用多次连续的3X3卷积,卷积核的数目由较浅组的64增多到最深组的512,同一组内的卷积核数目是一样的。卷积之后接两层全连接层,之后是分类层。由于每组内卷积层的不同,有11、13、16、19层这几种模型,下图展示一个16层的网络结构。VGG模型结构相对简洁,提出之后也有很多文章基于此模型进行研究,如在ImageNet上首次公开超过人眼识别的模型就是借鉴VGG模型的结构。
用PaddlePaddle实现图像分类-VGG(动态图版)-LMLPHP
VGGNet验证了几个小卷积(3×3)(3\times 3)(3×3)的组合比一个大的卷积(5×5)(5\times 5)(5×5)的效果要好,相同感受野的情况下,减少网络参数的同时加深了网络的层数,更多非线性层的引入提高了模型对复杂模式的学习能力。
参考连接:VGG网络模型详解
论文原文:Very Deep Convolutional Networks for Large-Scale Image Recognition

In[  ]
# 解压花朵数据集
!cd data/data2815 && unzip -qo flower_photos.zip
In[  ]
# 预处理数据,将其转化为标准格式。同时将数据拆分成两份,以便训练和计算预估准确率
# 生成的train.txt和eval.txt中每行代表一个样本,
# 格式形如:path/to/image \t label_id,即一行包含一个样本的路径以及对应标签,中间用制表符分隔
import codecs
import os
import random
import shutil
from PIL import Image

train_ratio = 4.0 / 5

all_file_dir = 'data/data2815'
class_list = [c for c in os.listdir(all_file_dir) if os.path.isdir(os.path.join(all_file_dir, c)) and not c.endswith('Set') and not c.startswith('.')]
class_list.sort()
print(class_list)
train_image_dir = os.path.join(all_file_dir, "trainImageSet")
if not os.path.exists(train_image_dir):
    os.makedirs(train_image_dir)

eval_image_dir = os.path.join(all_file_dir, "evalImageSet")
if not os.path.exists(eval_image_dir):
    os.makedirs(eval_image_dir)

train_file = codecs.open(os.path.join(all_file_dir, "train.txt"), 'w')
eval_file = codecs.open(os.path.join(all_file_dir, "eval.txt"), 'w')

with codecs.open(os.path.join(all_file_dir, "label_list.txt"), "w") as label_list:
    label_id = 0
    for class_dir in class_list:
        label_list.write("{0}\t{1}\n".format(label_id, class_dir))
        image_path_pre = os.path.join(all_file_dir, class_dir)
        for file in os.listdir(image_path_pre):
            try:
                img = Image.open(os.path.join(image_path_pre, file))
                if random.uniform(0, 1) <= train_ratio:
                    shutil.copyfile(os.path.join(image_path_pre, file), os.path.join(train_image_dir, file))
                    train_file.write("{0}\t{1}\n".format(os.path.join(train_image_dir, file), label_id))
                else:
                    shutil.copyfile(os.path.join(image_path_pre, file), os.path.join(eval_image_dir, file))
                    eval_file.write("{0}\t{1}\n".format(os.path.join(eval_image_dir, file), label_id))
            except Exception as e:
                pass
                # 存在一些文件打不开,此处需要稍作清洗
        label_id += 1

train_file.close()
eval_file.close()
['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']
In[4]
# 使用鲜花数据集训练VGG模型,具体的配置可以在work/config.py文件中进行修改
# 目前是加载上次训练的断点进行继续训练,如果需要从零训练,可以将config.py文件中的『continue_train』参数改为Fales
!python work/train.py
{'input_size': [3, 224, 224], 'class_dim': 5, 'image_count': 2896, 'label_dict': {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}, 'data_dir': 'data/data2815', 'train_file_list': 'train.txt', 'eval_file_list': 'eval.txt', 'label_file': 'label_list.txt', 'continue_train': False, 'mode': 'train', 'num_epochs': 15, 'layer': 16, 'train_batch_size': 64, 'mean_rgb': [127.5, 127.5, 127.5], 'use_gpu': True, 'image_enhance_strategy': {'need_distort': True, 'need_rotate': True, 'need_crop': True, 'need_flip': True, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'good_acc1': 0.92}, 'learning_strategy': {'name': 'cosine_decay', 'batch_size': 64, 'epochs': [40, 80, 100], 'steps': [0.1, 0.01, 0.001, 0.0001]}, 'lr': 0.000125}
W0309 12:44:42.877504   153 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0309 12:44:42.880957   153 device_context.cc:245] device: 0, cuDNN Version: 7.3.
2020-03-09 12:44:45,123-INFO: Loss at epoch 0 step 0: [1.6775398], acc: [0.203125]
2020-03-09 12:44:45,123 - train.py[line:75] - INFO: Loss at epoch 0 step 0: [1.6775398], acc: [0.203125]
2020-03-09 12:45:23,170-INFO: Loss at epoch 0 step 20: [1.5772853], acc: [0.328125]
2020-03-09 12:45:23,170 - train.py[line:75] - INFO: Loss at epoch 0 step 20: [1.5772853], acc: [0.328125]
2020-03-09 12:46:00,459-INFO: Loss at epoch 0 step 40: [1.5763767], acc: [0.25]
2020-03-09 12:46:00,459 - train.py[line:75] - INFO: Loss at epoch 0 step 40: [1.5763767], acc: [0.25]
2020-03-09 12:46:28,171-INFO: Loss at epoch 1 step 0: [1.6362255], acc: [0.171875]
2020-03-09 12:46:28,171 - train.py[line:75] - INFO: Loss at epoch 1 step 0: [1.6362255], acc: [0.171875]
2020-03-09 12:47:04,452-INFO: Loss at epoch 1 step 20: [1.5832849], acc: [0.15625]
2020-03-09 12:47:04,452 - train.py[line:75] - INFO: Loss at epoch 1 step 20: [1.5832849], acc: [0.15625]
2020-03-09 12:47:42,195-INFO: Loss at epoch 1 step 40: [1.5955092], acc: [0.28125]
2020-03-09 12:47:42,195 - train.py[line:75] - INFO: Loss at epoch 1 step 40: [1.5955092], acc: [0.28125]
2020-03-09 12:48:10,122-INFO: Loss at epoch 2 step 0: [1.6067427], acc: [0.203125]
2020-03-09 12:48:10,122 - train.py[line:75] - INFO: Loss at epoch 2 step 0: [1.6067427], acc: [0.203125]
2020-03-09 12:48:46,660-INFO: Loss at epoch 2 step 20: [1.5652078], acc: [0.296875]
2020-03-09 12:48:46,660 - train.py[line:75] - INFO: Loss at epoch 2 step 20: [1.5652078], acc: [0.296875]
2020-03-09 12:49:25,004-INFO: Loss at epoch 2 step 40: [1.5488539], acc: [0.328125]
2020-03-09 12:49:25,004 - train.py[line:75] - INFO: Loss at epoch 2 step 40: [1.5488539], acc: [0.328125]
2020-03-09 12:49:52,381-INFO: Loss at epoch 3 step 0: [1.617861], acc: [0.203125]
2020-03-09 12:49:52,381 - train.py[line:75] - INFO: Loss at epoch 3 step 0: [1.617861], acc: [0.203125]
2020-03-09 12:50:29,478-INFO: Loss at epoch 3 step 20: [1.6220372], acc: [0.234375]
2020-03-09 12:50:29,478 - train.py[line:75] - INFO: Loss at epoch 3 step 20: [1.6220372], acc: [0.234375]
2020-03-09 12:51:06,544-INFO: Loss at epoch 3 step 40: [1.5271487], acc: [0.328125]
2020-03-09 12:51:06,544 - train.py[line:75] - INFO: Loss at epoch 3 step 40: [1.5271487], acc: [0.328125]
2020-03-09 12:51:34,110-INFO: Loss at epoch 4 step 0: [1.575762], acc: [0.328125]
2020-03-09 12:51:34,110 - train.py[line:75] - INFO: Loss at epoch 4 step 0: [1.575762], acc: [0.328125]
2020-03-09 12:52:11,308-INFO: Loss at epoch 4 step 20: [1.5116317], acc: [0.3125]
2020-03-09 12:52:11,308 - train.py[line:75] - INFO: Loss at epoch 4 step 20: [1.5116317], acc: [0.3125]
2020-03-09 12:52:48,375-INFO: Loss at epoch 4 step 40: [1.4684483], acc: [0.40625]
2020-03-09 12:52:48,375 - train.py[line:75] - INFO: Loss at epoch 4 step 40: [1.4684483], acc: [0.40625]
2020-03-09 12:53:15,972-INFO: Loss at epoch 5 step 0: [1.4961929], acc: [0.34375]
2020-03-09 12:53:15,972 - train.py[line:75] - INFO: Loss at epoch 5 step 0: [1.4961929], acc: [0.34375]
2020-03-09 12:53:52,429-INFO: Loss at epoch 5 step 20: [1.4398606], acc: [0.328125]
2020-03-09 12:53:52,429 - train.py[line:75] - INFO: Loss at epoch 5 step 20: [1.4398606], acc: [0.328125]
2020-03-09 12:54:28,990-INFO: Loss at epoch 5 step 40: [1.4618224], acc: [0.359375]
2020-03-09 12:54:28,990 - train.py[line:75] - INFO: Loss at epoch 5 step 40: [1.4618224], acc: [0.359375]
2020-03-09 12:54:56,302-INFO: Loss at epoch 6 step 0: [1.4387238], acc: [0.359375]
2020-03-09 12:54:56,302 - train.py[line:75] - INFO: Loss at epoch 6 step 0: [1.4387238], acc: [0.359375]
2020-03-09 12:55:33,677-INFO: Loss at epoch 6 step 20: [1.4354379], acc: [0.359375]
2020-03-09 12:55:33,677 - train.py[line:75] - INFO: Loss at epoch 6 step 20: [1.4354379], acc: [0.359375]
2020-03-09 12:56:11,423-INFO: Loss at epoch 6 step 40: [1.4143186], acc: [0.359375]
2020-03-09 12:56:11,423 - train.py[line:75] - INFO: Loss at epoch 6 step 40: [1.4143186], acc: [0.359375]
2020-03-09 12:56:39,305-INFO: Loss at epoch 7 step 0: [1.4664223], acc: [0.359375]
2020-03-09 12:56:39,305 - train.py[line:75] - INFO: Loss at epoch 7 step 0: [1.4664223], acc: [0.359375]
2020-03-09 12:57:14,006-INFO: Loss at epoch 7 step 20: [1.3116063], acc: [0.453125]
2020-03-09 12:57:14,006 - train.py[line:75] - INFO: Loss at epoch 7 step 20: [1.3116063], acc: [0.453125]
2020-03-09 12:57:50,826-INFO: Loss at epoch 7 step 40: [1.3123933], acc: [0.453125]
2020-03-09 12:57:50,826 - train.py[line:75] - INFO: Loss at epoch 7 step 40: [1.3123933], acc: [0.453125]
2020-03-09 12:58:18,438-INFO: Loss at epoch 8 step 0: [1.2403197], acc: [0.4375]
2020-03-09 12:58:18,438 - train.py[line:75] - INFO: Loss at epoch 8 step 0: [1.2403197], acc: [0.4375]
2020-03-09 12:58:54,727-INFO: Loss at epoch 8 step 20: [1.2308981], acc: [0.5]
2020-03-09 12:58:54,727 - train.py[line:75] - INFO: Loss at epoch 8 step 20: [1.2308981], acc: [0.5]
2020-03-09 12:59:31,963-INFO: Loss at epoch 8 step 40: [1.3417854], acc: [0.34375]
2020-03-09 12:59:31,963 - train.py[line:75] - INFO: Loss at epoch 8 step 40: [1.3417854], acc: [0.34375]
2020-03-09 12:59:59,786-INFO: Loss at epoch 9 step 0: [1.3701968], acc: [0.4375]
2020-03-09 12:59:59,786 - train.py[line:75] - INFO: Loss at epoch 9 step 0: [1.3701968], acc: [0.4375]
2020-03-09 13:00:37,330-INFO: Loss at epoch 9 step 20: [1.3665786], acc: [0.421875]
2020-03-09 13:00:37,330 - train.py[line:75] - INFO: Loss at epoch 9 step 20: [1.3665786], acc: [0.421875]
2020-03-09 13:01:15,666-INFO: Loss at epoch 9 step 40: [1.3561741], acc: [0.40625]
2020-03-09 13:01:15,666 - train.py[line:75] - INFO: Loss at epoch 9 step 40: [1.3561741], acc: [0.40625]
2020-03-09 13:01:43,304-INFO: Loss at epoch 10 step 0: [1.2408481], acc: [0.4375]
2020-03-09 13:01:43,304 - train.py[line:75] - INFO: Loss at epoch 10 step 0: [1.2408481], acc: [0.4375]
2020-03-09 13:02:19,777-INFO: Loss at epoch 10 step 20: [1.2982098], acc: [0.40625]
2020-03-09 13:02:19,777 - train.py[line:75] - INFO: Loss at epoch 10 step 20: [1.2982098], acc: [0.40625]
2020-03-09 13:02:57,301-INFO: Loss at epoch 10 step 40: [1.2386811], acc: [0.484375]
2020-03-09 13:02:57,301 - train.py[line:75] - INFO: Loss at epoch 10 step 40: [1.2386811], acc: [0.484375]
2020-03-09 13:03:25,039-INFO: Loss at epoch 11 step 0: [1.233202], acc: [0.421875]
2020-03-09 13:03:25,039 - train.py[line:75] - INFO: Loss at epoch 11 step 0: [1.233202], acc: [0.421875]
2020-03-09 13:04:01,287-INFO: Loss at epoch 11 step 20: [1.2279402], acc: [0.46875]
2020-03-09 13:04:01,287 - train.py[line:75] - INFO: Loss at epoch 11 step 20: [1.2279402], acc: [0.46875]
2020-03-09 13:04:38,998-INFO: Loss at epoch 11 step 40: [1.133811], acc: [0.515625]
2020-03-09 13:04:38,998 - train.py[line:75] - INFO: Loss at epoch 11 step 40: [1.133811], acc: [0.515625]
2020-03-09 13:05:06,614-INFO: Loss at epoch 12 step 0: [1.2067374], acc: [0.515625]
2020-03-09 13:05:06,614 - train.py[line:75] - INFO: Loss at epoch 12 step 0: [1.2067374], acc: [0.515625]
2020-03-09 13:05:43,253-INFO: Loss at epoch 12 step 20: [1.2442422], acc: [0.46875]
2020-03-09 13:05:43,253 - train.py[line:75] - INFO: Loss at epoch 12 step 20: [1.2442422], acc: [0.46875]
2020-03-09 13:06:20,466-INFO: Loss at epoch 12 step 40: [1.2749608], acc: [0.46875]
2020-03-09 13:06:20,466 - train.py[line:75] - INFO: Loss at epoch 12 step 40: [1.2749608], acc: [0.46875]
2020-03-09 13:06:47,796-INFO: Loss at epoch 13 step 0: [1.2344071], acc: [0.390625]
2020-03-09 13:06:47,796 - train.py[line:75] - INFO: Loss at epoch 13 step 0: [1.2344071], acc: [0.390625]
2020-03-09 13:07:25,013-INFO: Loss at epoch 13 step 20: [1.2967558], acc: [0.390625]
2020-03-09 13:07:25,013 - train.py[line:75] - INFO: Loss at epoch 13 step 20: [1.2967558], acc: [0.390625]
2020-03-09 13:08:01,211-INFO: Loss at epoch 13 step 40: [1.2794902], acc: [0.421875]
2020-03-09 13:08:01,211 - train.py[line:75] - INFO: Loss at epoch 13 step 40: [1.2794902], acc: [0.421875]
2020-03-09 13:08:28,084-INFO: Loss at epoch 14 step 0: [1.1525229], acc: [0.546875]
2020-03-09 13:08:28,084 - train.py[line:75] - INFO: Loss at epoch 14 step 0: [1.1525229], acc: [0.546875]
2020-03-09 13:09:05,387-INFO: Loss at epoch 14 step 20: [1.2745286], acc: [0.4375]
2020-03-09 13:09:05,387 - train.py[line:75] - INFO: Loss at epoch 14 step 20: [1.2745286], acc: [0.4375]
2020-03-09 13:09:43,239-INFO: Loss at epoch 14 step 40: [1.26039], acc: [0.4375]
2020-03-09 13:09:43,239 - train.py[line:75] - INFO: Loss at epoch 14 step 40: [1.26039], acc: [0.4375]
2020-03-09 13:10:10,909-INFO: Final loss: [1.1711379]
2020-03-09 13:10:10,909 - train.py[line:79] - INFO: Final loss: [1.1711379]
In[5]
# 使用验证集对训练好的模型效果进行验证,并输出在整个验证集上的准确率
!python work/eval.py
W0309 13:12:08.470460   213 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0309 13:12:08.474128   213 device_context.cc:245] device: 0, cuDNN Version: 7.3.
0.54521966
In[6]
# 使用模型在验证集上进行推断,由于验证集每行包含图像路径和标签,因此reader部分在test模式只返回图像,
# 不返回图像的标签,因此实际中可以对reader的对应部分进行修改
# 由于在此仅作演示,所以在预测20个样本后,程序结束!
!python work/infer.py
W0309 13:12:29.462474   282 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0309 13:12:29.465914   282 device_context.cc:245] device: 0, cuDNN Version: 7.3.
样本:1被预测为:roses
样本:2被预测为:dandelion
样本:3被预测为:sunflowers
样本:4被预测为:tulips
样本:5被预测为:tulips
样本:6被预测为:sunflowers
样本:7被预测为:tulips
样本:8被预测为:dandelion
样本:9被预测为:dandelion
样本:10被预测为:tulips
样本:11被预测为:dandelion
样本:12被预测为:dandelion
样本:13被预测为:sunflowers
样本:14被预测为:roses
样本:15被预测为:sunflowers
样本:16被预测为:daisy
样本:17被预测为:daisy
样本:18被预测为:roses
样本:19被预测为:dandelion
样本:20被预测为:roses
结束

点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/204999 

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

>> 访问 PaddlePaddle 官网,了解更多相关内容

09-05 01:01