项目简介

本项目使用paddle实现图像分类模型 ShuffleNet-V2网络的训练和预测,建议使用GPU运行。动态图版本请查看:用PaddlePaddle实现图像分类-ShuffleNetV2(动态图版)

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

模型简介

设计准则

作者首先测试了几个轻量级网络模型的运行时间,并进行了分解,从而提出了四个轻量级网络的准则,不同模型在相同FLOPs情况下运行时间对比图如下: 用PaddlePaddle实现图像分类-ShuffleNetV2-LMLPHP
上图a)和图b)分别是在GPU和ARM运行相同FLOPs的模型的时间对比图,由上图我们可以发现尽管模型的FLOPs相同,但是模型的运行速度还是存在差异,所以需要进一步的去分析模型中各个模块的运行速度,如下图: 用PaddlePaddle实现图像分类-ShuffleNetV2-LMLPHP
由上图可以看到,相同FLOPs的两个模型,各部分的运行时间存在着明显的差异。这种不一致主要归结为两个原因:1)影响速度的不仅仅是FLOPs,还有内存访问成本(Memory Access Cost,MAC);2)模型的并行程度也会影响速度,并行度高的模型速度相对更快。因此作者结合理论与实践得到了四条实用的设计原则。

  1. 同等通道大小最小化内存访问成本——使用1×11\times 11×1卷积平衡输入和输出的通道大小
  2. 过量使用分组卷积会增加MAC——分组卷积要谨慎实用,注意分组数
  3. 网络碎片化会降低并行度,一些网络如inception等倾向于采用"多路"结构,既存在一个block中有很多不同的小卷积或pooling,这容易造成网络碎片化,降低并行度。——避免网络碎片化
  4. 不能忽略元素级别的操作,例如ReLU和Add等操作,这些操作虽然FLOPs较小,但是MAC较大。——减少元素级运算

网络结构

根据上述四条准则,作者分析了ShuffleNet-V1设计的不足,并在此基础上改进得到了Shuffle-V2,两者模块上的对比如下图所示: 用PaddlePaddle实现图像分类-ShuffleNetV2-LMLPHP
(a): the basic ShuffleNet-V1 unit; (b) the ShuffleNet-V1 unit for spatial down sampling (2×); (c) ShuffleNet-V2 basic unit; (d) ShuffleNet-V2 unit for spatial down sampling (2×)
ShuffleNet-V2 相对与V1,引入了一种新的运算:channel split。具体来说,在开始时先将输入特征图在通道维度分成两个分支:通道数分别为 C′C^{'}C  C−C′C - C^{'}CC ,实际实现时 C′=C/2C^{'} = C / 2C=C/2 。左边分支做同等映射,右边的分支包含3个连续的卷积,并且输入和输出通道相同,这符合准则1。而且两个1x1卷积不再是组卷积,这符合准则2,另外两个分支相当于已经分成两组。两个分支的输出不再是Add元素,而是concat在一起,紧接着是对两个分支concat结果进行channle shuffle,以保证两个分支信息交流。其实concat和channel shuffle可以和下一个模块单元的channel split合成一个元素级运算,这符合准则4。整体网络结果如下表:
用PaddlePaddle实现图像分类-ShuffleNetV2-LMLPHP
论文原文:ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
参考链接:ShuffleNetV2:轻量级CNN网络中的桂冠

数据介绍

使用公开鲜花据集,数据集压缩包里包含五个文件夹,每个文件夹一种花卉。分别是雏菊,蒲公英,玫瑰,向日葵,郁金香。每种各690-890张不等

In[  ]
# 解压花朵数据集
!cd data/data2815 && unzip -qo flower_photos.zip
In[  ]
# 解压预训练模型参数 
!cd data/data6598 && unzip -q ShuffleNetV2_pretrained.zip
 

预处理数据,将其转化为需要的格式

In[  ]
# 预处理数据,将其转化为标准格式。同时将数据拆分成两份,以便训练和计算预估准确率
import codecs
import os
import random
import shutil
from PIL import Image

train_ratio = 4.0 / 5

all_file_dir = 'data/data2815'
class_list = [c for c in os.listdir(all_file_dir) if os.path.isdir(os.path.join(all_file_dir, c)) and not c.endswith('Set') and not c.startswith('.')]
class_list.sort()
print(class_list)
train_image_dir = os.path.join(all_file_dir, "trainImageSet")
if not os.path.exists(train_image_dir):
    os.makedirs(train_image_dir)

eval_image_dir = os.path.join(all_file_dir, "evalImageSet")
if not os.path.exists(eval_image_dir):
    os.makedirs(eval_image_dir)

train_file = codecs.open(os.path.join(all_file_dir, "train.txt"), 'w')
eval_file = codecs.open(os.path.join(all_file_dir, "eval.txt"), 'w')

with codecs.open(os.path.join(all_file_dir, "label_list.txt"), "w") as label_list:
    label_id = 0
    for class_dir in class_list:
        label_list.write("{0}\t{1}\n".format(label_id, class_dir))
        image_path_pre = os.path.join(all_file_dir, class_dir)
        for file in os.listdir(image_path_pre):
            try:
                img = Image.open(os.path.join(image_path_pre, file))
                if random.uniform(0, 1) <= train_ratio:
                    shutil.copyfile(os.path.join(image_path_pre, file), os.path.join(train_image_dir, file))
                    train_file.write("{0}\t{1}\n".format(os.path.join(train_image_dir, file), label_id))
                else:
                    shutil.copyfile(os.path.join(image_path_pre, file), os.path.join(eval_image_dir, file))
                    eval_file.write("{0}\t{1}\n".format(os.path.join(eval_image_dir, file), label_id))
            except Exception as e:
                pass
                # 存在一些文件打不开,此处需要稍作清洗
        label_id += 1

train_file.close()
eval_file.close()
['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']
 

模型训练主体

In[  ]
# -*- coding: UTF-8 -*-
"""
训练常用视觉基础网络,用于分类任务
需要将训练图片,类别文件 label_list.txt 放置在同一个文件夹下
程序会先读取 train.txt 文件获取类别数和图片数量
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import numpy as np
import time
import math
import paddle
import paddle.fluid as fluid
import codecs
import logging

from paddle.fluid.initializer import MSRA
from paddle.fluid.initializer import Uniform
from paddle.fluid.param_attr import ParamAttr
from PIL import Image
from PIL import ImageEnhance

train_parameters = {
    "input_size": [3, 224, 224],
    "class_dim": -1,  # 分类数,会在初始化自定义 reader 的时候获得
    "image_count": -1,  # 训练图片数量,会在初始化自定义 reader 的时候获得
    "label_dict": {},
    "data_dir": "data/data2815",  # 训练数据存储地址
    "train_file_list": "train.txt",
    "label_file": "label_list.txt",
    "save_freeze_dir": "./freeze-model",
    "save_persistable_dir": "./persistable-params",
    "continue_train": False,        # 是否接着上一次保存的参数接着训练,优先级高于预训练模型
    "pretrained": True,            # 是否使用预训练的模型
    "pretrained_dir": "data/data6598/ShuffleNet",
    "mode": "train",
    "num_epochs": 120,
    "train_batch_size": 30,
    "mean_rgb": [127.5, 127.5, 127.5],  # 常用图片的三通道均值,通常来说需要先对训练数据做统计,此处仅取中间值
    "use_gpu": True,
    "image_enhance_strategy": {  # 图像增强相关策略
        "need_distort": True,  # 是否启用图像颜色增强
        "need_rotate": True,   # 是否需要增加随机角度
        "need_crop": True,      # 是否要增加裁剪
        "need_flip": True,      # 是否要增加水平随机翻转
        "hue_prob": 0.5,
        "hue_delta": 18,
        "contrast_prob": 0.5,
        "contrast_delta": 0.5,
        "saturation_prob": 0.5,
        "saturation_delta": 0.5,
        "brightness_prob": 0.5,
        "brightness_delta": 0.125
    },
    "early_stop": {
        "sample_frequency": 50,
        "successive_limit": 3,
        "good_acc1": 0.92
    },
    "rsm_strategy": {
        "learning_rate": 0.001,
        "lr_epochs": [20, 40, 60, 80, 100],
        "lr_decay": [1, 0.5, 0.25, 0.1, 0.01, 0.002]
    },
    "momentum_strategy": {
        "learning_rate": 0.001,
        "lr_epochs": [20, 40, 60, 80, 100],
        "lr_decay": [1, 0.5, 0.25, 0.1, 0.01, 0.002]
    },
    "sgd_strategy": {
        "learning_rate": 0.001,
        "lr_epochs": [20, 40, 60, 80, 100],
        "lr_decay": [1, 0.5, 0.25, 0.1, 0.01, 0.002]
    },
    "adam_strategy": {
        "learning_rate": 0.002
    }
}


class ShuffleNetV2():
    def __init__(self, scale=1.0):
        self.scale = scale

    def net(self, input, class_dim=1000):
        scale = self.scale
        stage_repeats = [4, 8, 4]

        if scale == 0.5:
            stage_out_channels = [-1, 24,  48,  96, 192, 1024]
        elif scale == 1.0:
            stage_out_channels = [-1, 24, 116, 232, 464, 1024]
        elif scale == 1.5:
            stage_out_channels = [-1, 24, 176, 352, 704, 1024]
        elif scale == 2.0:
            stage_out_channels = [-1, 24, 224, 488, 976, 2048]
        elif scale == 8.0:
            stage_out_channels = [-1, 48, 896, 1952, 3904, 8192]
        else:
            raise ValueError(
                """{} groups is not supported for
                       1x1 Grouped Convolutions""".format(num_groups))

        #conv1

        input_channel = stage_out_channels[1]
        conv1 = self.conv_bn_layer(input=input, filter_size=3, num_filters=input_channel, padding=1, stride=2,name='stage1_conv')
        pool1 = fluid.layers.pool2d(input=conv1, pool_size=3, pool_stride=2, pool_padding=1, pool_type='max')
        conv = pool1
        # bottleneck sequences
        for idxstage in range(len(stage_repeats)):
            numrepeat = stage_repeats[idxstage]
            output_channel = stage_out_channels[idxstage+2]
            for i in range(numrepeat):
                if i == 0:
                    conv = self.inverted_residual_unit(input=conv, num_filters=output_channel, stride=2,
                                                       benchmodel=2,name=str(idxstage+2)+'_'+str(i+1))
                else:
                    conv = self.inverted_residual_unit(input=conv, num_filters=output_channel, stride=1,
                                                       benchmodel=1,name=str(idxstage+2)+'_'+str(i+1))

        conv_last = self.conv_bn_layer(input=conv, filter_size=1, num_filters=stage_out_channels[-1],
                                       padding=0, stride=1, name='conv5')
        pool_last = fluid.layers.pool2d(input=conv_last, pool_size=7, pool_stride=1, pool_padding=0, pool_type='avg')


        output = fluid.layers.fc(input=pool_last,
                                 size=class_dim,
                                 act="softmax",
                                 param_attr=ParamAttr(initializer=MSRA(),name='fc6_weights'),
                                 bias_attr=ParamAttr(name='fc6_offset'))
        return output


    def conv_bn_layer(self,
                  input,
                  filter_size,
                  num_filters,
                  stride,
                  padding,
                  num_groups=1,
                  use_cudnn=True,
                  if_act=True,
                  name=None):
#         print(num_groups)
        conv = fluid.layers.conv2d(
            input=input,
            num_filters=num_filters,
            filter_size=filter_size,
            stride=stride,
            padding=padding,
            groups=num_groups,
            act=None,
            use_cudnn=use_cudnn,
            param_attr=ParamAttr(initializer=MSRA(),name=name+'_weights'),
            bias_attr=False)
        out = int((input.shape[2] - 1)/float(stride) + 1)
       # print(input.shape[1],(out, out), num_filters, (filter_size, filter_size), stride, 
       #       (filter_size - 1) / 2, num_groups, name)
        bn_name = name + '_bn'
        if if_act:
            return fluid.layers.batch_norm(input=conv, act='swish',
                                           param_attr = ParamAttr(name=bn_name+"_scale"),
                                           bias_attr=ParamAttr(name=bn_name+"_offset"),
                                           moving_mean_name=bn_name + '_mean',
                                           moving_variance_name=bn_name + '_variance')
        else:
            return fluid.layers.batch_norm(input=conv,
                                           param_attr = ParamAttr(name=bn_name+"_scale"),
                                           bias_attr=ParamAttr(name=bn_name+"_offset"),
                                           moving_mean_name=bn_name + '_mean',
                                           moving_variance_name=bn_name + '_variance')


    def channel_shuffle(self, x, groups):
        batchsize, num_channels, height, width = x.shape[0], x.shape[1], x.shape[2], x.shape[3]
        channels_per_group = num_channels // groups

        # reshape
        x = fluid.layers.reshape(x=x, shape=[batchsize, groups, channels_per_group, height, width])

        x = fluid.layers.transpose(x=x, perm=[0,2,1,3,4])

        # flatten
        x = fluid.layers.reshape(x=x, shape=[batchsize, num_channels, height, width])

        return x


    def inverted_residual_unit(self, input, num_filters, stride, benchmodel, name=None):
        assert stride in [1, 2], \
            "supported stride are {} but your stride is {}".format([1,2], stride)

        oup_inc = num_filters//2
        inp = input.shape[1]

        if benchmodel == 1:
            x1, x2 = fluid.layers.split(
                input, num_or_sections=[input.shape[1]//2, input.shape[1]//2], dim=1)
#             x1 = input[:, :(input.shape[1]//2), :, :]
#             x2 = input[:, (input.shape[1]//2):, :, :]

            conv_pw = self.conv_bn_layer(
                input=x2,
                num_filters=oup_inc,
                filter_size=1,
                stride=1,
                padding=0,
                num_groups=1,
                if_act=True,
                name='stage_'+name+'_conv1')

            conv_dw = self.conv_bn_layer(
                input=conv_pw,
                num_filters=oup_inc,
                filter_size=3,
                stride=stride,
                padding=1,
                num_groups=oup_inc,
                if_act=False,
                use_cudnn=False,
                name='stage_'+name+'_conv2')

            conv_linear = self.conv_bn_layer(
                input=conv_dw,
                num_filters=oup_inc,
                filter_size=1,
                stride=1,
                padding=0,
                num_groups=1,
                if_act=True,
                name='stage_'+name+'_conv3')

            out = fluid.layers.concat([x1, conv_linear], axis=1)


        else:
            #branch1
            conv_dw_1 = self.conv_bn_layer(
                input=input,
                num_filters=inp,
                filter_size=3,
                stride=stride,
                padding=1,
                num_groups=inp,
                if_act=False,
                use_cudnn=False,
                name='stage_'+name+'_conv4')

            conv_linear_1 = self.conv_bn_layer(
                input=conv_dw_1,
                num_filters=oup_inc,
                filter_size=1,
                stride=1,
                padding=0,
                num_groups=1,
                if_act=True,
                name='stage_'+name+'_conv5')

            #branch2
            conv_pw_2 = self.conv_bn_layer(
                input=input,
                num_filters=oup_inc,
                filter_size=1,
                stride=1,
                padding=0,
                num_groups=1,
                if_act=True,
                name='stage_'+name+'_conv1')

            conv_dw_2 = self.conv_bn_layer(
                input=conv_pw_2,
                num_filters=oup_inc,
                filter_size=3,
                stride=stride,
                padding=1,
                num_groups=oup_inc,
                if_act=False,
                use_cudnn=False,
                name='stage_'+name+'_conv2')

            conv_linear_2 = self.conv_bn_layer(
                input=conv_dw_2,
                num_filters=oup_inc,
                filter_size=1,
                stride=1,
                padding=0,
                num_groups=1,
                if_act=True,
                name='stage_'+name+'_conv3')
            out = fluid.layers.concat([conv_linear_1, conv_linear_2], axis=1)

        return self.channel_shuffle(out, 2)


def init_log_config():
    """
    初始化日志相关配置
    :return:
    """
    global logger
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    log_path = os.path.join(os.getcwd(), 'logs')
    if not os.path.exists(log_path):
        os.makedirs(log_path)
    log_name = os.path.join(log_path, 'train.log')
    sh = logging.StreamHandler()
    fh = logging.FileHandler(log_name, mode='w')
    fh.setLevel(logging.DEBUG)
    formatter = logging.Formatter("%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s")
    fh.setFormatter(formatter)
    sh.setFormatter(formatter)
    logger.addHandler(sh)
    logger.addHandler(fh)


def init_train_parameters():
    """
    初始化训练参数,主要是初始化图片数量,类别数
    :return:
    """
    train_file_list = os.path.join(train_parameters['data_dir'], train_parameters['train_file_list'])
    label_list = os.path.join(train_parameters['data_dir'], train_parameters['label_file'])
    index = 0
    with codecs.open(label_list, encoding='utf-8') as flist:
        lines = [line.strip() for line in flist]
        for line in lines:
            parts = line.strip().split()
            train_parameters['label_dict'][parts[1]] = int(parts[0])
            index += 1
        train_parameters['class_dim'] = index
    with codecs.open(train_file_list, encoding='utf-8') as flist:
        lines = [line.strip() for line in flist]
        train_parameters['image_count'] = len(lines)


def resize_img(img, target_size):
    """
    强制缩放图片
    :param img:
    :param target_size:
    :return:
    """
    target_size = input_size
    img = img.resize((target_size[1], target_size[2]), Image.BILINEAR)
    return img


def random_crop(img, scale=[0.08, 1.0], ratio=[3. / 4., 4. / 3.]):
    aspect_ratio = math.sqrt(np.random.uniform(*ratio))
    w = 1. * aspect_ratio
    h = 1. / aspect_ratio

    bound = min((float(img.size[0]) / img.size[1]) / (w**2),
                (float(img.size[1]) / img.size[0]) / (h**2))
    scale_max = min(scale[1], bound)
    scale_min = min(scale[0], bound)

    target_area = img.size[0] * img.size[1] * np.random.uniform(scale_min,
                                                                scale_max)
    target_size = math.sqrt(target_area)
    w = int(target_size * w)
    h = int(target_size * h)

    i = np.random.randint(0, img.size[0] - w + 1)
    j = np.random.randint(0, img.size[1] - h + 1)

    img = img.crop((i, j, i + w, j + h))
    img = img.resize((train_parameters['input_size'][1], train_parameters['input_size'][2]), Image.BILINEAR)
    return img


def rotate_image(img):
    """
    图像增强,增加随机旋转角度
    """
    angle = np.random.randint(-14, 15)
    img = img.rotate(angle)
    return img


def random_brightness(img):
    """
    图像增强,亮度调整
    :param img:
    :return:
    """
    prob = np.random.uniform(0, 1)
    if prob < train_parameters['image_enhance_strategy']['brightness_prob']:
        brightness_delta = train_parameters['image_enhance_strategy']['brightness_delta']
        delta = np.random.uniform(-brightness_delta, brightness_delta) + 1
        img = ImageEnhance.Brightness(img).enhance(delta)
    return img


def random_contrast(img):
    """
    图像增强,对比度调整
    :param img:
    :return:
    """
    prob = np.random.uniform(0, 1)
    if prob < train_parameters['image_enhance_strategy']['contrast_prob']:
        contrast_delta = train_parameters['image_enhance_strategy']['contrast_delta']
        delta = np.random.uniform(-contrast_delta, contrast_delta) + 1
        img = ImageEnhance.Contrast(img).enhance(delta)
    return img


def random_saturation(img):
    """
    图像增强,饱和度调整
    :param img:
    :return:
    """
    prob = np.random.uniform(0, 1)
    if prob < train_parameters['image_enhance_strategy']['saturation_prob']:
        saturation_delta = train_parameters['image_enhance_strategy']['saturation_delta']
        delta = np.random.uniform(-saturation_delta, saturation_delta) + 1
        img = ImageEnhance.Color(img).enhance(delta)
    return img


def random_hue(img):
    """
    图像增强,色度调整
    :param img:
    :return:
    """
    prob = np.random.uniform(0, 1)
    if prob < train_parameters['image_enhance_strategy']['hue_prob']:
        hue_delta = train_parameters['image_enhance_strategy']['hue_delta']
        delta = np.random.uniform(-hue_delta, hue_delta)
        img_hsv = np.array(img.convert('HSV'))
        img_hsv[:, :, 0] = img_hsv[:, :, 0] + delta
        img = Image.fromarray(img_hsv, mode='HSV').convert('RGB')
    return img


def distort_color(img):
    """
    概率的图像增强
    :param img:
    :return:
    """
    prob = np.random.uniform(0, 1)
    # Apply different distort order
    if prob < 0.35:
        img = random_brightness(img)
        img = random_contrast(img)
        img = random_saturation(img)
        img = random_hue(img)
    elif prob < 0.7:
        img = random_brightness(img)
        img = random_saturation(img)
        img = random_hue(img)
        img = random_contrast(img)
    return img


def custom_image_reader(file_list, data_dir, mode):
    """
    自定义用户图片读取器,先初始化图片种类,数量
    :param file_list:
    :param data_dir:
    :param mode:
    :return:
    """
    with codecs.open(file_list) as flist:
        lines = [line.strip() for line in flist]

    def reader():
        np.random.shuffle(lines)
        for line in lines:
            if mode == 'train' or mode == 'val':
                img_path, label = line.split()
                img = Image.open(img_path)
                try:
                    if img.mode != 'RGB':
                        img = img.convert('RGB')
                    if train_parameters['image_enhance_strategy']['need_distort'] == True:
                        img = distort_color(img)
                    if train_parameters['image_enhance_strategy']['need_rotate'] == True:
                        img = rotate_image(img)
                    if train_parameters['image_enhance_strategy']['need_crop'] == True:
                        img = random_crop(img, train_parameters['input_size'])
                    if train_parameters['image_enhance_strategy']['need_flip'] == True:
                        mirror = int(np.random.uniform(0, 2))
                        if mirror == 1:
                            img = img.transpose(Image.FLIP_LEFT_RIGHT)
                    # HWC--->CHW && normalized
                    img = np.array(img).astype('float32')
                    img -= train_parameters['mean_rgb']
                    img = img.transpose((2, 0, 1))  # HWC to CHW
                    img *= 0.007843                 # 像素值归一化
                    yield img, int(label)
                except Exception as e:
                    pass                            # 以防某些图片读取处理出错,加异常处理
            elif mode == 'test':
                img_path = os.path.join(data_dir, line)
                img = Image.open(img_path)
                if img.mode != 'RGB':
                    img = img.convert('RGB')
                img = resize_img(img, train_parameters['input_size'])
                # HWC--->CHW && normalized
                img = np.array(img).astype('float32')
                img -= train_parameters['mean_rgb']
                img = img.transpose((2, 0, 1))  # HWC to CHW
                img *= 0.007843  # 像素值归一化
                yield img

    return reader


def optimizer_momentum_setting():
    """
    阶梯型的学习率适合比较大规模的训练数据
    """
    learning_strategy = train_parameters['momentum_strategy']
    batch_size = train_parameters["train_batch_size"]
    iters = train_parameters["image_count"] // batch_size
    lr = learning_strategy['learning_rate']

    boundaries = [i * iters for i in learning_strategy["lr_epochs"]]
    values = [i * lr for i in learning_strategy["lr_decay"]]
    learning_rate = fluid.layers.piecewise_decay(boundaries, values)
    optimizer = fluid.optimizer.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9)
    return optimizer


def optimizer_rms_setting():
    """
    阶梯型的学习率适合比较大规模的训练数据
    """
    batch_size = train_parameters["train_batch_size"]
    iters = train_parameters["image_count"] // batch_size
    learning_strategy = train_parameters['rsm_strategy']
    lr = learning_strategy['learning_rate']

    boundaries = [i * iters for i in learning_strategy["lr_epochs"]]
    values = [i * lr for i in learning_strategy["lr_decay"]]

    optimizer = fluid.optimizer.RMSProp(
        learning_rate=fluid.layers.piecewise_decay(boundaries, values))

    return optimizer


def optimizer_sgd_setting():
    """
    loss下降相对较慢,但是最终效果不错,阶梯型的学习率适合比较大规模的训练数据
    """
    learning_strategy = train_parameters['sgd_strategy']
    batch_size = train_parameters["train_batch_size"]
    iters = train_parameters["image_count"] // batch_size
    lr = learning_strategy['learning_rate']

    boundaries = [i * iters for i in learning_strategy["lr_epochs"]]
    values = [i * lr for i in learning_strategy["lr_decay"]]
    learning_rate = fluid.layers.piecewise_decay(boundaries, values)
    optimizer = fluid.optimizer.SGD(learning_rate=learning_rate)
    return optimizer


def optimizer_adam_setting():
    """
    能够比较快速的降低 loss,但是相对后期乏力
    """
    learning_strategy = train_parameters['adam_strategy']
    learning_rate = learning_strategy['learning_rate']
    optimizer = fluid.optimizer.Adam(learning_rate=learning_rate)
    return optimizer


def load_params(exe, program):
    if train_parameters['continue_train'] and os.path.exists(train_parameters['save_persistable_dir']):
        logger.info('load params from retrain model')
        fluid.io.load_persistables(executor=exe,
                                   dirname=train_parameters['save_persistable_dir'],
                                   main_program=program)
    elif train_parameters['pretrained'] and os.path.exists(train_parameters['pretrained_dir']):
        logger.info('load params from pretrained model')
        def if_exist(var):
            return os.path.exists(os.path.join(train_parameters['pretrained_dir'], var.name))

        fluid.io.load_vars(exe, train_parameters['pretrained_dir'], main_program=program,
                           predicate=if_exist)


def train():
    train_prog = fluid.Program()
    train_startup = fluid.Program()
    logger.info("create prog success")
    logger.info("train config: %s", str(train_parameters))
    logger.info("build input custom reader and data feeder")
    file_list = os.path.join(train_parameters['data_dir'], "train.txt")
    mode = train_parameters['mode']
    batch_reader = paddle.batch(custom_image_reader(file_list, train_parameters['data_dir'], mode),
                                batch_size=train_parameters['train_batch_size'],
                                drop_last=False)
    batch_reader = paddle.reader.shuffle(batch_reader, train_parameters['train_batch_size'])
    place = fluid.CUDAPlace(0) if train_parameters['use_gpu'] else fluid.CPUPlace()
    # 定义输入数据的占位符
    img = fluid.data(name='img', shape=[-1] + train_parameters['input_size'], dtype='float32')
    label = fluid.data(name='label', shape=[-1, 1], dtype='int64')
    feeder = fluid.DataFeeder(feed_list=[img, label], place=place)

    # 选取不同的网络
    logger.info("build newwork")
    model = ShuffleNetV2()
    out = model.net(input=img, class_dim=train_parameters['class_dim'])
    cost = fluid.layers.cross_entropy(out, label)
    avg_cost = fluid.layers.mean(x=cost)
    acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1)
    # 选取不同的优化器
    optimizer = optimizer_rms_setting()
    # optimizer = optimizer_momentum_setting()
    # optimizer = optimizer_sgd_setting()
    # optimizer = optimizer_adam_setting()
    optimizer.minimize(avg_cost)
    exe = fluid.Executor(place)

    main_program = fluid.default_main_program()
    exe.run(fluid.default_startup_program())
    train_fetch_list = [avg_cost.name, acc_top1.name, out.name]

    load_params(exe, main_program)

    # 训练循环主体
    stop_strategy = train_parameters['early_stop']
    successive_limit = stop_strategy['successive_limit']
    sample_freq = stop_strategy['sample_frequency']
    good_acc1 = stop_strategy['good_acc1']
    successive_count = 0
    stop_train = False
    total_batch_count = 0
    for pass_id in range(train_parameters["num_epochs"]):
        logger.info("current pass: %d, start read image", pass_id)
        batch_id = 0
        for step_id, data in enumerate(batch_reader()):
            t1 = time.time()
            # logger.info("data size:{0}".format(len(data)))
            loss, acc1, pred_ot = exe.run(main_program,
                                          feed=feeder.feed(data),
                                          fetch_list=train_fetch_list)
            t2 = time.time()
            batch_id += 1
            total_batch_count += 1
            period = t2 - t1
            loss = np.mean(np.array(loss))
            acc1 = np.mean(np.array(acc1))
            if batch_id % 10 == 0:
                logger.info("Pass {0}, trainbatch {1}, loss {2}, acc1 {3}, time {4}".format(pass_id, batch_id, loss, acc1,
                                                                                            "%2.2f sec" % period))
            # 简单的提前停止策略,认为连续达到某个准确率就可以停止了
            if acc1 >= good_acc1:
                successive_count += 1
                logger.info("current acc1 {0} meets good {1}, successive count {2}".format(acc1, good_acc1, successive_count))
                fluid.io.save_inference_model(dirname=train_parameters['save_freeze_dir'],
                                              feeded_var_names=['img'],
                                              target_vars=[out],
                                              main_program=main_program,
                                              executor=exe)
                if successive_count >= successive_limit:
                    logger.info("end training")
                    stop_train = True
                    break
            else:
                successive_count = 0

            # 通用的保存策略,减小意外停止的损失
            if total_batch_count % sample_freq == 0:
                logger.info("temp save {0} batch train result, current acc1 {1}".format(total_batch_count, acc1))
                fluid.io.save_persistables(dirname=train_parameters['save_persistable_dir'],
                                           main_program=main_program,
                                           executor=exe)
        if stop_train:
            break
    logger.info("training till last epcho, end training")
    fluid.io.save_persistables(dirname=train_parameters['save_persistable_dir'],
                                           main_program=main_program,
                                           executor=exe)
    fluid.io.save_inference_model(dirname=train_parameters['save_freeze_dir'],
                                              feeded_var_names=['img'],
                                              target_vars=[out],
                                              main_program=main_program,
                                              executor=exe)


if __name__ == '__main__':
    init_log_config()
    init_train_parameters()
    train()
2020-02-12 16:55:17,346-INFO: create prog success
2020-02-12 16:55:17,346 - <ipython-input-4-29f844e8c1f3>[line:596] - INFO: create prog success
2020-02-12 16:55:17,348-INFO: train config: {'input_size': [3, 224, 224], 'class_dim': 5, 'image_count': 2937, 'label_dict': {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}, 'data_dir': 'data/data2815', 'train_file_list': 'train.txt', 'label_file': 'label_list.txt', 'save_freeze_dir': './freeze-model', 'save_persistable_dir': './persistable-params', 'continue_train': False, 'pretrained': True, 'pretrained_dir': 'data/data6598/ShuffleNet', 'mode': 'train', 'num_epochs': 120, 'train_batch_size': 30, 'mean_rgb': [127.5, 127.5, 127.5], 'use_gpu': True, 'image_enhance_strategy': {'need_distort': True, 'need_rotate': True, 'need_crop': True, 'need_flip': True, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'good_acc1': 0.92}, 'rsm_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'momentum_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'sgd_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'adam_strategy': {'learning_rate': 0.002}}
2020-02-12 16:55:17,348 - <ipython-input-4-29f844e8c1f3>[line:597] - INFO: train config: {'input_size': [3, 224, 224], 'class_dim': 5, 'image_count': 2937, 'label_dict': {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}, 'data_dir': 'data/data2815', 'train_file_list': 'train.txt', 'label_file': 'label_list.txt', 'save_freeze_dir': './freeze-model', 'save_persistable_dir': './persistable-params', 'continue_train': False, 'pretrained': True, 'pretrained_dir': 'data/data6598/ShuffleNet', 'mode': 'train', 'num_epochs': 120, 'train_batch_size': 30, 'mean_rgb': [127.5, 127.5, 127.5], 'use_gpu': True, 'image_enhance_strategy': {'need_distort': True, 'need_rotate': True, 'need_crop': True, 'need_flip': True, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'good_acc1': 0.92}, 'rsm_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'momentum_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'sgd_strategy': {'learning_rate': 0.001, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'adam_strategy': {'learning_rate': 0.002}}
2020-02-12 16:55:17,349-INFO: build input custom reader and data feeder
2020-02-12 16:55:17,349 - <ipython-input-4-29f844e8c1f3>[line:598] - INFO: build input custom reader and data feeder
2020-02-12 16:55:17,352-INFO: build newwork
2020-02-12 16:55:17,352 - <ipython-input-4-29f844e8c1f3>[line:612] - INFO: build newwork
2020-02-12 16:55:20,810-INFO: load params from pretrained model
2020-02-12 16:55:20,810 - <ipython-input-4-29f844e8c1f3>[line:585] - INFO: load params from pretrained model
2020-02-12 16:55:20,967-INFO: current pass: 0, start read image
2020-02-12 16:55:20,967 - <ipython-input-4-29f844e8c1f3>[line:641] - INFO: current pass: 0, start read image
2020-02-12 16:55:30,606-INFO: Pass 0, trainbatch 10, loss 1.40152108669281, acc1 0.6666666865348816, time 0.09 sec
2020-02-12 16:55:30,606 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 10, loss 1.40152108669281, acc1 0.6666666865348816, time 0.09 sec
2020-02-12 16:55:31,495-INFO: Pass 0, trainbatch 20, loss 0.9425716400146484, acc1 0.7333333492279053, time 0.09 sec
2020-02-12 16:55:31,495 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 20, loss 0.9425716400146484, acc1 0.7333333492279053, time 0.09 sec
2020-02-12 16:55:32,552-INFO: Pass 0, trainbatch 30, loss 0.597796618938446, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:55:32,552 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 30, loss 0.597796618938446, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:55:41,431-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:55:41,431 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:55:41,913-INFO: Pass 0, trainbatch 40, loss 0.7412126064300537, acc1 0.8333333134651184, time 0.10 sec
2020-02-12 16:55:41,913 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 40, loss 0.7412126064300537, acc1 0.8333333134651184, time 0.10 sec
2020-02-12 16:55:43,003-INFO: Pass 0, trainbatch 50, loss 0.6937267184257507, acc1 0.7666666507720947, time 0.09 sec
2020-02-12 16:55:43,003 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 50, loss 0.6937267184257507, acc1 0.7666666507720947, time 0.09 sec
2020-02-12 16:55:43,006-INFO: temp save 50 batch train result, current acc1 0.7666666507720947
2020-02-12 16:55:43,006 - <ipython-input-4-29f844e8c1f3>[line:676] - INFO: temp save 50 batch train result, current acc1 0.7666666507720947
2020-02-12 16:55:44,334-INFO: Pass 0, trainbatch 60, loss 0.5651506185531616, acc1 0.7333333492279053, time 0.09 sec
2020-02-12 16:55:44,334 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 60, loss 0.5651506185531616, acc1 0.7333333492279053, time 0.09 sec
2020-02-12 16:55:53,202-INFO: Pass 0, trainbatch 70, loss 0.3173483908176422, acc1 0.8666666746139526, time 0.09 sec
2020-02-12 16:55:53,202 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 70, loss 0.3173483908176422, acc1 0.8666666746139526, time 0.09 sec
2020-02-12 16:55:54,097-INFO: Pass 0, trainbatch 80, loss 0.21400924026966095, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:55:54,097 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 80, loss 0.21400924026966095, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:55:55,165-INFO: Pass 0, trainbatch 90, loss 0.393379807472229, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:55:55,165 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 0, trainbatch 90, loss 0.393379807472229, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:55:57,695-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:55:57,695 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:55:58,598-INFO: current pass: 1, start read image
2020-02-12 16:55:58,598 - <ipython-input-4-29f844e8c1f3>[line:641] - INFO: current pass: 1, start read image
2020-02-12 16:56:06,936-INFO: current acc1 1.0 meets good 0.92, successive count 1
2020-02-12 16:56:06,936 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 1.0 meets good 0.92, successive count 1
2020-02-12 16:56:07,430-INFO: temp save 100 batch train result, current acc1 0.8333333134651184
2020-02-12 16:56:07,430 - <ipython-input-4-29f844e8c1f3>[line:676] - INFO: temp save 100 batch train result, current acc1 0.8333333134651184
2020-02-12 16:56:08,197-INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:08,197 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:08,750-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:08,750 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:09,739-INFO: Pass 1, trainbatch 10, loss 0.7846766710281372, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:56:09,739 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 1, trainbatch 10, loss 0.7846766710281372, acc1 0.800000011920929, time 0.09 sec
2020-02-12 16:56:09,918-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:09,918 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:10,474-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:10,474 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:11,116-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:11,116 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:11,955-INFO: Pass 1, trainbatch 20, loss 0.3565816581249237, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:56:11,955 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 1, trainbatch 20, loss 0.3565816581249237, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:56:12,137-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:12,137 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:12,690-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:12,690 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:13,481-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:13,481 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 1
2020-02-12 16:56:14,296-INFO: Pass 1, trainbatch 30, loss 0.22621259093284607, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:56:14,296 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 1, trainbatch 30, loss 0.22621259093284607, acc1 0.8999999761581421, time 0.09 sec
2020-02-12 16:56:23,558-INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:23,558 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:24,091-INFO: Pass 1, trainbatch 40, loss 0.5292304158210754, acc1 0.7666666507720947, time 0.12 sec
2020-02-12 16:56:24,091 - <ipython-input-4-29f844e8c1f3>[line:657] - INFO: Pass 1, trainbatch 40, loss 0.5292304158210754, acc1 0.7666666507720947, time 0.12 sec
2020-02-12 16:56:24,970-INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:24,970 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9666666388511658 meets good 0.92, successive count 1
2020-02-12 16:56:25,463-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 2
2020-02-12 16:56:25,463 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 2
2020-02-12 16:56:25,966-INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 3
2020-02-12 16:56:25,966 - <ipython-input-4-29f844e8c1f3>[line:661] - INFO: current acc1 0.9333333373069763 meets good 0.92, successive count 3
2020-02-12 16:56:26,371-INFO: end training
2020-02-12 16:56:26,371 - <ipython-input-4-29f844e8c1f3>[line:668] - INFO: end training
2020-02-12 16:56:26,375-INFO: training till last epcho, end training
2020-02-12 16:56:26,375 - <ipython-input-4-29f844e8c1f3>[line:682] - INFO: training till last epcho, end training
 

加载训练保存的模型,验证效果

In[  ]
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import numpy as np
import random
import time
import codecs
import sys
import functools
import math
import paddle
import paddle.fluid as fluid
from paddle.fluid import core
from paddle.fluid.param_attr import ParamAttr
from PIL import Image, ImageEnhance

target_size = [3, 224, 224]
mean_rgb = [127.5, 127.5, 127.5]
data_dir = "data/data2815"
eval_file = "eval.txt"
use_gpu = True
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
save_freeze_dir = "./freeze-model"
[inference_program, feed_target_names, fetch_targets] = fluid.io.load_inference_model(dirname=save_freeze_dir, executor=exe)
# print(fetch_targets)  


def crop_image(img, target_size):
    width, height = img.size
    w_start = (width - target_size[2]) / 2
    h_start = (height - target_size[1]) / 2
    w_end = w_start + target_size[2]
    h_end = h_start + target_size[1]
    img = img.crop((w_start, h_start, w_end, h_end))
    return img


def resize_img(img, target_size):
    ret = img.resize((target_size[1], target_size[2]), Image.BILINEAR)
    return ret


def read_image(img_path):
    img = Image.open(img_path)
    if img.mode != 'RGB':
        img = img.convert('RGB')
    img = crop_image(img, target_size)
    img = np.array(img).astype('float32')
    img -= mean_rgb
    img = img.transpose((2, 0, 1))  # HWC to CHW  
    img *= 0.007843
    img = img[np.newaxis,:]
    return img


def infer(image_path):
    tensor_img = read_image(image_path)
    label = exe.run(inference_program, feed={feed_target_names[0]: tensor_img}, fetch_list=fetch_targets)
    return np.argmax(label)


def eval_all():
    eval_file_path = os.path.join(data_dir, eval_file)
    total_count = 0
    right_count = 0
    with codecs.open(eval_file_path, encoding='utf-8') as flist:
        lines = [line.strip() for line in flist]
        t1 = time.time()
        for line in lines:
            total_count += 1
            parts = line.strip().split()
            result = infer(parts[0])
            # print("infer result:{0} answer:{1}".format(result, parts[1]))  
            if str(result) == parts[1]:
                right_count += 1
        period = time.time() - t1
        print("total eval count:{0} cost time:{1} predict accuracy:{2}".format(total_count, "%2.2f sec" % period, right_count / total_count))


if __name__ == '__main__':
    eval_all()

点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/169416 

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

>> 访问 PaddlePaddle 官网,了解更多相关内容

09-04 15:34