本文介绍了不理解类UNET架构中的数据流,并且Conv2DTranspose图层的输出有问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对修改后的U网络架构的输入尺寸有一两个问题。为了节省您的时间并更好地理解/复制我的结果,我将发布代码和输出尺寸。修改后的U-Net结构是https://github.com/nibtehaz/MultiResUNet/blob/master/MultiResUNet.py中的MultiResUNet结构。并基于本文https://arxiv.org/abs/1902.04049请不要因为此代码的长度而关闭。你可以简单地复制粘贴它,它应该不会超过10秒来复制我的结果。此外,您不需要为此设置数据集。使用TF.v1.9 Kera v.2.20进行测试。

import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate, BatchNormalization, Activation, add
from tensorflow.keras.models import Model
from tensorflow.keras.activations import relu

###{ 2D Convolutional layers

   # Arguments: ######################################################################
   #     x {keras layer} -- input layer                                   #
   #     filters {int} -- number of filters                                        #
   #     num_row {int} -- number of rows in filters                               #
   #     num_col {int} -- number of columns in filters                           #

    # Keyword Arguments:
   #     padding {str} -- mode of padding (default: {'same'})
  #      strides {tuple} -- stride of convolution operation (default: {(1, 1)})
 #       activation {str} -- activation function (default: {'relu'})
#        name {str} -- name of the layer (default: {None})

  #  Returns:
  #          [keras layer] -- [output layer]}

      # # ############################################################################


def conv2d_bn(x, filters ,num_row,num_col, padding = "same", strides = (1,1), activation = 'relu', name = None):

    x = Conv2D(filters,(num_row, num_col), strides=strides, padding=padding, use_bias=False)(x)
    x = BatchNormalization(axis=3, scale=False)(x)
    if(activation == None):
        return x
    x = Activation(activation, name=name)(x)

    return x

# our 2D transposed Convolution with batch normalization

 # 2D Transposed Convolutional layers

 #   Arguments:      #############################################################
 #       x {keras layer} -- input layer                                         #
 #       filters {int} -- number of filters                                    #
 #       num_row {int} -- number of rows in filters                           #
 #       num_col {int} -- number of columns in filters

 #   Keyword Arguments:
 #       padding {str} -- mode of padding (default: {'same'})
 #       strides {tuple} -- stride of convolution operation (default: {(2, 2)})
 #       name {str} -- name of the layer (default: {None})

  #  Returns:
  #      [keras layer] -- [output layer] ###################################

def trans_conv2d_bn(x, filters, num_row, num_col, padding='same', strides=(2, 2), name=None):

    x = Conv2DTranspose(filters, (num_row, num_col), strides=strides, padding=padding)(x)
    x = BatchNormalization(axis=3, scale=False)(x)

    return x

# Our Multi-Res Block

# Arguments: ############################################################
#        U {int} -- Number of filters in a corrsponding UNet stage     #
#        inp {keras layer} -- input layer                             #

#    Returns:                                                       #
#        [keras layer] -- [output layer]                           #
###################################################################

def MultiResBlock(U, inp, alpha = 1.67):

    W = alpha * U

    shortcut = inp

    shortcut = conv2d_bn(shortcut, int(W*0.167) + int(W*0.333) +
                         int(W*0.5), 1, 1, activation=None, padding='same')

    conv3x3 = conv2d_bn(inp, int(W*0.167), 3, 3,
                        activation='relu', padding='same')

    conv5x5 = conv2d_bn(conv3x3, int(W*0.333), 3, 3,
                        activation='relu', padding='same')

    conv7x7 = conv2d_bn(conv5x5, int(W*0.5), 3, 3,
                        activation='relu', padding='same')

    out = concatenate([conv3x3, conv5x5, conv7x7], axis=3)
    out = BatchNormalization(axis=3)(out)

    out = add([shortcut, out])
    out = Activation('relu')(out)
    out = BatchNormalization(axis=3)(out)

    return out

# Our ResPath:
# ResPath

#    Arguments:#######################################
#        filters {int} -- [description]
#        length {int} -- length of ResPath
#        inp {keras layer} -- input layer

#    Returns:
#        [keras layer] -- [output layer]#############



def ResPath(filters, length, inp):
    shortcut = inp
    shortcut = conv2d_bn(shortcut, filters, 1, 1,
                         activation=None, padding='same')

    out = conv2d_bn(inp, filters, 3, 3, activation='relu', padding='same')

    out = add([shortcut, out])
    out = Activation('relu')(out)
    out = BatchNormalization(axis=3)(out)

    for i in range(length-1):

        shortcut = out
        shortcut = conv2d_bn(shortcut, filters, 1, 1,
                             activation=None, padding='same')

        out = conv2d_bn(out, filters, 3, 3, activation='relu', padding='same')

        out = add([shortcut, out])
        out = Activation('relu')(out)
        out = BatchNormalization(axis=3)(out)

    return out



#    MultiResUNet

#    Arguments: ############################################
#        height {int} -- height of image
#        width {int} -- width of image
#        n_channels {int} -- number of channels in image

#    Returns:
#        [keras model] -- MultiResUNet model###############




def MultiResUnet(height, width, n_channels):



    inputs = Input((height, width, n_channels))

    # downsampling part begins here

    mresblock1 = MultiResBlock(32, inputs)
    pool1 = MaxPooling2D(pool_size=(2, 2))(mresblock1)
    mresblock1 = ResPath(32, 4, mresblock1)

    mresblock2 = MultiResBlock(32*2, pool1)
    pool2 = MaxPooling2D(pool_size=(2, 2))(mresblock2)
    mresblock2 = ResPath(32*2, 3, mresblock2)

    mresblock3 = MultiResBlock(32*4, pool2)
    pool3 = MaxPooling2D(pool_size=(2, 2))(mresblock3)
    mresblock3 = ResPath(32*4, 2, mresblock3)

    mresblock4 = MultiResBlock(32*8, pool3)


    # Upsampling part

    up5 = concatenate([Conv2DTranspose(
        32*4, (2, 2), strides=(2, 2), padding='same')(mresblock4), mresblock3], axis=3)
    mresblock5 = MultiResBlock(32*8, up5)

    up6 = concatenate([Conv2DTranspose(
        32*4, (2, 2), strides=(2, 2), padding='same')(mresblock5), mresblock2], axis=3)
    mresblock6 = MultiResBlock(32*4, up6)

    up7 = concatenate([Conv2DTranspose(
        32*2, (2, 2), strides=(2, 2), padding='same')(mresblock6), mresblock1], axis=3)
    mresblock7 = MultiResBlock(32*2, up7)


    conv8 = conv2d_bn(mresblock7, 1, 1, 1, activation='sigmoid')

    model = Model(inputs=[inputs], outputs=[conv8])

    return model
现在回到我在UNET体系结构中输入/输出维度不匹配的问题上。

如果我选择过滤高度/宽度(128,128)或(256,256)或(512,512),并执行:

 model = MultiResUnet(128, 128,3)
 display(model.summary())

TensorFlow为我提供了整个体系结构外观的完美结果。现在如果我这样做

     model = MultiResUnet(36, 36,3)
     display(model.summary())

我收到此错误:

为什么Conv2DTranspose给我的尺寸错误

而不是

为什么选择(128,128)、(256,256)等(32的倍数)等过滤大小,Concat函数没有提示为概括此问题,我如何使此UNET架构适用于任何过滤大小,以及如何处理Conv2DTranspose层生成的输出比实际需要的维度少一维(宽/高)(当过滤大小不是32的倍数或不对称时),为什么其他过滤大小不是32的倍数。如果我有可变的输入大小??

如有任何帮助,我们将不胜感激。

干杯,h

推荐答案

U-NET系列模型(如上面的MultiResUNet模型)遵循编码器-解码器体系结构。编码器是具有特征提取的下采样路径,而解码器是上采样路径。编码器中的功能映射通过跳过连接在解码器连接。这些要素映射连接在最后一个轴,即‘channel’轴(考虑要素具有维度[BATCH_SIZE,HEIGHT,WIDTH,CHANNECTS])。现在,对于要在任意轴(在我们的示例中为"channel"轴)上串联的要素,所有其他轴上的维度必须匹配。

在上述模型架构中,在编码器路径中执行了3次下采样/最大合并操作(通过MaxPooling2D)。在解码器路径执行3次上采样/转置-卷积操作,目的是将图像恢复到全尺寸。但是,要进行串联(通过跳过连接),高度、宽度和amp;Batch_Size的下采样和上采样特征尺寸在模型的每个"级别"都应该保持相同。我将用您在问题中提到的例子来说明这一点:

第一例:输入维度(128,128,3):128->64->32->16->32->64->128

第二种情况:输入维度(36,36,3):36->18->9->4->8->16->32

在第二种情况下,当特征映射的高度宽度在编码器路径中达到9时,进一步的下采样会导致在解码器中无法恢复的维度变化(丢失)。因此,由于无法连接维度[(None,8,8,128)]&;[(None,9,9,128)],它会抛出错误。

通常,对于具有"n"个下采样(MaxPooling2D)层的简单编码器-解码器模型(带有跳过连接),输入维度必须是2^n的倍数,才能在解码器连接模型的编码器功能。在这种情况下,n=3,因此输入必须是8的倍数,才不会遇到这些维度不匹配错误。

希望这会有帮助!:)

这篇关于不理解类UNET架构中的数据流,并且Conv2DTranspose图层的输出有问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-18 07:52