本文介绍了当轴= 0时,Keras BatchNormalization仅适用于恒定的批次调光.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码显示了一种可行的方法,而另一种则失败了.

axis = 0上的BatchNorm不应该取决于batchsize,或者如果它确实如此,则应在文档中明确说明.

In [118]: tf.__version__
Out[118]: '2.0.0-beta1'



class M(tf.keras.models.Model):
import numpy as np
import tensorflow as tf

class M(tf.keras.Model):

    def __init__(self, axis):
        super().__init__()
        self.layer = tf.keras.layers.BatchNormalization(axis=axis, scale=False, center=True, input_shape=(6,))

    def call(self, x):
        out = self.layer(x)
        return out

def fails():
    m = M(axis=0)
    x = np.random.randn(3, 6).astype(np.float32)
    print(m(x))
    x = np.random.randn(2, 6).astype(np.float32)
    print(m(x))

def ok():
    m = M(axis=1)
    x = np.random.randn(3, 6).astype(np.float32)
    print(m(x))
    x = np.random.randn(2, 6).astype(np.float32)
    print(m(x))

args中的轴不是您认为的轴.

解决方案

此答案中所述 Keras doc axis参数表示要素轴.这完全是有道理的,因为我们要进行按特征进行归一化,即对整个输入批次中的每个特征进行归一化(这与我们可能对图像进行的按特征进行归一化相一致,例如,从所有图像中减去平均像素"数据集).

现在,您编写的fails()方法在此行上失败:

x = np.random.randn(2, 6).astype(np.float32)
print(m(x))

这是因为,在构建模型时,因此在上述代码之前执行以下行时,您将要素轴设置为0,即第一个轴:

x = np.random.randn(3, 6).astype(np.float32)
print(m(x))

图层的权重将基于 3 个要素(不要忘记将要素轴表示为0,因此在形状为(3,6)的输入中将包含3个要素).因此,当您为其输入形状为(2,6)的输入张量时,它会正确地引发一个错误,因为该张量中有2个特征,因此由于这种不匹配而无法进行归一化.

另一方面,ok()方法有效,因为特征轴是最后一个轴,因此两个输入张量都具有相同数量的特征,即6.因此,在两种情况下都可以对所有特征进行归一化. /p>

The following code shows one way that works and the other that fails.

The BatchNorm on axis=0 should not depend on the batchsize or if it does it should be explicitly stated as such in the docs.

In [118]: tf.__version__
Out[118]: '2.0.0-beta1'



class M(tf.keras.models.Model):
import numpy as np
import tensorflow as tf

class M(tf.keras.Model):

    def __init__(self, axis):
        super().__init__()
        self.layer = tf.keras.layers.BatchNormalization(axis=axis, scale=False, center=True, input_shape=(6,))

    def call(self, x):
        out = self.layer(x)
        return out

def fails():
    m = M(axis=0)
    x = np.random.randn(3, 6).astype(np.float32)
    print(m(x))
    x = np.random.randn(2, 6).astype(np.float32)
    print(m(x))

def ok():
    m = M(axis=1)
    x = np.random.randn(3, 6).astype(np.float32)
    print(m(x))
    x = np.random.randn(2, 6).astype(np.float32)
    print(m(x))

EDIT:

The axis in the args is not the axis you think it is.

解决方案

As it has been stated in this answer and the Keras doc, the axis argument indicates the feature axis. This totally makes sense because we want to do feature-wise normalization i.e. to normalize each feature over the whole input batch (this is in accordance with feature-wise normalization we may do on images, e.g. subtracting the "mean pixel" from all the images of a dataset).

Now, the fails() method you have written fails on this line:

x = np.random.randn(2, 6).astype(np.float32)
print(m(x))

That's because you have set the feature axis as 0, i.e. the first axis, when building the model and therefore when the following lines get executed before the above code:

x = np.random.randn(3, 6).astype(np.float32)
print(m(x))

the layer's weight would be built based on 3 features (don't forget you have indicated the feature axis as 0, so there would be 3 features in an input of shape (3,6)). So when you give it an input tensor of shape (2,6) it would correctly raise an error because there are 2 features in that tensor and therefore the normalization could not be done due to this mismatch.

On the other hand, the ok() method works because feature axis is the last axis and therefore both input tensors have the same number of features, i.e. 6. So normalization could be done in both cases for all the features.

这篇关于当轴= 0时,Keras BatchNormalization仅适用于恒定的批次调光.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-26 20:09