本文介绍了您可以使用BatchNormalization在神经网络中解释Keras get_weights()函数吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在Keras中运行神经网络(不使用BatchNormalization)时,我了解get_weights()函数如何提供神经网络的权重和偏差.但是,对于BatchNorm,它会产生4个额外的参数,我假设是Gamma,Beta,Mean和标准

When i run a Neural Network (without BatchNormalization) in Keras, I understand how the get_weights() function provides the weights and bias of the NN. However with BatchNorm it produces 4 extra parameters, I assume Gamma, Beta, Mean & Std.

当我保存这些值时,我试图手动复制一个简单的NN,但无法使它们产生正确的输出.有谁知道这些值是如何工作的?

I have tried to replicate a simple NN manually when i save these values, and cant get them to produce the right output. Does anyone know how these values work?

无批处理规范

使用批处理规范

推荐答案

在简单的多层感知器(MLP)和具有批处理归一化(BN)的MLP的情况下,我将以一个示例来说明get_weights().

I will take an example to explain get_weights() in case of simple Multi Layer Perceptron (MLP) and MLP with Batch Normalization(BN).

示例:假设我们正在处理MNIST数据集,并使用2层MLP体系结构(即2个隐藏层).隐藏层1中的神经元数量为392,隐藏层2中的神经元数量为196.因此,我们的MLP的最终架构将为784 x 512 x 196 x 10

Example: Say we are working on MNIST dataset, and using 2 layer MLP architecture (i.e. 2 hidden layers). No. of neurons in hidden layer 1 is 392 and No. of neurons in hidden layer 2 is 196. So the final architecture for our MLP will be 784 x 512 x 196 x 10

这里784是输入图像尺寸,10是输出层尺寸

Here 784 is the input image dimension and 10 is the output layer dimension

案例1:不具有批标准化的MLP =>让我的模型名称是使用ReLU激活功能的 model_relu .现在,在训练 model_relu 之后,我正在使用get_weights(),这将返回大小为6的列表,如下面的屏幕截图所示.

Case1: MLP without Batch Normalization => Let my model name is model_relu that uses ReLU activation function. Now after training model_relu, I am using get_weights(), This will return a list of size 6 as shown in below screen shot.

具有简单MLP且不具有批处理规范的get_weights()并且列表值是如下:

  • (784,392):隐藏的layer1的权重
  • (392,):与隐藏层1的权重相关的偏差

  • (784, 392): weights for hidden layer1
  • (392,): bias associated with weights of hidden layer1

(392,196):隐藏的layer2的权重

(392, 196): weights for hidden layer2

(196,):与隐藏层2的权重相关的偏差

(196,): bias associated with weights of hidden layer2

(196,10):输出层的权重

(196, 10): weights for output layer

案例2:具有批处理规范化的MLP =>让我的模型名称为 model_batch ,该模型还使用ReLU激活功能以及批处理规范化.现在,在训练 model_batch 之后,我正在使用get_weights(),这将返回大小14的列表,如下面的屏幕截图所示.

Case2: MLP with Batch Normalization => Let my model name is model_batch that also uses ReLU activation function along with Batch Normalization. Now after training model_batch I am using get_weights(), This will return a list of size 14 as shown in below screen shot.

具有批处理规范的get_weights()列表值如下:

  • (784,392):隐藏的layer1的权重
  • (392,):与隐藏层1的权重相关的偏差
  • (392,)(392,)(392,)(392,):这四个参数是gamma,beta,mean和std.大小为392的dev值每个都与隐藏layer1的批次规范化相关.

  • (784, 392): weight for hidden layer1
  • (392,): bias associated with weights of hidden layer1
  • (392,) (392,) (392,) (392,): these four parameters are gamma, beta, mean and std. dev values of size 392 each associated with Batch Normalization of hidden layer1.

(392,196):隐藏层2的权重

(392, 196): weight for hidden layer2

(196,)(196,)(196,)(196,):这四个参数是gamma,β,运行均值和std.大小为196的dev与隐藏层2的批次标准化相关.

(196,) (196,) (196,) (196,): these four parameters are gamma, beta, running mean and std. dev of size 196 each associated with Batch Normalization of hidden layer2.

(196,10):输出层的权重

(196, 10): weight for output layer

因此,在case2中,如果要获取隐藏层1,隐藏层2和输出层的权重,则python代码可以是这样的:

So, in case2 if you want to get weights for hidden layer1, hidden layer2, and output layer, the python code can be something like this:

wrights = model_batch.get_weights()      
hidden_layer1_wt = wrights[0].flatten().reshape(-1,1)     
hidden_layer2_wt = wrights[6].flatten().reshape(-1,1)     
output_layer_wt = wrights[12].flatten().reshape(-1,1)

希望这会有所帮助!

参考:keras-BatchNormalization

这篇关于您可以使用BatchNormalization在神经网络中解释Keras get_weights()函数吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-10 15:22