问题描述
我正在训练以下模型:
I am training the following model:
with slim.arg_scope(inception_arg_scope(is_training=True)):
logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=True, dropout_keep_prob=0.8,
spatial_squeeze=True, reuse=reuse_variables, scope='vis')
logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=True, dropout_keep_prob=0.8,
spatial_squeeze=True, reuse=reuse_variables, scope='pol')
pol_features = endpoints_p['pol/features']
vis_features = endpoints_v['vis/features']
eps = 1e-08
loss = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(pol_features - vis_features), axis=1, keep_dims=True), eps))
# rest of code
saver = tf.train.Saver(tf.global_variables())
其中
def inception_arg_scope(weight_decay=0.00004,
batch_norm_decay=0.9997,
batch_norm_epsilon=0.001, is_training=True):
normalizer_params = {
'decay': batch_norm_decay,
'epsilon': batch_norm_epsilon,
'is_training': is_training
}
normalizer_fn = tf.contrib.layers.batch_norm
# Set weight_decay for weights in Conv and FC layers.
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_regularizer=slim.l2_regularizer(weight_decay)):
with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training):
with slim.arg_scope(
[slim.conv2d],
weights_initializer=slim.variance_scaling_initializer(),
activation_fn=tf.nn.relu,
normalizer_fn=normalizer_fn,
normalizer_params=normalizer_params) as sc:
return sc
和inception_V3已定义。
我的模型训练得很好,损失从60降至小于1。但是当我想在另一个文件中测试模型时:
and inception_V3 is defined here.My model trains very well and the loss goes from 60 to less than 1. But when I want to test the model in another file:
with slim.arg_scope(inception_arg_scope(is_training=False)):
logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=False, dropout_keep_prob=0.8,
spatial_squeeze=True, reuse=reuse_variables, scope='vis')
logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=False, dropout_keep_prob=0.8,
spatial_squeeze=True, reuse=reuse_variables, scope='pol')
它给我毫无意义的结果,或更准确地说所有火车和测试样本的损失为 1e-8
。当我更改 is_training = True
时,它给出的逻辑结果更多,但损失仍然大于训练阶段(即使我正在对训练数据进行测试)
VGG16也有同样的问题。当我使用不带batch_norm的VGG时,我的测试准确度为100%,而使用batch_norm时,则为0%。
it gives me none-sense results, or more precisely the loss is 1e-8
for all the train and test samples. When I change is_training=True
it gives more logical results but still the loss is bigger than training phase (even when I am testing on the training data)I have the same problem with VGG16. I have %100 accuracy on my test when I am using VGG without batch_norm and 0% when I use batch_norm.
我在这里缺少什么?
谢谢
What am I missing here?Thank you,
推荐答案
我遇到了同样的问题并解决了。使用 slim.batch_norm
时,请确保使用 slim.learning.create_train_op
而不是 tf.train.GradientDecentOptimizer(lr).minimize(loss)
或其他优化程序。尝试一下看是否有效!
I met the same problem and solved. When you use slim.batch_norm
,be sure to use slim.learning.create_train_op
instead of tf.train.GradientDecentOptimizer(lr).minimize(loss)
or other optimizer. Try it to see if it works!
这篇关于Tensorflow batch_norm在测试时无法正常运行(is_training = False)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!