ML Engine Experiment eval tf.summary.scalar未显示在张量板上

本文介绍了ML Engine Experiment eval tf.summary.scalar未显示在张量板上的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在训练和评估时间的ML引擎实验中输出一些摘要标量. tf.summary.scalar('loss', loss)在张量板上的同一图上正确输出用于训练和评估的摘要标量.但是，我也在尝试在训练和评估时间同时输出其他指标，并且它们仅在训练时输出.该代码紧随tf.summary.scalar('loss', loss)之后，但似乎不起作用.例如，以下代码仅针对TRAIN输出，而不针对EVAL输出.唯一的区别是这些功能使用的是自定义精度功能，但它们适用于TRAIN

I am trying to output some summary scalars in an ML engine experiment at both train and eval time. tf.summary.scalar('loss', loss) is correctly outputting the summary scalars for both training and evaluation on the same plot in tensorboard. However, I am also trying to output other metrics at both train and eval time and they are only outputting at train time. The code immediately follows tf.summary.scalar('loss', loss) but does not appear to work. For example, the code as follows is only outputting for TRAIN, but not EVAL. The only difference is that these are using custom accuracy functions, but they are working for TRAIN

if mode in (Modes.TRAIN, Modes.EVAL):
    loss = tf.contrib.legacy_seq2seq.sequence_loss(logits, outputs, weights)
    tf.summary.scalar('loss', loss)

    sequence_accuracy = sequence_accuracy(targets, predictions,weights)
    tf.summary.scalar('sequence_accuracy', sequence_accuracy)

对于TRAIN& amp; amp; amp; amp; Tamp; amp; amp; amp; amp; amp; T& A&T;<>为什么损失会在张量板上绘制出来有任何意义吗? EVAL，而sequence_accuracy仅会规划火车吗?

Does it make any sense why loss would plot in tensorboard for both TRAIN & EVAL, while sequence_accuracy would only plot for TRAIN?

这种行为是否与我收到的警告每次运行发现一个以上的元数据事件.用最新的事件覆盖该元数据"有关?

Could this behavior somehow be related to the warning I received "Found more than one metagraph event per run. Overwriting the metagraph with the newest event."?

推荐答案

因为图中的summary节点只是一个节点.仍然需要对其进行评估(输出protobuf字符串)，并且仍然需要将该字符串写入文件.它不在训练模式下进行评估，因为它不在图形中train_op的上游，即使对其进行了评估，也不会将其写入文件，除非您指定了 tf.train.SummarySaverHook 作为training_chief_hooks中的一个人.因为Estimator类不假定您在训练期间不需要任何额外的评估，所以通常仅在EVAL阶段进行评估，而您只需增加min_eval_frequency或checkpoint_frequency即可获得更多的评估数据点.

Because the summary node in the graph is just a node. It still needs to be evaluated (outputting a protobuf string), and that string still needs to be written to a file. It's not evaluated in training mode because it's not upstream of the train_op in your graph, and even if it were evaluated, it wouldn't be written to a file unless you specified a tf.train.SummarySaverHook as one of you training_chief_hooks in your EstimatorSpec. Because the Estimator class doesn't assume you want any extra evaluation during training, normally evaluation is only done during the EVAL phase, and you just increase min_eval_frequency or checkpoint_frequency to get more evaluation datapoints.

如果您真的真的想在培训期间记录摘要，请按以下步骤操作:

If you really really want to log a summary during training here's how you'd do it:

def model_fn(mode, features, labels, params):
  ...
  if mode == Modes.TRAIN:
    # loss is already written out during training, don't duplicate the summary op
    loss = tf.contrib.legacy_seq2seq.sequence_loss(logits, outputs, weights)
    sequence_accuracy = sequence_accuracy(targets, predictions,weights)
    seq_sum_op = tf.summary.scalar('sequence_accuracy', sequence_accuracy)
    with tf.control_depencencies([seq_sum_op]):
       train_op = optimizer.minimize(loss)

    return tf.estimator.EstimatorSpec(
      loss=loss,
      mode=mode,
      train_op=train_op,
      training_chief_hooks=[tf.train.SummarySaverHook(
          save_steps=100,
          output_dir='./summaries',
          summary_op=seq_sum_op
      )]
    )

但是最好增加评估频率，并使用tf.metrics.streaming_accuracy

But it's better to just increase your eval frequency and make an eval_metric_ops for accuracy with tf.metrics.streaming_accuracy

这篇关于ML Engine Experiment eval tf.summary.scalar未显示在张量板上的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！