本文介绍了opt.apply_gradients() 在 TensorFlow 中有什么作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

文档对此不太清楚.我想可以通过 opt.compute_gradients(E, [v]) 获得的梯度包含每个元素 ∂E/∂x = g(x)v 存储的张量的 x.opt.apply_gradients(grads_and_vars) 是否本质上执行 x ← -η·g(x),其中 η 是学习率?这意味着如果我想给变量添加一个正的附加变化 p,我需要改变 g(x) ← g(x) - (1/η)p,例如像这样:

The documentation is not quite clear about this. I suppose the gradients one can obtain by opt.compute_gradients(E, [v]) contain the ∂E/∂x = g(x) for each element x of the tensor that v stores. Does opt.apply_gradients(grads_and_vars) essentially execute x ← -η·g(x), where η is the learning rate? That would imply that if I want to add a positive additive change p to the variable, I would need to need to change g(x) ← g(x) - (1/η)p, e.g. like this:

opt = tf.train.GradientDescentOptimizer(learning_rate=l)
grads_and_vars = opt.compute_gradients(loss, var_list)

for l, gv in enumerate(grads_and_vars):
    grads_and_vars[l] = (gv[0] - (1/l) * p, gv[1])

train_op = opt.apply_gradients(grads_and_vars)

有没有更好的方法来做到这一点?

Is there a better way to do this?

推荐答案

apply_gradients 方法实际应用的更新规则取决于具体的优化器.看一下tf.train.Optimizer类中apply_gradients的实现这里.它依赖于在方法 _apply_dense_apply_spares 中实现更新规则的派生类.您所指的更新规则由 GradientDescentOptimizer 实现.

The update rule that the apply_gradients method actually applies depends on the specific optimizer. Take a look at the implementation of apply_gradients in the tf.train.Optimizer class here. It relies on the derived classes implementing the update rule in the methods _apply_dense and _apply_spares. The update rule you are referring to is implemented by the GradientDescentOptimizer.

关于你想要的正向附加更新:如果你调用的 optGradientDescentOptimizer 的一个实例,那么你确实可以通过

Regarding your desired positive additive update: If what you are calling opt is an instantiation of GradientDescentOptimizer, then you could indeed achieve what you want to do by

grads_and_vars = opt.compute_gradients(E, [v])
eta = opt._learning_rate
my_grads_and_vars = [(g-(1/eta)*p, v) for g, v in grads_and_vars]
opt.apply_gradients(my_grads_and_vars)

更优雅的方法可能是编写一个新的优化器(继承自 tf.train.Optimizer),直接实现您想要的更新规则.

The more elegant way to do this is probably to write a new optimizer (inheriting from tf.train.Optimizer) that implements your desired update rule directly.

这篇关于opt.apply_gradients() 在 TensorFlow 中有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 18:30