指数加权平均（EMA）理解

# Create variables.
var0 = tf.Variable(...)
var1 = tf.Variable(...)
# ... use the variables to build a training model...
...
# Create an op that applies the optimizer.  This is what we usually
# would use as a training op.
opt_op = opt.minimize(my_loss, [var0, var1])# Create an ExponentialMovingAverage object
ema = tf.train.ExponentialMovingAverage(decay=0.9999)with tf.control_dependencies([opt_op]):# Create the shadow variables, and add ops to maintain moving averages# of var0 and var1. This also creates an op that will update the moving# averages after each training step.  This is what we will use in place# of the usual training op.training_op = ema.apply([var0, var1])...train the model by running training_op...

There are two ways to use the moving averages for evaluations:

Build a model that uses the shadow variables instead of the variables.
For this, use the average() method which returns the shadow variable
for a given variable.
Build a model normally but load the checkpoint files to evaluate by using
the shadow variable names. For this use the average_name() method. See
the tf.train.Saver for more
information on restoring saved variables.

Example of restoring the shadow variable values:

# Create a Saver that loads variables from their saved shadow values.
shadow_var0_name = ema.average_name(var0)
shadow_var1_name = ema.average_name(var1)
saver = tf.train.Saver({shadow_var0_name: var0, shadow_var1_name: var1})
saver.restore(...checkpoint filename...)
# var0 and var1 now hold the moving average values

PyTorch实现

PyTorch官方目前没有提供EMA的实现，不过自己实现也不会太复杂，下面提供一个网上大神的实现方法：

class EMA():def __init__(self, decay):self.decay = decayself.shadow = {}def register(self, name, val):self.shadow[name] = val.clone()def get(self, name):return self.shadow[name]def update(self, name, x):assert name in self.shadownew_average = (1.0 - self.decay) * x + self.decay * self.shadow[name]self.shadow[name] = new_average.clone()

使用方法，分为初始化、注册和更新三个步骤。

// init
ema = EMA(0.999)// register
for name, param in model.named_parameters():if param.requires_grad:ema.register(name, param.data)// update
for name, param in model.named_parameters():if param.requires_grad:ema.update(name, param.data)

Refercences

[1]. 理解滑动平均(exponential moving average)

[2]. EMA 指数滑动平均原理和实现 (PyTorch)
[3]. tf.train.ExponentialMovingAverage

指数加权平均（EMA）理解相关推荐

深度学习笔记(2)：2.3|2.4 指数加权平均及理解 | 2.5 指数加权平均的偏差修正
接下来介绍一些比梯度下降法计算速度更快的优化算法. 2.3 指数加权平均为了更好地介绍这些优化算法,这里先介绍一个概念----指数加权平均(exponentially weighted avera ...
机器学习模型性能提升技巧：指数加权平均（EMA）
主要内容什么是EMA? 为什么EMA在测试过程中使用通常能提升模型表现? Tensorflow实现 PyTorch实现 Refercences 什么是EMA? 滑动平均(exponential mo ...
2.4 理解指数加权平均-深度学习第二课《改善深层神经网络》-Stanford吴恩达教授
←上一篇 ↓↑ 下一篇→ 2.3 指数加权平均回到目录 2.5 指数加权平均的偏差修正理解指数加权平均 (Understanding Exponentially Weighted Averages ...
PyTorch指数移动平均(EMA)手册
文章目录 PyTorch指数移动平均(EMA)手册 EMA的数学模型 EMA的意义 EMA的偏差修正 EMA在测试阶段的优越性能 PyTorch实现 Reference PyTorch指数移动平均(E ...
指数加权平均(EWA)
平时跑模型只知道直接上Adam Optimizer,但具体原理却不甚理解,于是把吴恩达老师的深度学习课翻出来看,记录一下关于动量优化算法的基础-EMA相关内容. 指数加权平均的概念平时我们计算平均值 ...
2.2.2 指数加权平均
指数加权平均下面介绍一下比梯度下降更快的算法,不过在这之前,你要了解指数加全平均. 如1和2所示,指数加权实际上就是设置一个权值.就像下图所示通过 11−β11−β \frac{1}{1-\bet ...
2.5 指数加权平均的偏差修正-深度学习第二课《改善深层神经网络》-Stanford吴恩达教授
←上一篇 ↓↑ 下一篇→ 2.4 理解指数加权平均回到目录 2.6 动量梯度下降法指数加权平均的偏差修正 (Bias Correction in Exponentially Weighted Av ...
2.3 指数加权平均-深度学习第二课《改善深层神经网络》-Stanford吴恩达教授
←上一篇 ↓↑ 下一篇→ 2.2 理解 mini-batch 梯度下降法回到目录 2.4 理解指数加权平均指数加权平均 (Exponentially Weighted Averages) 我想向你 ...
指数加权平均与RmsProp（转载+自己总结)以及Adagrad
一.指数加权平均(先说用途:抗噪声拟合) 假设我们有一年365天的气温数据θ1,θ2,...,θ365\theta_1,\theta_2,...,\theta_{365}θ1,θ2,...,θ36 ...

指数加权平均（EMA）理解

主要内容

什么是EMA?

为什么EMA在测试过程中使用通常能提升模型表现？

Tensorflow实现

PyTorch实现

Refercences

指数加权平均（EMA）理解相关推荐

最新文章

热门文章