分类目录:《深入浅出Pytorch函数》总目录
相关文章:
· 深入浅出Pytorch函数------torch.nn.init.calculate_gain
· 深入浅出Pytorch函数------torch.nn.init.uniform_
· 深入浅出Pytorch函数------torch.nn.init.normal_
· 深入浅出Pytorch函数------torch.nn.init.constant_
· 深入浅出Pytorch函数------torch.nn.init.ones_
· 深入浅出Pytorch函数------torch.nn.init.zeros_
· 深入浅出Pytorch函数------torch.nn.init.eye_
· 深入浅出Pytorch函数------torch.nn.init.dirac_
· 深入浅出Pytorch函数------torch.nn.init.xavier_uniform_
· 深入浅出Pytorch函数------torch.nn.init.xavier_normal_
· 深入浅出Pytorch函数------torch.nn.init.kaiming_uniform_
· 深入浅出Pytorch函数------torch.nn.init.kaiming_normal_
· 深入浅出Pytorch函数------torch.nn.init.trunc_normal_
· 深入浅出Pytorch函数------torch.nn.init.orthogonal_
· 深入浅出Pytorch函数------torch.nn.init.sparse_
torch.nn.init
模块中的所有函数都用于初始化神经网络参数,因此它们都在torc.no_grad()
模式下运行,autograd
不会将其考虑在内。
该函数对于给定的非线性函数,返回推荐的增益值。这些值如下所示:
Nonlinearity | Gain |
---|---|
Linear / Identity | 1 1 1 |
Conv1D / Conv2D / Conv3D | 1 1 1 |
Sigmoid | 1 1 1 |
Tanh | 5 3 \frac{5}{3} 35 |
ReLU | 2 \sqrt{2} 2 |
Leaky Relu | 2 1 + negative_slope 2 \sqrt{\frac{2}{1+\text{negative\_slope}^2}} 1+negative_slope22 |
SELU | 4 3 \frac{4}{3} 34 |
为了实现自归一化神经网络,应该使用nonlinearity='linear'
而不是nonlinearity='selu'
。这使得初始权重的方差为 1 N \frac{1}{N} N1,这对于在前向通道中引入稳定的固定点是必要的。相比之下,SELU的默认增益牺牲了矩形层中更稳定梯度流的归一化效应。
语法
torch.nn.init.calculate_gain(nonlinearity, param=None)
参数
nonlinearity
:[nn.functional
] 非线性函数名称param
:非线性函数的可选参数
实例
# leaky_relu with negative_slope=0.2
gain = nn.init.calculate_gain('leaky_relu', 0.2)
函数实现
def calculate_gain(nonlinearity, param=None):
r"""Return the recommended gain value for the given nonlinearity function.
The values are as follows:
================= ====================================================
nonlinearity gain
================= ====================================================
Linear / Identity :math:`1`
Conv{1,2,3}D :math:`1`
Sigmoid :math:`1`
Tanh :math:`\frac{5}{3}`
ReLU :math:`\sqrt{2}`
Leaky Relu :math:`\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}`
SELU :math:`\frac{3}{4}`
================= ====================================================
.. warning::
In order to implement `Self-Normalizing Neural Networks`_ ,
you should use ``nonlinearity='linear'`` instead of ``nonlinearity='selu'``.
This gives the initial weights a variance of ``1 / N``,
which is necessary to induce a stable fixed point in the forward pass.
In contrast, the default gain for ``SELU`` sacrifices the normalisation
effect for more stable gradient flow in rectangular layers.
Args:
nonlinearity: the non-linear function (`nn.functional` name)
param: optional parameter for the non-linear function
Examples:
>>> gain = nn.init.calculate_gain('leaky_relu', 0.2) # leaky_relu with negative_slope=0.2
.. _Self-Normalizing Neural Networks: https://papers.nips.cc/paper/2017/hash/5d44ee6f2c3f71b73125876103c8f6c4-Abstract.html
"""
linear_fns = ['linear', 'conv1d', 'conv2d', 'conv3d', 'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d']
if nonlinearity in linear_fns or nonlinearity == 'sigmoid':
return 1
elif nonlinearity == 'tanh':
return 5.0 / 3
elif nonlinearity == 'relu':
return math.sqrt(2.0)
elif nonlinearity == 'leaky_relu':
if param is None:
negative_slope = 0.01
elif not isinstance(param, bool) and isinstance(param, int) or isinstance(param, float):
# True/False are instances of int, hence check above
negative_slope = param
else:
raise ValueError("negative_slope {} not a valid number".format(param))
return math.sqrt(2.0 / (1 + negative_slope ** 2))
elif nonlinearity == 'selu':
return 3.0 / 4 # Value found empirically (https://github.com/pytorch/pytorch/pull/50664)
else:
raise ValueError("Unsupported nonlinearity {}".format(nonlinearity))