分类目录:《深入浅出Pytorch函数》总目录
相关文章:
· 深入浅出Pytorch函数------torch.nn.init.calculate_gain
· 深入浅出Pytorch函数------torch.nn.init.uniform_
· 深入浅出Pytorch函数------torch.nn.init.normal_
· 深入浅出Pytorch函数------torch.nn.init.constant_
· 深入浅出Pytorch函数------torch.nn.init.ones_
· 深入浅出Pytorch函数------torch.nn.init.zeros_
· 深入浅出Pytorch函数------torch.nn.init.eye_
· 深入浅出Pytorch函数------torch.nn.init.dirac_
· 深入浅出Pytorch函数------torch.nn.init.xavier_uniform_
· 深入浅出Pytorch函数------torch.nn.init.xavier_normal_
· 深入浅出Pytorch函数------torch.nn.init.kaiming_uniform_
· 深入浅出Pytorch函数------torch.nn.init.kaiming_normal_
· 深入浅出Pytorch函数------torch.nn.init.trunc_normal_
· 深入浅出Pytorch函数------torch.nn.init.orthogonal_
· 深入浅出Pytorch函数------torch.nn.init.sparse_
torch.nn.init
模块中的所有函数都用于初始化神经网络参数,因此它们都在torc.no_grad()
模式下运行,autograd
不会将其考虑在内。
根据Martens, J等人在《Deep learning via Hessian-free optimization》中描述的方法,将2维的输入张量或变量当做稀疏矩阵填充,其中非零元素生成自 N ( 0 , std 2 ) N(0, \text{std}^2) N(0,std2)。
语法
torch.nn.init.sparse_(tensor, sparsity, std=0.01)
参数
tensor
:[Tensor
] 一个 N N N维张量torch.Tensor
sparsity
:每列中需要被设置成零的元素比例std
:用于生成非零值的正态分布的标准差
返回值
一个torch.Tensor
且参数tensor
也会更新
实例
w = torch.empty(3, 5)
nn.init.sparse_(w, sparsity=0.1)
函数实现
def sparse_(tensor, sparsity, std=0.01):
r"""Fills the 2D input `Tensor` as a sparse matrix, where the
non-zero elements will be drawn from the normal distribution
:math:`\mathcal{N}(0, 0.01)`, as described in `Deep learning via
Hessian-free optimization` - Martens, J. (2010).
Args:
tensor: an n-dimensional `torch.Tensor`
sparsity: The fraction of elements in each column to be set to zero
std: the standard deviation of the normal distribution used to generate
the non-zero values
Examples:
>>> w = torch.empty(3, 5)
>>> nn.init.sparse_(w, sparsity=0.1)
"""
if tensor.ndimension() != 2:
raise ValueError("Only tensors with 2 dimensions are supported")
rows, cols = tensor.shape
num_zeros = int(math.ceil(sparsity * rows))
with torch.no_grad():
tensor.normal_(0, std)
for col_idx in range(cols):
row_indices = torch.randperm(rows)
zero_indices = row_indices[:num_zeros]
tensor[zero_indices, col_idx] = 0
return tensor