pytorch小记（二十一）：PyTorch 中的 torch.randn 全面指南

[PyTorch 中的 `torch.randn` 全面指南](#PyTorch 中的 torch.randn 全面指南)
- 一、接口定义
- 二、参数详解
- 三、常见使用场景
- [四、位置参数 vs. Tuple 传参 ------ 数值示例](#四、位置参数 vs. Tuple 传参 —— 数值示例)
- 五、必须用关键字传入
- 小结

PyTorch 中的 `torch.randn` 全面指南

在深度学习中，我们经常需要从标准正态分布（ N ( 0 , 1 ) \mathcal{N}(0,1) N(0,1)）中采样，PyTorch 提供了非常灵活的接口 torch.randn。本文将从接口定义、参数详解、常见场景、示例及输出，到关键字参数的设计原理，一一展开。

一、接口定义

python 复制代码

torch.randn(*sizes, 
             out=None, 
             dtype=None, 
             layout=torch.strided, 
             device=None, 
             requires_grad=False, 
             generator=None) → Tensor

功能：返回一个从标准正态分布中采样的张量。
签名解读：
- *sizes：一个可变长的位置参数列表，或一个整型元组，用来指定输出张量的形状。
- out：可选，指定已有张量存放结果（in-place）。
- dtype：数据类型（如 torch.float32、torch.float64）。
- layout：存储布局，默认 torch.strided（稠密 Tensor）。
- device：设备，如 "cpu"、"cuda:0"。
- requires_grad：是否参与梯度追踪。
- generator：自定义随机数生成器，用于多线程或多卡场景下隔离随机流。

二、参数详解

参数	含义	示例
`*sizes`	输出张量的形状，例如 `2,3` 或 `(2,3)`	`torch.randn(2,3)` 或 `torch.randn((2,3))`
`out`	指定用来存放结果的张量，只能通过 `out=` 关键字传入	`torch.randn(2,3, out=my_tensor)`
`dtype`	输出数据类型	`torch.randn(2,3, dtype=torch.float64)`
`layout`	存储布局，通常无需修改	`torch.randn(2,3, layout=torch.strided)`
`device`	指定设备	`torch.randn(2,3, device='cuda:0')`
`requires_grad`	是否记录梯度	`torch.randn(2,3, requires_grad=True)`
`generator`	指定 `torch.Generator()`	`g = torch.Generator().manual_seed(1)`

三、常见使用场景

模型权重初始化

python 复制代码

self.weight = torch.randn(out_channels, in_channels) * std + mean

噪声注入

python 复制代码

noise = torch.randn(*x.shape, device=x.device)
x_noisy = x + noise * noise_level

随机输入或仿真

python 复制代码

random_input = torch.randn(batch_size, latent_dim)

实验可复现
python 复制代码
```
torch.manual_seed(42)
torch.randn(3,3)
```

四、位置参数 vs. Tuple 传参 ------ 数值示例

下面以 固定随机种子 的方式，演示两种写法输出的张量形状与内容格式上的一致性。

python 复制代码

import torch
torch.manual_seed(0)

# 方式 A：位置参数
a = torch.randn(2, 3)
print("a:\n", a)

# 方式 B：整型元组
b = torch.randn((2, 3))
print("\nb:\n", b)

print("\na.shape =", a.shape, ", b.shape =", b.shape)

运行输出示例：

text 复制代码

a:
 tensor([[ 1.5410, -0.2934, -2.1788],
         [ 0.5684, -1.0845, -1.3986]])

b:
 tensor([[-0.4033,  0.8380, -0.7193],
         [ 0.0921, -0.3950, -0.0132]])

a.shape = torch.Size([2, 3]) , b.shape = torch.Size([2, 3])

结论：
- 两者都会生成 shape 为 (2,3) 的张量。
- 位置参数和 tuple 形式是等价的，只是 Python 语法上的两种传参方式。

五、必须用关键字传入

在 Python 里，函数签名中 *sizes 表示所有位置参数都会被收集到 sizes 这个元组里。当你调用：

python 复制代码

torch.randn(2, 3,     # 这两个位置参数被当作大小
           out=my_out, # 只能通过关键字指定
           dtype=torch.float64,  # 关键字形式
           layout=torch.strided,  # 关键字形式
           device='cuda:0',       # 关键字形式
           requires_grad=True,    # 关键字形式
           generator=g)           # 关键字形式

如果你尝试用位置参数来"偷"传 out，比如写 torch.randn(2,3,my_tensor)，Python 会把 my_tensor 当成第三个维度大小（必须是 int），自然会报类型错误。
因此，out、dtype、layout、device、requires_grad、generator 都被设计成 keyword-only arguments ，只能用 key=value 的形式调用，避免和形状参数冲突。

小结

torch.randn(*sizes)：位置参数与整型元组都可用于指定输出形状；
输出示例：两种写法生成相同 shape 的张量，只是随机内容不同；
关键字参数 ：out、dtype、layout、device、requires_grad、generator 必须写成 名称=值，确保位置参数只对应"形状"这一语义，不会混淆。