【动手学深度学习】

python 复制代码

def my_init(m):
    if type(m) == nn.Linear:
        print("Init", *[(name, param.shape)
                        for name, param in m.named_parameters()][0])
        nn.init.uniform_(m.weight, -10, 10)
        m.weight.data *= m.weight.data.abs() >= 5

代码中这里的[0]是列表索引 ，表示取列表中的第一个元素。

先分解这段代码：

python 复制代码

[(name, param.shape) for name, param in m.named_parameters()][0]

代码分解：

m.named_parameters() - 返回模块的所有参数（权重和偏置）及其名称
- 对于 nn.Linear 层，通常返回两个参数：weight 和 bias
列表推导式：

python 复制代码

[(name, param.shape) for name, param in m.named_parameters()]

这会生成一个列表，例如：

python 复制代码

[('weight', torch.Size([out_features, in_features])), 
 ('bias', torch.Size([out_features]))]

3. [0] - 取列表中的第一（首）个元素：

python 复制代码

('weight', torch.Size([out_features, in_features]))

*解包 - 将元组解包为单独的参数：

python 复制代码

print("Init", *('weight', torch.Size([out_features, in_features])))
# 等价于：
print("Init", 'weight', torch.Size([out_features, in_features]))

输出示例：

python 复制代码

# 假设有一个 nn.Linear(10, 5) 层
Init weight torch.Size([5, 10])

为什么只取第一个？

因为对于 nn.Linear 层，通常只需要关注权重（weight）的初始化，偏置（bias）可以使用默认初始化或单独处理。

如果你想看到所有参数，可以去掉 [0]：

python 复制代码

print("Init", *[(name, param.shape) for name, param in m.named_parameters()])
# 输出：Init weight torch.Size([5, 10]) bias torch.Size([5])

[0]在这里的作用就是只选择第一(首个)个参数（权重）进行打印和初始化。