《动手学深度学习 Pytorch版》 4.3 多层感知机的简洁实现

python 复制代码
import torch
from torch import nn
from d2l import torch as d2l

模型

python 复制代码
net = nn.Sequential(nn.Flatten(),
                    nn.Linear(784, 256),
                    nn.ReLU(),  # 与 3.7 节相比多了一层
                    nn.Linear(256, 10))

def init_weights(m):
    if type(m) == nn.Linear:  # 使用正态分布中的随机值初始化权重
        nn.init.normal_(m.weight, std=0.01)

net.apply(init_weights)
复制代码
Sequential(
  (0): Flatten(start_dim=1, end_dim=-1)
  (1): Linear(in_features=784, out_features=256, bias=True)
  (2): ReLU()
  (3): Linear(in_features=256, out_features=10, bias=True)
)
python 复制代码
batch_size, lr, num_epochs = 256, 0.1, 10
loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net.parameters(), lr=lr)

train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)


练习

(1)尝试添加不同数量的隐藏层(也可以修改学习率),怎样设置效果最好?

python 复制代码
net2 = nn.Sequential(nn.Flatten(),
                    nn.Linear(784, 256),
                    nn.ReLU(),
                    nn.Linear(256, 128),
                    nn.ReLU(),
                    nn.Linear(128, 10))

def init_weights(m):
    if type(m) == nn.Linear:  # 使用正态分布中的随机值初始化权重
        nn.init.normal_(m.weight, std=0.01)

net2.apply(init_weights)

batch_size2, lr2, num_epochs2 = 256, 0.3, 10
loss2 = nn.CrossEntropyLoss(reduction='none')
trainer2 = torch.optim.SGD(net2.parameters(), lr=lr2)

train_iter2, test_iter2 = d2l.load_data_fashion_mnist(batch_size2)
d2l.train_ch3(net2, train_iter2, test_iter2, loss2, num_epochs2, trainer2)



(2)尝试不同的激活函数,哪个激活函数效果最好?

python 复制代码
net3 = nn.Sequential(nn.Flatten(),
                    nn.Linear(784, 256),
                    nn.Sigmoid(),
                    nn.Linear(256, 10))

net4 = nn.Sequential(nn.Flatten(),
                    nn.Linear(784, 256),
                    nn.Tanh(),
                    nn.Linear(256, 10))

def init_weights(m):
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight, std=0.01)

net3.apply(init_weights)
net4.apply(init_weights)


train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
python 复制代码
batch_size, lr, num_epochs = 256, 0.1, 10
loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net3.parameters(), lr=lr)
d2l.train_ch3(net3, train_iter, test_iter, loss, num_epochs, trainer)
复制代码
---------------------------------------------------------------------------

AssertionError                            Traceback (most recent call last)

Cell In[5], line 4
      2 loss = nn.CrossEntropyLoss(reduction='none')
      3 trainer = torch.optim.SGD(net3.parameters(), lr=lr)
----> 4 d2l.train_ch3(net3, train_iter, test_iter, loss, num_epochs, trainer)


File c:\Software\Miniconda3\envs\d2l\lib\site-packages\d2l\torch.py:340, in train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
    338     animator.add(epoch + 1, train_metrics + (test_acc,))
    339 train_loss, train_acc = train_metrics
--> 340 assert train_loss < 0.5, train_loss
    341 assert train_acc <= 1 and train_acc > 0.7, train_acc
    342 assert test_acc <= 1 and test_acc > 0.7, test_acc


AssertionError: 0.5017133202234904
python 复制代码
batch_size, lr, num_epochs = 256, 0.1, 10
loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net4.parameters(), lr=lr)
d2l.train_ch3(net4, train_iter, test_iter, loss, num_epochs, trainer)


还是 ReLU 比较奈斯。


(3)尝试不同的方案来初始化权重,什么方案效果最好。

累了,不想试试了。略...

相关推荐
神州问学2 分钟前
【AI洞察】别再只想着“让AI听你话”,人类也需要学习“适应AI”!
人工智能
CoovallyAIHub20 分钟前
中科大DSAI Lab团队多篇论文入选ICCV 2025,推动三维视觉与泛化感知技术突破
深度学习·算法·计算机视觉
DevUI团队22 分钟前
🚀 MateChat V1.8.0 震撼发布!对话卡片可视化升级,对话体验全面进化~
前端·vue.js·人工智能
聚客AI25 分钟前
🎉7.6倍训练加速与24倍吞吐提升:两项核心技术背后的大模型推理优化全景图
人工智能·llm·掘金·日新计划
黎燃35 分钟前
当 YOLO 遇见编剧:用自然语言生成技术把“目标检测”写成“目标剧情”
人工智能
算家计算36 分钟前
AI教母李飞飞团队发布最新空间智能模型!一张图生成无限3D世界,元宇宙越来越近了
人工智能·资讯
掘金一周39 分钟前
Flutter Riverpod 3.0 发布,大规模重构下的全新状态管理框架 | 掘金一周 9.18
前端·人工智能·后端
CoovallyAIHub1 小时前
开源的消逝与新生:从 TensorFlow 的落幕到开源生态的蜕变
pytorch·深度学习·llm
用户5191495848451 小时前
C#记录类型与集合的深度解析:从默认实现到自定义比较器
人工智能·aigc
IT_陈寒4 小时前
React 18实战:7个被低估的Hooks技巧让你的开发效率提升50%
前端·人工智能·后端