目录
- [1 改善模型性能](#1 改善模型性能)
- [2 线性模型 fit 圆圈数据](#2 线性模型 fit 圆圈数据)
- [3 线性模型 fit 线性方程](#3 线性模型 fit 线性方程)
- [4 加入非线性激活函数 fit 圆圈数据](#4 加入非线性激活函数 fit 圆圈数据)
- [5 复现非线性激活函数](#5 复现非线性激活函数)
-
- [5.1 ReLU](#5.1 ReLU)
- [5.2 sigmoid](#5.2 sigmoid)
OK, 今天我们学习如何改善模型的性能
1 改善模型性能
以下有几种供我们考虑的思路:
- 添加模型层数
- 添加神经元的个数
- 增加训练的周期
- 选择更好的损失函数
- 调整学习率
- 选择更优的优化器
- 更改激活函数
让我们创建一个模型看看吧,Let's have a try!
2 线性模型 fit 圆圈数据
python
# 制作数据
from sklearn.datasets import make_circles
# 创建1000个样本
n_samples = 1000
# 创建我们的圆圈样本
X, y = make_circles(n_samples,
noise=0.03, # 每个点的噪声
random_state=42) # 保证我们获得相同的值
可视化来看一下数据
python
import matplotlib.pyplot as plt
plt.scatter(x=X[:,0],
y=X[:,1],
c=y,
cmap=plt.cm.RdYlBu)
python
# 将数据转换为张量,并将数据转换为默认数据格式
import torch
X = torch.from_numpy(X).type(torch.float)
y = torch.from_numpy(y).type(torch.float)
# 查看一下前五个样本
X[:5],y[:5]
(tensor([[ 0.7542, 0.2315],
[-0.7562, 0.1533],
[-0.8154, 0.1733],
[-0.3937, 0.6929],
[ 0.4422, -0.8967]]),
tensor([1., 1., 1., 1., 0.]))
python
# 划分数据为训练集和测试集
from sklearn.model_selection import train_test_split
# test_size=0.2 是说测试数据占数据的20%,因为这个方法是随机划分的,因此我们这里设置了random_state=42,这样就有助于我们复现代码
X_train, X_test, y_train, y_test = train_test_split(X,
y,
test_size=0.2,
random_state=42)
len(X_train), len(y_train), len(X_test), len(y_test)
(800, 800, 200, 200)
python
import torch.nn as nn
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
device
python
import torch.nn as nn
class CircleClassificationV2(nn.Module):
def __init__(self):
super().__init__()
self.layer_1 = nn.Linear(in_features=2, out_features=10)
self.layer_2 = nn.Linear(in_features=10, out_features=10)
self.layer_3 = nn.Linear(in_features=10, out_features=1)
def forward(self, x):
return self.layer_3(self.layer_2(self.layer_1(x)))
model_2 = CircleClassificationV2().to(device)
model_2
CircleClassificationV2(
(layer_1): Linear(in_features=2, out_features=10, bias=True)
(layer_2): Linear(in_features=10, out_features=10, bias=True)
(layer_3): Linear(in_features=10, out_features=1, bias=True)
)
可以看出这里有三层线形层,同时out_features的数量也增加了
python
# 损失函数
loss_fn = nn.BCEWithLogitsLoss()
# 优化器
optimizer = optim.SGD(params=model_2.parameters(),
lr=0.1)
python
# 设置训练周期
epochs = 1000
# 将数据都放到统一的设备上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)
# 训练循环
for epoch in range(epochs):
model_2.train()
y_logits = model_2(X_train).squeeze()
y_pred = torch.round(torch.sigmoid(y_logits))
# 损失函数
loss = loss_fn(y_logits,
y_train)
acc = accuracy_fn(y_true=y_train,
y_pred=y_pred)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 测试
model_0.eval()
with torch.inference_mode():
test_logits = model_2(X_test).squeeze()
test_pred = torch.round(torch.sigmoid(test_logits))
test_loss = loss_fn(test_logits,
y_test)
test_acc = accuracy_fn(y_true=y_test,
y_pred=test_pred)
# 打印输出
if epoch % 100 == 0:
print(f"Epoch:{epoch} | Train loss:{loss:.5f} | Train accuracy:{acc:.2f}% | Test loss:{test_loss:.4f} | Test accuracy:{test_acc:.2f}%")
Epoch:0 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:100 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:200 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:300 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:400 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:500 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:600 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:700 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:800 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
Epoch:900 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%
分类的损失并没有变小,准确率仍然是50%,这就意味着模型进行分类,就是随机的,就像抛硬币一样,50%正面,50%反面。让我们可视化看一下。
python
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title("Training")
plot_decision_boundary(model_2, X_train, y_train)
plt.subplot(1, 2, 2)
plt.title("Testing")
plot_decision_boundary(model_2, X_test, y_test)
从这个图片,我们可以看出,这个图像的分解仍然是一条线,在右上角。那这是我们的模型没有进行学习吗?还记得我们之前学习的线性回归吗? y = weight * X + bias, 我们用这个来试一试就知道这个模型有没有学习数据了。
3 线性模型 fit 线性方程
python
# 创建数据
weight = 0.7
bias = 0.3
start = 0
end = 1
step = 0.01
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias
print(len(X), len(y))
print(X[:5], y[:5])
100 100
tensor([[0.0000],
[0.0100],
[0.0200],
[0.0300],
[0.0400]]) tensor([[0.3000],
[0.3070],
[0.3140],
[0.3210],
[0.3280]])
python
# 将数据划分为训练集和测试集
train_split = int(0.8 * len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]
len(X_train), len(y_train), len(X_test), len(y_test)
(80, 80, 20, 20)
python
plot_predictions(train_data=X_train,
train_labels=y_train,
test_data=X_test,
test_labels=y_test)
python
# 创建设备无关的代码
device = "cuda" if torch.cuda.is_available() else "cpu"
device
'cuda'
python
# 设置CPU上的随机种子
torch.manual_seed(42)
# 设置GPU上的随机种子
torch.cuda.manual_seed(42)
# 将数据放到GPU上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)
python
model_2
CircleClassificationV2(
(layer_1): Linear(in_features=2, out_features=10, bias=True)
(layer_2): Linear(in_features=10, out_features=10, bias=True)
(layer_3): Linear(in_features=10, out_features=1, bias=True)
)
从这里可以看出模型的输入是2,但我们这里线性回归的输入是1,所以这里注意需要更改
python
# 创建模型,这里采用 nn.Sequential 来构建模型,因为是顺序的,这样简单一点,和model_2一样的结构
model_1 = nn.Sequential(
nn.Linear(in_features = 1, out_features = 10),
nn.Linear(in_features = 10, out_features = 10),
nn.Linear(in_features = 10, out_features = 1)
)
model_1.to(device)
model_1
Sequential(
(0): Linear(in_features=1, out_features=10, bias=True)
(1): Linear(in_features=10, out_features=10, bias=True)
(2): Linear(in_features=10, out_features=1, bias=True)
)
python
# 损失函数, 因为是线性的,所以我们肯定是用MAE的
loss_fn = nn.L1Loss()
# 优化器
optimizer = optim.SGD(params=model_1.parameters(),
lr=0.01)
python
# 训练的周期
epochs = 1000
for epoch in range(epochs):
# 训练
model_1.train()
# 前向传播
y_pred = model_1(X_train)
# 损失
loss = loss_fn(y_pred,
y_train)
# 梯度清零
optimizer.zero_grad()
# 反向传播
loss.backward()
# 梯度下降
optimizer.step()
# 测试
model_1.eval()
with torch.inference_mode():
test_pred = model_1(X_test)
test_loss = loss_fn(test_pred,
y_test)
# 打印结果
if epoch % 100 == 0:
print(f"Epoch:{epoch} | Train loss:{loss:.4f} | Test loss:{test_loss:.4f}")
Epoch:0 | Train loss:0.7599 | Test loss:0.9110
Epoch:100 | Train loss:0.0286 | Test loss:0.0008
Epoch:200 | Train loss:0.0253 | Test loss:0.0021
Epoch:300 | Train loss:0.0214 | Test loss:0.0031
Epoch:400 | Train loss:0.0196 | Test loss:0.0034
Epoch:500 | Train loss:0.0194 | Test loss:0.0039
Epoch:600 | Train loss:0.0190 | Test loss:0.0038
Epoch:700 | Train loss:0.0188 | Test loss:0.0038
Epoch:800 | Train loss:0.0184 | Test loss:0.0033
Epoch:900 | Train loss:0.0180 | Test loss:0.0036
可视化看一下
python
model_1.eval()
with torch.inference_mode():
y_pred = model_1(X_test)
python
plot_predictions(train_data=X_train.cpu(),
train_labels=y_train.cpu(),
test_data=X_test.cpu(),
test_labels=y_test.cpu(),
predictions=y_pred.cpu())
设置一下学习率,马上红色和绿色的点就更接近,那从这里我们可以看出定义的模型是对数据进行了学习的,那为什么效果不好呢,可以想一下,我们的数据是圆形和线性明显是不一样的,这里就不得不提出一个概念,非线性 。相信了解过机器学习的是不是都听过ReLu()、Sigmoid()、Tanh()
这些非线性的函数,加入非线性我们才能更好地学习数据。
4 加入非线性激活函数 fit 圆圈数据
python
# 创建数据
from sklearn.datasets import make_circles
from sklearn.model_selection import train_test_split
X, y = make_circles(n_samples=1000,
noise=0.03,
random_state=42)
# 将 X, y转换为张量
X = torch.from_numpy(X).type(torch.float)
y = torch.from_numpy(y).type(torch.float)
X[:5], y[:5]
(tensor([[ 0.7542, 0.2315],
[-0.7562, 0.1533],
[-0.8154, 0.1733],
[-0.3937, 0.6929],
[ 0.4422, -0.8967]]),
tensor([1., 1., 1., 1., 0.]))
python
# 绘制个图像看看
import matplotlib.pyplot as plt
plt.scatter(x = X[:,0],
y = X[:,1],
c=y,
cmap=plt.cm.RdYlBu)
python
# 将数据集划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X,
y,
test_size=0.2,
random_state=42)
len(X_train), len(y_train), len(X_test), len(y_test)
(800, 800, 200, 200)
python
# 设备无关的代码
device = "cuda" if torch.cuda.is_available() else "cpu"
device
'cuda'
python
# 将数据放到统一的设备上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)
创建模型,并将模型实例化,把他放到指定的设备上
python
class CircleClassificationV3(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(in_features=2, out_features=10)
self.layer2 = nn.Linear(in_features=10, out_features=10)
self.layer3 = nn.Linear(in_features=10, out_features=1)
self.relu = nn.ReLU()
def forward(self, x):
# 这里用了两个ReLU()啊
return self.layer3(self.relu(self.layer2(self.relu(self.layer1(x)))))
model_3 = CircleClassificationV3().to(device)
model_3
CircleClassificationV3(
(layer1): Linear(in_features=2, out_features=10, bias=True)
(layer2): Linear(in_features=10, out_features=10, bias=True)
(layer3): Linear(in_features=10, out_features=1, bias=True)
(relu): ReLU()
)
python
# 损失函数
loss_fn = nn.BCEWithLogitsLoss()
# 优化函数
optimizer = optim.SGD(params=model_3.parameters(),
lr=0.1)
python
print((model_3(X_train).squeeze()).shape)
print(y_train.shape)
torch.Size([800])
torch.Size([800])
python
# 设置随机种子
torch.manual_seed(42)
torch.cuda.manual_seed(42)
# 设置训练周期
epochs = 1000
for epoch in range(epochs):
# 训练阶段
model_3.train()
y_logits = model_3(X_train).squeeze()
y_pred = torch.round(torch.sigmoid(y_logits))
loss = loss_fn(y_logits,
y_train)
acc = accuracy_fn(y_true=y_train,
y_pred=y_pred)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 测试阶段
model_3.eval()
with torch.inference_mode():
test_logits = model_3(X_test).squeeze()
test_pred = torch.round(torch.sigmoid(test_logits))
test_loss = loss_fn(test_logits,
y_test)
test_acc = accuracy_fn(y_true=y_test,
y_pred=test_pred)
# 打印输出
if epoch % 100 == 0:
print(f"Epoch:{epoch} | Train Loss:{loss:.4f} | Train Accuracy:{acc:.2f}% | Test Loss:{test_loss:.4f} | Test Accuracy:{test_acc:.2f}%")
Epoch:0 | Train Loss:0.6929 | Train Accuracy:50.00% | Test Loss:0.6932 | Test Accuracy:50.00%
Epoch:100 | Train Loss:0.6912 | Train Accuracy:52.88% | Test Loss:0.6910 | Test Accuracy:52.50%
Epoch:200 | Train Loss:0.6898 | Train Accuracy:53.37% | Test Loss:0.6894 | Test Accuracy:55.00%
Epoch:300 | Train Loss:0.6879 | Train Accuracy:53.00% | Test Loss:0.6872 | Test Accuracy:56.00%
Epoch:400 | Train Loss:0.6852 | Train Accuracy:52.75% | Test Loss:0.6841 | Test Accuracy:56.50%
Epoch:500 | Train Loss:0.6810 | Train Accuracy:52.75% | Test Loss:0.6794 | Test Accuracy:56.50%
Epoch:600 | Train Loss:0.6751 | Train Accuracy:54.50% | Test Loss:0.6729 | Test Accuracy:56.00%
Epoch:700 | Train Loss:0.6666 | Train Accuracy:58.38% | Test Loss:0.6632 | Test Accuracy:59.00%
Epoch:800 | Train Loss:0.6516 | Train Accuracy:64.00% | Test Loss:0.6476 | Test Accuracy:67.50%
Epoch:900 | Train Loss:0.6236 | Train Accuracy:74.00% | Test Loss:0.6215 | Test Accuracy:79.00%
绘制图像浅看一下
python
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title("Train")
plot_decision_boundary(model_3, X_train, y_train)
plt.subplot(1, 2, 2)
plt.title("Test")
plot_decision_boundary(model_3, X_test, y_test)
哇塞,这个图尊嘟很不错,是不是成功的画圈圈啦,证明什么,证明我们的激活函数是有作用的,我们可以参考上面的思路调整模型的性能哈,接下来,我们就会把这些过程完整的整合在一起啦!
5 复现非线性激活函数
我们之前实验了如何将激活函数加入我们的模型中来给非线性激活函数建模.
python
# 创建一个简单的 tensor
A = torch.arange(-10, 10, 1, dtype=torch.float)
A
tensor([-10., -9., -8., -7., -6., -5., -4., -3., -2., -1., 0., 1.,
2., 3., 4., 5., 6., 7., 8., 9.])
python
plt.plot(A)
接下来让我看看ReLU是如何影响它的
5.1 ReLU
python
def relu(x):
return torch.maximum(torch.tensor(0),x)
relu(A)
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 2., 3., 4., 5., 6., 7.,
8., 9.])
所有的负数都变成0了
python
plt.plot(relu(A))
5.2 sigmoid
python
def sigmoid(x):
return 1 / (1 + torch.exp(-x))
sigmoid(A)
tensor([4.5398e-05, 1.2339e-04, 3.3535e-04, 9.1105e-04, 2.4726e-03, 6.6929e-03,
1.7986e-02, 4.7426e-02, 1.1920e-01, 2.6894e-01, 5.0000e-01, 7.3106e-01,
8.8080e-01, 9.5257e-01, 9.8201e-01, 9.9331e-01, 9.9753e-01, 9.9909e-01,
9.9966e-01, 9.9988e-01])
python
plt.plot(sigmoid(A))
ok,今天很顺利的完成了我的学习task!
BB啊**,今天好好吃饭了嘛~晚餐的牛蛙很好吃,方便面也嘎嘎nice!吃撑啦。
BB ,如果我的文档对您有帮助的话,记得给俺点个赞 呐!比心心
靴靴,谢谢~