新手小白的pytorch学习第九弹-----提升分类问题模型性能和非线性激活函数

目录

  • [1 改善模型性能](#1 改善模型性能)
  • [2 线性模型 fit 圆圈数据](#2 线性模型 fit 圆圈数据)
  • [3 线性模型 fit 线性方程](#3 线性模型 fit 线性方程)
  • [4 加入非线性激活函数 fit 圆圈数据](#4 加入非线性激活函数 fit 圆圈数据)
  • [5 复现非线性激活函数](#5 复现非线性激活函数)
    • [5.1 ReLU](#5.1 ReLU)
    • [5.2 sigmoid](#5.2 sigmoid)

OK, 今天我们学习如何改善模型的性能

1 改善模型性能

以下有几种供我们考虑的思路:

  • 添加模型层数
  • 添加神经元的个数
  • 增加训练的周期
  • 选择更好的损失函数
  • 调整学习率
  • 选择更优的优化器
  • 更改激活函数

让我们创建一个模型看看吧,Let's have a try!

2 线性模型 fit 圆圈数据

python 复制代码
# 制作数据
from sklearn.datasets import make_circles

# 创建1000个样本
n_samples = 1000

# 创建我们的圆圈样本
X, y = make_circles(n_samples,
                    noise=0.03, # 每个点的噪声
                    random_state=42) # 保证我们获得相同的值

可视化来看一下数据

python 复制代码
import matplotlib.pyplot as plt
plt.scatter(x=X[:,0],
            y=X[:,1],
            c=y,
            cmap=plt.cm.RdYlBu)
python 复制代码
# 将数据转换为张量,并将数据转换为默认数据格式
import torch
X = torch.from_numpy(X).type(torch.float)
y = torch.from_numpy(y).type(torch.float)

# 查看一下前五个样本
X[:5],y[:5]

(tensor([[ 0.7542, 0.2315],

-0.7562, 0.1533\], \[-0.8154, 0.1733\], \[-0.3937, 0.6929\], \[ 0.4422, -0.8967\]\]), tensor(\[1., 1., 1., 1., 0.\]))

python 复制代码
# 划分数据为训练集和测试集
from sklearn.model_selection import train_test_split

# test_size=0.2 是说测试数据占数据的20%,因为这个方法是随机划分的,因此我们这里设置了random_state=42,这样就有助于我们复现代码
X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y, 
                                                    test_size=0.2,
                                                    random_state=42)
len(X_train), len(y_train), len(X_test), len(y_test)

(800, 800, 200, 200)

python 复制代码
import torch.nn as nn
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
device
python 复制代码
import torch.nn as nn
class CircleClassificationV2(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer_1 = nn.Linear(in_features=2, out_features=10)
        self.layer_2 = nn.Linear(in_features=10, out_features=10)
        self.layer_3 = nn.Linear(in_features=10, out_features=1)
    
    def forward(self, x):
        return self.layer_3(self.layer_2(self.layer_1(x)))
    
model_2 = CircleClassificationV2().to(device)
model_2

CircleClassificationV2(

(layer_1): Linear(in_features=2, out_features=10, bias=True)

(layer_2): Linear(in_features=10, out_features=10, bias=True)

(layer_3): Linear(in_features=10, out_features=1, bias=True)

)

可以看出这里有三层线形层,同时out_features的数量也增加了

python 复制代码
# 损失函数
loss_fn = nn.BCEWithLogitsLoss()

# 优化器
optimizer = optim.SGD(params=model_2.parameters(),
                      lr=0.1)
python 复制代码
# 设置训练周期
epochs = 1000

# 将数据都放到统一的设备上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)

# 训练循环
for epoch in range(epochs):
    model_2.train()
    y_logits = model_2(X_train).squeeze()
    y_pred = torch.round(torch.sigmoid(y_logits))
    
    # 损失函数
    loss = loss_fn(y_logits,
                   y_train)
    acc = accuracy_fn(y_true=y_train,
                      y_pred=y_pred)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    # 测试
    model_0.eval()
    with torch.inference_mode():
        test_logits = model_2(X_test).squeeze()
        test_pred = torch.round(torch.sigmoid(test_logits))
        test_loss = loss_fn(test_logits, 
                            y_test)
        test_acc = accuracy_fn(y_true=y_test,
                               y_pred=test_pred)
    # 打印输出
    if epoch % 100 == 0:
        print(f"Epoch:{epoch} | Train loss:{loss:.5f} | Train accuracy:{acc:.2f}% | Test loss:{test_loss:.4f} | Test accuracy:{test_acc:.2f}%")

Epoch:0 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:100 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:200 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:300 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:400 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:500 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:600 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:700 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:800 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

Epoch:900 | Train loss:0.69381 | Train accuracy:50.00% | Test loss:0.6957 | Test accuracy:50.00%

分类的损失并没有变小,准确率仍然是50%,这就意味着模型进行分类,就是随机的,就像抛硬币一样,50%正面,50%反面。让我们可视化看一下。

python 复制代码
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title("Training")
plot_decision_boundary(model_2, X_train, y_train)
plt.subplot(1, 2, 2)
plt.title("Testing")
plot_decision_boundary(model_2, X_test, y_test)

从这个图片,我们可以看出,这个图像的分解仍然是一条线,在右上角。那这是我们的模型没有进行学习吗?还记得我们之前学习的线性回归吗? y = weight * X + bias, 我们用这个来试一试就知道这个模型有没有学习数据了。

3 线性模型 fit 线性方程

python 复制代码
# 创建数据
weight = 0.7
bias = 0.3

start = 0
end = 1
step = 0.01

X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias

print(len(X), len(y))
print(X[:5], y[:5])

100 100

tensor([[0.0000],

0.0100\], \[0.0200\], \[0.0300\], \[0.0400\]\]) tensor(\[\[0.3000\], \[0.3070\], \[0.3140\], \[0.3210\], \[0.3280\]\])

python 复制代码
# 将数据划分为训练集和测试集
train_split = int(0.8 * len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

len(X_train), len(y_train), len(X_test), len(y_test)

(80, 80, 20, 20)

python 复制代码
plot_predictions(train_data=X_train,
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test)
python 复制代码
# 创建设备无关的代码
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

python 复制代码
# 设置CPU上的随机种子
torch.manual_seed(42)

# 设置GPU上的随机种子
torch.cuda.manual_seed(42)

# 将数据放到GPU上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)
python 复制代码
model_2

CircleClassificationV2(

(layer_1): Linear(in_features=2, out_features=10, bias=True)

(layer_2): Linear(in_features=10, out_features=10, bias=True)

(layer_3): Linear(in_features=10, out_features=1, bias=True)

)

从这里可以看出模型的输入是2,但我们这里线性回归的输入是1,所以这里注意需要更改

python 复制代码
# 创建模型,这里采用 nn.Sequential 来构建模型,因为是顺序的,这样简单一点,和model_2一样的结构
model_1 = nn.Sequential(
    nn.Linear(in_features = 1, out_features = 10),
    nn.Linear(in_features = 10, out_features = 10),
    nn.Linear(in_features = 10, out_features = 1)
)

model_1.to(device)
model_1

Sequential(

(0): Linear(in_features=1, out_features=10, bias=True)

(1): Linear(in_features=10, out_features=10, bias=True)

(2): Linear(in_features=10, out_features=1, bias=True)

)

python 复制代码
# 损失函数, 因为是线性的,所以我们肯定是用MAE的
loss_fn = nn.L1Loss()

# 优化器
optimizer = optim.SGD(params=model_1.parameters(),
                      lr=0.01)
python 复制代码
# 训练的周期
epochs = 1000

for epoch in range(epochs):
    # 训练
    model_1.train()
    
    # 前向传播
    y_pred = model_1(X_train)
    
    # 损失
    loss = loss_fn(y_pred, 
                   y_train)
    
    # 梯度清零
    optimizer.zero_grad()
    
    # 反向传播
    loss.backward()
    
    # 梯度下降
    optimizer.step()
    
    # 测试
    model_1.eval()
    with torch.inference_mode():
        test_pred = model_1(X_test)
        test_loss = loss_fn(test_pred, 
                            y_test)
    
    # 打印结果
    if epoch % 100 == 0:
        print(f"Epoch:{epoch} | Train loss:{loss:.4f} | Test loss:{test_loss:.4f}")

Epoch:0 | Train loss:0.7599 | Test loss:0.9110

Epoch:100 | Train loss:0.0286 | Test loss:0.0008

Epoch:200 | Train loss:0.0253 | Test loss:0.0021

Epoch:300 | Train loss:0.0214 | Test loss:0.0031

Epoch:400 | Train loss:0.0196 | Test loss:0.0034

Epoch:500 | Train loss:0.0194 | Test loss:0.0039

Epoch:600 | Train loss:0.0190 | Test loss:0.0038

Epoch:700 | Train loss:0.0188 | Test loss:0.0038

Epoch:800 | Train loss:0.0184 | Test loss:0.0033

Epoch:900 | Train loss:0.0180 | Test loss:0.0036

可视化看一下

python 复制代码
model_1.eval()
with torch.inference_mode():
    y_pred = model_1(X_test)
python 复制代码
plot_predictions(train_data=X_train.cpu(),
                 train_labels=y_train.cpu(),
                 test_data=X_test.cpu(),
                 test_labels=y_test.cpu(),
                 predictions=y_pred.cpu())

设置一下学习率,马上红色和绿色的点就更接近,那从这里我们可以看出定义的模型是对数据进行了学习的,那为什么效果不好呢,可以想一下,我们的数据是圆形和线性明显是不一样的,这里就不得不提出一个概念,非线性 。相信了解过机器学习的是不是都听过ReLu()、Sigmoid()、Tanh()这些非线性的函数,加入非线性我们才能更好地学习数据。

4 加入非线性激活函数 fit 圆圈数据

python 复制代码
# 创建数据
from sklearn.datasets import make_circles
from sklearn.model_selection import train_test_split

X, y = make_circles(n_samples=1000,
                    noise=0.03,
                    random_state=42)

# 将 X, y转换为张量
X = torch.from_numpy(X).type(torch.float)
y = torch.from_numpy(y).type(torch.float)
X[:5], y[:5]

(tensor([[ 0.7542, 0.2315],

-0.7562, 0.1533\], \[-0.8154, 0.1733\], \[-0.3937, 0.6929\], \[ 0.4422, -0.8967\]\]), tensor(\[1., 1., 1., 1., 0.\]))

python 复制代码
# 绘制个图像看看
import matplotlib.pyplot as plt
plt.scatter(x = X[:,0],
            y = X[:,1],
            c=y,
            cmap=plt.cm.RdYlBu)
python 复制代码
# 将数据集划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, 
                                                    y, 
                                                    test_size=0.2, 
                                                    random_state=42)
len(X_train), len(y_train), len(X_test), len(y_test)

(800, 800, 200, 200)

python 复制代码
# 设备无关的代码
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

python 复制代码
# 将数据放到统一的设备上
X_train, y_train = X_train.to(device), y_train.to(device)
X_test, y_test = X_test.to(device), y_test.to(device)

创建模型,并将模型实例化,把他放到指定的设备上

python 复制代码
class CircleClassificationV3(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.layer1 = nn.Linear(in_features=2, out_features=10)
        self.layer2 = nn.Linear(in_features=10, out_features=10)
        self.layer3 = nn.Linear(in_features=10, out_features=1)
        self.relu = nn.ReLU()
        
    def forward(self, x):
        # 这里用了两个ReLU()啊
        return self.layer3(self.relu(self.layer2(self.relu(self.layer1(x)))))
    
model_3 = CircleClassificationV3().to(device)
model_3

CircleClassificationV3(

(layer1): Linear(in_features=2, out_features=10, bias=True)

(layer2): Linear(in_features=10, out_features=10, bias=True)

(layer3): Linear(in_features=10, out_features=1, bias=True)

(relu): ReLU()

)

python 复制代码
# 损失函数
loss_fn = nn.BCEWithLogitsLoss()

# 优化函数
optimizer = optim.SGD(params=model_3.parameters(),
                      lr=0.1)
python 复制代码
print((model_3(X_train).squeeze()).shape)
print(y_train.shape)

torch.Size([800])

torch.Size([800])

python 复制代码
# 设置随机种子
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# 设置训练周期
epochs = 1000

for epoch in range(epochs):
    # 训练阶段
    model_3.train()
    y_logits = model_3(X_train).squeeze()
    y_pred = torch.round(torch.sigmoid(y_logits))
    
    loss = loss_fn(y_logits, 
                   y_train)
    acc = accuracy_fn(y_true=y_train,
                      y_pred=y_pred)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    # 测试阶段
    model_3.eval()
    with torch.inference_mode():
        test_logits = model_3(X_test).squeeze()
        test_pred = torch.round(torch.sigmoid(test_logits))
        test_loss = loss_fn(test_logits,
                            y_test)
        test_acc = accuracy_fn(y_true=y_test,
                               y_pred=test_pred)
        
    # 打印输出
    if epoch % 100 == 0:
        print(f"Epoch:{epoch} | Train Loss:{loss:.4f} | Train Accuracy:{acc:.2f}% | Test Loss:{test_loss:.4f} | Test Accuracy:{test_acc:.2f}%")

Epoch:0 | Train Loss:0.6929 | Train Accuracy:50.00% | Test Loss:0.6932 | Test Accuracy:50.00%

Epoch:100 | Train Loss:0.6912 | Train Accuracy:52.88% | Test Loss:0.6910 | Test Accuracy:52.50%

Epoch:200 | Train Loss:0.6898 | Train Accuracy:53.37% | Test Loss:0.6894 | Test Accuracy:55.00%

Epoch:300 | Train Loss:0.6879 | Train Accuracy:53.00% | Test Loss:0.6872 | Test Accuracy:56.00%

Epoch:400 | Train Loss:0.6852 | Train Accuracy:52.75% | Test Loss:0.6841 | Test Accuracy:56.50%

Epoch:500 | Train Loss:0.6810 | Train Accuracy:52.75% | Test Loss:0.6794 | Test Accuracy:56.50%

Epoch:600 | Train Loss:0.6751 | Train Accuracy:54.50% | Test Loss:0.6729 | Test Accuracy:56.00%

Epoch:700 | Train Loss:0.6666 | Train Accuracy:58.38% | Test Loss:0.6632 | Test Accuracy:59.00%

Epoch:800 | Train Loss:0.6516 | Train Accuracy:64.00% | Test Loss:0.6476 | Test Accuracy:67.50%

Epoch:900 | Train Loss:0.6236 | Train Accuracy:74.00% | Test Loss:0.6215 | Test Accuracy:79.00%

绘制图像浅看一下

python 复制代码
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title("Train")
plot_decision_boundary(model_3, X_train, y_train)
plt.subplot(1, 2, 2)
plt.title("Test")
plot_decision_boundary(model_3, X_test, y_test)

哇塞,这个图尊嘟很不错,是不是成功的画圈圈啦,证明什么,证明我们的激活函数是有作用的,我们可以参考上面的思路调整模型的性能哈,接下来,我们就会把这些过程完整的整合在一起啦!

5 复现非线性激活函数

我们之前实验了如何将激活函数加入我们的模型中来给非线性激活函数建模.

python 复制代码
# 创建一个简单的 tensor
A = torch.arange(-10, 10, 1, dtype=torch.float)
A

tensor([-10., -9., -8., -7., -6., -5., -4., -3., -2., -1., 0., 1.,

2., 3., 4., 5., 6., 7., 8., 9.])

python 复制代码
plt.plot(A)

接下来让我看看ReLU是如何影响它的

5.1 ReLU

python 复制代码
def relu(x):
    return torch.maximum(torch.tensor(0),x)

relu(A)

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 2., 3., 4., 5., 6., 7.,

8., 9.])

所有的负数都变成0了

python 复制代码
plt.plot(relu(A))

5.2 sigmoid

python 复制代码
def sigmoid(x):
    return 1 / (1 + torch.exp(-x))

sigmoid(A)

tensor([4.5398e-05, 1.2339e-04, 3.3535e-04, 9.1105e-04, 2.4726e-03, 6.6929e-03,

1.7986e-02, 4.7426e-02, 1.1920e-01, 2.6894e-01, 5.0000e-01, 7.3106e-01,

8.8080e-01, 9.5257e-01, 9.8201e-01, 9.9331e-01, 9.9753e-01, 9.9909e-01,

9.9966e-01, 9.9988e-01])

python 复制代码
plt.plot(sigmoid(A))

ok,今天很顺利的完成了我的学习task!
BB啊**,今天好好吃饭了嘛~晚餐的牛蛙很好吃,方便面也嘎嘎nice!吃撑啦。

BB ,如果我的文档对您有帮助的话,记得给俺点个 呐!比心心

靴靴,谢谢~

相关推荐
数据智能老司机9 分钟前
精通 Python 设计模式——并发与异步模式
python·设计模式·编程语言
数据智能老司机9 分钟前
精通 Python 设计模式——测试模式
python·设计模式·架构
数据智能老司机10 分钟前
精通 Python 设计模式——性能模式
python·设计模式·架构
c8i20 分钟前
drf初步梳理
python·django
每日AI新事件20 分钟前
python的异步函数
python
这里有鱼汤2 小时前
miniQMT下载历史行情数据太慢怎么办?一招提速10倍!
前端·python
databook11 小时前
Manim实现脉冲闪烁特效
后端·python·动效
程序设计实验室11 小时前
2025年了,在 Django 之外,Python Web 框架还能怎么选?
python
倔强青铜三13 小时前
苦练Python第46天:文件写入与上下文管理器
人工智能·python·面试
用户25191624271116 小时前
Python之语言特点
python