【PyTorch】实现卷积神经网络：使用CNN进行手写数字识别

介绍

PyTorch 是一个开源的机器学习库，由 Facebook 的人工智能研究实验室开发。它提供了两种主要的功能：张量计算（类似于 NumPy，但具有 GPU 加速）和基于动态计算图的深度学习工具。PyTorch 因其灵活性、易用性和强大的社区支持而广受欢迎，特别适合研究和原型设计。

PyTorch 的核心特性

张量（Tensor）：PyTorch 中的基本数据结构是 torch.Tensor，它可以表示多维数组，并且可以在 CPU 或 GPU 上运行。张量支持自动求导功能，这使得实现复杂的神经网络变得简单。
自动求导（Autograd）：PyTorch 的 torch.autograd 模块可以自动计算梯度，这对于训练神经网络至关重要。每个张量都可以跟踪其计算历史，并根据需要反向传播以计算梯度。
神经网络模块（nn.Module）：torch.nn 提供了构建神经网络所需的各种层和损失函数。你可以通过- 继承 nn.Module 类来定义自己的模型，并使用内置或自定义的层和激活函数。
优化器（Optimizers）：torch.optim 包含了多种常用的优化算法，如 SGD、Adam 等，可以帮助你轻松地设置训练过程中的参数更新规则。
数据处理（Data Loading）：torch.utils.data 提供了方便的数据加载和预处理工具，包括 Dataset 和 DataLoader 类，它们能够高效地管理批量数据并支持多线程读取。
分布式训练（Distributed Training）：PyTorch 支持单机多 GPU 和多机多 GPU 的分布式训练，可以通过 torch.distributed 和 torch.nn.parallel.DistributedDataParallel 实现高效的模型并行化。
迁移学习（Transfer Learning）：PyTorch 提供了许多预训练模型，可以直接用于新的任务或者作为基础模型进行微调。
可视化（Visualization）：结合 TensorBoard 或其他可视化工具，PyTorch 可以帮助开发者更好地理解模型的行为和性能。

PyTorch 生态系统

除了核心库外，PyTorch 还拥有丰富的生态系统，涵盖了从计算机视觉到自然语言处理等多个领域。例如：

torchvision：提供了一系列与图像相关的数据集、模型架构和常用变换。
torchaudio：专注于音频处理，包含数据集和预训练模型。
torchtext：为文本数据处理提供工具和支持。
transformers：Hugging Face 提供的库，包含了大量预训练的语言模型及其应用。

实现卷积神经网络

导包

python 复制代码

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

python 复制代码

torch.__version__

复制代码

'1.8.0+cu111'

检查GPU是否可用

python 复制代码

# 检查GPU是否可用
torch.cuda.is_available()

复制代码

True

python 复制代码

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

python 复制代码

device

复制代码

device(type='cuda', index=0)

pytorch中使用GPU进行训练

把模型转移到GPU上.
将每一批次的训练数据转移到GPU上.

torchvision 内置了常用的数据集和常见的模型

python 复制代码

import torchvision
from torchvision import datasets, transforms

transforms.ToTensor

把数据转化为tensor
数据的值转化为0到1之间.
会把channel放到第一个维度上.

transforms用来做数据增强, 数据预处理等功能的.

python 复制代码

transformation = transforms.Compose([transforms.ToTensor(),])

加载数据

python 复制代码

train_ds = datasets.MNIST('./', train=True, transform=transformation, download=True)
# 测试数据集
test_ds = datasets.MNIST('./', train=False, transform=transformation, download=True)

转换成dataloader

python 复制代码

train_dl = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True)
test_dl = torch.utils.data.DataLoader(test_ds, batch_size=256)

python 复制代码

images, labels = next(iter(train_dl))

pytorch中图片的表现形式[batch, channel, hight, width]

python 复制代码

images.shape

复制代码

torch.Size([64, 1, 28, 28])

python 复制代码

labels

复制代码

tensor([8, 3, 1, 2, 1, 1, 6, 5, 0, 4, 9, 4, 9, 6, 0, 4, 4, 6, 1, 6, 4, 8, 3, 9,
        3, 7, 8, 8, 3, 6, 5, 7, 9, 4, 5, 1, 1, 4, 0, 1, 3, 0, 2, 6, 4, 8, 2, 8,
        3, 8, 7, 1, 4, 8, 4, 9, 3, 0, 4, 5, 6, 1, 5, 0])

python 复制代码

img = images[0]
img.shape

复制代码

torch.Size([1, 28, 28])

python 复制代码

img = img.numpy()
img = np.squeeze(img)
img.shape

复制代码

(28, 28)

python 复制代码

plt.imshow(img, cmap='gray')

复制代码

<matplotlib.image.AxesImage at 0x2d50e4ae7f0>

创建模型

python 复制代码

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 32, 3)# in: 64, 1, 28 , 28 -> out: 64, 32, 26, 26
        self.pool = nn.MaxPool2d((2, 2)) # out: 64, 32, 13, 13
        self.conv2 = nn.Conv2d(32, 64, 3)# in: 64, 32, 13, 13 -> out: 64, 64, 11, 11
        # 再加一层池化操作, in: 64, 64, 11, 11  --> out: 64, 64, 5, 5
        self.linear_1 = nn.Linear(64 * 5 * 5, 256)
        self.linear_2 = nn.Linear(256, 10)
        
    def forward(self, input):
        x = F.relu(self.conv1(input))
        x = self.pool(x)
        x = F.relu(self.conv2(x))
        x = self.pool(x)
        # flatten
        x = x.view(-1, 64 * 5 * 5)
        x = F.relu(self.linear_1(x))
        x = self.linear_2(x)
        return x

python 复制代码

model = Model()

把model拷到GPU上面去

python 复制代码

# 把model拷到GPU上面去
model.to(device)

复制代码

Model(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (linear_1): Linear(in_features=1600, out_features=256, bias=True)
  (linear_2): Linear(in_features=256, out_features=10, bias=True)
)

损失函数

python 复制代码

loss_fn = torch.nn.CrossEntropyLoss()

优化器

python 复制代码

optimizer = optim.Adam(model.parameters(), lr=0.001)

训练

python 复制代码

def fit(epoch, model, train_loader, test_loader):
    correct = 0
    total = 0
    running_loss = 0
    
    for x, y in train_loader:
        # 把数据放到GPU上去. 
        x, y = x.to(device), y.to(device)
        y_pred = model(x)
        loss = loss_fn(y_pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        with torch.no_grad():
            y_pred = torch.argmax(y_pred, dim=1)
            correct += (y_pred == y).sum().item()
            total += y.size(0)
            running_loss += loss.item()
            
    epoch_loss = running_loss / len(train_loader.dataset)
    epoch_acc = correct / total
        
    # 测试过程
    test_correct = 0
    test_total = 0
    test_running_loss = 0
    with torch.no_grad():
        for x, y in test_loader:
            x, y = x.to(device), y.to(device)
            y_pred = model(x)
            loss = loss_fn(y_pred, y)
            y_pred = torch.argmax(y_pred, dim=1)
            test_correct += (y_pred == y).sum().item()
            test_total += y.size(0)
            test_running_loss += loss.item()
    test_epoch_loss = test_running_loss / len(test_loader.dataset)
    test_epoch_acc = test_correct / test_total

    print('epoch: ', epoch,
         'loss: ', round(epoch_loss, 3),
         'accuracy: ', round(epoch_acc, 3),
         'test_loss: ', round(test_epoch_loss, 3),
         'test_accuracy: ', round(test_epoch_acc))
    return epoch_loss, epoch_acc, test_epoch_loss, test_epoch_acc

python 复制代码

epochs = 20
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs):
    epoch_loss, epoch_acc, test_epoch_loss, test_epoch_acc = fit(epoch, model, train_dl, test_dl)
    train_loss.append(epoch_loss)
    train_acc.append(epoch_acc)
    
    test_loss.append(epoch_loss)
    test_acc.append(epoch_acc)

复制代码

epoch:  0 loss:  0.003 accuracy:  0.949 test_loss:  0.0 test_accuracy:  1
epoch:  1 loss:  0.001 accuracy:  0.984 test_loss:  0.0 test_accuracy:  1
epoch:  2 loss:  0.001 accuracy:  0.989 test_loss:  0.0 test_accuracy:  1
epoch:  3 loss:  0.0 accuracy:  0.993 test_loss:  0.0 test_accuracy:  1
epoch:  4 loss:  0.0 accuracy:  0.994 test_loss:  0.0 test_accuracy:  1
epoch:  5 loss:  0.0 accuracy:  0.995 test_loss:  0.0 test_accuracy:  1
epoch:  6 loss:  0.0 accuracy:  0.996 test_loss:  0.0 test_accuracy:  1
epoch:  7 loss:  0.0 accuracy:  0.996 test_loss:  0.0 test_accuracy:  1
epoch:  8 loss:  0.0 accuracy:  0.997 test_loss:  0.0 test_accuracy:  1
epoch:  9 loss:  0.0 accuracy:  0.997 test_loss:  0.0 test_accuracy:  1
epoch:  10 loss:  0.0 accuracy:  0.998 test_loss:  0.0 test_accuracy:  1
epoch:  11 loss:  0.0 accuracy:  0.998 test_loss:  0.0 test_accuracy:  1
epoch:  12 loss:  0.0 accuracy:  0.998 test_loss:  0.0 test_accuracy:  1
epoch:  13 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  14 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  15 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  16 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  17 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  18 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1
epoch:  19 loss:  0.0 accuracy:  0.999 test_loss:  0.0 test_accuracy:  1