昇思MindSpore学习笔记6-03计算机视觉--ResNet50图像分类

摘要:

记录MindSpore AI框架使用ResNet50神经网络模型,选择Bottleneck残差网络结构对CIFAR-10数据集进行分类的过程、步骤和方法。包括环境准备、下载数据集、数据集加载和预处理、构建模型、模型训练、模型测试等。

一、

1. 图像分类

最基础的计算机视觉应用

有监督学习类别

给定一张图像(猫、狗、飞机、汽车等等)

判断图像所属的类别

使用ResNet50网络

对CIFAR-10数据集进行分类

2.ResNet网络

ResNet50网络

2015年微软实验室提出

ILSVRC2015图像分类竞赛第一名

传统卷积神经网络

一系列卷积层和池化层堆叠

堆叠到一定深度时会出现退化问题

56层网络与20层网络训练误差和测试误差图

CIFAR-10数据集

56层网络比20层网络训练误差和测试误差更大

随着网络加深,误差并没有减小

3. 残差网络结构

Residual Network

减轻退化问题

实现搭建较深的网络结构(突破1000层)

ResNet网络在CIFAR-10数据集上的训练误差与测试误差图

虚线 训练误差

实线 测试误差

网络层数越深,训练误差和测试误差越小

二、环境准备

复制代码
%%capture captured_output
# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
# 查看当前 mindspore 版本
!pip show mindspore

输出:

复制代码
Name: mindspore
Version: 2.2.14
Summary: MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Home-page: https://www.mindspore.cn
Author: The MindSpore Authors
Author-email: [email protected]
License: Apache 2.0
Location: /home/nginx/miniconda/envs/jupyter/lib/python3.9/site-packages
Requires: asttokens, astunparse, numpy, packaging, pillow, protobuf, psutil, scipy
Required-by: 

三、 数据集准备与加载

1.数据集

CIFAR-10数据集

60000张32*32的彩色图像

50000张训练图片

10000张评估图片

10个类别

每类有6000张图

2.下载数据集

download接口

下载

解压

仅支持解析二进制版本的CIFAR-10文件(CIFAR-10 binary version)

复制代码
from download import download
​
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz"
​
download(url, "./datasets-cifar10-bin", kind="tar.gz", replace=True)

输出:

复制代码
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz (162.2 MB)

file_sizes: 100%|█████████████████████████████| 170M/170M [00:01<00:00, 113MB/s]
Extracting tar.gz file...
Successfully downloaded / unzipped to ./datasets-cifar10-bin
'./datasets-cifar10-bin'

数据集目录结构:

复制代码
datasets-cifar10-bin/cifar-10-batches-bin
├── batches.meta.text
├── data_batch_1.bin
├── data_batch_2.bin
├── data_batch_3.bin
├── data_batch_4.bin
├── data_batch_5.bin
├── readme.html
└── test_batch.bin

3.加载 数据

mindspore.dataset.Cifar10Dataset接口

加载数据集

图像增强操作

复制代码
import mindspore as ms
import mindspore.dataset as ds
import mindspore.dataset.vision as vision
import mindspore.dataset.transforms as transforms
from mindspore import dtype as mstype
​
data_dir = "./datasets-cifar10-bin/cifar-10-batches-bin"  # 数据集根目录
batch_size = 256  # 批量大小
image_size = 32  # 训练图像空间大小
workers = 4  # 并行线程个数
num_classes = 10  # 分类数量
​
def create_dataset_cifar10(dataset_dir, usage, resize, batch_size, workers):
​
    data_set = ds.Cifar10Dataset(dataset_dir=dataset_dir,
                                 usage=usage,
                                 num_parallel_workers=workers,
                                 shuffle=True)
​
    trans = []
    if usage == "train":
        trans += [
            vision.RandomCrop((32, 32), (4, 4, 4, 4)),
            vision.RandomHorizontalFlip(prob=0.5)
        ]
​
    trans += [
        vision.Resize(resize),
        vision.Rescale(1.0 / 255.0, 0.0),
        vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
        vision.HWC2CHW()
    ]
​
    target_trans = transforms.TypeCast(mstype.int32)
​
    # 数据映射操作
    data_set = data_set.map(operations=trans,
                            input_columns='image',
                            num_parallel_workers=workers)
​
    data_set = data_set.map(operations=target_trans,
                            input_columns='label',
                            num_parallel_workers=workers)
​
    # 批量操作
    data_set = data_set.batch(batch_size)
​
    return data_set
​
​
# 获取处理后的训练与测试数据集
​
dataset_train = create_dataset_cifar10(dataset_dir=data_dir,
                                       usage="train",
                                       resize=image_size,
                                       batch_size=batch_size,
                                       workers=workers)
step_size_train = dataset_train.get_dataset_size()
​
dataset_val = create_dataset_cifar10(dataset_dir=data_dir,
                                     usage="test",
                                     resize=image_size,
                                     batch_size=batch_size,
                                     workers=workers)
step_size_val = dataset_val.get_dataset_size()

4.显示 CIFAR-10训练数据集

复制代码
import matplotlib.pyplot as plt
import numpy as np
​
data_iter = next(dataset_train.create_dict_iterator())
​
images = data_iter["image"].asnumpy()
labels = data_iter["label"].asnumpy()
print(f"Image shape: {images.shape}, Label shape: {labels.shape}")
​
# 训练数据集中,前六张图片所对应的标签
print(f"Labels: {labels[:6]}")
​
classes = []
​
with open(data_dir + "/batches.meta.txt", "r") as f:
    for line in f:
        line = line.rstrip()
        if line:
            classes.append(line)
​
# 训练数据集的前六张图片
plt.figure()
for i in range(6):
    plt.subplot(2, 3, i + 1)
    image_trans = np.transpose(images[i], (1, 2, 0))
    mean = np.array([0.4914, 0.4822, 0.4465])
    std = np.array([0.2023, 0.1994, 0.2010])
    image_trans = std * image_trans + mean
    image_trans = np.clip(image_trans, 0, 1)
    plt.title(f"{classes[labels[i]]}")
    plt.imshow(image_trans)
    plt.axis("off")
plt.show()

输出:

复制代码
Image shape: (256, 3, 32, 32), Label shape: (256,)
Labels: [3 3 6 4 7 4]

四、 构建网络

残差网络结构(Residual Network)

有效减轻ResNet退化问题

实现更深的网络结构设计

提高网络的训练精度

堆叠残差网络构建ResNet50网络

1. 构建残差网络结构

残差网络结构图

残差网络由两个分支构成

主分支

堆叠系列卷积操作得到

输出的特征矩阵()

shortcuts(图中弧线表示)

从输入直接到输出

主分支F(x)加上shortcuts输出的特征矩阵x得到F(x)+x

Relu激活函数

输出

残差网络结构主要由两种

Building Block

用于较浅的ResNet网络,如ResNet18和ResNet34

Bottleneck

用于层数较深的ResNet网络,如ResNet50、ResNet101和ResNet152

Building Block

Building Block结构图

主分支有两层卷积网络结构:

第一层网络

输入channel为64

3×33×3卷积层

Batch Normalization层

Relu激活函数层

输出channel为64

第二层网络

输入channel为64

3×33×3的卷积层

Batch Normalization层

输出channel为64

融合

主分支输出的特征矩阵

shortcuts输出的特征矩阵

保证shape相同

输出Relu激活函数

主分支与shortcuts输出的特征矩阵相加

如果shape不相同

如输出channel是输入channel的一倍时,

shortcuts需要使用数量与输出channel相等

大小为1×11×1的卷积核进行卷积操作

若输出的图像较输入图像缩小一倍

设置shortcuts中卷积操作中的stride为2

主分支第一层卷积操作的stride也需设置为2

定义ResidualBlockBase类实现Building Block结构代码:

复制代码
from typing import Type, Union, List, Optional
import mindspore.nn as nn
from mindspore.common.initializer import Normal
​
# 初始化卷积层与BatchNorm的参数
weight_init = Normal(mean=0, sigma=0.02)
gamma_init = Normal(mean=1, sigma=0.02)
​
class ResidualBlockBase(nn.Cell):
    expansion: int = 1  # 最后一个卷积核数量与第一个卷积核数量相等
​
    def __init__(self, in_channel: int, out_channel: int,
                 stride: int = 1, norm: Optional[nn.Cell] = None,
                 down_sample: Optional[nn.Cell] = None) -> None:
        super(ResidualBlockBase, self).__init__()
        if not norm:
            self.norm = nn.BatchNorm2d(out_channel)
        else:
            self.norm = norm
​
        self.conv1 = nn.Conv2d(in_channel, out_channel,
                               kernel_size=3, stride=stride,
                               weight_init=weight_init)
        self.conv2 = nn.Conv2d(in_channel, out_channel,
                               kernel_size=3, weight_init=weight_init)
        self.relu = nn.ReLU()
        self.down_sample = down_sample
​
    def construct(self, x):
        """ResidualBlockBase construct."""
        identity = x  # shortcuts分支
​
        out = self.conv1(x)  # 主分支第一层:3*3卷积层
        out = self.norm(out)
        out = self.relu(out)
        out = self.conv2(out)  # 主分支第二层:3*3卷积层
        out = self.norm(out)
​
        if self.down_sample is not None:
            identity = self.down_sample(x)
        out += identity  # 输出为主分支与shortcuts之和
        out = self.relu(out)
​
        return out

Bottleneck

Bottleneck结构图

Bottleneck结构的参数数量更少

更适合层数较深的网络

主分支有三层卷积结构

1×11×1的卷积层

输入channel为256

通过数量为64

降维

Batch Normalization层

Relu激活函数层

输出channel为64

3×33×3卷积层

通过数量为64

Batch Normalization层

Relu激活函数层

输出channel为64

1×11×1的卷积层

升维

通过数量为256

Batch Normalization层

输出channel为256

融合

主分支输出的特征矩阵

shortcuts输出的特征矩阵

保证特征矩阵shape相同

输出Relu激活函数

主分支与shortcuts输出的特征矩阵相加

如果shape不相同

如输出channel是输入channel的一倍时,

shortcuts上需要使用数量与输出channel相等

大小为1×11×1的卷积核进行卷积操作

如输出的图像较输入图像缩小一倍

设置shortcuts中卷积操作中的stride为2

主分支第二层卷积操作的stride也需设置为2

定义ResidualBlock类实现Bottleneck结构代码:

复制代码
class ResidualBlock(nn.Cell):
    expansion = 4  # 最后一个卷积核的数量是第一个卷积核数量的4倍
​
    def __init__(self, in_channel: int, out_channel: int,
                 stride: int = 1, down_sample: Optional[nn.Cell] = None) -> None:
        super(ResidualBlock, self).__init__()
​
        self.conv1 = nn.Conv2d(in_channel, out_channel,
                               kernel_size=1, weight_init=weight_init)
        self.norm1 = nn.BatchNorm2d(out_channel)
        self.conv2 = nn.Conv2d(out_channel, out_channel,
                               kernel_size=3, stride=stride,
                               weight_init=weight_init)
        self.norm2 = nn.BatchNorm2d(out_channel)
        self.conv3 = nn.Conv2d(out_channel, out_channel * self.expansion,
                               kernel_size=1, weight_init=weight_init)
        self.norm3 = nn.BatchNorm2d(out_channel * self.expansion)
​
        self.relu = nn.ReLU()
        self.down_sample = down_sample
​
    def construct(self, x):
​
        identity = x  # shortscuts分支
​
        out = self.conv1(x)  # 主分支第一层:1*1卷积层
        out = self.norm1(out)
        out = self.relu(out)
        out = self.conv2(out)  # 主分支第二层:3*3卷积层
        out = self.norm2(out)
        out = self.relu(out)
        out = self.conv3(out)  # 主分支第三层:1*1卷积层
        out = self.norm3(out)
​
        if self.down_sample is not None:
            identity = self.down_sample(x)
​
        out += identity  # 输出为主分支与shortcuts之和
        out = self.relu(out)
​
        return out

2. 构建ResNet50网络

ResNet网络层结构

以输入彩色图像224×224为例

通过数量64

卷积核大小为7×7

stride为2的卷积层conv1

该层输出图片大小为112×112

输出channel为64;

通过3×3最大下采样池化层

输出图片大小为56×56

输出channel为64;

再堆叠4个残差网络块

conv2_x

conv3_x

conv4_x

conv5_x

输出图片大小为7×7

输出channel为2048

最后通过

平均池化层

全连接层

Softmax

得到分类概率。

以ResNet50网络中的conv2_x为例

每个残差网络块

由3个Bottleneck结构堆叠而成

每个Bottleneck

输入channel为64

输出channel为256

示例

定义make_layer实现残差块的构建,

参数如下:

last_out_channel:上一个残差网络输出的通道数

block: 残差网络的类别

ResidualBlockBase

ResidualBlock

channel: 残差网络输入的通道数

block_nums: 残差网络块堆叠的个数

stride: 卷积移动的步幅

复制代码
def make_layer(last_out_channel, block: Type[Union[ResidualBlockBase, ResidualBlock]],
               channel: int, block_nums: int, stride: int = 1):
    down_sample = None  # shortcuts分支
​
    if stride != 1 or last_out_channel != channel * block.expansion:
​
        down_sample = nn.SequentialCell([
            nn.Conv2d(last_out_channel, channel * block.expansion,
                      kernel_size=1, stride=stride, weight_init=weight_init),
            nn.BatchNorm2d(channel * block.expansion, gamma_init=gamma_init)
        ])
​
    layers = []
    layers.append(block(last_out_channel, channel, stride=stride, down_sample=down_sample))
​
    in_channel = channel * block.expansion
    # 堆叠残差网络
    for _ in range(1, block_nums):
​
        layers.append(block(in_channel, channel))
​
    return nn.SequentialCell(layers)

ResNet50网络

5个卷积结构

1个平均池化层

1个全连接层

以CIFAR-10数据集为例:

conv1

输入图片大小为32×32

输入channel为3

卷积层

卷积核数量为64

卷积核大小为7×7

stride为2

Batch Normalization层

Reul激活函数

输出feature map大小为16×16

输出channel为64

conv2_x

输入feature map大小为16×16

输入channel为64

最大下采样池化操作

卷积核大小为3×3

stride为2

堆叠3个Bottleneck

1×1,64; 3×3,64; 1×1,256\]结构 输出feature map大小为8×8 输出channel为256 conv3_x 输入feature map大小为8×8 输入channel为256 堆叠4个Bottleneck \[1×1,128; 3×3,128; 1×1,512\]结构 输出feature map大小为4×4 输出channel为512 conv4_x 输入feature map大小为4×4 输入channel为512 堆叠6个Bottleneck \[1×1,256; 3×3,256; 1×1,1024\]结构 输出feature map大小为2×2 输出channel为1024 conv5_x 输入feature map大小为2×2 输入channel为1024 堆叠3个Bottleneck \[1×1,512; 3×3,512; 1×1,2048\]结构。 输出feature map大小为1×1 输出channel为2048 average pool \& fc 输入channel为2048 输出channel为分类的类别数

ResNet50模型构建代码

函数resnet50参数:

num_classes: 分类的类别数

默认类别数为1000。

Pretrained : 下载对应的训练模型

加载预训练模型中的参数

复制代码
from mindspore import load_checkpoint, load_param_into_net
​
class ResNet(nn.Cell):
    def __init__(self, block: Type[Union[ResidualBlockBase, ResidualBlock]],
                 layer_nums: List[int], num_classes: int, input_channel: int) -> None:
        super(ResNet, self).__init__()
​
        self.relu = nn.ReLU()
        # 第一个卷积层,输入channel为3(彩色图像),输出channel为64
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, weight_init=weight_init)
        self.norm = nn.BatchNorm2d(64)
        # 最大池化层,缩小图片的尺寸
        self.max_pool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same')
        # 各个残差网络结构块定义
        self.layer1 = make_layer(64, block, 64, layer_nums[0])
        self.layer2 = make_layer(64 * block.expansion, block, 128, layer_nums[1], stride=2)
        self.layer3 = make_layer(128 * block.expansion, block, 256, layer_nums[2], stride=2)
        self.layer4 = make_layer(256 * block.expansion, block, 512, layer_nums[3], stride=2)
        # 平均池化层
        self.avg_pool = nn.AvgPool2d()
        # flattern层
        self.flatten = nn.Flatten()
        # 全连接层
        self.fc = nn.Dense(in_channels=input_channel, out_channels=num_classes)
​
    def construct(self, x):
​
        x = self.conv1(x)
        x = self.norm(x)
        x = self.relu(x)
        x = self.max_pool(x)
​
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
​
        x = self.avg_pool(x)
        x = self.flatten(x)
        x = self.fc(x)
​
        return x

def _resnet(model_url: str, block: Type[Union[ResidualBlockBase, ResidualBlock]],
            layers: List[int], num_classes: int, pretrained: bool, pretrained_ckpt: str,
            input_channel: int):
    model = ResNet(block, layers, num_classes, input_channel)
​
    if pretrained:
        # 加载预训练模型
        download(url=model_url, path=pretrained_ckpt, replace=True)
        param_dict = load_checkpoint(pretrained_ckpt)
        load_param_into_net(model, param_dict)
​
    return model
​
​
def resnet50(num_classes: int = 1000, pretrained: bool = False):
    """ResNet50模型"""
    resnet50_url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/models/application/resnet50_224_new.ckpt"
    resnet50_ckpt = "./LoadPretrainedModel/resnet50_224_new.ckpt"
    return _resnet(resnet50_url, ResidualBlock, [3, 4, 6, 3], num_classes,
                   pretrained, resnet50_ckpt, 2048)

五、 模型训练与评估

ResNet50预训练模型微调:

调用resnet50构造ResNet50模型

设置pretrained参数为True

自动下载ResNet50预训练模型

加载预训练模型中的参数到网络中。

定义优化器和损失函数

逐epoch打印训练的损失值和评估精度

保存评估精度最高的ckpt文件(resnet50-best.ckpt)到当前路径的./BestCheckPoint

预训练模型

全连接层(fc)输出大小(对应参数num_classes)默认为1000

CIFAR10数据集共有10个分类

重置全连接层输出大小为10

展示5个epochs的训练过程

复制代码
# 定义ResNet50网络
network = resnet50(pretrained=True)
​
# 全连接层输入层的大小
in_channel = network.fc.in_channels
fc = nn.Dense(in_channels=in_channel, out_channels=10)
# 重置全连接层
network.fc = fc

输出:

复制代码
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/models/application/resnet50_224_new.ckpt (97.7 MB)

file_sizes: 100%|█████████████████████████████| 102M/102M [00:00<00:00, 109MB/s]
Successfully downloaded file to ./LoadPretrainedModel/resnet50_224_new.ckpt

# 设置学习率
num_epochs = 5
lr = nn.cosine_decay_lr(min_lr=0.00001, max_lr=0.001, total_step=step_size_train * num_epochs,
                        step_per_epoch=step_size_train, decay_epoch=num_epochs)
# 定义优化器和损失函数
opt = nn.Momentum(params=network.trainable_params(), learning_rate=lr, momentum=0.9)
loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
​
def forward_fn(inputs, targets):
    logits = network(inputs)
    loss = loss_fn(logits, targets)
    return loss
​
grad_fn = ms.value_and_grad(forward_fn, None, opt.parameters)
​
def train_step(inputs, targets):
    loss, grads = grad_fn(inputs, targets)
    opt(grads)
    return loss

import os
​
# 创建迭代器
data_loader_train = dataset_train.create_tuple_iterator(num_epochs=num_epochs)
data_loader_val = dataset_val.create_tuple_iterator(num_epochs=num_epochs)
​
# 最佳模型存储路径
best_acc = 0
best_ckpt_dir = "./BestCheckpoint"
best_ckpt_path = "./BestCheckpoint/resnet50-best.ckpt"
​
if not os.path.exists(best_ckpt_dir):
    os.mkdir(best_ckpt_dir)

import mindspore.ops as ops
​
def train(data_loader, epoch):
    """模型训练"""
    losses = []
    network.set_train(True)
​
    for i, (images, labels) in enumerate(data_loader):
        loss = train_step(images, labels)
        if i % 100 == 0 or i == step_size_train - 1:
            print('Epoch: [%3d/%3d], Steps: [%3d/%3d], Train Loss: [%5.3f]' %
                  (epoch + 1, num_epochs, i + 1, step_size_train, loss))
        losses.append(loss)
​
    return sum(losses) / len(losses)
​
def evaluate(data_loader):
    """模型验证"""
    network.set_train(False)
​
    correct_num = 0.0  # 预测正确个数
    total_num = 0.0  # 预测总数
​
    for images, labels in data_loader:
        logits = network(images)
        pred = logits.argmax(axis=1)  # 预测结果
        correct = ops.equal(pred, labels).reshape((-1, ))
        correct_num += correct.sum().asnumpy()
        total_num += correct.shape[0]
​
    acc = correct_num / total_num  # 准确率
​
    return acc

# 开始循环训练
print("Start Training Loop ...")
​
for epoch in range(num_epochs):
    curr_loss = train(data_loader_train, epoch)
    curr_acc = evaluate(data_loader_val)
​
    print("-" * 50)
    print("Epoch: [%3d/%3d], Average Train Loss: [%5.3f], Accuracy: [%5.3f]" % (
        epoch+1, num_epochs, curr_loss, curr_acc
    ))
    print("-" * 50)
​
    # 保存当前预测准确率最高的模型
    if curr_acc > best_acc:
        best_acc = curr_acc
        ms.save_checkpoint(network, best_ckpt_path)
​
print("=" * 80)
print(f"End of validation the best Accuracy is: {best_acc: 5.3f}, "
      f"save the best ckpt file in {best_ckpt_path}", flush=True)

输出:

复制代码
Start Training Loop ...
Epoch: [  1/  5], Steps: [  1/196], Train Loss: [2.412]
Epoch: [  1/  5], Steps: [101/196], Train Loss: [1.384]
Epoch: [  1/  5], Steps: [196/196], Train Loss: [0.991]
--------------------------------------------------
Epoch: [  1/  5], Average Train Loss: [1.590], Accuracy: [0.606]
--------------------------------------------------
Epoch: [  2/  5], Steps: [  1/196], Train Loss: [1.030]
Epoch: [  2/  5], Steps: [101/196], Train Loss: [0.892]
Epoch: [  2/  5], Steps: [196/196], Train Loss: [0.968]
--------------------------------------------------
Epoch: [  2/  5], Average Train Loss: [0.994], Accuracy: [0.692]
--------------------------------------------------
Epoch: [  3/  5], Steps: [  1/196], Train Loss: [0.774]
Epoch: [  3/  5], Steps: [101/196], Train Loss: [0.950]
Epoch: [  3/  5], Steps: [196/196], Train Loss: [0.642]
--------------------------------------------------
Epoch: [  3/  5], Average Train Loss: [0.836], Accuracy: [0.721]
--------------------------------------------------
Epoch: [  4/  5], Steps: [  1/196], Train Loss: [0.804]
Epoch: [  4/  5], Steps: [101/196], Train Loss: [0.824]
Epoch: [  4/  5], Steps: [196/196], Train Loss: [0.924]
--------------------------------------------------
Epoch: [  4/  5], Average Train Loss: [0.766], Accuracy: [0.737]
--------------------------------------------------
Epoch: [  5/  5], Steps: [  1/196], Train Loss: [0.843]
Epoch: [  5/  5], Steps: [101/196], Train Loss: [0.756]
Epoch: [  5/  5], Steps: [196/196], Train Loss: [0.965]
--------------------------------------------------
Epoch: [  5/  5], Average Train Loss: [0.737], Accuracy: [0.738]
--------------------------------------------------
================================================================================
End of validation the best Accuracy is:  0.738, save the best ckpt file in ./BestCheckpoint/resnet50-best.ckpt

六、可视化模型预测

定义visualize_model函数

使用验证精度最高的模型

预测CIFAR-10测试数据集

预测结果可视化

预测字体颜色为蓝色 表示预测正确

预测字体颜色为红色 表示预测错误

5 epochs模型在验证数据集的预测准确率在70%左右

6张图片中会有2张预测失败

要达到理想的训练效果,训练80个epochs

复制代码
import matplotlib.pyplot as plt
​
def visualize_model(best_ckpt_path, dataset_val):
    num_class = 10  # 对狼和狗图像进行二分类
    net = resnet50(num_class)
    # 加载模型参数
    param_dict = ms.load_checkpoint(best_ckpt_path)
    ms.load_param_into_net(net, param_dict)
    # 加载验证集的数据进行验证
    data = next(dataset_val.create_dict_iterator())
    images = data["image"]
    labels = data["label"]
    # 预测图像类别
    output = net(data['image'])
    pred = np.argmax(output.asnumpy(), axis=1)
​
    # 图像分类
    classes = []
​
    with open(data_dir + "/batches.meta.txt", "r") as f:
        for line in f:
            line = line.rstrip()
            if line:
                classes.append(line)
​
    # 显示图像及图像的预测值
    plt.figure()
    for i in range(6):
        plt.subplot(2, 3, i + 1)
        # 若预测正确,显示为蓝色;若预测错误,显示为红色
        color = 'blue' if pred[i] == labels.asnumpy()[i] else 'red'
        plt.title('predict:{}'.format(classes[pred[i]]), color=color)
        picture_show = np.transpose(images.asnumpy()[i], (1, 2, 0))
        mean = np.array([0.4914, 0.4822, 0.4465])
        std = np.array([0.2023, 0.1994, 0.2010])
        picture_show = std * picture_show + mean
        picture_show = np.clip(picture_show, 0, 1)
        plt.imshow(picture_show)
        plt.axis('off')
​
    plt.show()
​
​
# 使用测试数据集进行验证
visualize_model(best_ckpt_path=best_ckpt_path, dataset_val=dataset_val)

输出:

相关推荐
chushiyunen18 分钟前
dom操作笔记、xml和document等
xml·java·笔记
chushiyunen21 分钟前
tomcat使用笔记、启动失败但是未打印日志
java·笔记·tomcat
汇能感知26 分钟前
光谱相机的光谱数据采集原理
经验分享·笔记·科技
人人题1 小时前
汽车加气站操作工考试答题模板
笔记·职场和发展·微信小程序·汽车·创业创新·学习方法·业界资讯
xiangzhihong81 小时前
Amodal3R ,南洋理工推出的 3D 生成模型
人工智能·深度学习·计算机视觉
小脑斧爱吃鱼鱼1 小时前
鸿蒙项目笔记(1)
笔记·学习·harmonyos
阿linlin1 小时前
OpenCV--图像预处理学习01
opencv·学习·计算机视觉
狂奔solar1 小时前
diffusion-vas 提升遮挡区域的分割精度
人工智能·深度学习
张张张3122 小时前
4.2学习总结 Java:list系列集合
java·学习
SuperW2 小时前
linux课程学习二——缓存
学习