P6周:VGG-16算法-Pytorch实现人脸识别

我的环境

语言环境:Python 3.8.12

编译器:jupyter notebook

深度学习环境:torch 1.12.0+cu113

一、前期准备
1.设置GPU
python 复制代码
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms, datasets
import os,PIL,pathlib,warnings

warnings.filterwarnings("ignore")             #忽略警告信息

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
复制代码
device(type='cuda')
2.导入数据
python 复制代码
import os,PIL,random,pathlib

data_dir = 'F:/jupyter lab/DL-100-days/datasets/Hollywood_stars_photos/'
data_dir = pathlib.Path(data_dir)

data_paths  = list(data_dir.glob('*'))
classeNames = [str(path).split("\\")[5] for path in data_paths]
classeNames
复制代码
['Angelina Jolie',
 'Brad Pitt',
 'Denzel Washington',
 'Hugh Jackman',
 'Jennifer Lawrence',
 'Johnny Depp',
 'Kate Winslet',
 'Leonardo DiCaprio',
 'Megan Fox',
 'Natalie Portman',
 'Nicole Kidman',
 'Robert Downey Jr',
 'Sandra Bullock',
 'Scarlett Johansson',
 'Tom Cruise',
 'Tom Hanks',
 'Will Smith']
python 复制代码
# 关于transforms.Compose的更多介绍可以参考:https://blog.csdn.net/qq_38251616/article/details/124878863
train_transforms = transforms.Compose([
    transforms.Resize([224, 224]),  # 将输入图片resize成统一尺寸
    # transforms.RandomHorizontalFlip(), # 随机水平翻转
    transforms.ToTensor(),          # 将PIL Image或numpy.ndarray转换为tensor,并归一化到[0,1]之间
    transforms.Normalize(           # 标准化处理-->转换为标准正太分布(高斯分布),使模型更容易收敛
        mean=[0.485, 0.456, 0.406], 
        std=[0.229, 0.224, 0.225])  # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
])

total_data = datasets.ImageFolder("F:/jupyter lab/DL-100-days/datasets/Hollywood_stars_photos/",transform=train_transforms)
total_data
复制代码
Dataset ImageFolder
    Number of datapoints: 1800
    Root location: F:/jupyter lab/DL-100-days/datasets/Hollywood_stars_photos/
    StandardTransform
Transform: Compose(
               Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=None)
               ToTensor()
               Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
           )
python 复制代码
total_data.class_to_idx
复制代码
{'Angelina Jolie': 0,
 'Brad Pitt': 1,
 'Denzel Washington': 2,
 'Hugh Jackman': 3,
 'Jennifer Lawrence': 4,
 'Johnny Depp': 5,
 'Kate Winslet': 6,
 'Leonardo DiCaprio': 7,
 'Megan Fox': 8,
 'Natalie Portman': 9,
 'Nicole Kidman': 10,
 'Robert Downey Jr': 11,
 'Sandra Bullock': 12,
 'Scarlett Johansson': 13,
 'Tom Cruise': 14,
 'Tom Hanks': 15,
 'Will Smith': 16}
3.划分数据集
python 复制代码
train_size = int(0.8 * len(total_data))
test_size  = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
train_dataset, test_dataset
复制代码
(<torch.utils.data.dataset.Subset at 0x187391e5a60>,
 <torch.utils.data.dataset.Subset at 0x187391e5b20>)
python 复制代码
batch_size = 32

train_dl = torch.utils.data.DataLoader(train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True,
                                           num_workers=1)
test_dl = torch.utils.data.DataLoader(test_dataset,
                                          batch_size=batch_size,
                                          shuffle=True,
                                          num_workers=1)
python 复制代码
for X, y in test_dl:
    print("Shape of X [N, C, H, W]: ", X.shape)
    print("Shape of y: ", y.shape, y.dtype)
    break
复制代码
Shape of X [N, C, H, W]:  torch.Size([32, 3, 224, 224])
Shape of y:  torch.Size([32]) torch.int64

二、调用官方的VGG16模型

python 复制代码
from torchvision.models import vgg16

device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))
    
# 加载预训练模型,并且对模型进行微调
model = vgg16(pretrained = True).to(device) # 加载预训练的vgg16模型

for param in model.parameters():
    param.requires_grad = False # 冻结模型的参数,这样子在训练的时候只训练最后一层的参数

# 修改classifier模块的第6层(即:(6): Linear(in_features=4096, out_features=2, bias=True))
# 注意查看我们下方打印出来的模型
model.classifier._modules['6'] = nn.Linear(4096,len(classeNames)) # 修改vgg16模型中最后一层全连接层,输出目标类别个数
model.to(device)  
model
复制代码
Using cuda device
VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=17, bias=True)
  )
)

三、训练循环

1.编写训练函数
python 复制代码
# 训练循环
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)  # 训练集的大小
    num_batches = len(dataloader)   # 批次数目, (size/batch_size,向上取整)

    train_loss, train_acc = 0, 0  # 初始化训练损失和正确率
    
    for X, y in dataloader:  # 获取图片及其标签
        X, y = X.to(device), y.to(device)
        
        # 计算预测误差
        pred = model(X)          # 网络输出
        loss = loss_fn(pred, y)  # 计算网络输出和真实值之间的差距,targets为真实值,计算二者差值即为损失
        
        # 反向传播
        optimizer.zero_grad()  # grad属性归零
        loss.backward()        # 反向传播
        optimizer.step()       # 每一步自动更新
        
        # 记录acc与loss
        train_acc  += (pred.argmax(1) == y).type(torch.float).sum().item()
        train_loss += loss.item()
            
    train_acc  /= size
    train_loss /= num_batches

    return train_acc, train_loss
2.编写测试函数
python 复制代码
def test (dataloader, model, loss_fn):
    size        = len(dataloader.dataset)  # 测试集的大小
    num_batches = len(dataloader)          # 批次数目, (size/batch_size,向上取整)
    test_loss, test_acc = 0, 0
    
    # 当不进行训练时,停止梯度更新,节省计算内存消耗
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)
            
            # 计算loss
            target_pred = model(imgs)
            loss        = loss_fn(target_pred, target)
            
            test_loss += loss.item()
            test_acc  += (target_pred.argmax(1) == target).type(torch.float).sum().item()

    test_acc  /= size
    test_loss /= num_batches

    return test_acc, test_loss
3.设置动态学习率
python 复制代码
# 调用官方动态学习率接口时使用
learn_rate = 1e-3 # 初始学习率
lambda1 = lambda epoch: 0.92 ** (epoch // 4)
optimizer = torch.optim.SGD(model.parameters(), lr=learn_rate)
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda1) #选定调整方法
4.正式训练
python 复制代码
import copy

loss_fn    = nn.CrossEntropyLoss() # 创建损失函数
epochs     = 40

train_loss = []
train_acc  = []
test_loss  = []
test_acc   = []

best_acc = 0    # 设置一个最佳准确率,作为最佳模型的判别指标

for epoch in range(epochs):
    # 更新学习率(使用自定义学习率时使用)
    # adjust_learning_rate(optimizer, epoch, learn_rate)
    
    model.train()
    epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, optimizer)
    scheduler.step() # 更新学习率(调用官方动态学习率接口时使用)
    
    model.eval()
    epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
    
    # 保存最佳模型到 best_model
    if epoch_test_acc > best_acc:
        best_acc   = epoch_test_acc
        best_model = copy.deepcopy(model)
    
    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)
    
    # 获取当前的学习率
    lr = optimizer.state_dict()['param_groups'][0]['lr']
    
    template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}')
    print(template.format(epoch+1, epoch_train_acc*100, epoch_train_loss, 
                          epoch_test_acc*100, epoch_test_loss, lr))
    
# 保存最佳模型到文件中
PATH = './best_model123.pth'  # 保存的参数文件名
torch.save(model.state_dict(), PATH)

print('Done')
复制代码
Epoch: 1, Train_acc:11.9%, Train_loss:2.810, Test_acc:14.4%, Test_loss:2.639, Lr:1.00E-03
Epoch: 2, Train_acc:17.4%, Train_loss:2.590, Test_acc:15.3%, Test_loss:2.512, Lr:1.00E-03
Epoch: 3, Train_acc:17.8%, Train_loss:2.483, Test_acc:17.5%, Test_loss:2.400, Lr:1.00E-03
Epoch: 4, Train_acc:21.0%, Train_loss:2.409, Test_acc:20.8%, Test_loss:2.344, Lr:9.20E-04
Epoch: 5, Train_acc:22.1%, Train_loss:2.331, Test_acc:23.9%, Test_loss:2.289, Lr:9.20E-04
...........
Epoch:36, Train_acc:42.2%, Train_loss:1.763, Test_acc:38.9%, Test_loss:1.858, Lr:4.72E-04
Epoch:37, Train_acc:43.8%, Train_loss:1.744, Test_acc:39.2%, Test_loss:1.872, Lr:4.72E-04
Epoch:38, Train_acc:44.0%, Train_loss:1.747, Test_acc:39.4%, Test_loss:1.872, Lr:4.72E-04
Epoch:39, Train_acc:43.9%, Train_loss:1.744, Test_acc:39.7%, Test_loss:1.859, Lr:4.72E-04
Epoch:40, Train_acc:44.4%, Train_loss:1.731, Test_acc:39.7%, Test_loss:1.885, Lr:4.34E-04
Done

四、结果可视化

1.Loss与Accuracy图
python 复制代码
import matplotlib.pyplot as plt
#隐藏警告
import warnings
warnings.filterwarnings("ignore")               #忽略警告信息
plt.rcParams['font.sans-serif']    = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False      # 用来正常显示负号
plt.rcParams['figure.dpi']         = 100        #分辨率

from datetime import datetime
current_time = datetime.now() # 获取当前时间

epochs_range = range(epochs)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)

plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.xlabel(current_time) # 打卡请带上时间戳,否则代码截图无效

plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
2.对指定图片进行预测
python 复制代码
from PIL import Image 

classes = list(total_data.class_to_idx)

def predict_one_image(image_path, model, transform, classes):
    
    test_img = Image.open(image_path).convert('RGB')
    plt.imshow(test_img)  # 展示预测的图片

    test_img = transform(test_img)
    img = test_img.to(device).unsqueeze(0)
    
    model.eval()
    output = model(img)

    _,pred = torch.max(output,1)
    pred_class = classes[pred]
    print(f'预测结果是:{pred_class}')
python 复制代码
# 预测训练集中的某张照片
predict_one_image(image_path='F:/jupyter lab/DL-100-days/datasets/Hollywood_stars_photos/Angelina Jolie/001_fe3347c0.jpg', 
                  model=model, 
                  transform=train_transforms, 
                  classes=classes)
复制代码
预测结果是:Angelina Jolie
3.模型评估
python 复制代码
best_model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, best_model, loss_fn)
python 复制代码
epoch_test_acc, epoch_test_loss
复制代码
(0.3972222222222222, 1.8640953699747722)
python 复制代码
# 查看是否与我们记录的最高准确率一致
epoch_test_acc
复制代码
0.3972222222222222

五、学习心得

1.本次使用pytorch深度学习环境对官方的VGG-16模型进行调用,并且保存最佳模型权重。VGG-16模型的最大特点是深度。除此之外,其卷积层都采用3x3的卷积核和步长为1的卷积操作,同时卷积层后都有ReLU激活函数,从而降低过拟合风险。

2.训练过程中发现训练和测试的acc都过低(约为20%),通过调整动态学习率予以调整,初始学习率增大一个数量级之后,此问题得到一定的解决。

3.下一步将自行搭建VGG-16模型。

相关推荐
haosend15 分钟前
AI时代,传统网络运维人员的转型指南
python·数据网络·网络自动化
曲幽27 分钟前
不止于JWT:用FastAPI的Depends实现细粒度权限控制
python·fastapi·web·jwt·rbac·permission·depends·abac
风象南8 小时前
我把大脑开源给了AI
人工智能·后端
Johny_Zhao10 小时前
OpenClaw安装部署教程
linux·人工智能·ai·云计算·系统运维·openclaw
飞哥数智坊10 小时前
我帮你读《一人公司(OPC)发展研究》
人工智能
冬奇Lab14 小时前
OpenClaw 源码精读(3):Agent 执行引擎——AI 如何「思考」并与真实世界交互?
人工智能·aigc
没事勤琢磨15 小时前
如何让 OpenClaw 控制使用浏览器:让 AI 像真人一样操控你的浏览器
人工智能
用户51914958484516 小时前
CrushFTP 认证绕过漏洞利用工具 (CVE-2024-4040)
人工智能·aigc
牛马摆渡人52816 小时前
OpenClaw实战--Day1: 本地化
人工智能
前端小豆16 小时前
玩转 OpenClaw:打造你的私有 AI 助手网关
人工智能