深度学习与应用：人体关键点检测

实验二深度学习与应用：人体关键点检测

1、实验目的

了解人体关键点检测基础流程
熟悉YOLOV7-pose模型结构
掌握 YOLOv7-pose 模型的训练、Fine-tuning 以及推理的能力
掌握YOLOV7-pose模型对实际问题的应用能力，了解如何在特定的场景和任务中应用该模型

2、实验环境

[镜像详情]

虚拟机数量：1个（需GPU >=4GB）

虚拟机信息：

操作系统：Ubuntu20.04
代码位置：/home/zkpk/experiment/YOLOV7-POSE
MS COCO 2017数据集存储位置：/home/zkpk/experiment/YOLOV7-POSE/images
(数据集下载地址：https://cocodataset.org/#download)
提供tiny版测试数据集，位于：./data/coco 128
已安装软件：python版本：python 3.9，显卡驱动，cuda版本：cuda11.3 cudnn 版本：8.4.1,torch==1.12.1+cu113,torchvision= 0.13.1+cu113
根据requirements.txt,合理配置python环境

3、实验内容

准备MS COCO 2017 数据集或者 tiny 版数据集：coco-128 (./data/coco128)
根据训练参数，训练YOLOV7-pose模型
测试YOLOV7-pose模型
输入单张图片调用训练后的模型推理检测图片中人体关键点
设计输入离线视频，实时对视频进行人体关键点检测

4、实验关键点

数据集索引文件位置必须为数据集配置文件（coco_kpts.yaml）中指定位置；
数据集存储位置必须为数据集索引文件中指定的路径一致,如下图所示
图 1
训练过程中出现 OOM错误时，需将--batch-size参数设置为较小的数值（一般为2的次幂）

5、实验效果图

人体关键点检测

6、实验步骤

6.1 根据上文提供的下载地址下载MS COCO 2017数据集，存储位置：/home/zkpk/experiment/YOLOV7-POSE/images
(数据集下载地址：https://cocodataset.org/#download)
或者使用（data\coco128）路径下tiny版数据集，测试训练、推理过程；
6.2 打开命令行窗口进入当前代码工程目录下

shell 复制代码

cd /home/zkpk/experiment/YOLOV7-POSE

6.3 训练YOLOV7-POSE模型

shell 复制代码

#   使用coco128 tiny版数据集在 GPU  训练上训练模型
python --weights  weights/yolov7-w6-person.pt  --cfg  cfg/yolov7-w6-pose.yaml  --data  data/coco_kpts_128.yaml --hyp data/hyp.pose.yaml --batch-size 1 --img-size 960 --device "0" --kpt-label 

# 使用coco128 tiny版数据集在 CPU 训练上训练模型
python --weights  weights/yolov7-w6-person.pt  --cfg  cfg/yolov7-w6-pose.yaml  --data  data/coco_kpts_128.yaml --hyp data/hyp.pose.yaml --batch-size 1 --img-size 960 --device "cpu" --kpt-label 

 # 使用ms coco版数据集在 CPU 训练上训练模型
python --weights  weights/yolov7-w6-person.pt  --cfg  cfg/yolov7-w6-pose.yaml  --data  data/coco_kpts.yaml --hyp data/hyp.pose.yaml --batch-size 1 --img-size 960 --device "cpu" --kpt-label 

# 使用ms coco128 版数据集在 GPU 训练上训练模型
python --weights  weights/yolov7-w6-person.pt  --cfg  cfg/yolov7-w6-pose.yaml  --data  data/coco_kpts.yaml --hyp data/hyp.pose.yaml --batch-size 1 --img-size 960 --device "0" --kpt-label

运行日志输出如下：

复制代码

tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, kpt=0.1, cls=0.3, cls_pw=1.0, obj=0.7, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
wandb: Install Weights & Biases for YOLOv5 logging with 'pip install wandb' (recommended)

                 from  n    params  module                                  arguments                     
  0                -1  1         0  models.common.ReOrg                     []                            
  1                -1  1      7040  models.common.Conv                      [12, 64, 3, 1]                
  2                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  3                -1  1      8320  models.common.Conv                      [128, 64, 1, 1]               
  4                -2  1      8320  models.common.Conv                      [128, 64, 1, 1]               
  5                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
  6                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
  7                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
  8                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
  9  [-1, -3, -5, -6]  1         0  models.common.Concat                    [1]                           
 10                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
 12                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 13                -2  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 14                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 15                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 16                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 17                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 18  [-1, -3, -5, -6]  1         0  models.common.Concat                    [1]                           
 19                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 20                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
 21                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 22                -2  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 23                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 24                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 25                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 26                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 27  [-1, -3, -5, -6]  1         0  models.common.Concat                    [1]                           
 28                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
 29                -1  1   3540480  models.common.Conv                      [512, 768, 3, 2]              
 30                -1  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 31                -2  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 32                -1  1   1327872  models.common.Conv                      [384, 384, 3, 1]              
 33                -1  1   1327872  models.common.Conv                      [384, 384, 3, 1]              
 34                -1  1   1327872  models.common.Conv                      [384, 384, 3, 1]              
 35                -1  1   1327872  models.common.Conv                      [384, 384, 3, 1]              
 36  [-1, -3, -5, -6]  1         0  models.common.Concat                    [1]                           
 37                -1  1   1181184  models.common.Conv                      [1536, 768, 1, 1]             
 38                -1  1   7079936  models.common.Conv                      [768, 1024, 3, 2]             
 39                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
 40                -2  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
 41                -1  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 42                -1  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 43                -1  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 44                -1  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 45  [-1, -3, -5, -6]  1         0  models.common.Concat                    [1]                           
 46                -1  1   2099200  models.common.Conv                      [2048, 1024, 1, 1]            
 47                -1  1   7609344  models.common.SPPCSPC                   [1024, 512, 1]                
 48                -1  1    197376  models.common.Conv                      [512, 384, 1, 1]              
 49                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 50                37  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 51          [-1, -2]  1         0  models.common.Concat                    [1]                           
 52                -1  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 53                -2  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 54                -1  1    663936  models.common.Conv                      [384, 192, 3, 1]              
 55                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
 56                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
 57                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
 58[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
 59                -1  1    590592  models.common.Conv                      [1536, 384, 1, 1]             
 60                -1  1     98816  models.common.Conv                      [384, 256, 1, 1]              
 61                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 62                28  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 63          [-1, -2]  1         0  models.common.Concat                    [1]                           
 64                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 65                -2  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 66                -1  1    295168  models.common.Conv                      [256, 128, 3, 1]              
 67                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 68                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 69                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 70[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
 71                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1]             
 72                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 73                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 74                19  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 75          [-1, -2]  1         0  models.common.Concat                    [1]                           
 76                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 77                -2  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 78                -1  1     73856  models.common.Conv                      [128, 64, 3, 1]               
 79                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 80                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 81                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 82[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
 83                -1  1     65792  models.common.Conv                      [512, 128, 1, 1]              
 84                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
 85          [-1, 71]  1         0  models.common.Concat                    [1]                           
 86                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 87                -2  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 88                -1  1    295168  models.common.Conv                      [256, 128, 3, 1]              
 89                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 90                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 91                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 92[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
 93                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1]             
 94                -1  1    885504  models.common.Conv                      [256, 384, 3, 2]              
 95          [-1, 59]  1         0  models.common.Concat                    [1]                           
 96                -1  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 97                -2  1    295680  models.common.Conv                      [768, 384, 1, 1]              
 98                -1  1    663936  models.common.Conv                      [384, 192, 3, 1]              
 99                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
100                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
101                -1  1    332160  models.common.Conv                      [192, 192, 3, 1]              
102[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
103                -1  1    590592  models.common.Conv                      [1536, 384, 1, 1]             
104                -1  1   1770496  models.common.Conv                      [384, 512, 3, 2]              
105          [-1, 47]  1         0  models.common.Concat                    [1]                           
106                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
107                -2  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
108                -1  1   1180160  models.common.Conv                      [512, 256, 3, 1]              
109                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
110                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
111                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
112[-1, -2, -3, -4, -5, -6]  1         0  models.common.Concat                    [1]                           
113                -1  1   1049600  models.common.Conv                      [2048, 512, 1, 1]             
114                83  1    295424  models.common.Conv                      [128, 256, 3, 1]              
115                93  1   1180672  models.common.Conv                      [256, 512, 3, 1]              
116               103  1   2655744  models.common.Conv                      [384, 768, 3, 1]              
117               113  1   4720640  models.common.Conv                      [512, 1024, 3, 1]             
118[114, 115, 116, 117]  1  10466036  models.yolo.IKeypoint                   [1, [[19, 27, 44, 40, 38, 94], [96, 68, 86, 152, 180, 137], [140, 301, 303, 264, 238, 542], [436, 615, 739, 380, 925, 792]], 17, [256, 512, 768, 1024]]
D:\anaconda3\envs\yolo\lib\site-packages\torch\functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2895.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 641 layers, 80238452 parameters, 80238452 gradients, 102.2 GFLOPS

Transferred 634/908 items from weights/yolov7-w6-person.pt
Scaled weight_decay = 0.0005
Optimizer groups: 155 .bias, 155 conv.weight, 155 other
train: Scanning 'data\coco128\labels\train2017.cache' images and labels... 16 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 2/2 [00:00<?, ?it/s]
val: Scanning 'data\coco128\labels\train2017.cache' images and labels... 16 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 2/2 [00:00<?, ?it/s]
Plotting labels... 

autoanchor: Analyzing anchors... anchors/target = 6.17, Best Possible Recall (BPR) = 1.0000
Image sizes 960 train, 960 test
Using 2 dataloader workers
Logging results to runs\train\yolov7-w6-pose13
Starting training for 300 epochs...

     Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
     0/299     4.14G   0.08823      1.94         0    0.3494  0.008104     2.386        19       960: 100%|██████████| 8/8 [00:21<00:00,  2.64s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100%|██████████| 4/4 [00:06<00:00,  1.53s/it]
                 all          16          41        0.75       0.585       0.606       0.334

     Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
     1/299     4.14G   0.08576    0.5416         0    0.3474  0.008164     0.983         8       960:  88%|████████▊ | 7/8 [00:07<00:01,  1.01s/it]

6.4 测试训练模型

在shell 窗口输入测试指令

shell 复制代码

python test.py --data data/coco_kpts_128.yaml --img 960 --conf 0.001 --iou 0.65 --weights yolov7-w6-pose.pt --kpt-label

使用6.3中训练后的模型测试效果,将--weights 模型路径更换为训练模型保存的路径，模型存储路径为：runs\train\yolov7-w6-poseXX(XX为每次实验的次数)

6.5 在单张图片中，实现模型推理，输出人体关键点检测模型

python 复制代码

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weigths = torch.load('yolov7-w6-pose.pt', map_location=device)
model = weigths['model']
_ = model.float().eval()

if torch.cuda.is_available():
    model.half().to(device)


image = cv2.imread('./person.jpg') #   测试图片的路径
image = letterbox(image, 960, stride=64, auto=True)[0]
image_ = image.copy()
image = transforms.ToTensor()(image)
image = torch.tensor(np.array([image.numpy()]))

运行指令如下：

bash 复制代码

python  keypoints.py

运行效果如下：

6.5 在视频中实现实时模型推理，输出人体关键点检测模型

bash 复制代码

python keypoint_video.py

可以在keypoint_video.py中修改输入的数据源，选自自己的视频输入：

python 复制代码

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weigths = torch.load('yolov7-w6-pose.pt')
model = weigths['model']
model = model.half().to(device)
_ = model.eval()

cap = cv2.VideoCapture('2.mp4')#   输入视频路径
if (cap.isOpened() == False):
    print('open failed.')

效果如下：

7、思考题

考虑人体关键点检测中，模型结构还有哪些改进点
思考怎么将yoloV7-Pose模型应用到手势姿势识别中
思考如何调节模型参数和训练参数提升模型的效果指标

8、实验报告

请按照实验报告的格式要求撰写实验报告，需要源码私信我哈。

深度学习与应用：人体关键点检测

实验二 深度学习与应用：人体关键点检测

1、 实验目的

2、实验环境

3、实验内容

4、实验关键点

5、实验效果图

6、实验步骤

7、思考题

8、 实验报告

实验二深度学习与应用：人体关键点检测

1、实验目的

8、实验报告