MMdetection推理验证输出详解(单张图片demo)

训练完成后,把最后的权重文件,模型配置文件都放在一个文件夹下:

官方给出的测试及推理命令如下:

python 复制代码
python tools/test.py <配置文件路径> <权重文件路径>
bash 复制代码
python tools/test.py work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/deformable-detr_r50_16xb2-50e_coco.py work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/epoch_50.pth

看到输出结果如下:

bash 复制代码
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.985
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.993
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.990
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.978
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.986
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.996
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.996
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.996
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.986
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.997
05/27 13:45:59 - mmengine - INFO - bbox_mAP_copypaste: 0.985 0.993 0.990 -1.000 0.978 0.986
05/27 13:45:59 - mmengine - INFO - Epoch(test) [1549/1549]    coco/bbox_mAP: 0.9850  coco/bbox_mAP_50: 0.9930  coco/bbox_mAP_75: 0.9900  coco/bbox_mAP_s: -1.0000  coco/bbox_mAP_m: 0.9780  coco/bbox_mAP_l: 0.9860  data_time: 0.0020  time: 0.0326

并输出模型配置和日志记录文件夹:

以上只是一个简单的验证,若需要查看每张图片的结果,需要另写代码文件:

我们对开始你的第一步 --- MMDetection 3.3.0 文档中测试安装成功的代码稍作改动,主要是打印结果信息:

bash 复制代码
from mmdet.apis import init_detector, inference_detector

config_file = '/home/hary/ctc/mmdetection/work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/deformable-detr_r50_16xb2-50e_coco.py'  # 模型配置文件路径
checkpoint_file = '/home/hary/ctc/mmdetection/work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/epoch_50.pth'  # 模型权重文件路径
model = init_detector(config_file, checkpoint_file, device='cpu')  # or device='cuda:0'
img_path = '/home/hary/ctc/mmdetection/Dataset_depth_COCO/val2017/1112_9-rgb.png'  # 图片路径
result = inference_detector(model, img_path)
print('---------------------------------')
print(result)

打印结果:

bash 复制代码
---------------------------------
<DetDataSample(

    META INFORMATION
    batch_input_shape: (800, 1067)
    ori_shape: (960, 1280)
    pad_shape: (800, 1067)
    img_id: 0
    img_path: '/home/hary/ctc/mmdetection/Dataset_depth_COCO/val2017/1112_9-rgb.png'
    scale_factor: (0.83359375, 0.8333333333333334)
    img_shape: (800, 1067)

    DATA FIELDS
    pred_instances: <InstanceData(
        
            META INFORMATION
        
            DATA FIELDS
            scores: tensor([9.6636e-01, 8.6282e-01, 5.0045e-02, 3.0949e-02, 2.1291e-03, 1.8354e-03,
                        1.4797e-03, 9.5198e-04, 9.4724e-04, 9.0408e-04, 8.4457e-04, 7.7178e-04,
                        7.5072e-04, 7.4246e-04, 7.2029e-04, 7.0234e-04, 6.9444e-04, 6.6890e-04,
                        6.6067e-04, 6.5956e-04, 6.4741e-04, 6.2452e-04, 6.2145e-04, 6.0680e-04,
                        6.0662e-04, 5.8978e-04, 5.7643e-04, 5.6989e-04, 5.6188e-04, 5.6018e-04,
                        5.4903e-04, 5.3914e-04, 5.3539e-04, 5.3505e-04, 5.1647e-04, 5.1595e-04,
                        5.1284e-04, 5.0697e-04, 5.0271e-04, 4.9641e-04, 4.9617e-04, 4.9153e-04,
                        4.8392e-04, 4.6943e-04, 4.6823e-04, 4.6747e-04, 4.6432e-04, 4.6207e-04,
                        4.6087e-04, 4.5963e-04, 4.5072e-04, 4.4785e-04, 4.4583e-04, 4.4582e-04,
                        4.3889e-04, 4.3656e-04, 4.3481e-04, 4.3422e-04, 4.2235e-04, 4.1829e-04,
                        4.1680e-04, 4.1591e-04, 4.1196e-04, 4.0530e-04, 4.0237e-04, 3.9507e-04,
                        3.9482e-04, 3.8735e-04, 3.8713e-04, 3.8407e-04, 3.8263e-04, 3.8178e-04,
                        3.8151e-04, 3.7959e-04, 3.7872e-04, 3.7765e-04, 3.7668e-04, 3.7320e-04,
                        3.7057e-04, 3.6245e-04, 3.6220e-04, 3.6162e-04, 3.5402e-04, 3.5096e-04,
                        3.5083e-04, 3.5054e-04, 3.4927e-04, 3.4678e-04, 3.4543e-04, 3.4240e-04,
                        3.4200e-04, 3.4044e-04, 3.3560e-04, 3.3540e-04, 3.3506e-04, 3.3181e-04,
                        3.2372e-04, 3.2135e-04, 3.2047e-04, 3.1865e-04])
            bboxes: tensor([[3.8038e+02, 3.3829e+02, 8.7101e+02, 5.1814e+02],
                        [8.0331e-01, 3.3403e+02, 2.0168e+02, 7.1068e+02],
                        [3.8038e+02, 3.3829e+02, 8.7101e+02, 5.1814e+02],
                        [8.0331e-01, 3.3403e+02, 2.0168e+02, 7.1068e+02],
                        [4.0823e+02, 3.4763e+02, 8.7735e+02, 5.2347e+02],
                        [4.0823e+02, 3.4763e+02, 8.7735e+02, 5.2347e+02],
                        [4.1606e+02, 3.4775e+02, 8.7032e+02, 5.2471e+02],
                        [4.1606e+02, 3.4775e+02, 8.7032e+02, 5.2471e+02],
                        [4.2050e+02, 3.5188e+02, 8.8307e+02, 5.3558e+02],
                        [3.9088e+02, 3.9543e+02, 8.6223e+02, 5.3254e+02],
                        [0.0000e+00, 4.1365e+02, 4.3029e+02, 6.4290e+02],
                        [3.2440e+02, 8.4257e+02, 7.7319e+02, 9.6000e+02],
                        [3.8511e+02, 4.8201e+02, 8.7394e+02, 5.7323e+02],
                        [3.6575e+02, 5.1804e+02, 8.5130e+02, 5.9951e+02],
                        [3.3422e+02, 6.9991e+02, 8.4247e+02, 7.3495e+02],
                        [3.6575e+02, 5.1804e+02, 8.5130e+02, 5.9951e+02],
                        [3.8627e+02, 4.8445e+02, 8.6102e+02, 5.7369e+02],
                        [4.1046e+02, 4.9052e+02, 8.6503e+02, 5.8720e+02],
                        [4.2050e+02, 3.5188e+02, 8.8307e+02, 5.3558e+02],
                        [3.9088e+02, 3.9543e+02, 8.6223e+02, 5.3254e+02],
                        [3.7638e+02, 7.6128e+02, 8.9853e+02, 7.8412e+02],
                        [0.0000e+00, 8.4846e+02, 4.7391e+02, 9.6000e+02],
                        [0.0000e+00, 3.9305e+02, 4.1092e+02, 7.4043e+02],
                        [3.7064e+02, 6.1048e+02, 8.7357e+02, 6.6218e+02],
                        [3.8627e+02, 4.8445e+02, 8.6102e+02, 5.7369e+02],
                        [3.5878e+02, 8.5559e+02, 8.2981e+02, 9.6000e+02],
                        [0.0000e+00, 4.0445e+02, 4.1523e+02, 6.2125e+02],
                        [3.8181e+02, 7.6912e+02, 9.2646e+02, 7.9225e+02],
                        [2.6032e+02, 8.4527e+02, 7.3658e+02, 9.6000e+02],
                        [4.0113e+02, 5.7291e+02, 8.7452e+02, 6.3975e+02],
                        [0.0000e+00, 4.2017e+02, 4.4482e+02, 6.2521e+02],
                        [3.2440e+02, 8.4257e+02, 7.7319e+02, 9.6000e+02],
                        [0.0000e+00, 4.0378e+02, 4.1072e+02, 7.2026e+02],
                        [4.2259e+02, 8.0411e+02, 9.6577e+02, 8.2172e+02],
                        [3.4414e+02, 6.9423e+02, 8.6352e+02, 7.2694e+02],
                        [3.9884e+02, 3.8749e+02, 8.6840e+02, 5.4867e+02],
                        [0.0000e+00, 4.0742e+02, 4.3495e+02, 6.3147e+02],
                        [3.5916e+02, 6.9879e+02, 8.7993e+02, 7.3169e+02],
                        [6.6263e+02, 5.6504e+02, 9.8699e+02, 6.6078e+02],
                        [5.6836e+02, 6.8688e+02, 1.0009e+03, 7.3245e+02],
                        [4.0486e+02, 3.5922e+02, 8.6717e+02, 5.2445e+02],
                        [4.0486e+02, 3.5922e+02, 8.6717e+02, 5.2445e+02],
                        [5.2048e+02, 8.6254e+02, 9.8947e+02, 9.6000e+02],
                        [0.0000e+00, 3.9575e+02, 4.0728e+02, 6.2371e+02],
                        [0.0000e+00, 4.0989e+02, 4.3611e+02, 6.5765e+02],
                        [0.0000e+00, 3.8116e+02, 3.6796e+02, 6.8372e+02],
                        [3.9884e+02, 3.8749e+02, 8.6840e+02, 5.4867e+02],
                        [4.1046e+02, 4.9052e+02, 8.6503e+02, 5.8720e+02],
                        [0.0000e+00, 4.0378e+02, 4.1072e+02, 7.2026e+02],
                        [0.0000e+00, 3.9417e+02, 3.9502e+02, 6.1512e+02],
                        [0.0000e+00, 3.8116e+02, 3.6796e+02, 6.8372e+02],
                        [0.0000e+00, 3.9305e+02, 4.1092e+02, 7.4043e+02],
                        [3.4248e+02, 6.9715e+02, 8.7662e+02, 7.2740e+02],
                        [5.6609e+02, 8.6611e+02, 1.0528e+03, 9.6000e+02],
                        [0.0000e+00, 4.1365e+02, 4.3029e+02, 6.4290e+02],
                        [0.0000e+00, 3.9575e+02, 4.0728e+02, 6.2371e+02],
                        [0.0000e+00, 4.0742e+02, 4.3495e+02, 6.3147e+02],
                        [4.4883e+02, 8.5760e+02, 9.2626e+02, 9.6000e+02],
                        [0.0000e+00, 4.0445e+02, 4.1523e+02, 6.2125e+02],
                        [3.3355e+02, 7.0234e+02, 8.5813e+02, 7.3405e+02],
                        [1.2862e+02, 8.3867e+02, 6.2651e+02, 9.6000e+02],
                        [0.0000e+00, 4.2017e+02, 4.4482e+02, 6.2521e+02],
                        [4.1458e+02, 3.6223e+02, 8.6780e+02, 5.4675e+02],
                        [0.0000e+00, 0.0000e+00, 5.6674e+02, 2.2335e+02],
                        [0.0000e+00, 4.1518e+02, 4.2211e+02, 7.2022e+02],
                        [0.0000e+00, 8.4846e+02, 4.7391e+02, 9.6000e+02],
                        [1.0618e+03, 3.3472e+02, 1.2800e+03, 6.7609e+02],
                        [0.0000e+00, 4.1638e+02, 4.3290e+02, 6.9870e+02],
                        [0.0000e+00, 0.0000e+00, 5.2049e+02, 2.4524e+02],
                        [0.0000e+00, 3.8412e+02, 3.9710e+02, 6.0965e+02],
                        [0.0000e+00, 8.6693e+02, 4.3433e+02, 9.6000e+02],
                        [0.0000e+00, 3.9417e+02, 3.9502e+02, 6.1512e+02],
                        [0.0000e+00, 2.3495e+02, 3.4734e+02, 5.3374e+02],
                        [0.0000e+00, 4.0989e+02, 4.3611e+02, 6.5765e+02],
                        [2.8229e+02, 7.3323e+02, 8.4686e+02, 7.5988e+02],
                        [3.2823e+02, 0.0000e+00, 8.0437e+02, 2.8706e+02],
                        [0.0000e+00, 4.1518e+02, 4.2211e+02, 7.2022e+02],
                        [0.0000e+00, 3.8926e+02, 3.5770e+02, 6.0124e+02],
                        [8.3593e+02, 0.0000e+00, 1.2800e+03, 2.7487e+02],
                        [0.0000e+00, 8.6693e+02, 4.3433e+02, 9.6000e+02],
                        [0.0000e+00, 0.0000e+00, 5.3297e+02, 2.4709e+02],
                        [0.0000e+00, 0.0000e+00, 4.9581e+02, 1.9596e+02],
                        [2.0668e+02, 8.5584e+02, 7.0114e+02, 9.6000e+02],
                        [5.5672e+02, 8.6939e+02, 1.0421e+03, 9.6000e+02],
                        [3.8511e+02, 4.8201e+02, 8.7394e+02, 5.7323e+02],
                        [6.2638e+02, 5.2157e+02, 9.5959e+02, 6.4872e+02],
                        [8.2546e+02, 0.0000e+00, 1.2800e+03, 2.6732e+02],
                        [4.0113e+02, 5.7291e+02, 8.7452e+02, 6.3975e+02],
                        [5.7526e+02, 4.9513e+02, 9.3053e+02, 6.2974e+02],
                        [0.0000e+00, 0.0000e+00, 5.3319e+02, 2.2862e+02],
                        [8.2112e+02, 0.0000e+00, 1.2800e+03, 2.1693e+02],
                        [5.2048e+02, 8.6254e+02, 9.8947e+02, 9.6000e+02],
                        [0.0000e+00, 0.0000e+00, 5.3928e+02, 1.9326e+02],
                        [0.0000e+00, 8.6855e+02, 4.3088e+02, 9.6000e+02],
                        [0.0000e+00, 3.8412e+02, 3.9710e+02, 6.0965e+02],
                        [3.9847e+02, 0.0000e+00, 9.7181e+02, 2.6527e+02],
                        [0.0000e+00, 0.0000e+00, 4.9859e+02, 1.9115e+02],
                        [3.9252e+02, 3.6897e+02, 8.5414e+02, 5.6715e+02],
                        [3.2823e+02, 0.0000e+00, 8.0437e+02, 2.8706e+02],
                        [0.0000e+00, 0.0000e+00, 5.4149e+02, 2.5021e+02]])
            labels: tensor([0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1,
                        0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0,
                        1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0,
                        1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
                        0, 1, 0, 0])
        ) at 0x7f58f8515d90>
    ignored_instances: <InstanceData(
        
            META INFORMATION
        
            DATA FIELDS
            labels: tensor([], dtype=torch.int64)
            bboxes: tensor([], size=(0, 4))
        ) at 0x7f58f8515ee0>
    gt_instances: <InstanceData(
        
            META INFORMATION
        
            DATA FIELDS
            labels: tensor([], dtype=torch.int64)
            bboxes: tensor([], size=(0, 4))
        ) at 0x7f58f8515fa0>
) at 0x7f58f85ee220>

可以看到,打印结果包含以下信息:

1.META INFORMATION(元数据信息)

  • batch_input_shape: (800, 1067):模型实际接收的输入图像尺寸

  • ori_shape: (960, 1280):原始图像的尺寸(高度960px,宽度1280px)

  • pad_shape: (800, 1067):预处理后填充调整后的尺寸

  • img_path: 输入图像绝对路径

  • scale_factor: (0.83359375, 0.8333333333333334):图像缩放比例(原始尺寸到输入尺寸的缩放比例)

  • img_shape: (800, 1067):实际输入模型的图像尺寸

2.DATA FIELDS(核心数据字段)

  • pred_instances:预测结果实例

    • scores:置信度分数张量(共100个预测结果,按置信度降序排列

      • 最高置信度:0.96636(约96.6%),最低置信度:0.00034289(约0.034%)
    • bboxes:边界框坐标(XYXY格式)

    • labels:类别标签(共100个,对应数据集的类别ID)

  • ignored_instances:被忽略的实例(当前为空,表示没有设置忽略区域)

  • gt_instances:真实标注(推理模式下为空,仅在训练/验证时包含标注信息)

分析:模型输出了100个预测结果(默认保留top100),而有效预测只集中在前几个高置信度结果(前2个预测置信度 >0.8,第3个预测置信度骤降至0.05),说明存在大量低质量预测(置信度 <0.01),增加阈值过滤让结果变得"干净"一些:

python 复制代码
from mmdet.apis import init_detector, inference_detector

config_file = '/home/hary/ctc/mmdetection/work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/deformable-detr_r50_16xb2-50e_coco.py'
checkpoint_file = '/home/hary/ctc/mmdetection/work_dirs/deformable-detr_r50_16xb2-50e_coco/20250527_094540/epoch_50.pth'
model = init_detector(config_file, checkpoint_file, device='cpu')  # or device='cuda:0'
img_path = '/home/hary/ctc/mmdetection/Dataset_depth_COCO/val2017/1112_9-rgb.png'
result = inference_detector(model, img_path)
print('---------------------------------')
# 让我们稍作一点改动:过滤置信度大于0.3的结果
valid_idx = result.pred_instances.scores > 0.3
filtered_bboxes = result.pred_instances.bboxes[valid_idx]
filtered_scores = result.pred_instances.scores[valid_idx]
filtered_labels = result.pred_instances.labels[valid_idx]
print(filtered_bboxes)
print(filtered_scores)
print(filtered_labels)

打印结果,这个时候就过滤掉了大量的无效预测:

bash 复制代码
---------------------------------
tensor([[3.8038e+02, 3.3829e+02, 8.7101e+02, 5.1814e+02],
        [8.0331e-01, 3.3403e+02, 2.0168e+02, 7.1068e+02]])
tensor([0.9664, 0.8628])
tensor([0, 1])

我们来打印下json文件里的真实标签:

python 复制代码
from pycocotools.coco import COCO
import os

# 标注文件路径和图片目录
ann_file = 'Dataset_depth_COCO/annotations/instances_val2017.json'  # 替换为你的标注文件路径
image_dir = 'Dataset_depth_COCO/val2017'  # 替换为图片文件夹路径

# 初始化 COCO API
coco = COCO(ann_file)

# 指定要查询的图片文件名或 ID
target_image_name = '1112_9-rgb.png'  # 示例图片文件名
# target_image_id = 391895  # 如果已知图片ID,直接使用

# 根据文件名查找图片ID
target_image_id = None
for img_id in coco.imgs:
    img_info = coco.loadImgs(img_id)[0]
    if img_info['file_name'] == target_image_name:
        target_image_id = img_id
        break

if target_image_id is None:
    print(f"图片 {target_image_name} 不存在!")
    exit()

# 获取该图片的所有标注信息
ann_ids = coco.getAnnIds(imgIds=target_image_id)
annotations = coco.loadAnns(ann_ids)

# 打印标签信息
print(f"图片 {target_image_name} (ID: {target_image_id}) 的标签:")
for ann in annotations:
    category_info = coco.loadCats(ann['category_id'])[0]
    print(f"""
    - 类别名称: {category_info['name']} (ID: {ann['category_id']})
    - 边界框: {ann['bbox']}  [x, y, width, height]
    - 面积: {ann['area']}
    - 分割掩码: {'存在' if 'segmentation' in ann else '无'}
    """)

输出:

bash 复制代码
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
图片 1112_9-rgb.png (ID: 683) 的标签:

    - 类别名称: shallow_box (ID: 0)
    - 边界框: [387.00096, 334.99968, 477.9993600000001, 181.00032]  [x, y, width, height]
    - 面积: 86518.0371197952
    - 分割掩码: 存在
    

    - 类别名称: shallow_half_box (ID: 1)
    - 边界框: [1.0009600000000063, 335.00015999999994, 193.99936000000002, 373.99968]  [x, y, width, height]
    - 面积: 72555.69856020481
    - 分割掩码: 存在

可以看到,模型的预测是较为精确的。

**提请注意:**COCO数据集中的(x,y,w,h)和YOLO格式的(x,y,w,h)有本质区别,COCO格式的(x,y)是左上角点,且COCO数据集标签不做归一化,YOLO格式的(x,y)是框中心点,并且标签做归一化处理:
左:YOLO格式,右:COCO格式

将几种标注格式放在一起,读者自行体会:

COCO数据集下本图片的标签:

bash 复制代码
class_id   x_coco   y_coco        w        h
       0 387.0009 334.9997 477.9994 181.0003
       1   1.0009 335.0001 193.9993 373.9996

XYXY格式下本图片的标签:

bash 复制代码
class_id    x_min    y_min    x_max    y_max
       0 387.0009 334.9997 865.0003 516.0000
       1   1.0009 335.0001 195.0003 708.9998

YOLO格式下本图片的标签(图片高宽:960*1280):

bash 复制代码
class_id   x_yolo   y_yolo        w        h
       0 0.489063 0.443229 0.373437 0.188542
       1 0.076563 0.543750 0.151562 0.389583
相关推荐
水木流年追梦1 分钟前
CodeTop 热门题目汇总hot300题
算法·leetcode·职场和发展
起这个名字2 分钟前
LangGraphJs 核心概念、工作流程理解及应用
前端·人工智能
小赵同学WoW3 分钟前
vue组件基础知识
前端
牛奶12 分钟前
浏览器藏了这么多神器,你居然不知道?
前端·chrome·api
WebInfra17 分钟前
Rspack 2.0 正式发布!
前端·javascript·前端框架
小糖学代码22 分钟前
LLM系列:2.pytorch入门:3.基本优化思想与最小二乘法
人工智能·python·算法·机器学习·ai·数据挖掘·最小二乘法
极速蜗牛23 分钟前
Cursor最近变傻了?
前端
爱写代码的倒霉蛋24 分钟前
天梯赛备赛经验分享(基础版)
经验分享·算法
码字小学妹33 分钟前
Claude Opus 4.7 接入指南(2026):国内配置 + xhigh 推理 + 成本计算
前端
小赵同学WoW35 分钟前
插槽【vue2】与 【vue3】对比
前端