-
DIOU 概述
DIOU (Distance-IoU) 是 IoU 的改进版本,由 Zhaohui Zheng 等人在 2019 年提出。它在 IoU 的基础上直接考虑了边界框中心点距离,能够更有效地衡量两个边界框之间的相似度。
-
DIOU 的提出动机
虽然 GIOU 解决了 IoU 在零重叠情况下的梯度问题,但它仍存在一些局限性:
收敛速度慢:GIOU 需要多次迭代才能收敛
回归不准确:当两个框处于包含关系时,GIOU 退化为 IoU
方向不明确:GIOU 无法直接引导边界框向正确方向移动
DIOU 通过引入中心点距离惩罚项解决了这些问题。
- DIOU 的数学定义
3.1 基本公式
python
DIOU = IoU - (ρ²(b, b^gt) / c²)
其中:
IoU:传统的交并比
ρ²(b, bgt):预测框(b)和真实框(bgt)中心点之间的欧氏距离平方
c²:包含两个框的最小闭合矩形的对角线长度平方
3.2 公式展开
python
DIOU = IoU - (d² / c²)
其中:
d = √[(b_x - b^gt_x)² + (b_y - b^gt_y)²] # 中心点欧氏距离
c = 对角线长度 = √[(c_x2 - c_x1)² + (c_y2 - c_y1)²]
- DIOU 的特性分析
4.1 值域范围
最佳情况:两个框完全重合时,DIOU = 1
最差情况:两个框相距无限远时,DIOU ≈ -1
取值范围:DIOU ∈ [-1, 1]
4.2 性质证明
定理1:DIOU ≤ IoU
证明:由于 (d²/c²) ≥ 0,所以 DIOU = IoU - (d²/c²) ≤ IoU
定理2:当且仅当两个框中心点重合时,DIOU = IoU
证明:d = 0 时,DIOU = IoU - 0 = IoU
定理3:DIOU是对称的
证明:DIOU(A,B) = DIOU(B,A)
- DIOU Loss
5.1 损失函数定义
python
L_DIOU = 1 - DIOU
= 1 - IoU + (ρ²(b, b^gt) / c²)
5.2 损失函数特性
非负性:L_DIOU ≥ 0
可微性:处处可微,利于梯度下降
尺度不变性:与目标大小无关
直接优化中心点距离:显式最小化中心点偏移
- 代码实现
6.1 Python/Numpy 实现
python
import numpy as np
import math
def calculate_diou(box1, box2, eps=1e-7):
"""
计算DIOU
box1, box2: [x1, y1, x2, y2] 格式
"""
# 确保坐标顺序正确
box1 = [min(box1[0], box1[2]), min(box1[1], box1[3]),
max(box1[0], box1[2]), max(box1[1], box1[3])]
box2 = [min(box2[0], box2[2]), min(box2[1], box2[3]),
max(box2[0], box2[2]), max(box2[1], box2[3])]
# 计算IoU
x1_inter = max(box1[0], box2[0])
y1_inter = max(box1[1], box2[1])
x2_inter = min(box1[2], box2[2])
y2_inter = min(box1[3], box2[3])
width_inter = max(0, x2_inter - x1_inter)
height_inter = max(0, y2_inter - y1_inter)
area_inter = width_inter * height_inter
area_box1 = (box1[2] - box1[0]) * (box1[3] - box1[1])
area_box2 = (box2[2] - box2[0]) * (box2[3] - box2[1])
area_union = area_box1 + area_box2 - area_inter
iou = area_inter / (area_union + eps)
# 计算中心点距离平方
center1_x = (box1[0] + box1[2]) / 2
center1_y = (box1[1] + box1[3]) / 2
center2_x = (box2[0] + box2[2]) / 2
center2_y = (box2[1] + box2[3]) / 2
d_squared = (center1_x - center2_x) ** 2 + (center1_y - center2_y) ** 2
# 计算最小闭合矩形对角线长度平方
c_x1 = min(box1[0], box2[0])
c_y1 = min(box1[1], box2[1])
c_x2 = max(box1[2], box2[2])
c_y2 = max(box1[3], box2[3])
c_squared = (c_x2 - c_x1) ** 2 + (c_y2 - c_y1) ** 2
# 计算DIOU
diou = iou - d_squared / (c_squared + eps)
return diou
def diou_loss(box1, box2):
"""计算DIOU损失"""
diou = calculate_diou(box1, box2)
return 1 - diou
6.2 PyTorch 实现
python
import torch
import torch.nn as nn
class DIoULoss(nn.Module):
"""DIOU损失函数"""
def __init__(self, reduction='mean'):
super(DIoULoss, self).__init__()
self.reduction = reduction
def forward(self, pred, target, eps=1e-7):
"""
pred, target: [N, 4] 格式,坐标形式为 [x1, y1, x2, y2]
"""
# 确保坐标顺序正确
pred_x1 = torch.min(pred[:, 0], pred[:, 2])
pred_y1 = torch.min(pred[:, 1], pred[:, 3])
pred_x2 = torch.max(pred[:, 0], pred[:, 2])
pred_y2 = torch.max(pred[:, 1], pred[:, 3])
target_x1 = torch.min(target[:, 0], target[:, 2])
target_y1 = torch.min(target[:, 1], target[:, 3])
target_x2 = torch.max(target[:, 0], target[:, 2])
target_y2 = torch.max(target[:, 1], target[:, 3])
# 计算交集
inter_x1 = torch.max(pred_x1, target_x1)
inter_y1 = torch.max(pred_y1, target_y1)
inter_x2 = torch.min(pred_x2, target_x2)
inter_y2 = torch.min(pred_y2, target_y2)
inter_area = torch.clamp(inter_x2 - inter_x1, min=0) * \
torch.clamp(inter_y2 - inter_y1, min=0)
# 计算各自面积
pred_area = (pred_x2 - pred_x1) * (pred_y2 - pred_y1)
target_area = (target_x2 - target_x1) * (target_y2 - target_y1)
# 计算并集和IoU
union_area = pred_area + target_area - inter_area
iou = inter_area / (union_area + eps)
# 计算中心点距离平方
pred_center_x = (pred_x1 + pred_x2) / 2
pred_center_y = (pred_y1 + pred_y2) / 2
target_center_x = (target_x1 + target_x2) / 2
target_center_y = (target_y1 + target_y2) / 2
center_distance2 = (pred_center_x - target_center_x) ** 2 + \
(pred_center_y - target_center_y) ** 2
# 计算最小闭合矩形对角线长度平方
c_x1 = torch.min(pred_x1, target_x1)
c_y1 = torch.min(pred_y1, target_y1)
c_x2 = torch.max(pred_x2, target_x2)
c_y2 = torch.max(pred_y2, target_y2)
c_diagonal2 = (c_x2 - c_x1) ** 2 + (c_y2 - c_y1) ** 2
# 计算DIOU
diou = iou - center_distance2 / (c_diagonal2 + eps)
# 计算损失
loss = 1 - diou
if self.reduction == 'mean':
return loss.mean()
elif self.reduction == 'sum':
return loss.sum()
else:
return loss
6.3 批量计算优化版本
python
import torch
def batch_diou_loss(pred_boxes, target_boxes):
"""
批量计算DIOU损失,支持向量化操作
pred_boxes: [B, N, 4] 或 [B, 4]
target_boxes: [B, N, 4] 或 [B, 4]
"""
# 计算IoU
lt = torch.max(pred_boxes[..., :2], target_boxes[..., :2]) # [B, N, 2]
rb = torch.min(pred_boxes[..., 2:], target_boxes[..., 2:]) # [B, N, 2]
wh = (rb - lt).clamp(min=0) # [B, N, 2]
inter = wh[..., 0] * wh[..., 1] # [B, N]
area_pred = (pred_boxes[..., 2] - pred_boxes[..., 0]) * \
(pred_boxes[..., 3] - pred_boxes[..., 1])
area_target = (target_boxes[..., 2] - target_boxes[..., 0]) * \
(target_boxes[..., 3] - target_boxes[..., 1])
union = area_pred + area_target - inter
iou = inter / (union + 1e-7)
# 计算中心点距离
center_pred = (pred_boxes[..., :2] + pred_boxes[..., 2:]) / 2
center_target = (target_boxes[..., :2] + target_boxes[..., 2:]) / 2
center_distance2 = torch.sum((center_pred - center_target) ** 2, dim=-1)
# 计算最小闭合矩形对角线距离
enclose_lt = torch.min(pred_boxes[..., :2], target_boxes[..., :2])
enclose_rb = torch.max(pred_boxes[..., 2:], target_boxes[..., 2:])
enclose_wh = (enclose_rb - enclose_lt).clamp(min=0)
c_diagonal2 = torch.sum(enclose_wh ** 2, dim=-1)
# 计算DIOU
diou = iou - center_distance2 / (c_diagonal2 + 1e-7)
return 1 - diou
- DIOU 与 GIOU 的对比
7.1 几何意义对比
指标 惩罚项 几何意义 收敛方向
GIOU (C-U)/C 最小闭合区域中非重叠部分的比例 向重叠区域移动
DIOU d²/c² 中心点距离与对角线长度的比值 直接向中心点靠近
7.2 收敛速度对比
在目标检测训练中,DIOU 通常比 GIOU 收敛更快:
python
# 收敛曲线对比示意
epochs = list(range(1, 21))
giou_loss = [0.8, 0.6, 0.45, 0.35, 0.28, 0.23, 0.19, 0.16, 0.14, 0.12,
0.11, 0.10, 0.09, 0.085, 0.08, 0.075, 0.07, 0.068, 0.065, 0.063]
diou_loss = [0.8, 0.5, 0.3, 0.18, 0.12, 0.08, 0.06, 0.05, 0.045, 0.04,
0.038, 0.036, 0.035, 0.034, 0.033, 0.032, 0.031, 0.030, 0.030, 0.029]
# DIOU在前几个epoch下降更快
7.3 包含情况处理
当预测框包含真实框或真实框包含预测框时:
GIOU:退化为IoU,惩罚项为0
DIOU:仍然有惩罚项,能继续优化中心点位置
- 总结
DIOU 的主要优势:
收敛速度快:直接优化中心点距离,收敛效率高
定位精度高:对中心点位置敏感,定位更准确
泛化能力强:在不同数据集上表现稳定
实现简单:计算复杂度低,易于实现