PiscCode基于 YOLO 的人员分割 + PPE 检测绑定：一种工程级安全合规判定方案

在真实工业场景中（工地、工厂、矿区），仅仅检测到"安全帽/反光衣"是远远不够的 。

真正有价值的问题是：

👉 这个安全帽，是不是戴在这个人头上的？

本文介绍一种工程级、可落地的 PPE 绑定与合规判断方案，核心思想是：

使用 YOLO 分割模型 精确刻画"人"的像素级区域
使用 YOLO 检测模型 识别 PPE（安全帽、反光衣等）
通过 三阶段绑定策略，将 PPE 可靠地绑定到对应的人身上
最终输出 合规 / 不合规 的可视化结果

一、整体设计思路

1️⃣ 模型职责拆分（非常关键）

模块	模型类型	作用
Person	YOLO Segmentation	人员分割（mask + bbox）
PPE	YOLO Detection	装备检测（bbox only）

👉 人一定要用分割模型

👉 装备只需要检测模型即可

这是工业场景中稳定性与性能的最佳平衡点。

2️⃣ 为什么不能只用 bbox IoU？

在真实视频中，你会遇到这些情况：

多人靠近、交叠
PPE 在人框边缘
检测框略微偏移
分割 mask 与 bbox 不完全对齐

如果只用「PPE bbox 与 person bbox 的 IoU」：

❌ 非常容易绑定错误

❌ 多人场景下不稳定

二、核心绑定策略（重点）

本方案采用 三阶段优先级绑定规则：

🥇 规则 1：BBox -- Mask IoU（最高优先级）

复制代码

def _bbox_mask_iou(self, box, mask): ... inter = np.sum(region > 0) area = (x2 - x1) * (y2 - y1) return inter / (area + 1e-6)

解释：

将 PPE 检测框裁剪到人分割 mask 上
计算 PPE 框中有多少像素落在"人"的 mask 内
得到一个 真实的几何归属概率

✅ 对遮挡、偏移、贴边场景非常鲁棒

✅ 是最可靠的绑定依据

🥈 规则 2：Mask 内多点采样（容错机制）

复制代码

def _sample_points_in_mask(self, box, mask): xs = [x1, mid_x, x2-1] ys = [y1, mid_y, y2-1]

解释：

在 PPE 框内采样 3×3 共 9 个点
只要有一个点落在人 mask 内，即认为可能属于该人

✅ 弥补 IoU 极小但"确实佩戴"的情况

✅ 对小物体（安全帽）特别有效

🥉 规则 3：中心点落入人框（兜底策略）

复制代码

def _center_in_bbox(self, box, person_box): return px1 <= cx <= px2 and py1 <= cy <= py2

解释：

PPE 框中心点是否位于人 bbox 内
只作为最后兜底，不作为主要依据

⚠️ 准确性最低

⚠️ 但能避免"完全丢失绑定"

三、完整处理流程

Step 1：人员分割（Seg）

复制代码

seg_res = self.seg_model.track(...)

只保留 PERSON_CLASS_ID = 0
每个 person 保存：
- 分割 mask
- bbox
- 已绑定的 equipments 集合

Step 2：PPE 检测（Det）

复制代码

det_res = self.det_model.track(...)

遍历所有 PPE 检测框
对每个 PPE：
1. 尝试 mask IoU
2. 尝试 mask 采样
3. 尝试 bbox 中心点
找到最合适的 person 进行绑定

best_person["equipments"].add(cls_name)

Step 3：合规性判定

复制代码

REQUIRED_EQUIPMENT = {"helmet", "vest"}

✅ 合规人员

拥有全部必需装备
绿色 bbox

cv2.rectangle(output, (x1,y1),(x2,y2),(0,255,0),2)

❌ 不合规人员

缺少任一装备
红色 mask + 红色 bbox
明确标注缺失项

Missing: helmet, vest

四、可视化设计（工程友好）

合规人员

🟩 绿色人框
🔵 蓝色 PPE 框

不合规人员

🟥 红色半透明人 mask
🟥 红色 bbox
❗ Missing 文本提示

self._draw_mask(output, p["mask"], (0,0,255))

这种设计在 远距离监控 / 大屏 / 夜间 场景下非常清晰。

五、为什么这是"生产级方案"

✔ 使用分割而非 bbox 猜测

✔ 多策略绑定，极强鲁棒性

✔ 无需 ReID，实时性好

✔ 易扩展（护目镜、手套、安全鞋）

✔ 与业务规则解耦（只改 REQUIRED_EQUIPMENT）

复制代码

import cv2
import numpy as np
import random
from ultralytics import YOLO


class YOLOPersonEquipmentBinder:
    """
    Person Segmentation + PPE Detection Binder (Production Grade)

    Rules:
    - Person: segmentation mask + bbox
    - PPE: detection bbox only
    - Binding priority:
        1. BBox-Mask IoU
        2. Multi-point sampling in mask
        3. Center in person bbox (fallback)

    Visualization:
    - Compliant (helmet + vest):
        - Green person bbox
        - Blue PPE boxes
    - Non-compliant:
        - Red person mask
        - Red bbox
        - Missing equipment text
    """

    PERSON_CLASS_ID = 0
    REQUIRED_EQUIPMENT = {"helmet", "vest"}

    # =========================
    # INIT
    # =========================
    def __init__(
        self,
        seg_model_path="yolo11n-seg.pt",
        det_model_path="safety-11x.pt",
        device="cuda",
        alpha=0.45,
    ):
        self.device = device
        self.alpha = alpha

        self.seg_model = YOLO(seg_model_path).to(device)
        self.det_model = YOLO(det_model_path).to(device)

        self.det_names = self.det_model.names
        self.person_colors = {}

    # =========================
    # UTILS
    # =========================
    def _rand_color(self):
        return [random.randint(100, 255) for _ in range(3)]

    def _draw_mask(self, img, mask, color):
        mask = (mask > 0.5).astype(np.uint8)
        overlay = img.copy()
        overlay[mask == 1] = color
        img[:] = cv2.addWeighted(overlay, self.alpha, img, 1 - self.alpha, 0)

    def _bbox_mask_iou(self, box, mask):
        x1, y1, x2, y2 = box
        h, w = mask.shape

        x1 = max(0, min(w - 1, x1))
        y1 = max(0, min(h - 1, y1))
        x2 = max(0, min(w, x2))
        y2 = max(0, min(h, y2))

        if x2 <= x1 or y2 <= y1:
            return 0.0

        region = mask[y1:y2, x1:x2]
        inter = np.sum(region > 0)
        area = (x2 - x1) * (y2 - y1)

        return inter / (area + 1e-6)

    def _sample_points_in_mask(self, box, mask):
        x1, y1, x2, y2 = box
        h, w = mask.shape

        xs = [x1, (x1 + x2) // 2, x2 - 1]
        ys = [y1, (y1 + y2) // 2, y2 - 1]

        for px in xs:
            for py in ys:
                if 0 <= px < w and 0 <= py < h:
                    if mask[py, px] > 0:
                        return True
        return False

    def _center_in_bbox(self, box, person_box):
        x1, y1, x2, y2 = box
        cx = (x1 + x2) // 2
        cy = (y1 + y2) // 2

        px1, py1, px2, py2 = person_box
        return px1 <= cx <= px2 and py1 <= cy <= py2

    # =========================
    # MAIN
    # =========================
    def do(self, frame):
        if frame is None:
            return None

        output = frame.copy()
        h, w, _ = frame.shape

        # -------- Person SEG --------
        seg_res = self.seg_model.track(
            frame,
            persist=True,
            verbose=False,
            device=self.device,
        )[0]

        if seg_res.masks is None or seg_res.boxes is None:
            return output

        persons = []

        for i, cls in enumerate(seg_res.boxes.cls.cpu().numpy()):
            if int(cls) != self.PERSON_CLASS_ID:
                continue

            mask = seg_res.masks.data[i].cpu().numpy()
            mask = cv2.resize(mask, (w, h), interpolation=cv2.INTER_NEAREST)

            box = seg_res.boxes.xyxy[i].cpu().numpy().astype(int)

            persons.append({
                "mask": mask,
                "box": box,
                "equipments": set()
            })

        if not persons:
            return output

        # -------- PPE DET --------
        det_res = self.det_model.track(
            frame,
            persist=True,
            verbose=False,
            device=self.device,
        )[0]

        if det_res.boxes is not None:
            for i, det_box in enumerate(det_res.boxes.xyxy.cpu().numpy().astype(int)):
                cls_name = self.det_names[int(det_res.boxes.cls[i])].lower()

                best_person = None
                best_score = 0.0

                # 1️⃣ IoU
                for p in persons:
                    score = self._bbox_mask_iou(det_box, p["mask"])
                    if score > best_score:
                        best_score = score
                        best_person = p

                # 2️⃣ Multi-point sampling
                if best_score < 0.01:
                    for p in persons:
                        if self._sample_points_in_mask(det_box, p["mask"]):
                            best_person = p
                            break

                # 3️⃣ Center in person bbox
                if best_person is None:
                    for p in persons:
                        if self._center_in_bbox(det_box, p["box"]):
                            best_person = p
                            break

                if best_person is None:
                    continue

                best_person["equipments"].add(cls_name)

                # 🔵 draw PPE
                cv2.rectangle(
                    output,
                    (det_box[0], det_box[1]),
                    (det_box[2], det_box[3]),
                    (255, 0, 0),
                    2,
                )
                cv2.putText(
                    output,
                    cls_name,
                    (det_box[0], det_box[1] - 5),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.6,
                    (255, 0, 0),
                    2,
                )

        # -------- Decision & Render --------
        for p in persons:
            x1, y1, x2, y2 = p["box"]
            equip = p["equipments"]

            if self.REQUIRED_EQUIPMENT.issubset(equip):
                # ✅ compliant
                cv2.rectangle(output, (x1, y1), (x2, y2), (0, 255, 0), 2)
            else:
                # ❌ non-compliant
                self._draw_mask(output, p["mask"], (0, 0, 255))
                cv2.rectangle(output, (x1, y1), (x2, y2), (0, 0, 255), 2)

                missing = sorted(list(self.REQUIRED_EQUIPMENT - equip))
                cv2.putText(
                    output,
                    "Missing: " + ", ".join(missing),
                    (x1, max(20, y1 - 10)),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.7,
                    (0, 0, 255),
                    2,
                )

        return output

六、可扩展方向

你可以很容易继续升级：

🎯 PPE → 具体部位约束（helmet 必须在头部上半区）
⏱ 连续 N 帧不合规才报警
🧠 引入时序一致性（track_id）
📊 合规率统计 / 人员轨迹回放

七、总结

这套 YOLOPersonEquipmentBinder 的核心价值在于一句话：

"不是检测到了装备，而是明确知道：是谁戴的。"

它已经跨过了 Demo 阶段，

是真正可以放进 工业安全系统 / 智慧工地 / 合规审计 的方案。

对 PiscTrace or PiscCode感兴趣？更多精彩内容请移步官网看看～🔗 PiscTrace