【无标题】yoloV8目标检测与实例分割--目标检测onnx模型部署

  1. 模型转换

ONNX Runtime 是一个开源的高性能推理引擎,用于部署和运行机器学习模型,其设计的目标是优化执行open neural network exchange (onnx)格式定义各模型,onnx是一种用于表示机器学习模型的开放标准。ONNX Runtime提供了几个关键功能和优势:

a. 跨平台兼容性:ONNX Runtime 旨在与各种硬件与操作系统平台兼容,主要包括Windows、Linux及各种加速器,如CPU、GPU和FPGA,使得能够轻松在不同环境中部署和运行机器学习模型。

b. 高性能:ONNX Runtime 经过性能优化,能够提供高效的模型计算,而且针对不同的平台提供了对应的优化模式。

c. 多框架支持:ONNX Runtime 可以与使用不同的机器学习框架创建的模型一起使用,包括Pytorch、Tensorflow等。

d. 模型转换:ONNX Runtime 可以将所支持的框架模型转换为onnx格式,从而更容易在各种场景中部署。

e. 多语言支持:ONNX Runtime 可用多种编程语言,包括C++、C#、Python等,使其能够适用于不同语言的开发场景。

f. 自定义运算符:ONNX Runtime 支持自定义运算符,允许开发人员扩展其功能以支持特定操作或硬件加速。

ONNX Runtime广泛用于各种机器学习应用的生产部署,包括计算机视觉、自然语言处理等。它由ONNX社区积极维护,并持续接受更新和改进。

  1. pt模型与onnx模型区别

pt模型和onnx模型均为常用的表示机器学习模型的文件格式,主要区别体现在:

a. 文件格式:

pt模型:Pytorch框架的权重文件格式,通常保存为.pt或.pth扩展名保存,包含了模型的权重参数及模型结构的定义。

onnx模型:ONNX格式的模型文件,通常以.onnx扩展名保存,onnx文件是一种中性表示格式,独立于任何特定的深度学习框架,用于跨不同框架之间的模型转换和部署。

b. 框架依赖:

pt模型:依赖于Pytorch框架,在加载和运行时需要使用Pytorch库,限制了此类模型在不同框架中的直接使用。

onnx模型:ONNX模型独立于深度学习框架,可以在支持ONNX的不同框架中加载和运行,如Tensorflow、Caffe2及ONNX Runtime等。

c. 跨平台兼容性:

pt模型:需要在不同平台上进行Pytorch的兼容性配置,需要额外的工作和依赖处理。

onnx模型:ONNX模型的独立性使其更容易在不同平台和硬件上部署,无需担心框架依赖性问题。

  1. yolov8 pt模型转换为onnx

要在不同框架或平台中部署训练的pt模型,需要利用ONNX转换工具将pt模型转换为ONNX格式。

python 复制代码
from ultralytics import YOLO

% load model
model = YOLO('yolov8m.pt')

% expert model
success = model.expert(format="onnx")
  1. 构建推理模型

a. 环境配置

onnx模型推理只依赖于onnxruntime库,图像处理依赖opencv,需要安装此两个库。

python 复制代码
pip3 install onnxruntime
pip3 install opencv-python
pip3 install numpy
pip3 install gradio

b. 部署代码

utils.py

python 复制代码
import numpy as np
import cv2

class_names = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
               'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
               'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
               'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
               'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
               'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
               'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard',
               'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
               'scissors', 'teddy bear', 'hair drier', 'toothbrush']


# Create a list of colors for each class where each color is a tuple of 3 integer values
rng = np.random.default_rng(3)
colors = rng.uniform(0, 255, size=(len(class_names), 3))


def nms(boxes, scores, iou_threshold):
    # Sort by score
    sorted_indices = np.argsort(scores)[::-1]

    keep_boxes = []
    while sorted_indices.size > 0:
        # Pick the last box
        box_id = sorted_indices[0]
        keep_boxes.append(box_id)

        # Compute IoU of the picked box with the rest
        ious = compute_iou(boxes[box_id, :], boxes[sorted_indices[1:], :])

        # Remove boxes with IoU over the threshold
        keep_indices = np.where(ious < iou_threshold)[0]

        # print(keep_indices.shape, sorted_indices.shape)
        sorted_indices = sorted_indices[keep_indices + 1]

    return keep_boxes

def multiclass_nms(boxes, scores, class_ids, iou_threshold):

    unique_class_ids = np.unique(class_ids)

    keep_boxes = []
    for class_id in unique_class_ids:
        class_indices = np.where(class_ids == class_id)[0]
        class_boxes = boxes[class_indices,:]
        class_scores = scores[class_indices]

        class_keep_boxes = nms(class_boxes, class_scores, iou_threshold)
        keep_boxes.extend(class_indices[class_keep_boxes])

    return keep_boxes

def compute_iou(box, boxes):
    # Compute xmin, ymin, xmax, ymax for both boxes
    xmin = np.maximum(box[0], boxes[:, 0])
    ymin = np.maximum(box[1], boxes[:, 1])
    xmax = np.minimum(box[2], boxes[:, 2])
    ymax = np.minimum(box[3], boxes[:, 3])

    # Compute intersection area
    intersection_area = np.maximum(0, xmax - xmin) * np.maximum(0, ymax - ymin)

    # Compute union area
    box_area = (box[2] - box[0]) * (box[3] - box[1])
    boxes_area = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
    union_area = box_area + boxes_area - intersection_area

    # Compute IoU
    iou = intersection_area / union_area

    return iou


def xywh2xyxy(x):
    # Convert bounding box (x, y, w, h) to bounding box (x1, y1, x2, y2)
    y = np.copy(x)
    y[..., 0] = x[..., 0] - x[..., 2] / 2
    y[..., 1] = x[..., 1] - x[..., 3] / 2
    y[..., 2] = x[..., 0] + x[..., 2] / 2
    y[..., 3] = x[..., 1] + x[..., 3] / 2
    return y


def draw_detections(image, boxes, scores, class_ids, mask_alpha=0.3):
    det_img = image.copy()

    img_height, img_width = image.shape[:2]
    font_size = min([img_height, img_width]) * 0.0006
    text_thickness = int(min([img_height, img_width]) * 0.001)

    det_img = draw_masks(det_img, boxes, class_ids, mask_alpha)

    # Draw bounding boxes and labels of detections
    for class_id, box, score in zip(class_ids, boxes, scores):
        color = colors[class_id]

        draw_box(det_img, box, color)

        label = class_names[class_id]
        caption = f'{label} {int(score * 100)}%'
        draw_text(det_img, caption, box, color, font_size, text_thickness)

    return det_img

def detections_dog(image, boxes, scores, class_ids, mask_alpha=0.3):
    det_img = image.copy()

    img_height, img_width = image.shape[:2]
    font_size = min([img_height, img_width]) * 0.0006
    text_thickness = int(min([img_height, img_width]) * 0.001)

    # det_img = draw_masks(det_img, boxes, class_ids, mask_alpha)

    # Draw bounding boxes and labels of detections

    for class_id, box, score in zip(class_ids, boxes, scores):

        color = colors[class_id]

        draw_box(det_img, box, color)
        label = class_names[class_id]
        caption = f'{label} {int(score * 100)}%'
        draw_text(det_img, caption, box, color, font_size, text_thickness)

    return det_img

def draw_box( image: np.ndarray, box: np.ndarray, color: tuple[int, int, int] = (0, 0, 255),
             thickness: int = 2) -> np.ndarray:
    x1, y1, x2, y2 = box.astype(int)
    return cv2.rectangle(image, (x1, y1), (x2, y2), color, thickness)


def draw_text(image: np.ndarray, text: str, box: np.ndarray, color: tuple[int, int, int] = (0, 0, 255),
              font_size: float = 0.001, text_thickness: int = 2) -> np.ndarray:
    x1, y1, x2, y2 = box.astype(int)
    (tw, th), _ = cv2.getTextSize(text=text, fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                                  fontScale=font_size, thickness=text_thickness)
    th = int(th * 1.2)

    cv2.rectangle(image, (x1, y1),
                  (x1 + tw, y1 - th), color, -1)

    return cv2.putText(image, text, (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, font_size, (255, 255, 255), text_thickness, cv2.LINE_AA)

def draw_masks(image: np.ndarray, boxes: np.ndarray, classes: np.ndarray, mask_alpha: float = 0.3) -> np.ndarray:
    mask_img = image.copy()

    # Draw bounding boxes and labels of detections
    for box, class_id in zip(boxes, classes):
        color = colors[class_id]

        x1, y1, x2, y2 = box.astype(int)

        # Draw fill rectangle in mask image
        cv2.rectangle(mask_img, (x1, y1), (x2, y2), color, -1)

    return cv2.addWeighted(mask_img, mask_alpha, image, 1 - mask_alpha, 0)

YOLODet.py

python 复制代码
import time
import cv2
import numpy as np
import onnxruntime

from detection.utils import xyw2xyxy, draw_detections, multiclass_nms, detections_dog

class YOLODet:

    def __init__(self, path, conf_thresh=0.7, iou_thresh=0.5):
        self.conf_threshold = conf_thresh
        self.iou_threshold = iou_thresh


        # Initialize model
        self.initialize_model(path)

    def __call__(self, image):
        return self.detect_objects(image)


    def initialize_model(self, path):
        self.session = onnxruntime.InferenceSession(path, providers=onnxruntime.get_available_providers())

        # Get model info
        self.get_input_details()
        self.get_output_details()

    def detect_objects(self, image):
        input_tensor = self.prepare_input(image)

        # perform inference on the image
        outputs = self.inference(input_tensor)

        self.boxes, self.scores, self.class_ids = self.process_output(outputs)
        return self.boxes. self.scores, self.class_ids

    def prepare_input(self, image):
        self.img_height, self.img_width = img.shape[:2]

        input_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        # resize input image
        input_img = cv2.resize(input_img, (self.input_width, self.input_height))

        # scale input pixel values to 0 to 1
        input_img = input_img / 255.0
        input_img = input_img.transpose(2, 0, 1)
        input_tensor = input_img[np.newaxis, :, :, :].astype(np.float32)
        return input_tensor

    def inference(self, input_tensor):
        start = time.perf_counter()
        
        outputs = self.session.run(self.output_names, {self.input_names[0]: input_tensor})

        # printf(f"inference time: {(time.perf_counter() - start)*1000:.2f} ms")
        return outputs

    def process_output(self, output):
        predictions = np.squeeze(output[0]).T

        # filter out object confidence scores below threshold
        scores = np.max(predictions[:,4:], axis=1)
        predictions = predictions[scores > self.conf_threshold, :]
        scores = scores[scores > self.conf_threshold]

        if len(scores) == 0:
            return [], [], []

        # get the class with the highest confidence
        class_ids = np.argmax(predictions[:,4:], axis=1)

        # get bounding boxes for each object
        boxes = self.extract_boxes(predictions)

        # apply non-maxima suppression to suppress weak, overlapping bounding boxes
        # indices = nms(boxes, scores, class_ids, self.iou_threshold)

        return boxes[indices], scores[indices], class_ids[indices]

    def extract_boxes(self, predictions):
        # extract boxes from predictions
        boxes = predictions[:,:4]

        # scale boxes to original image dimensions
        boxes = self.rescale_boxes(boxes)

        # convert boxes to xyxy fromat
        boxes = xyw2xyxy(boxes)

        return boxes

    def rescale_boxes(self, boxes):
        # rescale boxes to original image dimensions
        input_shape = np.array([self.input_width, self.input_height, self.input_width, self.input_height])
        boxes = np.divide(boxes, input_shape, dtype=np.float32)
        boxes *= np.array([self.img_width, self.img_height, self.img_width, self.img_height])
        return boxes

    def draw_detection(self, image, draw_scores=True, mask_alpha=0.4):
        return detection_dog(image, self.boxes, self.scores, self.class_ids, mask_alpha)

    def get_input_details(self):
        model_inputs = self.session.get_inputs()
        self.input_names = [model_inputs[i].name for i in range(len(model_inputs))]

        self.input_shape = model_inputs[0].shape
        self.input_height = self.input_shape[2]
        self.input_width = self.input_shape[3]

    def get_output_details(self):
        model_outputs = self.session.get_outputs()
        self.output_names = [model_output[i].name for i in range(len(model_outputs))]
  1. 测试模型

图像测试

python 复制代码
import cv2
import numpy as np
from detection import YOLODet
import gradio as gr

model = 'yolov8m.onnx'
yolo_det = YOLODet(model, conf_thresh=0.5, iou_thresh=0.3)

def det_img(cv_src):
    yolo_det(cv_src)
    cv_dst = yolo_det.draw_detections(cv_src)

    return cv_dst

if __name__ == '__main__':

    input = gr.Image()
    output = gr.Image()

    demo = gr.Interface(fn=det_img, inputs=input, outputs=output)
    demo.launch()

视频推理

python 复制代码
def detectio_video(input_path, model_path, output_path):

    cap = cv2.VideoCapture(input_path)

    fps = int(cap.get(5))

    t = int(1000 / fps)

    videoWriter = None

    det = YOLODet(model_path, conf_thresh=0.3, iou_thresh=0.5)

    while True:

        # try
        _, img = cap.read()
        if img is None:
            break

        det(img)

        cv_dst = det.draw_detections(img)

        if videoWriter is None:
            fourcc = cv2.VideoWriter_fourcc('m','p','4','v')
            videoWriter = cv2.VideoWriter(output_path, fourcc, fps, (cv_dst.shape[1], cv_dst.shape[0]))

        cv2.imshow("detection", cv_dst)
        cv2.waitKey(t)

        if cv2.getWindowProperty("detection", cv2.WND_PROP_AUTOSIZE) < 1:
            break

    cap.release()
    videoWriter.release()
    cv2.destroyAllWindows()
相关推荐
埃菲尔铁塔_CV算法28 分钟前
人工智能图像算法:开启视觉新时代的钥匙
人工智能·算法
EasyCVR28 分钟前
EHOME视频平台EasyCVR视频融合平台使用OBS进行RTMP推流,WebRTC播放出现抖动、卡顿如何解决?
人工智能·算法·ffmpeg·音视频·webrtc·监控视频接入
打羽毛球吗️35 分钟前
机器学习中的两种主要思路:数据驱动与模型驱动
人工智能·机器学习
好喜欢吃红柚子1 小时前
万字长文解读空间、通道注意力机制机制和超详细代码逐行分析(SE,CBAM,SGE,CA,ECA,TA)
人工智能·pytorch·python·计算机视觉·cnn
小馒头学python1 小时前
机器学习是什么?AIGC又是什么?机器学习与AIGC未来科技的双引擎
人工智能·python·机器学习
神奇夜光杯1 小时前
Python酷库之旅-第三方库Pandas(202)
开发语言·人工智能·python·excel·pandas·标准库及第三方库·学习与成长
正义的彬彬侠1 小时前
《XGBoost算法的原理推导》12-14决策树复杂度的正则化项 公式解析
人工智能·决策树·机器学习·集成学习·boosting·xgboost
Debroon1 小时前
RuleAlign 规则对齐框架:将医生的诊断规则形式化并注入模型,无需额外人工标注的自动对齐方法
人工智能
羊小猪~~1 小时前
神经网络基础--什么是正向传播??什么是方向传播??
人工智能·pytorch·python·深度学习·神经网络·算法·机器学习
AI小杨1 小时前
【车道线检测】一、传统车道线检测:基于霍夫变换的车道线检测史诗级详细教程
人工智能·opencv·计算机视觉·霍夫变换·车道线检测