yolov11-cpp-opencv-dnn推理onnx模型

文章目录


前言

随着深度学习技术的不断发展,目标检测算法在计算机视觉领域扮演着越来越重要的角色。YOLO(You Only Look Once)系列算法因其高效、实时的特点,在目标检测领域取得了显著成果。YOLOv11作为该系列的新成员,继承了前代算法的优势,并在性能上有了进一步的提升。

本文将介绍YOLOv11-cpp-opencv-dnn推理的实现方法,通过YOLOv11算法与OpenCV库相结合,我们可以在保持算法高效性的同时,提高代码的可读性和可维护性。此外,C++语言的高性能特性使得该推理方法在处理大规模数据集时具有更好的实时性。


一、前期准备:

本次系统为Ubuntu,使用的编译器为QT,安装opencv的方式为源码安装,安装方式参考博客:值得注意的是,opencv版本一定要大于4.7.0,不然在推理时候会报错,opencv安装好之后在.pro

工程中链接opencv,方法如下所示:

c 复制代码
TEMPLATE = app
CONFIG += console c++11
CONFIG -= app_bundle
CONFIG -= qt

SOURCES += \
        inference.cpp \
        main.cpp

HEADERS += \
    inference.h

INCLUDEPATH +=  /usr/local/include/opencv4 \
            /usr/local/include/opencv4/opencv2


unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_video
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_dnn
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_highgui
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_objdetect
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_calib3d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_core
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_features2d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_flann
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgproc
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgcodecs
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_videoio
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_stitching
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_gapi
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_photo

二、使用步骤

在工程下创建三个文件,分别是inference.h,inference.cpp,main.cpp,其内容分别如下所示:

inference.h:

c 复制代码
#ifndef INFERENCE_H
#define INFERENCE_H

// Cpp native
#include <fstream>
#include <vector>
#include <string>
#include <random>

// OpenCV / DNN / Inference
#include <opencv2/imgproc.hpp>
#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>

struct Detection
{
    int class_id{0};
    std::string className{};
    float confidence{0.0};
    cv::Scalar color{};
    cv::Rect box{};
};

class Inference
{
public:
    Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape = {640, 640}, const std::string &classesTxtFile = "", const bool &runWithCuda = false);
    std::vector<Detection> runInference(const cv::Mat &input);

private:
    void loadClassesFromFile();
    void loadOnnxNetwork();
    cv::Mat formatToSquare(const cv::Mat &source);

    std::string modelPath{};
    std::string classesPath{};
    bool cudaEnabled{};

    std::vector<std::string> classes{"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"};

    cv::Size2f modelShape{};

    float modelConfidenceThreshold {0.25};
    float modelScoreThreshold      {0.45};
    float modelNMSThreshold        {0.50};

    bool letterBoxForSquare = true;

    cv::dnn::Net net;
};

#endif // INFERENCE_H

inference.cpp:

c 复制代码
#include "inference.h"

Inference::Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape, const std::string &classesTxtFile, const bool &runWithCuda)
{
    modelPath = onnxModelPath;
    modelShape = modelInputShape;
    classesPath = classesTxtFile;
    cudaEnabled = runWithCuda;

    loadOnnxNetwork();
    // loadClassesFromFile(); The classes are hard-coded for this example
}

std::vector<Detection> Inference::runInference(const cv::Mat &input)
{
    cv::Mat modelInput = input;
    if (letterBoxForSquare && modelShape.width == modelShape.height)
        modelInput = formatToSquare(modelInput);

    cv::Mat blob;
    cv::dnn::blobFromImage(modelInput, blob, 1.0/255.0, modelShape, cv::Scalar(), true, false);
    net.setInput(blob);

    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    int rows = outputs[0].size[1];
    int dimensions = outputs[0].size[2];

    bool yolov8 = false;
    // yolov5 has an output of shape (batchSize, 25200, 85) (Num classes + box[x,y,w,h] + confidence[c])
    // yolov8 has an output of shape (batchSize, 84,  8400) (Num classes + box[x,y,w,h])
    if (dimensions > rows) // Check if the shape[2] is more than shape[1] (yolov8)
    {
        yolov8 = true;
        rows = outputs[0].size[2];
        dimensions = outputs[0].size[1];

        outputs[0] = outputs[0].reshape(1, dimensions);
        cv::transpose(outputs[0], outputs[0]);
    }
    float *data = (float *)outputs[0].data;

    float x_factor = modelInput.cols / modelShape.width;
    float y_factor = modelInput.rows / modelShape.height;

    std::vector<int> class_ids;
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;

    for (int i = 0; i < rows; ++i)
    {
        if (yolov8)
        {
            float *classes_scores = data+4;

            cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double maxClassScore;

            minMaxLoc(scores, 0, &maxClassScore, 0, &class_id);

            if (maxClassScore > modelScoreThreshold)
            {
                confidences.push_back(maxClassScore);
                class_ids.push_back(class_id.x);

                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];

                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);

                int width = int(w * x_factor);
                int height = int(h * y_factor);

                boxes.push_back(cv::Rect(left, top, width, height));
            }
        }
        else // yolov5
        {
            float confidence = data[4];

            if (confidence >= modelConfidenceThreshold)
            {
                float *classes_scores = data+5;

                cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
                cv::Point class_id;
                double max_class_score;

                minMaxLoc(scores, 0, &max_class_score, 0, &class_id);

                if (max_class_score > modelScoreThreshold)
                {
                    confidences.push_back(confidence);
                    class_ids.push_back(class_id.x);

                    float x = data[0];
                    float y = data[1];
                    float w = data[2];
                    float h = data[3];

                    int left = int((x - 0.5 * w) * x_factor);
                    int top = int((y - 0.5 * h) * y_factor);

                    int width = int(w * x_factor);
                    int height = int(h * y_factor);

                    boxes.push_back(cv::Rect(left, top, width, height));
                }
            }
        }

        data += dimensions;
    }

    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, modelScoreThreshold, modelNMSThreshold, nms_result);

    std::vector<Detection> detections{};
    for (unsigned long i = 0; i < nms_result.size(); ++i)
    {
        int idx = nms_result[i];

        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];

        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<int> dis(100, 255);
        result.color = cv::Scalar(dis(gen),
                                  dis(gen),
                                  dis(gen));

        result.className = classes[result.class_id];
        result.box = boxes[idx];

        detections.push_back(result);
    }

    return detections;
}

void Inference::loadClassesFromFile()
{
    std::ifstream inputFile(classesPath);
    if (inputFile.is_open())
    {
        std::string classLine;
        while (std::getline(inputFile, classLine))
            classes.push_back(classLine);
        inputFile.close();
    }
}

void Inference::loadOnnxNetwork()
{
    net = cv::dnn::readNetFromONNX(modelPath);
    if (cudaEnabled)
    {
        std::cout << "\nRunning on CUDA" << std::endl;

    }
    else
    {
        std::cout << "\nRunning on CPU" << std::endl;
        net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
        net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
    }
}

cv::Mat Inference::formatToSquare(const cv::Mat &source)
{
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

main.cpp:

c 复制代码
#include <iostream>
#include <vector>
#include <getopt.h>

#include <opencv2/opencv.hpp>

#include "inference.h"

using namespace std;
//using namespace cv;

int main(int argc, char **argv)
{
    std::string projectBasePath = "/home/build/下载/ultralytics-main-2"; // Set your ultralytics base path

    bool runOnGPU = false;

    //
    // Pass in either:
    //
    // "yolov8s.onnx" or "yolov5s.onnx"
    //
    // To run Inference with yolov8/yolov5 (ONNX)
    //

    // Note that in this example the classes are hard-coded and 'classes.txt' is a place holder.
    Inference inf("/home/build/下载/ultralytics-main-2/yolo11n.onnx", cv::Size(640, 640), "classes.txt", runOnGPU);

    std::vector<std::string> imageNames;
    imageNames.push_back( "/home/build/下载/ultralytics-main-2/ccc.jpg");

    for (int i = 0; i < imageNames.size(); ++i)
    {
        cv::Mat frame = cv::imread(imageNames[i]);

        // Inference starts here...
        std::vector<Detection> output = inf.runInference(frame);

        int detections = output.size();
        std::cout << "Number of detections:" << detections << std::endl;

        for (int i = 0; i < detections; ++i)
        {
            Detection detection = output[i];

            cv::Rect box = detection.box;
            cv::Scalar color = detection.color;

            // Detection box
            cv::rectangle(frame, box, color, 2);

            // Detection box text
            std::string classString = detection.className + ' ' + std::to_string(detection.confidence).substr(0, 4);
            cv::Size textSize = cv::getTextSize(classString, cv::FONT_HERSHEY_DUPLEX, 1, 2, 0);
            cv::Rect textBox(box.x, box.y - 40, textSize.width + 10, textSize.height + 20);

            cv::rectangle(frame, textBox, color, cv::FILLED);
            cv::putText(frame, classString, cv::Point(box.x + 5, box.y - 10), cv::FONT_HERSHEY_DUPLEX, 1, cv::Scalar(0, 0, 0), 2, 0);
        }
        // Inference ends here...

        // This is only for preview purposes
        float scale = 0.8;
        cv::resize(frame, frame, cv::Size(640, 640));
        cv::imshow("Inference", frame);

        cv::waitKey(-1);
    }
}

在main.cpp中修改图片和模型路径,运行:

相关推荐
Chef_Chen41 分钟前
从0开始学习计算机视觉--Day09--卷积与池化
深度学习·学习·计算机视觉
charley.layabox5 小时前
8月1日ChinaJoy酒会 | 游戏出海高端私享局 | 平台 × 发行 × 投资 × 研发精英畅饮畅聊
人工智能·游戏
DFRobot智位机器人6 小时前
AIOT开发选型:行空板 K10 与 M10 适用场景与选型深度解析
人工智能
想成为风筝8 小时前
从零开始学习深度学习—水果分类之PyQt5App
人工智能·深度学习·计算机视觉·pyqt
F_D_Z8 小时前
MMaDA:多模态大型扩散语言模型
人工智能·语言模型·自然语言处理
大知闲闲哟8 小时前
深度学习G2周:人脸图像生成(DCGAN)
人工智能·深度学习
飞哥数智坊9 小时前
Coze实战第15讲:钱都去哪儿了?Coze+飞书搭建自动记账系统
人工智能·coze
wenzhangli79 小时前
低代码引擎核心技术:OneCode常用动作事件速查手册及注解驱动开发详解
人工智能·低代码·云原生
千宇宙航9 小时前
闲庭信步使用图像验证平台加速FPGA的开发:第十课——图像gamma矫正的FPGA实现
图像处理·计算机视觉·缓存·fpga开发
潘达斯奈基~10 小时前
大模型的Temperature、Top-P、Top-K、Greedy Search、Beem Search
人工智能·aigc