yolov11-cpp-opencv-dnn推理onnx模型

文章目录


前言

随着深度学习技术的不断发展,目标检测算法在计算机视觉领域扮演着越来越重要的角色。YOLO(You Only Look Once)系列算法因其高效、实时的特点,在目标检测领域取得了显著成果。YOLOv11作为该系列的新成员,继承了前代算法的优势,并在性能上有了进一步的提升。

本文将介绍YOLOv11-cpp-opencv-dnn推理的实现方法,通过YOLOv11算法与OpenCV库相结合,我们可以在保持算法高效性的同时,提高代码的可读性和可维护性。此外,C++语言的高性能特性使得该推理方法在处理大规模数据集时具有更好的实时性。


一、前期准备:

本次系统为Ubuntu,使用的编译器为QT,安装opencv的方式为源码安装,安装方式参考博客:值得注意的是,opencv版本一定要大于4.7.0,不然在推理时候会报错,opencv安装好之后在.pro

工程中链接opencv,方法如下所示:

c 复制代码
TEMPLATE = app
CONFIG += console c++11
CONFIG -= app_bundle
CONFIG -= qt

SOURCES += \
        inference.cpp \
        main.cpp

HEADERS += \
    inference.h

INCLUDEPATH +=  /usr/local/include/opencv4 \
            /usr/local/include/opencv4/opencv2


unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_video
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_dnn
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_highgui
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_objdetect
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_calib3d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_core
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_features2d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_flann
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgproc
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgcodecs
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_videoio
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_stitching
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_gapi
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_photo

二、使用步骤

在工程下创建三个文件,分别是inference.h,inference.cpp,main.cpp,其内容分别如下所示:

inference.h:

c 复制代码
#ifndef INFERENCE_H
#define INFERENCE_H

// Cpp native
#include <fstream>
#include <vector>
#include <string>
#include <random>

// OpenCV / DNN / Inference
#include <opencv2/imgproc.hpp>
#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>

struct Detection
{
    int class_id{0};
    std::string className{};
    float confidence{0.0};
    cv::Scalar color{};
    cv::Rect box{};
};

class Inference
{
public:
    Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape = {640, 640}, const std::string &classesTxtFile = "", const bool &runWithCuda = false);
    std::vector<Detection> runInference(const cv::Mat &input);

private:
    void loadClassesFromFile();
    void loadOnnxNetwork();
    cv::Mat formatToSquare(const cv::Mat &source);

    std::string modelPath{};
    std::string classesPath{};
    bool cudaEnabled{};

    std::vector<std::string> classes{"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"};

    cv::Size2f modelShape{};

    float modelConfidenceThreshold {0.25};
    float modelScoreThreshold      {0.45};
    float modelNMSThreshold        {0.50};

    bool letterBoxForSquare = true;

    cv::dnn::Net net;
};

#endif // INFERENCE_H

inference.cpp:

c 复制代码
#include "inference.h"

Inference::Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape, const std::string &classesTxtFile, const bool &runWithCuda)
{
    modelPath = onnxModelPath;
    modelShape = modelInputShape;
    classesPath = classesTxtFile;
    cudaEnabled = runWithCuda;

    loadOnnxNetwork();
    // loadClassesFromFile(); The classes are hard-coded for this example
}

std::vector<Detection> Inference::runInference(const cv::Mat &input)
{
    cv::Mat modelInput = input;
    if (letterBoxForSquare && modelShape.width == modelShape.height)
        modelInput = formatToSquare(modelInput);

    cv::Mat blob;
    cv::dnn::blobFromImage(modelInput, blob, 1.0/255.0, modelShape, cv::Scalar(), true, false);
    net.setInput(blob);

    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    int rows = outputs[0].size[1];
    int dimensions = outputs[0].size[2];

    bool yolov8 = false;
    // yolov5 has an output of shape (batchSize, 25200, 85) (Num classes + box[x,y,w,h] + confidence[c])
    // yolov8 has an output of shape (batchSize, 84,  8400) (Num classes + box[x,y,w,h])
    if (dimensions > rows) // Check if the shape[2] is more than shape[1] (yolov8)
    {
        yolov8 = true;
        rows = outputs[0].size[2];
        dimensions = outputs[0].size[1];

        outputs[0] = outputs[0].reshape(1, dimensions);
        cv::transpose(outputs[0], outputs[0]);
    }
    float *data = (float *)outputs[0].data;

    float x_factor = modelInput.cols / modelShape.width;
    float y_factor = modelInput.rows / modelShape.height;

    std::vector<int> class_ids;
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;

    for (int i = 0; i < rows; ++i)
    {
        if (yolov8)
        {
            float *classes_scores = data+4;

            cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double maxClassScore;

            minMaxLoc(scores, 0, &maxClassScore, 0, &class_id);

            if (maxClassScore > modelScoreThreshold)
            {
                confidences.push_back(maxClassScore);
                class_ids.push_back(class_id.x);

                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];

                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);

                int width = int(w * x_factor);
                int height = int(h * y_factor);

                boxes.push_back(cv::Rect(left, top, width, height));
            }
        }
        else // yolov5
        {
            float confidence = data[4];

            if (confidence >= modelConfidenceThreshold)
            {
                float *classes_scores = data+5;

                cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
                cv::Point class_id;
                double max_class_score;

                minMaxLoc(scores, 0, &max_class_score, 0, &class_id);

                if (max_class_score > modelScoreThreshold)
                {
                    confidences.push_back(confidence);
                    class_ids.push_back(class_id.x);

                    float x = data[0];
                    float y = data[1];
                    float w = data[2];
                    float h = data[3];

                    int left = int((x - 0.5 * w) * x_factor);
                    int top = int((y - 0.5 * h) * y_factor);

                    int width = int(w * x_factor);
                    int height = int(h * y_factor);

                    boxes.push_back(cv::Rect(left, top, width, height));
                }
            }
        }

        data += dimensions;
    }

    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, modelScoreThreshold, modelNMSThreshold, nms_result);

    std::vector<Detection> detections{};
    for (unsigned long i = 0; i < nms_result.size(); ++i)
    {
        int idx = nms_result[i];

        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];

        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<int> dis(100, 255);
        result.color = cv::Scalar(dis(gen),
                                  dis(gen),
                                  dis(gen));

        result.className = classes[result.class_id];
        result.box = boxes[idx];

        detections.push_back(result);
    }

    return detections;
}

void Inference::loadClassesFromFile()
{
    std::ifstream inputFile(classesPath);
    if (inputFile.is_open())
    {
        std::string classLine;
        while (std::getline(inputFile, classLine))
            classes.push_back(classLine);
        inputFile.close();
    }
}

void Inference::loadOnnxNetwork()
{
    net = cv::dnn::readNetFromONNX(modelPath);
    if (cudaEnabled)
    {
        std::cout << "\nRunning on CUDA" << std::endl;

    }
    else
    {
        std::cout << "\nRunning on CPU" << std::endl;
        net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
        net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
    }
}

cv::Mat Inference::formatToSquare(const cv::Mat &source)
{
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

main.cpp:

c 复制代码
#include <iostream>
#include <vector>
#include <getopt.h>

#include <opencv2/opencv.hpp>

#include "inference.h"

using namespace std;
//using namespace cv;

int main(int argc, char **argv)
{
    std::string projectBasePath = "/home/build/下载/ultralytics-main-2"; // Set your ultralytics base path

    bool runOnGPU = false;

    //
    // Pass in either:
    //
    // "yolov8s.onnx" or "yolov5s.onnx"
    //
    // To run Inference with yolov8/yolov5 (ONNX)
    //

    // Note that in this example the classes are hard-coded and 'classes.txt' is a place holder.
    Inference inf("/home/build/下载/ultralytics-main-2/yolo11n.onnx", cv::Size(640, 640), "classes.txt", runOnGPU);

    std::vector<std::string> imageNames;
    imageNames.push_back( "/home/build/下载/ultralytics-main-2/ccc.jpg");

    for (int i = 0; i < imageNames.size(); ++i)
    {
        cv::Mat frame = cv::imread(imageNames[i]);

        // Inference starts here...
        std::vector<Detection> output = inf.runInference(frame);

        int detections = output.size();
        std::cout << "Number of detections:" << detections << std::endl;

        for (int i = 0; i < detections; ++i)
        {
            Detection detection = output[i];

            cv::Rect box = detection.box;
            cv::Scalar color = detection.color;

            // Detection box
            cv::rectangle(frame, box, color, 2);

            // Detection box text
            std::string classString = detection.className + ' ' + std::to_string(detection.confidence).substr(0, 4);
            cv::Size textSize = cv::getTextSize(classString, cv::FONT_HERSHEY_DUPLEX, 1, 2, 0);
            cv::Rect textBox(box.x, box.y - 40, textSize.width + 10, textSize.height + 20);

            cv::rectangle(frame, textBox, color, cv::FILLED);
            cv::putText(frame, classString, cv::Point(box.x + 5, box.y - 10), cv::FONT_HERSHEY_DUPLEX, 1, cv::Scalar(0, 0, 0), 2, 0);
        }
        // Inference ends here...

        // This is only for preview purposes
        float scale = 0.8;
        cv::resize(frame, frame, cv::Size(640, 640));
        cv::imshow("Inference", frame);

        cv::waitKey(-1);
    }
}

在main.cpp中修改图片和模型路径,运行:

相关推荐
艾莉丝努力练剑几秒前
【C++:红黑树】深入理解红黑树的平衡之道:从原理、变色、旋转到完整实现代码
大数据·开发语言·c++·人工智能·红黑树
九章云极AladdinEdu4 分钟前
论文分享 | BARD-GS:基于高斯泼溅的模糊感知动态场景重建
人工智能·新视角合成·动态场景重建·运动模糊处理·3d高斯泼溅·模糊感知建模·真实世界数据集
希露菲叶特格雷拉特16 分钟前
PyTorch深度学习笔记(二十)(模型验证测试)
人工智能·pytorch·笔记
NewsMash20 分钟前
PyTorch之父发离职长文,告别Meta
人工智能·pytorch·python
IT_陈寒22 分钟前
Python 3.12新特性实测:10个让你的代码提速30%的隐藏技巧 🚀
前端·人工智能·后端
Ztop25 分钟前
GPT-5.1 已确认!OpenAI下一步推理升级?对决 Gemini 3 在即
人工智能·gpt·chatgpt
qq_4369621830 分钟前
奥威BI:打破数据分析的桎梏,让决策更自由
人工智能·数据挖掘·数据分析
金融Tech趋势派31 分钟前
金融机构如何用企业微信实现客户服务优化?
大数据·人工智能·金融·企业微信·企业微信scrm
大模型真好玩43 分钟前
LangChain1.0速通指南(三)——LangChain1.0 create_agent api 高阶功能
人工智能·langchain·mcp
汉克老师44 分钟前
CCF--LMCC大语言模型能力认证官方样题(第一赛(青少年组)第二部分 程序题 (21--25))
人工智能·语言模型·自然语言处理·lmcc