yolov11-cpp-opencv-dnn推理onnx模型

文章目录


前言

随着深度学习技术的不断发展,目标检测算法在计算机视觉领域扮演着越来越重要的角色。YOLO(You Only Look Once)系列算法因其高效、实时的特点,在目标检测领域取得了显著成果。YOLOv11作为该系列的新成员,继承了前代算法的优势,并在性能上有了进一步的提升。

本文将介绍YOLOv11-cpp-opencv-dnn推理的实现方法,通过YOLOv11算法与OpenCV库相结合,我们可以在保持算法高效性的同时,提高代码的可读性和可维护性。此外,C++语言的高性能特性使得该推理方法在处理大规模数据集时具有更好的实时性。


一、前期准备:

本次系统为Ubuntu,使用的编译器为QT,安装opencv的方式为源码安装,安装方式参考博客:值得注意的是,opencv版本一定要大于4.7.0,不然在推理时候会报错,opencv安装好之后在.pro

工程中链接opencv,方法如下所示:

c 复制代码
TEMPLATE = app
CONFIG += console c++11
CONFIG -= app_bundle
CONFIG -= qt

SOURCES += \
        inference.cpp \
        main.cpp

HEADERS += \
    inference.h

INCLUDEPATH +=  /usr/local/include/opencv4 \
            /usr/local/include/opencv4/opencv2


unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_video
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_dnn
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_highgui
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_objdetect
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_calib3d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_core
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_features2d
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_flann
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgproc
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_imgcodecs
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_videoio
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_stitching
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_gapi
unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_photo

二、使用步骤

在工程下创建三个文件,分别是inference.h,inference.cpp,main.cpp,其内容分别如下所示:

inference.h:

c 复制代码
#ifndef INFERENCE_H
#define INFERENCE_H

// Cpp native
#include <fstream>
#include <vector>
#include <string>
#include <random>

// OpenCV / DNN / Inference
#include <opencv2/imgproc.hpp>
#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>

struct Detection
{
    int class_id{0};
    std::string className{};
    float confidence{0.0};
    cv::Scalar color{};
    cv::Rect box{};
};

class Inference
{
public:
    Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape = {640, 640}, const std::string &classesTxtFile = "", const bool &runWithCuda = false);
    std::vector<Detection> runInference(const cv::Mat &input);

private:
    void loadClassesFromFile();
    void loadOnnxNetwork();
    cv::Mat formatToSquare(const cv::Mat &source);

    std::string modelPath{};
    std::string classesPath{};
    bool cudaEnabled{};

    std::vector<std::string> classes{"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"};

    cv::Size2f modelShape{};

    float modelConfidenceThreshold {0.25};
    float modelScoreThreshold      {0.45};
    float modelNMSThreshold        {0.50};

    bool letterBoxForSquare = true;

    cv::dnn::Net net;
};

#endif // INFERENCE_H

inference.cpp:

c 复制代码
#include "inference.h"

Inference::Inference(const std::string &onnxModelPath, const cv::Size &modelInputShape, const std::string &classesTxtFile, const bool &runWithCuda)
{
    modelPath = onnxModelPath;
    modelShape = modelInputShape;
    classesPath = classesTxtFile;
    cudaEnabled = runWithCuda;

    loadOnnxNetwork();
    // loadClassesFromFile(); The classes are hard-coded for this example
}

std::vector<Detection> Inference::runInference(const cv::Mat &input)
{
    cv::Mat modelInput = input;
    if (letterBoxForSquare && modelShape.width == modelShape.height)
        modelInput = formatToSquare(modelInput);

    cv::Mat blob;
    cv::dnn::blobFromImage(modelInput, blob, 1.0/255.0, modelShape, cv::Scalar(), true, false);
    net.setInput(blob);

    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    int rows = outputs[0].size[1];
    int dimensions = outputs[0].size[2];

    bool yolov8 = false;
    // yolov5 has an output of shape (batchSize, 25200, 85) (Num classes + box[x,y,w,h] + confidence[c])
    // yolov8 has an output of shape (batchSize, 84,  8400) (Num classes + box[x,y,w,h])
    if (dimensions > rows) // Check if the shape[2] is more than shape[1] (yolov8)
    {
        yolov8 = true;
        rows = outputs[0].size[2];
        dimensions = outputs[0].size[1];

        outputs[0] = outputs[0].reshape(1, dimensions);
        cv::transpose(outputs[0], outputs[0]);
    }
    float *data = (float *)outputs[0].data;

    float x_factor = modelInput.cols / modelShape.width;
    float y_factor = modelInput.rows / modelShape.height;

    std::vector<int> class_ids;
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;

    for (int i = 0; i < rows; ++i)
    {
        if (yolov8)
        {
            float *classes_scores = data+4;

            cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double maxClassScore;

            minMaxLoc(scores, 0, &maxClassScore, 0, &class_id);

            if (maxClassScore > modelScoreThreshold)
            {
                confidences.push_back(maxClassScore);
                class_ids.push_back(class_id.x);

                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];

                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);

                int width = int(w * x_factor);
                int height = int(h * y_factor);

                boxes.push_back(cv::Rect(left, top, width, height));
            }
        }
        else // yolov5
        {
            float confidence = data[4];

            if (confidence >= modelConfidenceThreshold)
            {
                float *classes_scores = data+5;

                cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);
                cv::Point class_id;
                double max_class_score;

                minMaxLoc(scores, 0, &max_class_score, 0, &class_id);

                if (max_class_score > modelScoreThreshold)
                {
                    confidences.push_back(confidence);
                    class_ids.push_back(class_id.x);

                    float x = data[0];
                    float y = data[1];
                    float w = data[2];
                    float h = data[3];

                    int left = int((x - 0.5 * w) * x_factor);
                    int top = int((y - 0.5 * h) * y_factor);

                    int width = int(w * x_factor);
                    int height = int(h * y_factor);

                    boxes.push_back(cv::Rect(left, top, width, height));
                }
            }
        }

        data += dimensions;
    }

    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, modelScoreThreshold, modelNMSThreshold, nms_result);

    std::vector<Detection> detections{};
    for (unsigned long i = 0; i < nms_result.size(); ++i)
    {
        int idx = nms_result[i];

        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];

        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<int> dis(100, 255);
        result.color = cv::Scalar(dis(gen),
                                  dis(gen),
                                  dis(gen));

        result.className = classes[result.class_id];
        result.box = boxes[idx];

        detections.push_back(result);
    }

    return detections;
}

void Inference::loadClassesFromFile()
{
    std::ifstream inputFile(classesPath);
    if (inputFile.is_open())
    {
        std::string classLine;
        while (std::getline(inputFile, classLine))
            classes.push_back(classLine);
        inputFile.close();
    }
}

void Inference::loadOnnxNetwork()
{
    net = cv::dnn::readNetFromONNX(modelPath);
    if (cudaEnabled)
    {
        std::cout << "\nRunning on CUDA" << std::endl;

    }
    else
    {
        std::cout << "\nRunning on CPU" << std::endl;
        net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
        net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
    }
}

cv::Mat Inference::formatToSquare(const cv::Mat &source)
{
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

main.cpp:

c 复制代码
#include <iostream>
#include <vector>
#include <getopt.h>

#include <opencv2/opencv.hpp>

#include "inference.h"

using namespace std;
//using namespace cv;

int main(int argc, char **argv)
{
    std::string projectBasePath = "/home/build/下载/ultralytics-main-2"; // Set your ultralytics base path

    bool runOnGPU = false;

    //
    // Pass in either:
    //
    // "yolov8s.onnx" or "yolov5s.onnx"
    //
    // To run Inference with yolov8/yolov5 (ONNX)
    //

    // Note that in this example the classes are hard-coded and 'classes.txt' is a place holder.
    Inference inf("/home/build/下载/ultralytics-main-2/yolo11n.onnx", cv::Size(640, 640), "classes.txt", runOnGPU);

    std::vector<std::string> imageNames;
    imageNames.push_back( "/home/build/下载/ultralytics-main-2/ccc.jpg");

    for (int i = 0; i < imageNames.size(); ++i)
    {
        cv::Mat frame = cv::imread(imageNames[i]);

        // Inference starts here...
        std::vector<Detection> output = inf.runInference(frame);

        int detections = output.size();
        std::cout << "Number of detections:" << detections << std::endl;

        for (int i = 0; i < detections; ++i)
        {
            Detection detection = output[i];

            cv::Rect box = detection.box;
            cv::Scalar color = detection.color;

            // Detection box
            cv::rectangle(frame, box, color, 2);

            // Detection box text
            std::string classString = detection.className + ' ' + std::to_string(detection.confidence).substr(0, 4);
            cv::Size textSize = cv::getTextSize(classString, cv::FONT_HERSHEY_DUPLEX, 1, 2, 0);
            cv::Rect textBox(box.x, box.y - 40, textSize.width + 10, textSize.height + 20);

            cv::rectangle(frame, textBox, color, cv::FILLED);
            cv::putText(frame, classString, cv::Point(box.x + 5, box.y - 10), cv::FONT_HERSHEY_DUPLEX, 1, cv::Scalar(0, 0, 0), 2, 0);
        }
        // Inference ends here...

        // This is only for preview purposes
        float scale = 0.8;
        cv::resize(frame, frame, cv::Size(640, 640));
        cv::imshow("Inference", frame);

        cv::waitKey(-1);
    }
}

在main.cpp中修改图片和模型路径,运行:

相关推荐
爆改模型4 分钟前
【CVPR2025】计算机视觉|PX:让模型训练“事半功倍”!
人工智能·计算机视觉
weixin_446260854 小时前
轻松实现浏览器自动化——AI浏览器自动化框架Stagehand
运维·人工智能·自动化
张子夜 iiii5 小时前
(0️⃣基础)程序控制语句(初学者)(第3天)
人工智能·python
xiaoxiaoxiaolll5 小时前
双驱智造革命:物理方程+工业数据训练,突破增材制造温度场预测瓶颈
人工智能·深度学习·学习·制造
CareyWYR6 小时前
高效智能体设计:如何在不牺牲效果的前提下降低成本?
人工智能
Sui_Network7 小时前
Walrus 与 Pipe Network 集成,提升多链带宽并降低延迟
人工智能·web3·区块链·智能合约·量子计算
攻城狮7号7 小时前
GPT-OSS重磅开源:当OpenAI重拾“开放”初心
人工智能·openai·开源大模型·gpt-oss
我不是小upper7 小时前
什么是键值缓存?让 LLM 闪电般快速
人工智能·缓存·llm
2zcode7 小时前
基于Matlab图像处理的黄豆自动计数系统设计与实现
图像处理·人工智能·matlab