C# OnnxRuntime Gaze-LLE 凝视目标估计,通过利用冻结的DINOv2编码器的特征来简化注视目标估计,预测一个人在场景中看的位置。

目录

说明

效果

​编辑模型信息

det_face.onnx

gazelle_dinov2_vitl14_inout_1x3x448x448_1xNx4.onnx

项目

代码

下载

参考


说明

github地址:https://github.com/fkryan/gazelle

This is the official implementation for Gaze-LLE, a transformer approach for estimating gaze targets that leverages the power of pretrained visual foundation models. Gaze-LLE provides a streamlined gaze architecture that learns only a lightweight gaze decoder on top of a frozen, pretrained visual encoder (DINOv2). Gaze-LLE learns 1-2 orders of magnitude fewer parameters than prior works and doesn't require any extra input modalities like depth and pose!

效果

模型信息

det_face.onnx

Model Properties



Inputs


name:input.1

tensor:Float[1, 3, -1, -1]


Outputs


name:448

tensor:Float[12800, 1]

name:471

tensor:Float[3200, 1]

name:494

tensor:Float[800, 1]

name:451

tensor:Float[12800, 4]

name:474

tensor:Float[3200, 4]

name:497

tensor:Float[800, 4]

name:454

tensor:Float[12800, 10]

name:477

tensor:Float[3200, 10]

name:500

tensor:Float[800, 10]


gazelle_dinov2_vitl14_inout_1x3x448x448_1xNx4.onnx

Model Properties



Inputs


name:image_bgr

tensor:Float[1, 3, 448, 448]

name:bboxes_x1y1x2y2

tensor:Float[1, -1, 4]


Outputs


name:heatmap

tensor:Float[-1, 64, 64]

name:inout

tensor:Float[-1]


项目

代码

using OpenCvSharp;

using System;

using System.Collections.Generic;

using System.Drawing;

using System.Drawing.Imaging;

using System.Windows.Forms;

namespace Onnx_Demo

{

public partial class Form1 : Form

{

public Form1()

{

InitializeComponent();

}

string fileFilter = "*.*|*.bmp;*.jpg;*.jpeg;*.tiff;*.tiff;*.png";

string image_path = "";

DateTime dt1 = DateTime.Now;

DateTime dt2 = DateTime.Now;

Mat image;

Mat result_image;

FaceDet face_det;

GazeLLE gazelle;

private void button1_Click(object sender, EventArgs e)

{

OpenFileDialog ofd = new OpenFileDialog();

ofd.Filter = fileFilter;

if (ofd.ShowDialog() != DialogResult.OK) return;

pictureBox1.Image = null;

image_path = ofd.FileName;

pictureBox1.Image = new Bitmap(image_path);

textBox1.Text = "";

image = new Mat(image_path);

pictureBox2.Image = null;

}

private void button2_Click(object sender, EventArgs e)

{

if (image_path == "")

{

return;

}

button2.Enabled = false;

Application.DoEvents();

image = new Mat(image_path);

result_image = image.Clone();

dt1 = DateTime.Now;

List<Bbox> head_boxes = face_det.Detect(image);

foreach (var item in head_boxes)

{

Rect rect = Rect.FromLTRB((int)item.xmin, (int)item.ymin, (int)item.xmax, (int)item.ymax);

Cv2.Rectangle(result_image, rect, Scalar.Red);

}

List<Mat> resized_heatmaps = gazelle.Predict(image, head_boxes);

dt2 = DateTime.Now;

DrawGaze(result_image, head_boxes, resized_heatmaps);

pictureBox2.Image = new Bitmap(result_image.ToMemoryStream());

textBox1.Text = "推理耗时:" + (dt2 - dt1).TotalMilliseconds + "ms";

button2.Enabled = true;

}

void DrawGaze(Mat frame, List<Bbox> head_boxes, List<Mat> heatmaps, float thr = 0.0f)

{

int num_box = head_boxes.Count;

for (int i = 0; i < num_box; i++)

{

double max_score;

OpenCvSharp.Point classIdPoint;

double minVal;

OpenCvSharp.Point minLoc;

Cv2.MinMaxLoc(heatmaps[i], out minVal, out max_score, out minLoc, out classIdPoint);

int cx = classIdPoint.X;

int cy = classIdPoint.Y;

if (max_score >= thr)

{

int head_cx = (int)((head_boxes[i].xmin + head_boxes[i].xmax) * 0.5);

int head_cy = (int)((head_boxes[i].ymin + head_boxes[i].ymax) * 0.5);

Cv2.ArrowedLine(frame, new OpenCvSharp.Point(head_cx, head_cy), new OpenCvSharp.Point(cx, cy), new Scalar(0, 255, 0), 2, LineTypes.AntiAlias);

}

}

}

private void Form1_Load(object sender, EventArgs e)

{

face_det = new FaceDet("model\\det_face.onnx");

gazelle = new GazeLLE("model\\gazelle_dinov2_vitl14_inout_1x3x448x448_1xNx4.onnx");

image_path = "test_img\\1.jpg";

pictureBox1.Image = new Bitmap(image_path);

}

private void button3_Click(object sender, EventArgs e)

{

if (pictureBox2.Image == null)

{

return;

}

Bitmap output = new Bitmap(pictureBox2.Image);

SaveFileDialog sdf = new SaveFileDialog();

sdf.Title = "保存";

sdf.Filter = "Images (*.jpg)|*.jpg|Images (*.png)|*.png|Images (*.bmp)|*.bmp|Images (*.emf)|*.emf|Images (*.exif)|*.exif|Images (*.gif)|*.gif|Images (*.ico)|*.ico|Images (*.tiff)|*.tiff|Images (*.wmf)|*.wmf";

if (sdf.ShowDialog() == DialogResult.OK)

{

switch (sdf.FilterIndex)

{

case 1:

{

output.Save(sdf.FileName, ImageFormat.Jpeg);

break;

}

case 2:

{

output.Save(sdf.FileName, ImageFormat.Png);

break;

}

case 3:

{

output.Save(sdf.FileName, ImageFormat.Bmp);

break;

}

case 4:

{

output.Save(sdf.FileName, ImageFormat.Emf);

break;

}

case 5:

{

output.Save(sdf.FileName, ImageFormat.Exif);

break;

}

case 6:

{

output.Save(sdf.FileName, ImageFormat.Gif);

break;

}

case 7:

{

output.Save(sdf.FileName, ImageFormat.Icon);

break;

}

case 8:

{

output.Save(sdf.FileName, ImageFormat.Tiff);

break;

}

case 9:

{

output.Save(sdf.FileName, ImageFormat.Wmf);

break;

}

}

MessageBox.Show("保存成功,位置:" + sdf.FileName);

}

}

}

}

复制代码
using OpenCvSharp;
using System;
using System.Collections.Generic;
using System.Drawing;
using System.Drawing.Imaging;
using System.Windows.Forms;

namespace Onnx_Demo
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        string fileFilter = "*.*|*.bmp;*.jpg;*.jpeg;*.tiff;*.tiff;*.png";
        string image_path = "";
        DateTime dt1 = DateTime.Now;
        DateTime dt2 = DateTime.Now;

        Mat image;
        Mat result_image;

        FaceDet face_det;
        GazeLLE gazelle;

        private void button1_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.Filter = fileFilter;
            if (ofd.ShowDialog() != DialogResult.OK) return;
            pictureBox1.Image = null;
            image_path = ofd.FileName;
            pictureBox1.Image = new Bitmap(image_path);
            textBox1.Text = "";
            image = new Mat(image_path);
            pictureBox2.Image = null;
        }

        private void button2_Click(object sender, EventArgs e)
        {
            if (image_path == "")
            {
                return;
            }

            button2.Enabled = false;
            Application.DoEvents();

            image = new Mat(image_path);
            result_image = image.Clone();

            dt1 = DateTime.Now;
            List<Bbox> head_boxes = face_det.Detect(image);

            foreach (var item in head_boxes)
            {
                Rect rect = Rect.FromLTRB((int)item.xmin, (int)item.ymin, (int)item.xmax, (int)item.ymax);
                Cv2.Rectangle(result_image, rect, Scalar.Red);
            }

            List<Mat> resized_heatmaps = gazelle.Predict(image, head_boxes);
            dt2 = DateTime.Now;

            DrawGaze(result_image, head_boxes, resized_heatmaps);

            pictureBox2.Image = new Bitmap(result_image.ToMemoryStream());
            textBox1.Text = "推理耗时:" + (dt2 - dt1).TotalMilliseconds + "ms";

            button2.Enabled = true;
        }

        void DrawGaze(Mat frame, List<Bbox> head_boxes, List<Mat> heatmaps, float thr = 0.0f)
        {
            int num_box = head_boxes.Count;
            for (int i = 0; i < num_box; i++)
            {
                double max_score;
                OpenCvSharp.Point classIdPoint;
                double minVal;
                OpenCvSharp.Point minLoc;
                Cv2.MinMaxLoc(heatmaps[i], out minVal, out max_score, out minLoc, out classIdPoint);
                int cx = classIdPoint.X;
                int cy = classIdPoint.Y;
                if (max_score >= thr)
                {
                    int head_cx = (int)((head_boxes[i].xmin + head_boxes[i].xmax) * 0.5);
                    int head_cy = (int)((head_boxes[i].ymin + head_boxes[i].ymax) * 0.5);
                   
                    Cv2.ArrowedLine(frame, new OpenCvSharp.Point(head_cx, head_cy), new OpenCvSharp.Point(cx, cy), new Scalar(0, 255, 0), 2, LineTypes.AntiAlias);
                }
            }
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            face_det = new FaceDet("model\\det_face.onnx");
            gazelle = new GazeLLE("model\\gazelle_dinov2_vitl14_inout_1x3x448x448_1xNx4.onnx");

            image_path = "test_img\\1.jpg";
            pictureBox1.Image = new Bitmap(image_path);
        }

        private void button3_Click(object sender, EventArgs e)
        {
            if (pictureBox2.Image == null)
            {
                return;
            }
            Bitmap output = new Bitmap(pictureBox2.Image);
            SaveFileDialog sdf = new SaveFileDialog();
            sdf.Title = "保存";
            sdf.Filter = "Images (*.jpg)|*.jpg|Images (*.png)|*.png|Images (*.bmp)|*.bmp|Images (*.emf)|*.emf|Images (*.exif)|*.exif|Images (*.gif)|*.gif|Images (*.ico)|*.ico|Images (*.tiff)|*.tiff|Images (*.wmf)|*.wmf";
            if (sdf.ShowDialog() == DialogResult.OK)
            {
                switch (sdf.FilterIndex)
                {
                    case 1:
                        {
                            output.Save(sdf.FileName, ImageFormat.Jpeg);
                            break;
                        }
                    case 2:
                        {
                            output.Save(sdf.FileName, ImageFormat.Png);
                            break;
                        }
                    case 3:
                        {
                            output.Save(sdf.FileName, ImageFormat.Bmp);
                            break;
                        }
                    case 4:
                        {
                            output.Save(sdf.FileName, ImageFormat.Emf);
                            break;
                        }
                    case 5:
                        {
                            output.Save(sdf.FileName, ImageFormat.Exif);
                            break;
                        }
                    case 6:
                        {
                            output.Save(sdf.FileName, ImageFormat.Gif);
                            break;
                        }
                    case 7:
                        {
                            output.Save(sdf.FileName, ImageFormat.Icon);
                            break;
                        }

                    case 8:
                        {
                            output.Save(sdf.FileName, ImageFormat.Tiff);
                            break;
                        }
                    case 9:
                        {
                            output.Save(sdf.FileName, ImageFormat.Wmf);
                            break;
                        }
                }
                MessageBox.Show("保存成功,位置:" + sdf.FileName);
            }
        }
    }
}

下载

源码下载

参考

https://github.com/hpc203/Gaze-LLE-onnxrun

相关推荐
2301_787552871 小时前
console-chat-gpt开源程序是用于 AI Chat API 的 Python CLI
人工智能·python·gpt·开源·自动化
layneyao1 小时前
AI与自然语言处理(NLP):从BERT到GPT的演进
人工智能·自然语言处理·bert
jndingxin2 小时前
OpenCV 的 CUDA 模块中用于将多个单通道的 GpuMat 图像合并成一个多通道的图像 函数cv::cuda::merge
人工智能·opencv·计算机视觉
格林威2 小时前
Baumer工业相机堡盟工业相机的工业视觉中为什么偏爱“黑白相机”
开发语言·c++·人工智能·数码相机·计算机视觉
灬0灬灬0灬3 小时前
深度学习---常用优化器
人工智能·深度学习
_Itachi__3 小时前
Model.eval() 与 torch.no_grad() PyTorch 中的区别与应用
人工智能·pytorch·python
白光白光3 小时前
大语言模型训练的两个阶段
人工智能·机器学习·语言模型
巷9554 小时前
OpenCV图像金字塔详解:原理、实现与应用
人工智能·opencv·计算机视觉
科技小E4 小时前
WebRTC实时音视频通话技术EasyRTC嵌入式音视频通信SDK,助力智慧物流打造实时高效的物流管理体系
人工智能·音视频
BioRunYiXue4 小时前
一文了解氨基酸的分类、代谢和应用
人工智能·深度学习·算法·机器学习·分类·数据挖掘·代谢组学