Yolo-Uniow开集目标检测本地复现

本文不生产技术，只做技术的搬运工！！！

前言

Yolo-Uniow是清华团队前段时间公布的开集目标检测模型，继承了Yolo家族的优秀传统：快，对coco、lvis等开源数据集类别支持良好，本文不介绍原理及论文，仅记录在本地复现过程中出现的问题及解决方案。

环境配置

项目地址：GitHub - THU-MIG/YOLO-UniOW: YOLO-UniOW: Efficient Universal Open-World Object Detection

清华团队提供了环境配置方案，作者做了一些优化，对小白更友好，过程如下：

bash 复制代码

conda create -n yolouniow python=3.9
conda activate yolouniow
pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html
git clone https://github.com/THU-MIG/YOLO-UniOW.git
cd YOLO-UniOW
pip install -r requirements.txt
pip install -e .

本地推理(图像)

进行本地推理前需要先下载Yolo-Uniow权重、CLIP权重、lvis文件，其中Yolo-Uniow权重需要在github的链接中下载，CLIP权重是代码自动下载，lvis文件可以百度搜索lvis_v1_minival_inserted_image_name.json，在hugging face上下载，针对网络不好的问题，作者对这三个下载项提供了解决方案，均是免费下载。

推理脚本

在YOLO-UniOW工程下，新建infer.sh脚本，写入如下内容

bash 复制代码

python ./demo/image_demo.py \
./configs/pretrain/yolo_uniow_l_lora_bn_5e-4_100e_8gpus_obj365v1_goldg_train_lvis_minival.py \ #配置文件
./demo/yolo_uniow_l_lora_bn_5e-4_100e_8gpus_obj365v1_goldg_train_lvis_minival.pth \ #权重路径
./demo/src.jpg \ #图像路径
'white cars' \ #提示词
--topk 100 \
--threshold 0.05 \ #阈值
--output-dir ./demo/output/ #输出路径

./configs/pretrain/路径下提供了三个配置文件，分别对应三个权重，作者使用的是L模型，因此需要使用L配置文件

执行

bash 复制代码

conda activate yolouniow
sh infer.sh