使用Yolo 11进行定制化图像识别全流程

本文系统讲解如何使用YOLO进行定制化图像识别，覆盖从数据标注到模型部署的完整流程。

一、概述

什么是定制化图像识别

定制化图像识别（目标检测）是指针对特定业务场景，训练一个能识别特定目标类别的模型。

比如：检测产品缺陷、识别车辆部件、检查安全帽佩戴等。与通用模型不同，定制化模型只识别你定义的类别。

全流程预览

复制代码

Label Studio标注 → 导出YOLO格式 → 编写data.yaml → 拆分数据集 → 模型训练 → 预测部署

步骤	工具/技术	产出物
数据标注	Label Studio	标注好的图片
数据导出	YOLO with images	images/ + labels/
配置文件	data.yaml	数据集配置
数据拆分	Python脚本	train/val/test
模型训练	ultralytics	best.pt模型
预测部署	FastAPI	REST API服务

二、环境准备

2.1 硬件要求

配置项	最低要求	推荐配置
GPU	NVIDIA 4GB显存若没有，则使用CPU训练	NVIDIA 8GB+显存
CPU	4核	8核+
内存	8GB	16GB+
硬盘	20GB可用空间	SSD 50GB+

注意：显存不足时，可通过降低batch大小解决（如从8降到4或2）。

2.2 软件环境

Python版本：3.10+

CUDA安装（GPU训练必需，没有GPU的不考虑）：

安装NVIDIA CUDA Toolkit，最好是12.8+版本

核心依赖安装：

bash 复制代码

pip install ultralytics
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128

requirements.txt示例：

复制代码

ultralytics>=8.3.0
torch>=2.9.0
torchvision>=0.24.0
torchaudio>=2.9.0
Pillow>=10.0.0
opencv-python>=4.8.0
fastapi>=0.115.0
uvicorn>=0.27.0
python-multipart>=0.0.12
pydantic>=2.10.0
pydantic-settings>=2.6.0

2.3 项目结构

训练项目结构：

复制代码

photo-check-train/
├── train.py              # 训练脚本
├── runs/                 # 训练输出目录
│   └── train/            # 训练结果
│       └── exp/          # 实验目录
│           ├── weights/  # 模型文件
│           │   ├── best.pt   # 最佳模型
│           │   └── last.pt   # 最后一个epoch
│           └── results.csv   # 训练日志
├── data/                 # 数据集目录
├── models/               # 预训练模型
├── yolo11n.pt            # 基础模型文件
└── requirements.txt

预测API项目结构：

复制代码

photo-check-predict/
├── app/
│   ├── main.py           # FastAPI入口
│   ├── config.py         # 配置管理
│   ├── api/
│   │   └── routes.py     # API路由
│   ├── services/
│   │   └── inference.py  # 推理服务
│   └── models/
│       └── prediction.py # 数据模型
├── models/               # 训练好的模型
├── Dockerfile
├── docker-compose.yml
└── .env                  # 环境变量配置

三、数据标注（Label Studio）

3.1 创建标注项目

步骤1：登录Label Studio

访问Label Studio地址，使用账号密码登录。

步骤2：创建项目

点击【Create Project】按钮创建新项目。

步骤3：填写项目信息

Project Name：项目名称，如"车照检查标注"
Description：项目描述（可选）

点击【Save】继续。

3.2 配置标注模板

步骤1：选择标注模板

在项目设置中选择【Object Detection】（目标检测）模板。

图片:

步骤2：定义标签类别

进入模板，根据业务需求定义，进行需要训练的标签管理。例如：

类别名称	说明
defect	缺陷/异常
normal	正常
[图片: ]

点击【Save】完成配置。

3.3 标注操作

步骤1：导入待标注图片

点击【Import】按钮，上传需要标注的图片。

图片:

支持的图片格式：jpg、jpeg、png、bmp

**注意：**每次上传的图像数量需要控制，一般在40张左右。否则会报错

步骤2：进行标注

选择一张图片进入标注界面
选择左侧工具栏的矩形框工具
在图片上框选目标区域
选择对应的标签类别

图片:

标注规范：

框要紧贴目标边缘，不要留太大空白
每个目标都要标注，不要遗漏
不确定的目标可以先跳过，后续确认后再标注
若图像最终无标注，请删除该图像

四、数据导出与处理

4.1 导出数据

步骤1：进入导出界面

标注完成后，点击项目页面的【Export】按钮。

步骤2：选择导出格式

选择【YOLO with images】格式，这个格式会同时导出图片和YOLO格式的标注文件。

图片:

步骤3：下载并解压

下载压缩包并解压，得到如下结构：

复制代码

export/
├── images/           # 图片文件
│   ├── img001.jpg
│   ├── img002.jpg
│   └── ...
└── labels/           # 标注文件（.txt格式）
    ├── img001.txt
    ├── img002.txt
    └── ...

YOLO标注格式说明 ：

每行格式：class_id x_center y_center width height（归一化坐标，范围0-1）

复制代码

0 0.5 0.5 0.3 0.4
1 0.2 0.3 0.1 0.2

4.2 编写data.yaml

在数据集根目录创建data.yaml配置文件：

yaml 复制代码

# 数据集路径配置
path: ./data          # 数据集根目录（相对或绝对路径）
train: images/train   # 训练集图片路径（相对于path）
val: images/val       # 验证集图片路径
test: images/test     # 测试集图片路径（可选）

# 类别配置
names:                # 类别名称列表
  0: defect
  1: normal

配置说明：

字段	说明	示例
path	数据集根目录	`./data` 或 `E:/datasets/mydata`
train	训练集图片目录	`images/train`
val	验证集图片目录	`images/val`
test	测试集图片目录	`images/test`（可选）
names	类别名称映射	`{0: defect, 1: normal}`

注意：labels目录结构要与images对应，如images/train对应labels/train。

4.3 数据集拆分

推荐比例：

数据集	比例	说明
train	80%	训练模型
val	10%	验证调参
test	10%	最终测试

拆分脚本示例：

可让opencode或其他ai-agent进行自主分析数据集，自己编写拆分脚本，最后自己运行完成数据拆分。

python 复制代码

import os
import shutil
import random
from pathlib import Path

def split_dataset(source_images, source_labels, output_dir, train_ratio=0.8, val_ratio=0.1, test_ratio=0.1):
    """
    拆分数据集为训练集、验证集和测试集

    Args:
        source_images: 源图片目录
        source_labels: 源标注目录
        output_dir: 输出目录
        train_ratio: 训练集比例
        val_ratio: 验证集比例
        test_ratio: 测试集比例
    """
    # 获取所有图片文件
    image_files = [f for f in os.listdir(source_images)
                   if f.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp'))]
    random.shuffle(image_files)

    # 计算拆分点
    total = len(image_files)
    train_count = int(total * train_ratio)
    val_count = int(total * val_ratio)

    train_files = image_files[:train_count]
    val_files = image_files[train_count:train_count + val_count]
    test_files = image_files[train_count + val_count:]

    # 创建目录结构
    for split in ['train', 'val', 'test']:
        os.makedirs(os.path.join(output_dir, 'images', split), exist_ok=True)
        os.makedirs(os.path.join(output_dir, 'labels', split), exist_ok=True)

    # 复制文件
    def copy_files(file_list, split):
        for img_file in file_list:
            # 复制图片
            src_img = os.path.join(source_images, img_file)
            dst_img = os.path.join(output_dir, 'images', split, img_file)
            shutil.copy(src_img, dst_img)

            # 复制标注（同名.txt文件）
            label_file = Path(img_file).stem + '.txt'
            src_label = os.path.join(source_labels, label_file)
            dst_label = os.path.join(output_dir, 'labels', split, label_file)
            if os.path.exists(src_label):
                shutil.copy(src_label, dst_label)

    copy_files(train_files, 'train')
    copy_files(val_files, 'val')
    copy_files(test_files, 'test')

    print(f"拆分完成：训练集 {len(train_files)} 张，验证集 {len(val_files)} 张，测试集 {len(test_files)} 张")

# 使用示例
split_dataset(
    source_images='export/images',
    source_labels='export/labels',
    output_dir='./data',
    train_ratio=0.8,
    val_ratio=0.1,
    test_ratio=0.1
)

拆分后的目录结构：

复制代码

data/
├── images/
│   ├── train/         # 训练图片
│   │   ├── img001.jpg
│   │   └── ...
│   ├── val/           # 验证图片
│   │   ├── img101.jpg
│   │   └── ...
│   └── test/          # 测试图片
│       ├── img201.jpg
│       └── ...
├── labels/
│   ├── train/         # 训练标注
│   │   ├── img001.txt
│   │   └── ...
│   ├── val/           # 验证标注
│   │   ├── img101.txt
│   │   └── ...
│   └── test/          # 测试标注
│       ├── img201.txt
│       └── ...
└── data.yaml          # 配置文件

五、模型训练

5.1 训练脚本说明

train.py是完整的YOLO训练脚本，支持命令行参数配置和自动设备检测。

脚本核心逻辑：

解析命令行参数
自动检测GPU/CPU设备
加载预训练模型
构建训练配置（包含数据增强、正则化等）
执行训练并验证

完整训练脚本：

python 复制代码

#!/usr/bin/env python3
"""YOLO11 自定义数据集训练脚本"""

import os
import argparse
from ultralytics import YOLO


def parse_args():
    parser = argparse.ArgumentParser(description="YOLO11 训练脚本")
    parser.add_argument("--model", type=str, default="yolo11n.pt", help="预训练模型路径")
    parser.add_argument("--data", type=str, default="data.yaml", help="数据集配置文件路径")
    parser.add_argument("--epochs", type=int, default=200, help="训练轮数")
    parser.add_argument("--batch", type=int, default=8, help="批次大小")
    parser.add_argument("--imgsz", type=int, default=640, help="图像大小")
    parser.add_argument("--device", type=str, default=None, help="训练设备 (0=GPU, cpu=CPU)")
    parser.add_argument("--cache", type=str, default="false", help="数据缓存 (true/disk/false)")
    return parser.parse_args()


def train_model(args):
    # 1. 自动检测设备
    device = args.device
    if device is None:
        import torch
        device = "0" if torch.cuda.is_available() else "cpu"
        print(f"使用设备: {device}")

    # 2. 加载模型
    print(f"加载预训练模型: {args.model}")
    model = YOLO(args.model)

    # 3. 构建训练配置
    train_config = {
        "data": args.data,
        "epochs": args.epochs,
        "batch": args.batch,
        "imgsz": args.imgsz,
        "device": device,
        "workers": 0,  # Windows 兼容
        "project": "runs/train",
        "name": "exp",
        "lr0": 0.003,
        "optimizer": "AdamW",
        "patience": 15,
        "cos_lr": True,
        "mosaic": 1.0,
        "mixup": 0.3,
        "dropout": 0.15,
        "verbose": True,
        "plots": True,
        "save": True,
    }

    # 缓存配置
    if args.cache.lower() == "true":
        train_config["cache"] = True
    elif args.cache.lower() == "disk":
        train_config["cache"] = "disk"

    # 4. 开始训练
    results = model.train(**train_config)

    # 5. 输出结果
    save_dir = model.trainer.save_dir
    print(f"\n✅ 训练完成！")
    print(f"模型保存位置: {save_dir}")
    print(f"最佳模型: {save_dir}/weights/best.pt")
    return results


if __name__ == "__main__":
    args = parse_args()
    if not os.path.exists(args.data):
        print(f"❌ 错误: 数据集配置文件不存在: {args.data}")
        exit(1)
    train_model(args)

命令行参数详解：

参数	说明	默认值	使用示例
--model	预训练模型路径（n最小最快，x最大最准）	yolo11n.pt	`--model yolo11s.pt`
--data	数据集配置文件路径	data.yaml	`--data ./data.yaml`
--epochs	训练轮数	200	`--epochs 300`
--batch	批次大小，显存不足时降低	8	`--batch 4`
--imgsz	输入图像尺寸	640	`--imgsz 416`
--device	训练设备	自动检测	`--device 0`
--cache	数据缓存	false	`--cache disk`

提示：workers=0是Windows兼容设置，禁用多进程数据加载。

5.2 训练配置一览

脚本内置的训练参数（一般无需修改）：

参数	值	说明
lr0	0.003	初始学习率
optimizer	AdamW	优化器类型
patience	15	早停轮数
cos_lr	True	余弦学习率调度
mosaic	1.0	Mosaic数据增强
mixup	0.3	Mixup数据增强
dropout	0.15	Dropout正则化
project	runs/train	输出根目录
name	exp	实验名称（自动递增）

5.3 执行训练

基础训练命令：

bash 复制代码

python train.py --model yolo11n.pt --data data.yaml --epochs 200 --batch 8 --device 0

完整训练命令（推荐）：

bash 复制代码

python train.py ^
    --model yolo11n.pt ^
    --data data.yaml ^
    --epochs 200 ^
    --batch 8 ^
    --imgsz 640 ^
    --device 0 ^
    --cache disk

Windows提示：命令行换行使用^，PowerShell使用`````。

训练输出文件 （位于runs/train/exp/目录，由脚本中project="runs/train"和name="exp"决定）：

复制代码

runs/train/exp/
├── weights/
│   ├── best.pt           # 最佳模型（部署用这个）
│   └── last.pt           # 最后一个epoch的模型
├── results.csv           # 训练指标日志
├── results.png           # 训练曲线图
├── confusion_matrix.png  # 混淆矩阵
├── F1_curve.png          # F1分数曲线
├── PR_curve.png          # PR曲线
└── val_batch0_pred.jpg   # 验证集预测样例

5.4 训练结果评估

查看训练日志：

日志位置：runs/train/exp/results.csv
曲线图：runs/train/exp/results.png

使用大模型分析：

将训练日志截图或results.csv内容发给元宝、豆包、ChatGPT等大模型，提问示例：

"这是我的YOLO模型训练日志，请帮我分析：

mAP是否收敛？

是否有过拟合迹象？

有什么优化建议？"

大模型会帮你解读各项指标并给出针对性建议。

六、模型预测与部署

6.1 本地批量预测测试

使用以下脚本对批量图片进行预测并保存结果：

python 复制代码

from ultralytics import YOLO
import os

# 加载训练好的模型
model = YOLO("runs/train/exp/weights/best.pt")

# 收集待预测图片
images = []
for folder_name, subfolders, filenames in os.walk("./test_images"):
    for filename in filenames:
        if filename.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp')):
            full_path = os.path.join(folder_name, filename)
            images.append(full_path)

# 批量预测
# save=True 会自动保存带标注框的结果图
# project 指定保存根目录，YOLO会自动在其下创建predict子目录
results = model(images, conf=0.7, save=True, project="test_result")

# 查看结果
for r in results:
    print(f"检测结果: {r.boxes}")
    print(f"保存目录: {r.save_dir}")

参数说明：

参数	说明	示例
conf	置信度阈值	0.7（只显示置信度>0.7的结果）
save	是否保存结果图	True
project	结果保存根目录	"test_result"

说明：当save=True时，YOLO会自动在project目录下创建predict子目录。即project="test_result"时，结果实际保存在test_result/predict/目录下。

6.2 预测API部署

前置步骤：部署前需将训练好的模型复制到API项目的models目录：

bash 复制代码

# 从训练项目复制到预测项目
cp runs/train/exp/weights/best.pt predict/models/

API项目核心代码：

1. 配置管理（app/config.py）：

python 复制代码

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    app_name: str = "YOLO预测API"
    app_version: str = "1.0.0"
    model_path: str = "models/best.pt"
    max_upload_size: int = 10 * 1024 * 1024  # 10MB
    allowed_extensions: str = ".jpg,.jpeg,.png,.bmp"

    class Config:
        env_file = ".env"

settings = Settings()

2. 响应模型（app/models/prediction.py）：

python 复制代码

from pydantic import BaseModel
from typing import List, Optional

class BoundingBox(BaseModel):
    x1: float
    y1: float
    x2: float
    y2: float

class DetectionResult(BaseModel):
    class_id: int
    class_name: str
    confidence: float
    bbox: BoundingBox

class PredictionResponse(BaseModel):
    success: bool
    message: str
    detections: List[DetectionResult]
    processing_time: float

3. 推理服务（app/services/inference.py）：

python 复制代码

from ultralytics import YOLO
from PIL import Image
import io
import time
from typing import List, Tuple

class ModelInferenceService:
    def __init__(self):
        self.model = None
        self.class_names = {}

    def load_model(self, model_path: str):
        """加载模型"""
        self.model = YOLO(model_path)
        self.class_names = self.model.names if self.model.names else {}

    def predict(self, image_bytes: bytes, conf_threshold: float = 0.7) -> Tuple[List[dict], float]:
        """执行预测"""
        start_time = time.time()

        # 预处理
        image = Image.open(io.BytesIO(image_bytes)).convert("RGB")

        # 推理
        results = self.model.predict(source=image, conf=conf_threshold, verbose=False)

        # 提取结果
        detections = []
        result = results[0]
        if result.boxes is not None:
            for box in result.boxes:
                xyxy = box.xyxy[0].cpu().numpy()
                confidence = float(box.conf[0].cpu().numpy())
                class_id = int(box.cls[0].cpu().numpy())

                detections.append({
                    "class_id": class_id,
                    "class_name": self.class_names.get(class_id, f"class_{class_id}"),
                    "confidence": confidence,
                    "bbox": {
                        "x1": float(xyxy[0]),
                        "y1": float(xyxy[1]),
                        "x2": float(xyxy[2]),
                        "y2": float(xyxy[3])
                    }
                })

        processing_time = time.time() - start_time
        return detections, processing_time

# 全局服务实例
model_service = ModelInferenceService()

4. API路由（app/api/routes.py）：

python 复制代码

from fastapi import APIRouter, File, UploadFile, HTTPException
from app.config import settings
from app.services.inference import model_service
from app.models.prediction import PredictionResponse

router = APIRouter(prefix="/api/v1", tags=["预测"])

@router.post("/predict", response_model=PredictionResponse)
async def predict_image(file: UploadFile = File(...), conf_threshold: float = 0.7):
    """图片预测接口"""
    # 验证文件类型
    if not file.filename.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp')):
        raise HTTPException(status_code=400, detail="不支持的文件类型")

    # 读取图片
    image_bytes = await file.read()

    # 确保模型已加载
    if model_service.model is None:
        model_service.load_model(settings.model_path)

    # 执行预测
    detections, processing_time = model_service.predict(image_bytes, conf_threshold)

    return PredictionResponse(
        success=True,
        message="预测成功",
        detections=detections,
        processing_time=processing_time
    )

@router.get("/health")
async def health_check():
    """健康检查"""
    return {"status": "healthy", "model_loaded": model_service.model is not None}

5. 应用入口（app/main.py）：

python 复制代码

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.config import settings
from app.api.routes import router

app = FastAPI(title=settings.app_name, version=settings.app_version)

# CORS配置
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

# 注册路由
app.include_router(router)

@app.on_event("startup")
async def startup():
    """启动时加载模型"""
    from app.services.inference import model_service
    model_service.load_model(settings.model_path)
    print(f"✅ 模型加载完成: {settings.model_path}")

启动服务：

bash 复制代码

# 开发环境（自动重载）
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

# 生产环境
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

环境变量配置（.env）：

复制代码

MODEL_PATH=models/best.pt
MAX_UPLOAD_SIZE=10485760
ALLOWED_EXTENSIONS=.jpg,.jpeg,.png,.bmp

调用示例（Python）：

python 复制代码

import requests

url = "http://localhost:8000/api/v1/predict"
files = {"file": open("test_image.jpg", "rb")}
params = {"conf_threshold": 0.7}

response = requests.post(url, files=files, params=params)
result = response.json()

print(f"检测到 {len(result['detections'])} 个目标")
for det in result['detections']:
    print(f"  - {det['class_name']}: {det['confidence']:.2f}")

响应示例：

json 复制代码

{
  "success": true,
  "message": "预测成功",
  "detections": [
    {
      "class_id": 0,
      "class_name": "defect",
      "confidence": 0.95,
      "bbox": {"x1": 100.0, "y1": 200.0, "x2": 300.0, "y2": 400.0}
    }
  ],
  "processing_time": 0.15
}

七、常见问题与优化

7.1 训练问题

显存不足（CUDA out of memory）：

降低batch大小：--batch 4 或 --batch 2
减小图像尺寸：--imgsz 416
关闭数据缓存：--cache false

loss不收敛：

检查标注是否正确（是否有漏标、错标）
降低学习率：在train.py中修改lr0
增加训练轮数：--epochs 300

过拟合（训练loss下降但验证loss上升）：

增加数据量或使用数据增强
增大dropout：在train.py中修改dropout
减小模型规模：使用yolo11n代替yolo11l

7.2 预测问题

检测精度不足：

提高置信度阈值：conf=0.8
检查测试图片与训练数据是否分布一致
增加该类别的训练样本

误检/漏检处理：

误检（误报）：提高置信度阈值
漏检（漏报）：降低置信度阈值，或补充困难样本重新训练

7.3 数据问题

样本不平衡：

某类样本过少时，使用数据增强（旋转、翻转、调色）
或复制少数类样本并稍作变换

数据增强策略（train.py已内置）：

mosaic：4张图拼接
mixup：图像混合
hsv：色彩抖动
fliplr/flipud：翻转

八、总结

关键流程回顾

步骤	操作	产出
1. 标注	Label Studio框选目标	标注数据
2. 导出	导出YOLO with images	images/ + labels/
3. 配置	编写data.yaml	数据集配置
4. 拆分	按比例拆分train/val/test	训练/验证/测试集
5. 训练	python train.py	best.pt模型
6. 部署	FastAPI服务	REST API

最佳实践清单

每类目标至少100-200个样本
标注框紧贴目标边缘
训练集/验证集/测试集比例8:1:1
显存不足时优先降低batch
训练完成后用大模型分析日志
部署前用批量预测验证效果

使用Yolo 11进行定制化图像识别全流程

一、概述

什么是定制化图像识别

全流程预览

二、环境准备

2.1 硬件要求

2.2 软件环境

2.3 项目结构

三、数据标注（Label Studio）

3.1 创建标注项目

3.2 配置标注模板

3.3 标注操作

四、数据导出与处理

4.1 导出数据

4.2 编写data.yaml

4.3 数据集拆分

五、模型训练

5.1 训练脚本说明

5.2 训练配置一览

5.3 执行训练

5.4 训练结果评估

六、模型预测与部署

6.1 本地批量预测测试

6.2 预测API部署

七、常见问题与优化

7.1 训练问题

7.2 预测问题

7.3 数据问题

八、总结

关键流程回顾

最佳实践清单

参考资料