YOLO目标标注和检测

任务：创建窗口截图程序

用户要求创建一个Python程序，读取标题为"Phone-A2NMVB1916011354"，类为"Qt5152QWindowIcon"的窗口，截图，如果是竖版，逆时针转90度。

实现步骤：

创建了 window_screenshot.py 文件
使用 win32gui 查找窗口
使用 win32ui 进行窗口截图
使用 PIL 处理图片旋转

功能：

自动查找指定窗口
截取窗口内容
检测图片方向并旋转
保存截图文件

关键代码：

python 复制代码

# 查找窗口
def find_window(title, class_name):
    def enum_windows_callback(hwnd, extra):
        if win32gui.IsWindowVisible(hwnd):
            window_title = win32gui.GetWindowText(hwnd)
            window_class = win32gui.GetClassName(hwnd)
            if title in window_title and window_class == class_name:
                extra.append(hwnd)

    windows = []
    win32gui.EnumWindows(enum_windows_callback, windows)
    return windows[0] if windows else None

# 检测竖版
def is_portrait(image):
    width, height = image.size
    return height > width

# 旋转图片
def rotate_if_portrait(image):
    if is_portrait(image):
        return image.rotate(90, expand=True)
    return image

任务：修改保存规则

用户要求截图保存为时间的文件名，如果竖版前缀加"竖版"，逆时针旋转90度时间保存为时间的文件名。也就是说竖版的话，有2个文件。

修改内容：

添加了时间戳生成功能
竖版时保存两个文件：原始竖版图（带"竖版"前缀）和旋转后的图
横版时只保存一个文件

关键代码：

python 复制代码

# 获取时间戳
def get_timestamp():
    return datetime.now().strftime("%Y%m%d_%H%M%S")

# 保存逻辑
if is_portrait(screenshot):
    # 保存原始竖版图
    portrait_path = f"竖版_{timestamp}.png"
    screenshot.save(portrait_path)
    
    # 逆时针旋转90度并保存
    rotated = screenshot.rotate(90, expand=True)
    rotated_path = f"{timestamp}.png"
    rotated.save(rotated_path)
else:
    # 横版直接保存
    output_path = f"{timestamp}.png"
    screenshot.save(output_path)

任务：修复截图尺寸问题

用户发现截图尺寸是1280*657，而实际窗口是1942x1042。

问题原因：

初始使用 GetClientRect 获取的是客户区大小，不包含标题栏和边框
DPI缩放导致尺寸计算错误

修复方案：

使用 GetWindowRect 获取整个窗口大小
设置DPI感知以获取真实像素尺寸
使用 PrintWindow API 确保完整截图

最终结果：

截图尺寸成功达到1942x1042

关键代码：

python 复制代码

# 设置DPI感知
try:
    ctypes.windll.user32.SetThreadDpiAwarenessContext(wintypes.HANDLE(-4))
except:
    try:
        ctypes.windll.shcore.SetProcessDpiAwareness(2)
    except:
        ctypes.windll.user32.SetProcessDPIAware()

# 获取整个窗口大小
def capture_window(hwnd):
    # 使用 GetWindowRect 获取完整窗口大小
    left, top, right, bottom = win32gui.GetWindowRect(hwnd)
    width = right - left
    height = bottom - top
    
    # 使用 PrintWindow 确保完整截图
    result = user32.PrintWindow(hwnd, saveDC.GetSafeHdc(), 2)
    
    return im

任务：创建YOLO目标检测程序

用户要求为图片 20260211_124813.png 做YOLO目标检测。

实现步骤：

创建了 yolo_detect.py 文件
下载了YOLOv3模型文件（yolov3.weights, yolov3.cfg, coco.names）
使用OpenCV的DNN模块进行目标检测
运行检测并保存结果

检测结果：

图片尺寸：776x369
未检测到目标（0个）

关键代码：

python 复制代码

# 加载YOLO模型
def load_yolo_model():
    net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
    layer_names = net.getLayerNames()
    output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]
    return net, output_layers

# 检测目标
def detect_objects(image_path, net, output_layers):
    image = cv2.imread(image_path)
    blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outs = net.forward(output_layers)
    
    # 解析检测结果
    for out in outs:
        for detection in out:
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            if confidence > 0.5:
                # 目标检测到了
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                # 绘制矩形框
                cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)

任务：制作YOLO训练数据集

用户询问如何制作目标训练集。

实现方案：

创建了 yolo_label_tool.py 标注工具
创建了 train_yolo.py 训练脚本
提供了完整的数据集制作流程：
- 标注数据
- 分割数据集
- 创建配置文件
- 训练模型

标注工具功能：

鼠标绘制矩形框
数字键选择类别
自动保存标注文件
支持多图片批量标注

关键代码：

python 复制代码

# 标注工具核心
class YOLOLabelTool:
    def __init__(self, image_dir, class_names=None):
        self.class_names = class_names or [
            "unused",      # 0 - 未使用
            "npc",         # 1 - NPC
            "pet",         # 2 - 宠物
            "button",      # 3 - 按钮
        ]
    
    def mouse_callback(self, event, x, y, flags, param):
        if event == cv2.EVENT_LBUTTONDOWN:
            self.drawing = True
            self.start_x, self.start_y = x, y
        elif event == cv2.EVENT_LBUTTONUP:
            self.drawing = False
            self.add_label(x, y)
    
    def add_label(self, x, y):
        # 计算归一化坐标
        x_center = (x1 + x2) / 2 / w
        y_center = (y1 + y2) / 2 / h
        norm_w = box_w / w
        norm_h = box_h / h
        self.current_labels.append((self.current_class, x_center, y_center, norm_w, norm_h))

任务：修改标注工具类别

用户要求修改数字键对应关系：1设置为npc，2设置为宠物，3设置为按钮。

修改内容：

更新了三丰云yolo_label_tool.py 中的类别定义
数字键对应：0=未使用, 1=npc, 2=宠物, 3=按钮（免费云服务器初始分类："player", # 玩家角色
"enemy", # 敌人
"npc", # NPC
"item", # 物品
"monster", # 怪物
"boss", # Boss
"skill_effect", # 技能特效
"ui_button", # UI按钮
"health_bar", # 血条
"minimap", # 小地图）

关键代码：

python 复制代码

# 类别定义
self.class_names = class_names or [
    "unused",      # 0 - 未使用
    "npc",         # 1 - NPC
    "pet",         # 2 - 宠物
    "button",      # 3 - 按钮
]

任务：修复训练数据问题

训练时出现错误：找不到训练图片。

问题原因：

images/train/ 目录为空
dataset.yaml 中的类别名称不匹配

修复方案：

将测试集图片复制到训练集
更新 dataset.yaml 中的类别配置

关键代码：

yaml 复制代码

# dataset.yaml 配置
path: D:\code\python\auto\menghuanmobile
train: images/train
val: images/train

# 类别
nc: 4
names: ['unused', 'npc', 'pet', 'button']

任务：修改时间戳时区

用户要求 window_screenshot.py 保存时间要按中国时区。

修改内容：

导入 ZoneInfo 模块
使用 Asia/Shanghai 时区获取当前时间

关键代码：

python 复制代码

# 导入时区模块
from zoneinfo import ZoneInfo

# 获取中国时区时间戳
def get_timestamp():
    """获取当前时间戳字符串（中国时区）"""
    china_tz = ZoneInfo("Asia/Shanghai")
    return datetime.now(china_tz).strftime("%Y%m%d_%H%M%S")

任务：修改保存规则

用户要求 window_screenshot.py 改为文件名不带竖版的图片，如果是竖版只生成1个文件。

修改内容：

移除了生成带"竖版"前缀文件的代码
竖版时只保存旋转后的文件
横版时保持不变

关键代码：

python 复制代码

# 保存逻辑
if is_portrait(screenshot):
    # 逆时针旋转90度并保存（只生成1个文件）
    rotated = screenshot.rotate(90, expand=True)
    output_path = f"{timestamp}.png"
    rotated.save(output_path)
else:
    # 横版直接保存
    output_path = f"{timestamp}.png"
    screenshot.save(output_path)

总结

完成了以下功能：

窗口截图程序（支持指定窗口、自动旋转、中国时区时间戳）
YOLO目标检测程序
YOLO数据集标注和训练工具
解决了多个技术问题（DPI缩放、训练数据配置等）