人工智能之视觉领域计算机视觉第十五章简单物体识别

人工智能之视觉领域

第十五章简单物体识别

文章目录

人工智能之视觉领域

前言：简单物体识别

[1. 通俗理解：什么是"简单物体识别"？](#1. 通俗理解：什么是“简单物体识别”？)

[2. 两大核心技术路线](#2. 两大核心技术路线)

[3. 形状识别：从轮廓到几何分类](#3. 形状识别：从轮廓到几何分类)

[3.1 核心步骤](#3.1 核心步骤)

[3.2 关键函数：`cv2.approxPolyDP()`](#3.2 关键函数：cv2.approxPolyDP())

[4. 颜色识别：为什么用 HSV 而不是 RGB？](#4. 颜色识别：为什么用 HSV 而不是 RGB？)

[4.1 RGB 的问题](#4.1 RGB 的问题)

[4.2 HSV 的优势](#4.2 HSV 的优势)

[4.3 OpenCV 中的 HSV 范围（注意！）](#4.3 OpenCV 中的 HSV 范围（注意！）)

[5. 完整识别流程](#5. 完整识别流程)

[6. 配套代码实战](#6. 配套代码实战)

[示例 1：形状识别（仅基于轮廓）](#示例 1：形状识别（仅基于轮廓）)

[示例 2：颜色识别（HSV 色域分割）](#示例 2：颜色识别（HSV 色域分割）)

[示例 3：形状 + 颜色联合识别（完整系统）](#示例 3：形状 + 颜色联合识别（完整系统）)

[7. 常见问题与调优技巧](#7. 常见问题与调优技巧)

[❓ 问题1：颜色识别受光照影响大？](#❓ 问题1：颜色识别受光照影响大？)

[❓ 问题2：形状识别不准（如圆被识别为多边形）？](#❓ 问题2：形状识别不准（如圆被识别为多边形）？)

[❓ 问题3：如何识别更多颜色（如橙色、紫色）？](#❓ 问题3：如何识别更多颜色（如橙色、紫色）？)

[✅ 本章总结](#✅ 本章总结)

资料关注

前言：简单物体识别

学习目标 ：掌握基于形状分析 （轮廓 + 多边形近似）和颜色识别（HSV 色域分割）的简单物体识别方法，能从图像或视频中自动识别圆形、矩形、三角形及特定颜色的物体。

1. 通俗理解：什么是"简单物体识别"？

想象你在玩"找不同"游戏：

图中有红球、蓝方块、绿三角
你要快速说出："这是红色的圆！"、"那是蓝色的矩形！"

✅ 简单物体识别 = 结合形状 + 颜色，对规则物体进行分类

不依赖深度学习

适用于工业分拣、教育机器人、玩具识别等场景

2. 两大核心技术路线

方法	原理	适用场景
形状识别	分析轮廓的几何特征（顶点数、面积/周长比）	规则几何图形（圆、方、三角）
颜色识别	在 HSV 色彩空间筛选特定颜色区域	彩色物体（红球、黄香蕉、绿交通灯）

💡 最佳实践 ：形状 + 颜色联合判断，提高准确率！

3. 形状识别：从轮廓到几何分类

3.1 核心步骤

二值化：获取清晰前景
查找轮廓 ：cv2.findContours()
多边形近似 ：cv2.approxPolyDP() → 得到顶点
根据顶点数分类

3.2 关键函数：`cv2.approxPolyDP()`

python 复制代码

epsilon = 0.02 * cv2.arcLength(contour, True)
approx = cv2.approxPolyDP(contour, epsilon, True)

epsilon：近似精度（越小越精确）
返回：多边形顶点列表

📌 顶点数 ↔ 形状对应关系：

3 点 → 三角形

4 点 → 四边形（可能是矩形/正方形）

10 点 → 圆形（或椭圆）

4. 颜色识别：为什么用 HSV 而不是 RGB？

4.1 RGB 的问题

红色在暗光下变成 (50, 0, 0)，亮光下变成 (255, 0, 0)
亮度变化导致颜色值剧烈波动

4.2 HSV 的优势

H (Hue)：颜色本身（0°~360°，如红=0°，绿=120°）
S (Saturation)：饱和度（纯色 vs 灰色）
V (Value)：明暗（亮 vs 暗）

✅ 固定 H 范围 + 宽松 S/V → 鲁棒的颜色识别

4.3 OpenCV 中的 HSV 范围（注意！）

颜色	H 范围（OpenCV）	说明
红色	$0,10$ ∪ $170,180$	红色跨 0° 边界
绿色	$40, 80$	---
蓝色	$100, 140$	---
黄色	$20, 40$	---

⚠️ OpenCV 中 H 被压缩到 0~180（不是 0~360）！

5. 完整识别流程

仅形状
仅颜色
形状+颜色
3
4
大于10
输入图像/视频帧
识别依据?
灰度 → 二值化 → findContours
BGR → HSV → inRange → findContours
BGR → HSV → inRange → 二值化 → findContours
approxPolyDP 获取顶点
顶点数?
三角形
四边形
圆形
结合颜色标签输出
绘制结果

6. 配套代码实战

示例 1：形状识别（仅基于轮廓）

python 复制代码

import cv2
import numpy as np

def detect_shape(approx):
    """根据近似多边形顶点数判断形状"""
    num_vertices = len(approx)
    if num_vertices == 3:
        return "Triangle"
    elif num_vertices == 4:
        # 可进一步判断是否为正方形
        x, y, w, h = cv2.boundingRect(approx)
        aspect_ratio = float(w) / h
        if 0.95 <= aspect_ratio <= 1.05:
            return "Square"
        else:
            return "Rectangle"
    elif num_vertices == 5:
        return "Pentagon"
    elif num_vertices > 10:
        return "Circle"
    else:
        return "Unknown"

# 创建测试图像（或读取真实图）
img = np.zeros((500, 800, 3), dtype=np.uint8)
cv2.rectangle(img, (50, 50), (150, 150), (255, 255, 255), -1)
cv2.circle(img, (300, 100), 60, (255, 255, 255), -1)
pts = np.array([[500, 50], [550, 150], [450, 150]], np.int32)
cv2.fillPoly(img, [pts], (255, 255, 255))

# 转灰度 + 二值化
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

# 查找轮廓
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# 分析每个轮廓
result = img.copy()
for cnt in contours:
    area = cv2.contourArea(cnt)
    if area < 500:  # 过滤小噪点
        continue
    
    # 多边形近似
    epsilon = 0.02 * cv2.arcLength(cnt, True)
    approx = cv2.approxPolyDP(cnt, epsilon, True)
    
    # 识别形状
    shape = detect_shape(approx)
    
    # 绘制结果
    cv2.drawContours(result, [cnt], -1, (0, 255, 0), 2)
    x, y = approx.ravel()[0], approx.ravel()[1]
    cv2.putText(result, shape, (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

cv2.imshow('Shape Recognition', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

示例 2：颜色识别（HSV 色域分割）

python 复制代码

import cv2
import numpy as np

def get_color_mask(hsv, color_name):
    """返回指定颜色的掩码"""
    if color_name == 'red':
        lower1 = np.array([0, 100, 100])
        upper1 = np.array([10, 255, 255])
        lower2 = np.array([170, 100, 100])
        upper2 = np.array([180, 255, 255])
        mask1 = cv2.inRange(hsv, lower1, upper1)
        mask2 = cv2.inRange(hsv, lower2, upper2)
        return cv2.bitwise_or(mask1, mask2)
    elif color_name == 'green':
        lower = np.array([40, 100, 100])
        upper = np.array([80, 255, 255])
        return cv2.inRange(hsv, lower, upper)
    elif color_name == 'blue':
        lower = np.array([100, 100, 100])
        upper = np.array([140, 255, 255])
        return cv2.inRange(hsv, lower, upper)
    else:
        return np.zeros(hsv.shape[:2], dtype=np.uint8)

# 读取图像（建议用彩色测试图）
img = cv2.imread('color_balls.jpg')
if img is None:
    # 创建彩色测试图
    img = np.zeros((400, 600, 3), dtype=np.uint8)
    cv2.circle(img, (100, 100), 50, (0, 0, 255), -1)   # BGR: Red
    cv2.circle(img, (300, 100), 50, (0, 255, 0), -1)   # Green
    cv2.circle(img, (500, 100), 50, (255, 0, 0), -1)   # Blue

hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# 识别红色物体
red_mask = get_color_mask(hsv, 'red')

# 形态学去噪
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
red_mask = cv2.morphologyEx(red_mask, cv2.MORPH_OPEN, kernel)
red_mask = cv2.morphologyEx(red_mask, cv2.MORPH_CLOSE, kernel)

# 查找轮廓
contours, _ = cv2.findContours(red_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

result = img.copy()
for cnt in contours:
    if cv2.contourArea(cnt) > 500:
        cv2.drawContours(result, [cnt], -1, (0, 255, 255), 2)  # 黄色框
        x, y, w, h = cv2.boundingRect(cnt)
        cv2.putText(result, "Red Object", (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)

cv2.imshow('Color Recognition (Red)', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

示例 3：形状 + 颜色联合识别（完整系统）

python 复制代码

import cv2
import numpy as np

# 定义颜色范围（HSV）
color_ranges = {
    'red': ([0, 100, 100], [10, 255, 255]),
    'green': ([40, 100, 100], [80, 255, 255]),
    'blue': ([100, 100, 100], [140, 255, 255])
}

def detect_shape(approx):
    n = len(approx)
    if n == 3: return "Triangle"
    elif n == 4: return "Rectangle"
    elif n > 10: return "Circle"
    else: return "Unknown"

# 读取图像
img = cv2.imread('shapes_colors.jpg') or cv2.imread(cv2.samples.findFile('lena.jpg'))
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

result = img.copy()

for color_name, (lower, upper) in color_ranges.items():
    # 处理红色跨边界
    if color_name == 'red':
        mask1 = cv2.inRange(hsv, np.array(lower), np.array(upper))
        mask2 = cv2.inRange(hsv, np.array([170, 100, 100]), np.array([180, 255, 255]))
        mask = cv2.bitwise_or(mask1, mask2)
    else:
        mask = cv2.inRange(hsv, np.array(lower), np.array(upper))
    
    # 去噪
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, np.ones((5,5), np.uint8))
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, np.ones((5,5), np.uint8))
    
    # 找轮廓
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    for cnt in contours:
        if cv2.contourArea(cnt) < 800:
            continue
        
        # 形状识别
        epsilon = 0.02 * cv2.arcLength(cnt, True)
        approx = cv2.approxPolyDP(cnt, epsilon, True)
        shape = detect_shape(approx)
        
        # 绘制
        x, y, w, h = cv2.boundingRect(cnt)
        label = f"{color_name} {shape}"
        cv2.rectangle(result, (x, y), (x+w, y+h), (0, 255, 0), 2)
        cv2.putText(result, label, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

cv2.imshow('Color + Shape Recognition', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

7. 常见问题与调优技巧

❓ 问题1：颜色识别受光照影响大？

✅ 解决：

调整 S 和 V 的下限（如 S>50, V>50）
使用自适应阈值 或直方图均衡化预处理

❓ 问题2：形状识别不准（如圆被识别为多边形）？

✅ 解决：

调整 epsilon（尝试 0.01 ~ 0.05）

结合圆形度 判断：

python 复制代码

circularity = 4 * np.pi * area / (perimeter ** 2)
if circularity > 0.8:  # 接近1为圆
    shape = "Circle"

❓ 问题3：如何识别更多颜色（如橙色、紫色）？

✅ 解决：查 HSV 色环，设定对应 H 范围：

橙色：[10, 25]
紫色：[140, 170]

✅ 本章总结

技术	关键操作	函数
形状识别	轮廓 → 多边形近似 → 顶点数	`approxPolyDP()`
颜色识别	BGR → HSV → inRange	`cv2.cvtColor()`, `cv2.inRange()`
联合识别	先颜色分割，再形状分析	组合使用

🌟 现在可以：

让机器人分拣红球和蓝方块

制作一个"形状颜色配对"教育 App

为工业流水线实现简单的质检功能！

资料关注

咚咚王

《Python编程：从入门到实践》

《利用Python进行数据分析》

《算法导论中文第三版》

《概率论与数理统计（第四版） (盛骤) 》

《程序员的数学》

《线性代数应该这样学第3版》

《微积分和数学分析引论》

《（西瓜书）周志华-机器学习》

《TensorFlow机器学习实战指南》

《Sklearn与TensorFlow机器学习实用指南》

《模式识别（第四版）》

《深度学习 deep learning》伊恩·古德费洛著花书

《Python深度学习第二版(中文版)【纯文本】 (登封大数据 (Francois Choliet)) (Z-Library)》

《深入浅出神经网络与深度学习+(迈克尔·尼尔森（Michael+Nielsen）》

《自然语言处理综论第2版》

《Natural-Language-Processing-with-PyTorch》

《计算机视觉-算法与应用(中文版)》

《Learning OpenCV 4》

《AIGC：智能创作时代》杜雨+&+张孜铭

《AIGC原理与实践：零基础学大语言模型、扩散模型和多模态模型》

《从零构建大语言模型（中文版）》

《实战AI大模型》

《AI 3.0》

人工智能之视觉领域 计算机视觉 第十五章 简单物体识别