opencv计算机视觉--答题卡识别案例

一、总体概述

这是一个完整的答题卡自动识别和评分系统，主要流程包括：图像预处理→答题卡定位→透视变换→选项检测→答案判断→评分输出。

二、详细分析

1. 准备工作

python 复制代码

ANSWER_KEY = {0: 1, 1: 4, 2: 0, 3: 3, 4: 1}  # 正确答案

存储标准答案：题目索引→正确选项索引（0-based）
这里表示：第0题正确答案是B（索引1），第1题正确答案是E（索引4）等

2. 关键函数定义

(1) `order_points()` - 坐标排序

python 复制代码

def order_points(pts):  # 找出4个坐标位置
    rect = np.zeros((4, 2), dtype="float32")
    s = pts.sum(axis=1)
    rect[0] = pts[np.argmin(s)]  # 左上
    rect[2] = pts[np.argmax(s)]  # 右下
    diff = np.diff(pts, axis=1)
    rect[1] = pts[np.argmin(diff)]  # 右上
    rect[3] = pts[np.argmax(diff)]  # 左下
    return rect

功能：将4个点按顺序排列为（左上、右上、右下、左下）
原理：
- 左上点：x+y最小
- 右下点：x+y最大
- 右上点：x-y最小
- 左下点：x-y最大

(2) `four_point_transform()` - 透视变换

python 复制代码

def four_point_transform(image, pts):  # 获取输入坐标点，并做透视变换
    rect = order_points(pts)  # 找出4个坐标位置
    (tl, tr, br, bl) = rect

    # 计算输入的w和h值
    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    maxWidth = max(int(widthA), int(widthB))

    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    maxHeight = max(int(heightA), int(heightB))

    # 变换后对应坐标位置
    dst = np.array([[0, 0],
                    [maxWidth - 1, 0],
                    [maxWidth - 1, maxHeight - 1],
                    [0, maxHeight - 1]],
                   dtype="float32")

    # 计算变换矩阵
    M = cv2.getPerspectiveTransform(rect, dst)
    warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

    return warped  # 返回变换后结果

功能：将倾斜的答题卡矫正为正面视图
步骤：
1. 计算原始四边形和变换后矩形的对应关系
2. 使用cv2.getPerspectiveTransform()计算变换矩阵
3. 应用透视变换得到矫正后的图像

(3) `sort_contours()` - 轮廓排序

python 复制代码

def sort_contours(cnts, method="left-to-right"):  # 对轮廓进行排序
    reverse = False
    i = 0
    if method == "right-to-left" or method == "bottom-to-top":
        reverse = True
    if method == "top-to-bottom" or method == "bottom-to-top":
        i = 1
    boundingBoxes = [cv2.boundingRect(c) for c in cnts]
    (cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
                                        key=lambda b: b[1][i], reverse=reverse))
    return cnts, boundingBoxes

功能：按指定方向（左→右、上→下等）对轮廓排序
实现：通过bounding box的坐标进行排序

3. 主流程分析

第一阶段：图像预处理和答题卡定位

1）读取和灰度化

python 复制代码

image = cv2.imread(r'test_01.png')
contours_img = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

2）高斯模糊降噪

python 复制代码

blurred = cv2.GaussianBlur(gray, (5, 5), 0)
cv_show('blurred', blurred)

3）Canny边缘检测

python 复制代码

edged = cv2.Canny(blurred, 75, 200)
cv_show('edged', edged)

4）寻找轮廓

python 复制代码

cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
cv2.drawContours(contours_img, cnts, -1, (0, 0, 255), 3)
cv_show('contours_img', contours_img)
docCnt = None

使用Canny边缘检测找出所有边缘
筛选最大的轮廓（应该是答题卡的外框）

第二阶段：透视变换

1）寻找四边形轮廓

python 复制代码

# 根据轮廓大小进行排序，准备透视变换
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:  # 遍历每一个轮廓
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)  # 轮廓近似
    if len(approx) == 4:
        docCnt = approx
        break

2）执行透视变换

python 复制代码

warped_t = four_point_transform(image, docCnt.reshape(4, 2))
warped_new=warped_t.copy()
cv_show('warped', warped_t)
warped = cv2.cvtColor(warped_t, cv2.COLOR_BGR2GRAY)

通过cv2.approxPolyDP()近似多边形
找到4个顶点的轮廓作为答题卡边界
透视变换得到正视图

第三阶段：选项检测

1）阈值处理

python 复制代码

thresh = cv2.threshold(warped, 0, 255,
                       cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cv_show('thresh', thresh)
thresh_Contours = thresh.copy()

2）寻找所有轮廓

python 复制代码

# 找到每一个圆圈轮廓
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
warped_Contours = cv2.drawContours(warped_t, cnts, -1, (0, 255, 0), 1)
cv_show('warped_Contours', warped_Contours)

3）筛选圆形选项轮廓

python 复制代码

questionCnts = []
for c in cnts:  # 遍历轮廓并计算比例和大小
    (x, y, w, h) = cv2.boundingRect(c)
    ar = w / float(h)
    # 根据实际情况指定标准
    if w >= 20 and h >= 20 and 0.9<= ar <= 1.1:
        questionCnts.append(c)
print(len(questionCnts))

筛选标准：

宽高≥20像素（排除噪声）
宽高比0.9-1.1（接近圆形）
通过这两个条件筛选出选项圆圈

第四阶段：答案识别和评分

1）按行排序（每题5个选项）

python 复制代码

for (q, i) in enumerate(np.arange(0, len(questionCnts), 5)):
    cnts = sort_contours(questionCnts[i:i + 5])[0]  # 排序
    bubbled = None

2）逐题处理

python 复制代码

    for (j, c) in enumerate(cnts):
        # 使用mask来判断结果
        mask = np.zeros(thresh.shape, dtype="uint8")
        cv2.drawContours(mask, [c], -1, 255, -1)  # -1表示填充
        cv_show('mask', mask)
        # 通过计算非零点数量来算是否选择这个答案
        # 利用掩膜（mask）进行"与"操作，只保留mask位置中的内容
        thresh_mask_and = cv2.bitwise_and(thresh, thresh, mask=mask)
        cv_show('thresh_mask_and', thresh_mask_and)
        total = cv2.countNonZero(thresh_mask_and)  # 统计灰度值不为0的像素数

3）判断选择状态

python 复制代码

        if bubbled is None or total > bubbled[0]:  # 通过阈值判断，保存灰度值最大的序号
            bubbled = (total, j)

4）与正确答案对比

python 复制代码

    if k == bubbled[1]:  # 判断正确
        color = (0, 255, 0)
        correct += 1

    cv2.drawContours(warped_new, [cnts[k]], -1, color, 3)  # 绘图
    cv_show('warpeding', warped_new)

关键算法 - 判断哪个选项被选中：

掩膜技术：为每个选项创建单独的掩膜
区域统计：在掩膜区域内统计非零像素数量
决策逻辑：被涂黑的选项有最多的非零像素（因为是二值化后的白色）

具体实现：

python 复制代码

# 创建选项掩膜
mask = np.zeros(thresh.shape, dtype="uint8")
cv2.drawContours(mask, [c], -1, 255, -1)

# 只保留该选项区域
thresh_mask_and = cv2.bitwise_and(thresh, thresh, mask=mask)

# 统计非零像素（涂黑的部分）
total = cv2.countNonZero(thresh_mask_and)

4. 可视化过程

代码中通过cv_show()函数展示多个中间结果：

blurred：模糊后的图像
edged：边缘检测结果
contours_img：所有轮廓
warped：透视变换后
thresh：二值化结果
mask：单个选项的掩膜
thresh_mask_and：掩膜应用后的效果

5. 评分和标记

python 复制代码

score = (correct / 5.0) * 100
# print("[INFO] score: {:.2f}%".format(score))
print("score: {:.2f}%".format(score))
cv2.putText(warped_new, "{:.2f}%".format(score), (10, 30),
            cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 2)
cv2.imshow("Original", image)
cv2.imshow("Exam", warped_new)
cv2.waitKey(0)

正确答案用绿色框标记
错误答案用红色框标记
最终显示得分百分比

三、算法特点

优点：

鲁棒的定位：通过透视变换处理倾斜拍摄
精确检测：基于掩膜的统计方法准确可靠
适应性强：通过轮廓特征筛选选项，不依赖固定位置
可视化调试：完整的中间结果展示

局限性：

依赖预处理质量：需要清晰的图像和适当的阈值
固定题目数量：需要预先知道题目和选项数量
圆形选项假设：假设选项都是圆形/接近圆形
单选框假设：只支持单选题

这个实现是一个经典的计算机视觉应用案例，展示了如何将多个OpenCV技术组合解决实际问题。

opencv计算机视觉--答题卡识别案例

一、总体概述

二、详细分析

1. 准备工作

2. 关键函数定义

(1) order_points() - 坐标排序

(2) four_point_transform() - 透视变换

(3) sort_contours() - 轮廓排序

3. 主流程分析

第一阶段：图像预处理和答题卡定位

1）读取和灰度化

2）高斯模糊降噪

3）Canny边缘检测

4）寻找轮廓

第二阶段：透视变换

1）寻找四边形轮廓

2）执行透视变换

第三阶段：选项检测

1）阈值处理

2）寻找所有轮廓

3）筛选圆形选项轮廓

第四阶段：答案识别和评分

1）按行排序（每题5个选项）

2）逐题处理

3）判断选择状态

4）与正确答案对比

4. 可视化过程

5. 评分和标记

三、算法特点

优点：

局限性：

(1) `order_points()` - 坐标排序

(2) `four_point_transform()` - 透视变换

(3) `sort_contours()` - 轮廓排序