ROS1从入门到精通 15：机器人视觉 - 图像处理与计算机视觉

ROS1从入门到精通：第15篇机器人视觉 - 图像处理与计算机视觉

引言

在现代机器人系统中，视觉感知能力是实现自主导航、目标识别、环境理解和人机交互的关键技术。机器人视觉系统通过摄像头获取环境图像信息，利用图像处理和计算机视觉算法提取有用特征，为机器人决策提供重要依据。ROS作为机器人开发的标准框架，提供了完善的视觉处理工具链，包括图像采集、传输、处理和显示等功能模块。

本文将系统介绍ROS1中的机器人视觉技术，涵盖OpenCV集成、相机驱动、图像处理、特征提取、目标检测、三维重建和深度学习等核心内容。通过详细的理论讲解和完整的代码示例，帮助读者掌握ROS视觉系统的开发方法，实现各类视觉感知功能。

一、ROS视觉系统架构

1.1 视觉系统组成

ROS机器人视觉系统主要包含以下组件：

1.1.1 硬件层

相机设备：USB相机、工业相机、深度相机、双目相机
图像采集卡：高速图像采集和预处理
计算平台：CPU、GPU、嵌入式处理器

1.1.2 驱动层

相机驱动节点：usb_cam、industrial_camera、realsense2_camera
图像传输协议：USB、GigE、MIPI CSI
设备管理：相机参数配置、同步控制

1.1.3 处理层

图像处理库：OpenCV、PCL、OpenNI
视觉算法：滤波、特征提取、目标检测
深度学习框架：TensorFlow、PyTorch、ONNX

1.1.4 应用层

视觉SLAM：ORB-SLAM、VINS-Mono
目标跟踪：单目标跟踪、多目标跟踪
场景理解：语义分割、实例分割

1.2 图像消息类型

ROS定义了标准的图像消息类型用于传输视觉数据：

python 复制代码

# sensor_msgs/Image消息定义
std_msgs/Header header
  uint32 seq
  time stamp
  string frame_id
uint32 height         # 图像高度(像素)
uint32 width          # 图像宽度(像素)
string encoding       # 像素编码格式
uint8 is_bigendian    # 是否大端字节序
uint32 step           # 每行字节数
uint8[] data          # 实际图像数据

# sensor_msgs/CompressedImage压缩图像
std_msgs/Header header
string format         # 压缩格式(jpeg, png)
uint8[] data          # 压缩后的数据

# sensor_msgs/CameraInfo相机参数
std_msgs/Header header
uint32 height
uint32 width
string distortion_model    # 畸变模型
float64[] D                # 畸变系数
float64[9] K               # 内参矩阵
float64[9] R               # 旋转矩阵
float64[12] P              # 投影矩阵
uint32 binning_x           # 像素合并
uint32 binning_y
sensor_msgs/RegionOfInterest roi

1.3 视觉处理流程

典型的ROS视觉处理流程包括：

图像采集：相机驱动节点发布原始图像
图像校正：去畸变、颜色校正
预处理：滤波、增强、格式转换
特征提取：角点、边缘、描述子
高级处理：目标检测、跟踪、识别
结果发布：处理结果通过话题发布

二、OpenCV与ROS集成

2.1 cv_bridge图像转换

cv_bridge是ROS和OpenCV之间的桥梁，实现图像消息和OpenCV图像格式的相互转换。

2.1.1 安装cv_bridge

bash 复制代码

# 安装cv_bridge包
sudo apt-get install ros-noetic-cv-bridge
sudo apt-get install ros-noetic-vision-opencv

# 安装OpenCV
sudo apt-get install libopencv-dev python3-opencv

2.1.2 基本转换示例

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
from sensor_msgs.msg import Image
from cv_bridge import CvBridge, CvBridgeError

class ImageConverter:
    """图像格式转换节点"""

    def __init__(self):
        # 初始化节点
        rospy.init_node('image_converter', anonymous=True)

        # 创建CvBridge对象
        self.bridge = CvBridge()

        # 订阅原始图像话题
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        # 发布处理后的图像
        self.image_pub = rospy.Publisher(
            "/camera/image_processed",
            Image,
            queue_size=1
        )

        rospy.loginfo("Image converter node started")

    def image_callback(self, msg):
        """图像回调函数"""
        try:
            # ROS图像消息转OpenCV格式
            # encoding参数指定图像编码格式
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

        except CvBridgeError as e:
            rospy.logerr(f"CvBridge Error: {e}")
            return

        # 在OpenCV中处理图像
        processed_image = self.process_image(cv_image)

        try:
            # OpenCV图像转ROS消息
            ros_image = self.bridge.cv2_to_imgmsg(
                processed_image,
                encoding="bgr8"
            )
            ros_image.header = msg.header  # 保留时间戳

            # 发布处理后的图像
            self.image_pub.publish(ros_image)

        except CvBridgeError as e:
            rospy.logerr(f"CvBridge Error: {e}")

    def process_image(self, image):
        """图像处理函数"""
        # 转换为灰度图
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 高斯模糊
        blurred = cv2.GaussianBlur(gray, (5, 5), 0)

        # 边缘检测
        edges = cv2.Canny(blurred, 50, 150)

        # 转回BGR格式用于显示
        result = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)

        return result

    def run(self):
        """运行节点"""
        rospy.spin()

if __name__ == '__main__':
    try:
        converter = ImageConverter()
        converter.run()
    except rospy.ROSInterruptException:
        pass

2.2 图像编码格式

ROS支持多种图像编码格式，需要正确处理不同格式之间的转换：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image, CompressedImage
from cv_bridge import CvBridge

class ImageFormatHandler:
    """图像格式处理器"""

    # 支持的编码格式映射
    ENCODING_MAP = {
        'mono8': cv2.CV_8UC1,
        'bgr8': cv2.CV_8UC3,
        'rgb8': cv2.CV_8UC3,
        'bgra8': cv2.CV_8UC4,
        'rgba8': cv2.CV_8UC4,
        'mono16': cv2.CV_16UC1,
        '32FC1': cv2.CV_32FC1,
        '32FC3': cv2.CV_32FC3
    }

    def __init__(self):
        rospy.init_node('image_format_handler', anonymous=True)
        self.bridge = CvBridge()

        # 订阅不同格式的图像
        self.raw_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.raw_callback
        )

        self.compressed_sub = rospy.Subscriber(
            "/camera/image_raw/compressed",
            CompressedImage,
            self.compressed_callback
        )

        # 发布统一格式的图像
        self.unified_pub = rospy.Publisher(
            "/camera/image_unified",
            Image,
            queue_size=1
        )

    def raw_callback(self, msg):
        """处理原始图像"""
        try:
            # 自动检测编码格式
            if msg.encoding == 'rgb8':
                # RGB转BGR
                cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")
            elif msg.encoding == 'mono8':
                # 灰度图
                cv_image = self.bridge.imgmsg_to_cv2(msg, "mono8")
                # 转为3通道用于统一处理
                cv_image = cv2.cvtColor(cv_image, cv2.COLOR_GRAY2BGR)
            elif msg.encoding == '16UC1':
                # 16位深度图
                cv_image = self.bridge.imgmsg_to_cv2(msg, "16UC1")
                # 归一化到8位用于显示
                cv_image = self.normalize_depth(cv_image)
            else:
                # 默认转换
                cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # 发布统一格式
            self.publish_unified(cv_image, msg.header)

        except Exception as e:
            rospy.logerr(f"Failed to process raw image: {e}")

    def compressed_callback(self, msg):
        """处理压缩图像"""
        try:
            # 解压缩图像
            np_arr = np.frombuffer(msg.data, np.uint8)
            cv_image = cv2.imdecode(np_arr, cv2.IMREAD_COLOR)

            if cv_image is None:
                rospy.logwarn("Failed to decode compressed image")
                return

            # 发布统一格式
            self.publish_unified(cv_image, msg.header)

        except Exception as e:
            rospy.logerr(f"Failed to process compressed image: {e}")

    def normalize_depth(self, depth_image):
        """归一化深度图像用于显示"""
        # 忽略无效值
        valid_mask = depth_image > 0

        if np.sum(valid_mask) == 0:
            return np.zeros((depth_image.shape[0], depth_image.shape[1], 3),
                           dtype=np.uint8)

        # 计算有效深度范围
        min_depth = np.min(depth_image[valid_mask])
        max_depth = np.max(depth_image[valid_mask])

        # 归一化到0-255
        normalized = np.zeros_like(depth_image, dtype=np.uint8)
        if max_depth > min_depth:
            normalized[valid_mask] = (
                (depth_image[valid_mask] - min_depth) * 255 /
                (max_depth - min_depth)
            ).astype(np.uint8)

        # 应用彩色映射
        colored = cv2.applyColorMap(normalized, cv2.COLORMAP_JET)
        colored[~valid_mask] = [0, 0, 0]  # 无效区域设为黑色

        return colored

    def publish_unified(self, cv_image, header):
        """发布统一格式的图像"""
        try:
            # 确保是BGR8格式
            if len(cv_image.shape) == 2:
                cv_image = cv2.cvtColor(cv_image, cv2.COLOR_GRAY2BGR)

            # 转换为ROS消息
            ros_image = self.bridge.cv2_to_imgmsg(cv_image, "bgr8")
            ros_image.header = header

            self.unified_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"Failed to publish unified image: {e}")

if __name__ == '__main__':
    try:
        handler = ImageFormatHandler()
        rospy.spin()
    except rospy.ROSInterruptException:
        pass

2.3 图像处理管道

构建完整的图像处理管道，支持多种处理操作：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
from std_msgs.msg import String
import json

class ImageProcessingPipeline:
    """可配置的图像处理管道"""

    def __init__(self):
        rospy.init_node('image_processing_pipeline', anonymous=True)
        self.bridge = CvBridge()

        # 处理操作字典
        self.operations = {
            'blur': self.apply_blur,
            'sharpen': self.apply_sharpen,
            'edge': self.apply_edge_detection,
            'threshold': self.apply_threshold,
            'morphology': self.apply_morphology,
            'histogram': self.apply_histogram_equalization,
            'denoise': self.apply_denoise,
            'resize': self.apply_resize,
            'rotate': self.apply_rotate,
            'flip': self.apply_flip
        }

        # 默认处理管道配置
        self.pipeline_config = [
            {'operation': 'denoise', 'params': {}},
            {'operation': 'histogram', 'params': {}},
            {'operation': 'sharpen', 'params': {'kernel_size': 3}}
        ]

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.config_sub = rospy.Subscriber(
            "/pipeline/config",
            String,
            self.config_callback
        )

        self.image_pub = rospy.Publisher(
            "/camera/image_processed",
            Image,
            queue_size=1
        )

        rospy.loginfo("Image processing pipeline initialized")

    def config_callback(self, msg):
        """更新管道配置"""
        try:
            self.pipeline_config = json.loads(msg.data)
            rospy.loginfo(f"Pipeline config updated: {self.pipeline_config}")
        except json.JSONDecodeError as e:
            rospy.logerr(f"Invalid config JSON: {e}")

    def image_callback(self, msg):
        """处理图像"""
        try:
            # 转换图像格式
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # 执行处理管道
            processed = self.execute_pipeline(cv_image)

            # 发布结果
            ros_image = self.bridge.cv2_to_imgmsg(processed, "bgr8")
            ros_image.header = msg.header
            self.image_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"Processing error: {e}")

    def execute_pipeline(self, image):
        """执行处理管道"""
        result = image.copy()

        for step in self.pipeline_config:
            operation = step.get('operation')
            params = step.get('params', {})

            if operation in self.operations:
                result = self.operations[operation](result, **params)
            else:
                rospy.logwarn(f"Unknown operation: {operation}")

        return result

    def apply_blur(self, image, kernel_size=5, method='gaussian'):
        """应用模糊"""
        if method == 'gaussian':
            return cv2.GaussianBlur(image, (kernel_size, kernel_size), 0)
        elif method == 'median':
            return cv2.medianBlur(image, kernel_size)
        elif method == 'bilateral':
            return cv2.bilateralFilter(image, kernel_size, 75, 75)
        else:
            return image

    def apply_sharpen(self, image, kernel_size=3):
        """锐化处理"""
        if kernel_size == 3:
            kernel = np.array([[-1,-1,-1],
                              [-1, 9,-1],
                              [-1,-1,-1]])
        else:
            kernel = np.array([[0,-1,0],
                              [-1,5,-1],
                              [0,-1,0]])

        return cv2.filter2D(image, -1, kernel)

    def apply_edge_detection(self, image, method='canny',
                            low_threshold=50, high_threshold=150):
        """边缘检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        if method == 'canny':
            edges = cv2.Canny(gray, low_threshold, high_threshold)
        elif method == 'sobel':
            sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
            sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
            edges = np.sqrt(sobelx**2 + sobely**2)
            edges = np.uint8(edges / edges.max() * 255)
        elif method == 'laplacian':
            edges = cv2.Laplacian(gray, cv2.CV_64F)
            edges = np.uint8(np.absolute(edges))
        else:
            edges = gray

        # 转回BGR
        return cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)

    def apply_threshold(self, image, threshold_value=127,
                       max_value=255, method='binary'):
        """阈值处理"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        if method == 'binary':
            _, thresh = cv2.threshold(gray, threshold_value,
                                     max_value, cv2.THRESH_BINARY)
        elif method == 'binary_inv':
            _, thresh = cv2.threshold(gray, threshold_value,
                                     max_value, cv2.THRESH_BINARY_INV)
        elif method == 'otsu':
            _, thresh = cv2.threshold(gray, 0, max_value,
                                     cv2.THRESH_BINARY + cv2.THRESH_OTSU)
        elif method == 'adaptive':
            thresh = cv2.adaptiveThreshold(gray, max_value,
                                          cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                                          cv2.THRESH_BINARY, 11, 2)
        else:
            thresh = gray

        return cv2.cvtColor(thresh, cv2.COLOR_GRAY2BGR)

    def apply_morphology(self, image, operation='open', kernel_size=5):
        """形态学操作"""
        kernel = np.ones((kernel_size, kernel_size), np.uint8)

        if operation == 'erode':
            return cv2.erode(image, kernel, iterations=1)
        elif operation == 'dilate':
            return cv2.dilate(image, kernel, iterations=1)
        elif operation == 'open':
            return cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
        elif operation == 'close':
            return cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)
        elif operation == 'gradient':
            return cv2.morphologyEx(image, cv2.MORPH_GRADIENT, kernel)
        else:
            return image

    def apply_histogram_equalization(self, image):
        """直方图均衡化"""
        # 转换到YCrCb空间
        ycrcb = cv2.cvtColor(image, cv2.COLOR_BGR2YCrCb)

        # 均衡化Y通道
        ycrcb[:,:,0] = cv2.equalizeHist(ycrcb[:,:,0])

        # 转回BGR
        return cv2.cvtColor(ycrcb, cv2.COLOR_YCrCb2BGR)

    def apply_denoise(self, image, h=10):
        """去噪"""
        return cv2.fastNlMeansDenoisingColored(image, None, h, h, 7, 21)

    def apply_resize(self, image, scale=0.5):
        """调整大小"""
        height, width = image.shape[:2]
        new_height = int(height * scale)
        new_width = int(width * scale)
        return cv2.resize(image, (new_width, new_height))

    def apply_rotate(self, image, angle=90):
        """旋转图像"""
        height, width = image.shape[:2]
        center = (width/2, height/2)
        M = cv2.getRotationMatrix2D(center, angle, 1.0)
        return cv2.warpAffine(image, M, (width, height))

    def apply_flip(self, image, direction='horizontal'):
        """翻转图像"""
        if direction == 'horizontal':
            return cv2.flip(image, 1)
        elif direction == 'vertical':
            return cv2.flip(image, 0)
        elif direction == 'both':
            return cv2.flip(image, -1)
        else:
            return image

if __name__ == '__main__':
    try:
        pipeline = ImageProcessingPipeline()
        rospy.spin()
    except rospy.ROSInterruptException:
        pass

三、相机标定与矫正

3.1 相机内参标定

相机标定是计算机视觉的基础，用于获取相机的内部参数和畸变系数：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image, CameraInfo
from cv_bridge import CvBridge
import yaml
import os

class CameraCalibration:
    """相机标定节点"""

    def __init__(self):
        rospy.init_node('camera_calibration', anonymous=True)
        self.bridge = CvBridge()

        # 标定参数
        self.board_size = (9, 6)  # 棋盘格内角点数量
        self.square_size = 0.025  # 棋盘格方块大小(米)
        self.calibration_flags = (
            cv2.CALIB_FIX_ASPECT_RATIO +
            cv2.CALIB_FIX_PRINCIPAL_POINT +
            cv2.CALIB_ZERO_TANGENT_DIST
        )

        # 存储标定数据
        self.object_points = []  # 3D点
        self.image_points = []   # 2D点
        self.image_size = None

        # 标定结果
        self.camera_matrix = None
        self.dist_coeffs = None
        self.calibrated = False

        # 准备棋盘格3D坐标
        self.pattern_points = np.zeros((self.board_size[0] * self.board_size[1], 3),
                                       np.float32)
        self.pattern_points[:, :2] = (
            np.mgrid[0:self.board_size[0], 0:self.board_size[1]].T.reshape(-1, 2) *
            self.square_size
        )

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.calibrated_pub = rospy.Publisher(
            "/camera/image_calibrated",
            Image,
            queue_size=1
        )

        self.camera_info_pub = rospy.Publisher(
            "/camera/camera_info",
            CameraInfo,
            queue_size=1
        )

        # 标定控制
        self.capture_image = False
        self.min_samples = 20  # 最少需要的标定图片数量

        rospy.loginfo("Camera calibration node started")
        rospy.loginfo(f"Looking for {self.board_size} chessboard")

    def image_callback(self, msg):
        """处理图像"""
        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            if not self.calibrated:
                # 标定模式
                self.calibration_mode(cv_image, msg.header)
            else:
                # 矫正模式
                self.correction_mode(cv_image, msg.header)

        except Exception as e:
            rospy.logerr(f"Image processing error: {e}")

    def calibration_mode(self, image, header):
        """标定模式：收集标定图片"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        self.image_size = gray.shape[::-1]

        # 查找棋盘格角点
        ret, corners = cv2.findChessboardCorners(
            gray,
            self.board_size,
            cv2.CALIB_CB_ADAPTIVE_THRESH +
            cv2.CALIB_CB_NORMALIZE_IMAGE
        )

        # 绘制结果
        display_image = image.copy()

        if ret:
            # 精确化角点位置
            criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,
                       30, 0.001)
            corners = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1),
                                      criteria)

            # 绘制角点
            cv2.drawChessboardCorners(display_image, self.board_size,
                                     corners, ret)

            # 添加提示文字
            text = f"Found! Press 'c' to capture ({len(self.image_points)}/{self.min_samples})"
            cv2.putText(display_image, text, (10, 30),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

            # 捕获图像
            if self.capture_image:
                self.image_points.append(corners)
                self.object_points.append(self.pattern_points)
                rospy.loginfo(f"Captured calibration image {len(self.image_points)}")
                self.capture_image = False

                # 检查是否收集足够图片
                if len(self.image_points) >= self.min_samples:
                    self.perform_calibration()
        else:
            text = "Chessboard not found"
            cv2.putText(display_image, text, (10, 30),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

        # 发布显示图像
        ros_image = self.bridge.cv2_to_imgmsg(display_image, "bgr8")
        ros_image.header = header
        self.calibrated_pub.publish(ros_image)

    def perform_calibration(self):
        """执行相机标定"""
        rospy.loginfo("Starting camera calibration...")

        # 执行标定
        ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
            self.object_points,
            self.image_points,
            self.image_size,
            None,
            None,
            flags=self.calibration_flags
        )

        if ret:
            self.camera_matrix = camera_matrix
            self.dist_coeffs = dist_coeffs
            self.calibrated = True

            # 计算重投影误差
            total_error = 0
            for i in range(len(self.object_points)):
                imgpoints2, _ = cv2.projectPoints(
                    self.object_points[i],
                    rvecs[i],
                    tvecs[i],
                    camera_matrix,
                    dist_coeffs
                )
                error = cv2.norm(self.image_points[i], imgpoints2,
                               cv2.NORM_L2) / len(imgpoints2)
                total_error += error

            mean_error = total_error / len(self.object_points)

            rospy.loginfo("Calibration successful!")
            rospy.loginfo(f"RMS re-projection error: {mean_error:.4f}")
            rospy.loginfo(f"Camera matrix:\n{camera_matrix}")
            rospy.loginfo(f"Distortion coefficients:\n{dist_coeffs}")

            # 保存标定结果
            self.save_calibration()
        else:
            rospy.logerr("Calibration failed!")

    def correction_mode(self, image, header):
        """矫正模式：应用畸变矫正"""
        # 计算矫正映射
        h, w = image.shape[:2]
        new_camera_matrix, roi = cv2.getOptimalNewCameraMatrix(
            self.camera_matrix,
            self.dist_coeffs,
            (w, h),
            1,
            (w, h)
        )

        # 矫正图像
        dst = cv2.undistort(
            image,
            self.camera_matrix,
            self.dist_coeffs,
            None,
            new_camera_matrix
        )

        # 裁剪图像
        x, y, w, h = roi
        dst = dst[y:y+h, x:x+w]

        # 发布矫正后的图像
        ros_image = self.bridge.cv2_to_imgmsg(dst, "bgr8")
        ros_image.header = header
        self.calibrated_pub.publish(ros_image)

        # 发布相机信息
        self.publish_camera_info(header, new_camera_matrix)

    def publish_camera_info(self, header, camera_matrix):
        """发布相机参数"""
        camera_info = CameraInfo()
        camera_info.header = header
        camera_info.height = self.image_size[1]
        camera_info.width = self.image_size[0]

        # 设置相机矩阵
        camera_info.K = camera_matrix.flatten().tolist()
        camera_info.D = self.dist_coeffs.flatten().tolist()
        camera_info.R = np.eye(3).flatten().tolist()
        camera_info.P = np.column_stack((camera_matrix, np.zeros(3))).flatten().tolist()

        camera_info.distortion_model = "plumb_bob"
        camera_info.binning_x = 0
        camera_info.binning_y = 0

        self.camera_info_pub.publish(camera_info)

    def save_calibration(self):
        """保存标定结果到文件"""
        calibration_data = {
            'camera_matrix': self.camera_matrix.tolist(),
            'distortion_coefficients': self.dist_coeffs.tolist(),
            'image_width': self.image_size[0],
            'image_height': self.image_size[1]
        }

        # 保存到YAML文件
        calibration_file = os.path.expanduser('~/camera_calibration.yaml')
        with open(calibration_file, 'w') as f:
            yaml.dump(calibration_data, f)

        rospy.loginfo(f"Calibration saved to {calibration_file}")

    def load_calibration(self, file_path):
        """加载标定结果"""
        try:
            with open(file_path, 'r') as f:
                data = yaml.safe_load(f)

            self.camera_matrix = np.array(data['camera_matrix'])
            self.dist_coeffs = np.array(data['distortion_coefficients'])
            self.image_size = (data['image_width'], data['image_height'])
            self.calibrated = True

            rospy.loginfo(f"Calibration loaded from {file_path}")
            return True

        except Exception as e:
            rospy.logerr(f"Failed to load calibration: {e}")
            return False

if __name__ == '__main__':
    try:
        calibrator = CameraCalibration()

        # 尝试加载已有标定
        calibration_file = os.path.expanduser('~/camera_calibration.yaml')
        if os.path.exists(calibration_file):
            calibrator.load_calibration(calibration_file)

        # 监听键盘输入
        def check_capture():
            rate = rospy.Rate(10)
            while not rospy.is_shutdown():
                key = input("Press 'c' to capture, 'q' to quit: ")
                if key == 'c':
                    calibrator.capture_image = True
                elif key == 'q':
                    break
                rate.sleep()

        import threading
        keyboard_thread = threading.Thread(target=check_capture)
        keyboard_thread.daemon = True
        keyboard_thread.start()

        rospy.spin()

    except rospy.ROSInterruptException:
        pass

3.2 立体视觉标定

双目相机标定用于立体视觉和深度估计：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image, CameraInfo
from cv_bridge import CvBridge
from message_filters import TimeSynchronizer, Subscriber
import yaml

class StereoCameraCalibration:
    """双目相机标定"""

    def __init__(self):
        rospy.init_node('stereo_calibration', anonymous=True)
        self.bridge = CvBridge()

        # 标定板参数
        self.board_size = (9, 6)
        self.square_size = 0.025

        # 存储标定数据
        self.left_points = []
        self.right_points = []
        self.object_points = []

        # 准备标定板3D坐标
        self.pattern_points = np.zeros((self.board_size[0] * self.board_size[1], 3),
                                       np.float32)
        self.pattern_points[:, :2] = (
            np.mgrid[0:self.board_size[0], 0:self.board_size[1]].T.reshape(-1, 2) *
            self.square_size
        )

        # 标定结果
        self.stereo_calibrated = False
        self.left_camera_matrix = None
        self.left_dist_coeffs = None
        self.right_camera_matrix = None
        self.right_dist_coeffs = None
        self.R = None  # 旋转矩阵
        self.T = None  # 平移向量
        self.E = None  # 本质矩阵
        self.F = None  # 基础矩阵

        # 立体校正参数
        self.R1 = None
        self.R2 = None
        self.P1 = None
        self.P2 = None
        self.Q = None
        self.map_left = None
        self.map_right = None

        # 同步订阅左右相机图像
        left_sub = Subscriber("/camera/left/image_raw", Image)
        right_sub = Subscriber("/camera/right/image_raw", Image)

        self.sync = TimeSynchronizer([left_sub, right_sub], 10)
        self.sync.registerCallback(self.stereo_callback)

        # 发布校正后的图像
        self.left_rect_pub = rospy.Publisher(
            "/camera/left/image_rect",
            Image,
            queue_size=1
        )

        self.right_rect_pub = rospy.Publisher(
            "/camera/right/image_rect",
            Image,
            queue_size=1
        )

        self.disparity_pub = rospy.Publisher(
            "/camera/disparity",
            Image,
            queue_size=1
        )

        rospy.loginfo("Stereo calibration node started")

    def stereo_callback(self, left_msg, right_msg):
        """处理同步的左右图像"""
        try:
            left_image = self.bridge.imgmsg_to_cv2(left_msg, "bgr8")
            right_image = self.bridge.imgmsg_to_cv2(right_msg, "bgr8")

            if not self.stereo_calibrated:
                self.calibration_mode(left_image, right_image)
            else:
                self.rectification_mode(left_image, right_image, left_msg.header)

        except Exception as e:
            rospy.logerr(f"Stereo processing error: {e}")

    def calibration_mode(self, left_image, right_image):
        """标定模式"""
        left_gray = cv2.cvtColor(left_image, cv2.COLOR_BGR2GRAY)
        right_gray = cv2.cvtColor(right_image, cv2.COLOR_BGR2GRAY)

        # 查找棋盘格角点
        left_ret, left_corners = cv2.findChessboardCorners(
            left_gray, self.board_size, None
        )
        right_ret, right_corners = cv2.findChessboardCorners(
            right_gray, self.board_size, None
        )

        if left_ret and right_ret:
            # 精确化角点
            criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,
                       30, 0.001)

            left_corners = cv2.cornerSubPix(
                left_gray, left_corners, (11, 11), (-1, -1), criteria
            )
            right_corners = cv2.cornerSubPix(
                right_gray, right_corners, (11, 11), (-1, -1), criteria
            )

            # 保存角点
            self.left_points.append(left_corners)
            self.right_points.append(right_corners)
            self.object_points.append(self.pattern_points)

            rospy.loginfo(f"Captured stereo pair {len(self.left_points)}")

            # 绘制角点
            cv2.drawChessboardCorners(left_image, self.board_size,
                                     left_corners, left_ret)
            cv2.drawChessboardCorners(right_image, self.board_size,
                                     right_corners, right_ret)

            # 当收集足够数据时执行标定
            if len(self.left_points) >= 20:
                self.perform_stereo_calibration(left_gray.shape[::-1])

    def perform_stereo_calibration(self, image_size):
        """执行双目标定"""
        rospy.loginfo("Starting stereo calibration...")

        # 单目标定
        ret_left, K1, D1, _, _ = cv2.calibrateCamera(
            self.object_points,
            self.left_points,
            image_size,
            None, None
        )

        ret_right, K2, D2, _, _ = cv2.calibrateCamera(
            self.object_points,
            self.right_points,
            image_size,
            None, None
        )

        # 双目标定
        flags = cv2.CALIB_FIX_INTRINSIC
        criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,
                   100, 1e-5)

        ret_stereo, K1, D1, K2, D2, R, T, E, F = cv2.stereoCalibrate(
            self.object_points,
            self.left_points,
            self.right_points,
            K1, D1,
            K2, D2,
            image_size,
            criteria=criteria,
            flags=flags
        )

        if ret_stereo:
            self.left_camera_matrix = K1
            self.left_dist_coeffs = D1
            self.right_camera_matrix = K2
            self.right_dist_coeffs = D2
            self.R = R
            self.T = T
            self.E = E
            self.F = F

            # 立体校正
            self.R1, self.R2, self.P1, self.P2, self.Q, _, _ = cv2.stereoRectify(
                K1, D1, K2, D2,
                image_size, R, T,
                alpha=0, newImageSize=image_size
            )

            # 计算校正映射
            self.map_left = cv2.initUndistortRectifyMap(
                K1, D1, self.R1, self.P1,
                image_size, cv2.CV_16SC2
            )

            self.map_right = cv2.initUndistortRectifyMap(
                K2, D2, self.R2, self.P2,
                image_size, cv2.CV_16SC2
            )

            self.stereo_calibrated = True

            rospy.loginfo("Stereo calibration successful!")
            rospy.loginfo(f"Baseline: {np.linalg.norm(T):.3f} meters")
            rospy.loginfo(f"RMS error: {ret_stereo:.4f}")

            # 保存标定结果
            self.save_stereo_calibration()
        else:
            rospy.logerr("Stereo calibration failed!")

    def rectification_mode(self, left_image, right_image, header):
        """校正模式"""
        # 应用校正映射
        left_rect = cv2.remap(left_image, self.map_left[0],
                             self.map_left[1], cv2.INTER_LINEAR)
        right_rect = cv2.remap(right_image, self.map_right[0],
                              self.map_right[1], cv2.INTER_LINEAR)

        # 计算视差图
        disparity = self.compute_disparity(left_rect, right_rect)

        # 发布校正后的图像
        left_msg = self.bridge.cv2_to_imgmsg(left_rect, "bgr8")
        left_msg.header = header
        self.left_rect_pub.publish(left_msg)

        right_msg = self.bridge.cv2_to_imgmsg(right_rect, "bgr8")
        right_msg.header = header
        self.right_rect_pub.publish(right_msg)

        # 发布视差图
        disp_normalized = cv2.normalize(disparity, None, 0, 255,
                                       cv2.NORM_MINMAX, cv2.CV_8U)
        disp_colored = cv2.applyColorMap(disp_normalized, cv2.COLORMAP_JET)

        disp_msg = self.bridge.cv2_to_imgmsg(disp_colored, "bgr8")
        disp_msg.header = header
        self.disparity_pub.publish(disp_msg)

    def compute_disparity(self, left, right):
        """计算视差图"""
        # 转换为灰度图
        left_gray = cv2.cvtColor(left, cv2.COLOR_BGR2GRAY)
        right_gray = cv2.cvtColor(right, cv2.COLOR_BGR2GRAY)

        # SGBM参数
        window_size = 5
        min_disp = 0
        num_disp = 128

        # 创建SGBM对象
        stereo = cv2.StereoSGBM_create(
            minDisparity=min_disp,
            numDisparities=num_disp,
            blockSize=window_size,
            P1=8 * 3 * window_size ** 2,
            P2=32 * 3 * window_size ** 2,
            disp12MaxDiff=1,
            uniquenessRatio=15,
            speckleWindowSize=100,
            speckleRange=32,
            preFilterCap=63,
            mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY
        )

        # 计算视差
        disparity = stereo.compute(left_gray, right_gray).astype(np.float32) / 16.0

        return disparity

    def save_stereo_calibration(self):
        """保存双目标定结果"""
        calibration_data = {
            'left_camera_matrix': self.left_camera_matrix.tolist(),
            'left_distortion': self.left_dist_coeffs.tolist(),
            'right_camera_matrix': self.right_camera_matrix.tolist(),
            'right_distortion': self.right_dist_coeffs.tolist(),
            'rotation': self.R.tolist(),
            'translation': self.T.tolist(),
            'essential': self.E.tolist(),
            'fundamental': self.F.tolist(),
            'Q': self.Q.tolist()
        }

        with open('stereo_calibration.yaml', 'w') as f:
            yaml.dump(calibration_data, f)

        rospy.loginfo("Stereo calibration saved")

if __name__ == '__main__':
    try:
        calibrator = StereoCameraCalibration()
        rospy.spin()
    except rospy.ROSInterruptException:
        pass

四、特征检测与匹配

4.1 特征点检测

实现多种特征检测算法：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from geometry_msgs.msg import Point
from cv_bridge import CvBridge
from visualization_msgs.msg import Marker, MarkerArray

class FeatureDetector:
    """特征检测节点"""

    def __init__(self):
        rospy.init_node('feature_detector', anonymous=True)
        self.bridge = CvBridge()

        # 特征检测器
        self.detectors = {
            'harris': self.detect_harris,
            'shi-tomasi': self.detect_shi_tomasi,
            'sift': self.detect_sift,
            'surf': self.detect_surf,
            'orb': self.detect_orb,
            'akaze': self.detect_akaze,
            'brisk': self.detect_brisk,
            'fast': self.detect_fast
        }

        # 当前检测器
        self.current_detector = 'orb'

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.feature_pub = rospy.Publisher(
            "/camera/image_features",
            Image,
            queue_size=1
        )

        self.marker_pub = rospy.Publisher(
            "/features/markers",
            MarkerArray,
            queue_size=1
        )

        # 参数
        self.max_features = rospy.get_param('~max_features', 500)
        self.detector_type = rospy.get_param('~detector_type', 'orb')

        rospy.loginfo(f"Feature detector started with {self.detector_type}")

    def image_callback(self, msg):
        """处理图像"""
        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # 检测特征
            keypoints, descriptors = self.detect_features(cv_image)

            # 可视化特征
            vis_image = self.visualize_features(cv_image, keypoints)

            # 发布结果
            ros_image = self.bridge.cv2_to_imgmsg(vis_image, "bgr8")
            ros_image.header = msg.header
            self.feature_pub.publish(ros_image)

            # 发布3D标记
            self.publish_markers(keypoints, msg.header)

        except Exception as e:
            rospy.logerr(f"Feature detection error: {e}")

    def detect_features(self, image):
        """检测特征点"""
        if self.current_detector in self.detectors:
            return self.detectors[self.current_detector](image)
        else:
            rospy.logwarn(f"Unknown detector: {self.current_detector}")
            return [], None

    def detect_harris(self, image):
        """Harris角点检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # Harris角点检测
        dst = cv2.cornerHarris(gray, 2, 3, 0.04)
        dst = cv2.dilate(dst, None)

        # 获取角点坐标
        threshold = 0.01 * dst.max()
        corners = np.argwhere(dst > threshold)

        # 转换为KeyPoint格式
        keypoints = [cv2.KeyPoint(float(c[1]), float(c[0]), 10)
                    for c in corners[:self.max_features]]

        return keypoints, None

    def detect_shi_tomasi(self, image):
        """Shi-Tomasi角点检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 检测角点
        corners = cv2.goodFeaturesToTrack(
            gray,
            maxCorners=self.max_features,
            qualityLevel=0.01,
            minDistance=10,
            blockSize=3,
            useHarrisDetector=False,
            k=0.04
        )

        keypoints = []
        if corners is not None:
            for corner in corners:
                x, y = corner[0]
                keypoints.append(cv2.KeyPoint(float(x), float(y), 10))

        return keypoints, None

    def detect_sift(self, image):
        """SIFT特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建SIFT检测器
        sift = cv2.SIFT_create(nfeatures=self.max_features)

        # 检测特征点和描述子
        keypoints, descriptors = sift.detectAndCompute(gray, None)

        return keypoints, descriptors

    def detect_surf(self, image):
        """SURF特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建SURF检测器 (需要opencv-contrib)
        try:
            surf = cv2.xfeatures2d.SURF_create(400)
            keypoints, descriptors = surf.detectAndCompute(gray, None)
        except:
            rospy.logwarn("SURF not available, using ORB instead")
            return self.detect_orb(image)

        return keypoints, descriptors

    def detect_orb(self, image):
        """ORB特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建ORB检测器
        orb = cv2.ORB_create(
            nfeatures=self.max_features,
            scaleFactor=1.2,
            nlevels=8,
            edgeThreshold=31,
            firstLevel=0,
            WTA_K=2,
            scoreType=cv2.ORB_HARRIS_SCORE,
            patchSize=31
        )

        # 检测特征点和描述子
        keypoints, descriptors = orb.detectAndCompute(gray, None)

        return keypoints, descriptors

    def detect_akaze(self, image):
        """AKAZE特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建AKAZE检测器
        akaze = cv2.AKAZE_create()

        # 检测特征点和描述子
        keypoints, descriptors = akaze.detectAndCompute(gray, None)

        # 限制特征点数量
        if len(keypoints) > self.max_features:
            keypoints = sorted(keypoints, key=lambda x: x.response,
                             reverse=True)[:self.max_features]

        return keypoints, descriptors

    def detect_brisk(self, image):
        """BRISK特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建BRISK检测器
        brisk = cv2.BRISK_create()

        # 检测特征点和描述子
        keypoints, descriptors = brisk.detectAndCompute(gray, None)

        # 限制特征点数量
        if len(keypoints) > self.max_features:
            keypoints = sorted(keypoints, key=lambda x: x.response,
                             reverse=True)[:self.max_features]

        return keypoints, descriptors

    def detect_fast(self, image):
        """FAST特征检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 创建FAST检测器
        fast = cv2.FastFeatureDetector_create(
            threshold=20,
            nonmaxSuppression=True,
            type=cv2.FAST_FEATURE_DETECTOR_TYPE_9_16
        )

        # 检测特征点
        keypoints = fast.detect(gray, None)

        # 限制特征点数量
        if len(keypoints) > self.max_features:
            keypoints = sorted(keypoints, key=lambda x: x.response,
                             reverse=True)[:self.max_features]

        # FAST不生成描述子，使用BRIEF
        brief = cv2.xfeatures2d.BriefDescriptorExtractor_create()
        keypoints, descriptors = brief.compute(gray, keypoints)

        return keypoints, descriptors

    def visualize_features(self, image, keypoints):
        """可视化特征点"""
        # 绘制特征点
        vis_image = cv2.drawKeypoints(
            image, keypoints, None,
            color=(0, 255, 0),
            flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS
        )

        # 添加统计信息
        text = f"{self.current_detector.upper()}: {len(keypoints)} features"
        cv2.putText(vis_image, text, (10, 30),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        return vis_image

    def publish_markers(self, keypoints, header):
        """发布3D标记"""
        marker_array = MarkerArray()

        for i, kp in enumerate(keypoints[:50]):  # 只显示前50个
            marker = Marker()
            marker.header = header
            marker.ns = "features"
            marker.id = i
            marker.type = Marker.SPHERE
            marker.action = Marker.ADD

            # 设置位置（归一化坐标）
            marker.pose.position.x = kp.pt[0] / 640.0
            marker.pose.position.y = kp.pt[1] / 480.0
            marker.pose.position.z = kp.response / 100.0

            # 设置大小
            marker.scale.x = 0.01
            marker.scale.y = 0.01
            marker.scale.z = 0.01

            # 设置颜色（根据响应值）
            marker.color.a = 1.0
            marker.color.r = min(1.0, kp.response / 50.0)
            marker.color.g = 1.0 - marker.color.r
            marker.color.b = 0.0

            marker.lifetime = rospy.Duration(0.1)

            marker_array.markers.append(marker)

        self.marker_pub.publish(marker_array)

if __name__ == '__main__':
    try:
        detector = FeatureDetector()
        rospy.spin()
    except rospy.ROSInterruptException:
        pass

4.2 特征匹配与跟踪

实现特征匹配和跟踪功能：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from geometry_msgs.msg import Twist
from cv_bridge import CvBridge
import collections

class FeatureTracker:
    """特征跟踪节点"""

    def __init__(self):
        rospy.init_node('feature_tracker', anonymous=True)
        self.bridge = CvBridge()

        # 特征检测器和匹配器
        self.detector = cv2.ORB_create(nfeatures=500)
        self.matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

        # 光流跟踪参数
        self.lk_params = dict(
            winSize=(15, 15),
            maxLevel=2,
            criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03)
        )

        # 跟踪状态
        self.prev_gray = None
        self.prev_keypoints = None
        self.prev_descriptors = None
        self.tracks = collections.defaultdict(list)
        self.track_id = 0
        self.track_len = 10

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.tracked_pub = rospy.Publisher(
            "/camera/image_tracked",
            Image,
            queue_size=1
        )

        self.velocity_pub = rospy.Publisher(
            "/camera/optical_flow",
            Twist,
            queue_size=1
        )

        rospy.loginfo("Feature tracker started")

    def image_callback(self, msg):
        """处理图像"""
        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")
            gray = cv2.cvtColor(cv_image, cv2.COLOR_BGR2GRAY)

            # 跟踪特征
            tracked_image = self.track_features(cv_image, gray)

            # 发布结果
            ros_image = self.bridge.cv2_to_imgmsg(tracked_image, "bgr8")
            ros_image.header = msg.header
            self.tracked_pub.publish(ros_image)

            # 更新历史帧
            self.prev_gray = gray.copy()

        except Exception as e:
            rospy.logerr(f"Tracking error: {e}")

    def track_features(self, image, gray):
        """跟踪特征点"""
        vis_image = image.copy()

        if self.prev_gray is None:
            # 第一帧，检测特征
            self.detect_features(gray)
        else:
            # 光流跟踪
            self.optical_flow_tracking(gray, vis_image)

            # 特征匹配
            self.feature_matching(gray, vis_image)

        return vis_image

    def detect_features(self, gray):
        """检测新特征"""
        # ORB特征检测
        keypoints, descriptors = self.detector.detectAndCompute(gray, None)

        self.prev_keypoints = keypoints
        self.prev_descriptors = descriptors

        # 初始化跟踪
        for kp in keypoints:
            self.tracks[self.track_id] = [kp.pt]
            self.track_id += 1

    def optical_flow_tracking(self, gray, vis_image):
        """光流跟踪"""
        if self.prev_keypoints is None or len(self.prev_keypoints) == 0:
            return

        # 准备前一帧特征点
        p0 = np.array([kp.pt for kp in self.prev_keypoints],
                      dtype=np.float32).reshape(-1, 1, 2)

        # 计算光流
        p1, st, err = cv2.calcOpticalFlowPyrLK(
            self.prev_gray, gray, p0, None, **self.lk_params
        )

        if p1 is None:
            return

        # 筛选好的跟踪点
        good_new = p1[st == 1]
        good_old = p0[st == 1]

        # 计算平均光流
        if len(good_new) > 0:
            flow = good_new - good_old
            mean_flow = np.mean(flow, axis=0)

            # 发布光流速度
            twist = Twist()
            twist.linear.x = float(mean_flow[0])
            twist.linear.y = float(mean_flow[1])
            self.velocity_pub.publish(twist)

        # 绘制跟踪轨迹
        for i, (new, old) in enumerate(zip(good_new, good_old)):
            a, b = new.ravel().astype(int)
            c, d = old.ravel().astype(int)

            # 绘制轨迹线
            cv2.line(vis_image, (a, b), (c, d), (0, 255, 0), 2)
            cv2.circle(vis_image, (a, b), 5, (0, 0, 255), -1)

        # 更新跟踪的特征点
        self.prev_keypoints = [cv2.KeyPoint(pt[0], pt[1], 10)
                              for pt in good_new]

    def feature_matching(self, gray, vis_image):
        """特征匹配"""
        # 检测当前帧特征
        keypoints, descriptors = self.detector.detectAndCompute(gray, None)

        if self.prev_descriptors is None or descriptors is None:
            self.prev_keypoints = keypoints
            self.prev_descriptors = descriptors
            return

        # 特征匹配
        matches = self.matcher.match(self.prev_descriptors, descriptors)

        # 按距离排序
        matches = sorted(matches, key=lambda x: x.distance)

        # 筛选好的匹配
        good_matches = matches[:50]

        # 绘制匹配结果
        for match in good_matches:
            pt1 = self.prev_keypoints[match.queryIdx].pt
            pt2 = keypoints[match.trainIdx].pt

            # 绘制匹配线
            cv2.line(vis_image,
                    (int(pt1[0]), int(pt1[1])),
                    (int(pt2[0]), int(pt2[1])),
                    (255, 0, 0), 1)

        # 添加统计信息
        text = f"Matches: {len(good_matches)}/{len(matches)}"
        cv2.putText(vis_image, text, (10, 30),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        # 更新历史特征
        self.prev_keypoints = keypoints
        self.prev_descriptors = descriptors

class FeatureMatcher:
    """特征匹配器"""

    def __init__(self):
        rospy.init_node('feature_matcher', anonymous=True)
        self.bridge = CvBridge()

        # 创建不同类型的匹配器
        self.matchers = {
            'bf': cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True),
            'flann': self.create_flann_matcher()
        }

        self.current_matcher = 'flann'

        # 模板图像特征
        self.template_keypoints = None
        self.template_descriptors = None
        self.template_image = None

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.template_sub = rospy.Subscriber(
            "/template/image",
            Image,
            self.template_callback
        )

        self.matched_pub = rospy.Publisher(
            "/camera/image_matched",
            Image,
            queue_size=1
        )

        rospy.loginfo("Feature matcher started")

    def create_flann_matcher(self):
        """创建FLANN匹配器"""
        # FLANN参数
        FLANN_INDEX_LSH = 6
        index_params = dict(
            algorithm=FLANN_INDEX_LSH,
            table_number=12,
            key_size=20,
            multi_probe_level=2
        )
        search_params = dict(checks=50)

        return cv2.FlannBasedMatcher(index_params, search_params)

    def template_callback(self, msg):
        """设置模板图像"""
        try:
            template = self.bridge.imgmsg_to_cv2(msg, "bgr8")
            gray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)

            # 检测模板特征
            detector = cv2.ORB_create()
            self.template_keypoints, self.template_descriptors = \
                detector.detectAndCompute(gray, None)

            self.template_image = template

            rospy.loginfo(f"Template set with {len(self.template_keypoints)} features")

        except Exception as e:
            rospy.logerr(f"Template processing error: {e}")

    def image_callback(self, msg):
        """处理图像"""
        if self.template_descriptors is None:
            return

        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")
            gray = cv2.cvtColor(cv_image, cv2.COLOR_BGR2GRAY)

            # 检测特征
            detector = cv2.ORB_create()
            keypoints, descriptors = detector.detectAndCompute(gray, None)

            if descriptors is None:
                return

            # 特征匹配
            matches = self.match_features(descriptors)

            # 找到目标
            result_image = self.find_object(cv_image, keypoints, matches)

            # 发布结果
            ros_image = self.bridge.cv2_to_imgmsg(result_image, "bgr8")
            ros_image.header = msg.header
            self.matched_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"Matching error: {e}")

    def match_features(self, descriptors):
        """匹配特征"""
        if self.current_matcher == 'bf':
            # 暴力匹配
            matches = self.matchers['bf'].match(
                self.template_descriptors, descriptors
            )
            matches = sorted(matches, key=lambda x: x.distance)

        else:
            # FLANN匹配
            matches = self.matchers['flann'].knnMatch(
                self.template_descriptors, descriptors, k=2
            )

            # Lowe's ratio test
            good_matches = []
            for match_pair in matches:
                if len(match_pair) == 2:
                    m, n = match_pair
                    if m.distance < 0.7 * n.distance:
                        good_matches.append(m)
            matches = good_matches

        return matches

    def find_object(self, image, keypoints, matches):
        """在图像中找到目标对象"""
        result = image.copy()

        if len(matches) < 4:
            return result

        # 获取匹配点坐标
        src_pts = np.float32([self.template_keypoints[m.queryIdx].pt
                             for m in matches]).reshape(-1, 1, 2)
        dst_pts = np.float32([keypoints[m.trainIdx].pt
                             for m in matches]).reshape(-1, 1, 2)

        # 计算单应性矩阵
        M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)

        if M is not None:
            # 获取模板图像边界
            h, w = self.template_image.shape[:2]
            pts = np.float32([[0, 0], [0, h-1], [w-1, h-1], [w-1, 0]]).reshape(-1, 1, 2)

            # 变换边界点
            dst = cv2.perspectiveTransform(pts, M)

            # 绘制检测框
            result = cv2.polylines(result, [np.int32(dst)], True, (0, 255, 0), 3)

            # 添加匹配信息
            text = f"Object found! Matches: {len(matches)}"
            cv2.putText(result, text, (10, 30),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        return result

if __name__ == '__main__':
    try:
        # 选择运行模式
        mode = rospy.get_param('~mode', 'tracker')

        if mode == 'tracker':
            node = FeatureTracker()
        else:
            node = FeatureMatcher()

        rospy.spin()

    except rospy.ROSInterruptException:
        pass

五、目标检测与识别

5.1 传统目标检测

实现基于传统方法的目标检测：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from std_msgs.msg import Header
from geometry_msgs.msg import Polygon, Point32
from cv_bridge import CvBridge
from vision_msgs.msg import Detection2DArray, Detection2D, BoundingBox2D

class ClassicalObjectDetector:
    """传统目标检测器"""

    def __init__(self):
        rospy.init_node('classical_object_detector', anonymous=True)
        self.bridge = CvBridge()

        # 检测器配置
        self.detectors = {
            'cascade': self.cascade_detection,
            'hog': self.hog_detection,
            'template': self.template_matching,
            'color': self.color_detection,
            'contour': self.contour_detection
        }

        # 加载级联分类器
        self.face_cascade = cv2.CascadeClassifier(
            cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
        )
        self.eye_cascade = cv2.CascadeClassifier(
            cv2.data.haarcascades + 'haarcascade_eye.xml'
        )

        # HOG检测器
        self.hog = cv2.HOGDescriptor()
        self.hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

        # 模板
        self.templates = {}

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.detection_pub = rospy.Publisher(
            "/detections",
            Detection2DArray,
            queue_size=1
        )

        self.vis_pub = rospy.Publisher(
            "/camera/image_detected",
            Image,
            queue_size=1
        )

        # 参数
        self.detection_method = rospy.get_param('~method', 'cascade')

        rospy.loginfo(f"Classical detector started with {self.detection_method}")

    def image_callback(self, msg):
        """处理图像"""
        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # 执行检测
            detections = self.detect_objects(cv_image)

            # 可视化
            vis_image = self.visualize_detections(cv_image, detections)

            # 发布检测结果
            self.publish_detections(detections, msg.header)

            # 发布可视化图像
            ros_image = self.bridge.cv2_to_imgmsg(vis_image, "bgr8")
            ros_image.header = msg.header
            self.vis_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"Detection error: {e}")

    def detect_objects(self, image):
        """检测目标"""
        if self.detection_method in self.detectors:
            return self.detectors[self.detection_method](image)
        else:
            rospy.logwarn(f"Unknown detection method: {self.detection_method}")
            return []

    def cascade_detection(self, image):
        """级联分类器检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        detections = []

        # 检测人脸
        faces = self.face_cascade.detectMultiScale(
            gray,
            scaleFactor=1.1,
            minNeighbors=5,
            minSize=(30, 30)
        )

        for (x, y, w, h) in faces:
            detection = {
                'bbox': [x, y, w, h],
                'class': 'face',
                'score': 0.95
            }
            detections.append(detection)

            # 在人脸区域检测眼睛
            roi_gray = gray[y:y+h, x:x+w]
            eyes = self.eye_cascade.detectMultiScale(roi_gray)

            for (ex, ey, ew, eh) in eyes:
                eye_detection = {
                    'bbox': [x+ex, y+ey, ew, eh],
                    'class': 'eye',
                    'score': 0.90
                }
                detections.append(eye_detection)

        return detections

    def hog_detection(self, image):
        """HOG行人检测"""
        detections = []

        # 检测行人
        (rects, weights) = self.hog.detectMultiScale(
            image,
            winStride=(4, 4),
            padding=(8, 8),
            scale=1.05
        )

        for i, (x, y, w, h) in enumerate(rects):
            detection = {
                'bbox': [x, y, w, h],
                'class': 'person',
                'score': float(weights[i])
            }
            detections.append(detection)

        return detections

    def template_matching(self, image):
        """模板匹配"""
        detections = []
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        for template_name, template in self.templates.items():
            # 模板匹配
            result = cv2.matchTemplate(gray, template, cv2.TM_CCOEFF_NORMED)

            # 设置阈值
            threshold = 0.8
            locations = np.where(result >= threshold)

            # 获取检测框
            h, w = template.shape[:2]
            for pt in zip(*locations[::-1]):
                detection = {
                    'bbox': [pt[0], pt[1], w, h],
                    'class': template_name,
                    'score': float(result[pt[1], pt[0]])
                }
                detections.append(detection)

        # NMS去重
        detections = self.non_max_suppression(detections)

        return detections

    def color_detection(self, image):
        """颜色检测"""
        hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        detections = []

        # 定义颜色范围
        color_ranges = {
            'red': [(0, 50, 50), (10, 255, 255)],
            'green': [(35, 50, 50), (85, 255, 255)],
            'blue': [(100, 50, 50), (130, 255, 255)],
            'yellow': [(20, 50, 50), (35, 255, 255)]
        }

        for color_name, (lower, upper) in color_ranges.items():
            # 创建掩码
            mask = cv2.inRange(hsv, np.array(lower), np.array(upper))

            # 形态学操作
            kernel = np.ones((5, 5), np.uint8)
            mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
            mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)

            # 查找轮廓
            contours, _ = cv2.findContours(
                mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
            )

            for contour in contours:
                area = cv2.contourArea(contour)
                if area > 500:  # 最小面积阈值
                    x, y, w, h = cv2.boundingRect(contour)
                    detection = {
                        'bbox': [x, y, w, h],
                        'class': f'{color_name}_object',
                        'score': min(1.0, area / 10000.0)
                    }
                    detections.append(detection)

        return detections

    def contour_detection(self, image):
        """轮廓检测"""
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        detections = []

        # 边缘检测
        edges = cv2.Canny(gray, 50, 150)

        # 查找轮廓
        contours, hierarchy = cv2.findContours(
            edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
        )

        for contour in contours:
            area = cv2.contourArea(contour)
            if area < 100 or area > 10000:
                continue

            # 轮廓近似
            epsilon = 0.02 * cv2.arcLength(contour, True)
            approx = cv2.approxPolyDP(contour, epsilon, True)

            # 根据顶点数判断形状
            vertices = len(approx)
            if vertices == 3:
                shape = 'triangle'
            elif vertices == 4:
                x, y, w, h = cv2.boundingRect(approx)
                aspect_ratio = float(w) / h
                shape = 'square' if 0.95 <= aspect_ratio <= 1.05 else 'rectangle'
            elif vertices == 5:
                shape = 'pentagon'
            elif vertices > 5:
                shape = 'circle'
            else:
                continue

            x, y, w, h = cv2.boundingRect(contour)
            detection = {
                'bbox': [x, y, w, h],
                'class': shape,
                'score': 0.85,
                'contour': contour
            }
            detections.append(detection)

        return detections

    def non_max_suppression(self, detections, threshold=0.4):
        """非极大值抑制"""
        if len(detections) == 0:
            return []

        # 转换为numpy数组
        boxes = np.array([d['bbox'] for d in detections])
        scores = np.array([d['score'] for d in detections])

        # 计算面积
        x1 = boxes[:, 0]
        y1 = boxes[:, 1]
        x2 = boxes[:, 0] + boxes[:, 2]
        y2 = boxes[:, 1] + boxes[:, 3]
        areas = boxes[:, 2] * boxes[:, 3]

        # 按分数排序
        order = scores.argsort()[::-1]

        keep = []
        while order.size > 0:
            i = order[0]
            keep.append(i)

            # 计算IoU
            xx1 = np.maximum(x1[i], x1[order[1:]])
            yy1 = np.maximum(y1[i], y1[order[1:]])
            xx2 = np.minimum(x2[i], x2[order[1:]])
            yy2 = np.minimum(y2[i], y2[order[1:]])

            w = np.maximum(0.0, xx2 - xx1)
            h = np.maximum(0.0, yy2 - yy1)
            inter = w * h

            iou = inter / (areas[i] + areas[order[1:]] - inter)

            # 保留IoU小于阈值的框
            inds = np.where(iou <= threshold)[0]
            order = order[inds + 1]

        return [detections[i] for i in keep]

    def visualize_detections(self, image, detections):
        """可视化检测结果"""
        result = image.copy()

        # 颜色映射
        color_map = {
            'face': (255, 0, 0),
            'eye': (0, 255, 0),
            'person': (0, 0, 255),
            'circle': (255, 255, 0),
            'rectangle': (255, 0, 255),
            'square': (0, 255, 255),
            'triangle': (128, 255, 0)
        }

        for detection in detections:
            x, y, w, h = detection['bbox']
            class_name = detection['class']
            score = detection['score']

            # 获取颜色
            color = color_map.get(class_name, (255, 255, 255))

            # 绘制边界框
            cv2.rectangle(result, (x, y), (x+w, y+h), color, 2)

            # 绘制标签
            label = f"{class_name}: {score:.2f}"
            label_size, _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
            cv2.rectangle(result, (x, y-label_size[1]-4),
                         (x+label_size[0], y), color, -1)
            cv2.putText(result, label, (x, y-2),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)

            # 绘制轮廓（如果有）
            if 'contour' in detection:
                cv2.drawContours(result, [detection['contour']], -1, color, 1)

        # 添加统计信息
        info = f"Detections: {len(detections)}"
        cv2.putText(result, info, (10, 30),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        return result

    def publish_detections(self, detections, header):
        """发布检测结果"""
        detection_array = Detection2DArray()
        detection_array.header = header

        for det in detections:
            detection = Detection2D()
            detection.header = header

            # 设置边界框
            bbox = BoundingBox2D()
            bbox.center.x = det['bbox'][0] + det['bbox'][2] / 2
            bbox.center.y = det['bbox'][1] + det['bbox'][3] / 2
            bbox.size_x = det['bbox'][2]
            bbox.size_y = det['bbox'][3]
            detection.bbox = bbox

            # 设置类别和分数
            detection.results[0].id = det['class']
            detection.results[0].score = det['score']

            detection_array.detections.append(detection)

        self.detection_pub.publish(detection_array)

if __name__ == '__main__':
    try:
        detector = ClassicalObjectDetector()
        rospy.spin()
    except rospy.ROSInterruptException:
        pass

5.2 深度学习目标检测

集成深度学习模型进行目标检测：

python 复制代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
from vision_msgs.msg import Detection2DArray, Detection2D, BoundingBox2D, ObjectHypothesisWithPose
import torch
import torchvision

class DeepLearningDetector:
    """深度学习目标检测器"""

    def __init__(self):
        rospy.init_node('deep_learning_detector', anonymous=True)
        self.bridge = CvBridge()

        # 选择设备
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        rospy.loginfo(f"Using device: {self.device}")

        # 加载预训练模型
        self.model_name = rospy.get_param('~model', 'fasterrcnn')
        self.load_model()

        # COCO类别名称
        self.class_names = [
            '__background__', 'person', 'bicycle', 'car', 'motorcycle',
            'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
            'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird',
            'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear',
            'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
            'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
            'kite', 'baseball bat', 'baseball glove', 'skateboard',
            'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
            'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
            'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog',
            'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant',
            'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse',
            'remote', 'keyboard', 'cell phone', 'microwave', 'oven',
            'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
            'scissors', 'teddy bear', 'hair drier', 'toothbrush'
        ]

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.detection_pub = rospy.Publisher(
            "/deep_learning/detections",
            Detection2DArray,
            queue_size=1
        )

        self.vis_pub = rospy.Publisher(
            "/deep_learning/visualization",
            Image,
            queue_size=1
        )

        # 参数
        self.confidence_threshold = rospy.get_param('~confidence', 0.5)
        self.nms_threshold = rospy.get_param('~nms', 0.4)

        rospy.loginfo(f"Deep learning detector started with {self.model_name}")

    def load_model(self):
        """加载深度学习模型"""
        if self.model_name == 'fasterrcnn':
            # Faster R-CNN
            self.model = torchvision.models.detection.fasterrcnn_resnet50_fpn(
                pretrained=True
            )
        elif self.model_name == 'maskrcnn':
            # Mask R-CNN
            self.model = torchvision.models.detection.maskrcnn_resnet50_fpn(
                pretrained=True
            )
        elif self.model_name == 'retinanet':
            # RetinaNet
            self.model = torchvision.models.detection.retinanet_resnet50_fpn(
                pretrained=True
            )
        else:
            rospy.logwarn(f"Unknown model: {self.model_name}, using Faster R-CNN")
            self.model = torchvision.models.detection.fasterrcnn_resnet50_fpn(
                pretrained=True
            )

        self.model.to(self.device)
        self.model.eval()

    def image_callback(self, msg):
        """处理图像"""
        try:
            # 转换图像
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # 执行检测
            detections = self.detect_objects(cv_image)

            # 可视化
            vis_image = self.visualize_detections(cv_image, detections)

            # 发布检测结果
            self.publish_detections(detections, msg.header)

            # 发布可视化图像
            ros_image = self.bridge.cv2_to_imgmsg(vis_image, "bgr8")
            ros_image.header = msg.header
            self.vis_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"Detection error: {e}")

    def detect_objects(self, image):
        """使用深度学习模型检测目标"""
        # 预处理图像
        image_tensor = self.preprocess_image(image)

        # 推理
        with torch.no_grad():
            predictions = self.model(image_tensor)

        # 后处理
        detections = self.postprocess_predictions(predictions[0])

        return detections

    def preprocess_image(self, image):
        """预处理图像用于模型输入"""
        # BGR转RGB
        image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        # 转换为张量
        image_tensor = torchvision.transforms.functional.to_tensor(image_rgb)

        # 添加batch维度
        image_tensor = image_tensor.unsqueeze(0)

        return image_tensor.to(self.device)

    def postprocess_predictions(self, prediction):
        """后处理模型预测结果"""
        detections = []

        # 提取预测结果
        boxes = prediction['boxes'].cpu().numpy()
        labels = prediction['labels'].cpu().numpy()
        scores = prediction['scores'].cpu().numpy()

        # 如果有mask（Mask R-CNN）
        masks = None
        if 'masks' in prediction:
            masks = prediction['masks'].cpu().numpy()

        # 筛选高置信度检测
        for i in range(len(scores)):
            if scores[i] > self.confidence_threshold:
                detection = {
                    'bbox': boxes[i],
                    'label': labels[i],
                    'score': scores[i],
                    'class_name': self.class_names[labels[i]]
                }

                if masks is not None:
                    detection['mask'] = masks[i, 0] > 0.5

                detections.append(detection)

        return detections

    def visualize_detections(self, image, detections):
        """可视化检测结果"""
        result = image.copy()

        for detection in detections:
            # 提取信息
            bbox = detection['bbox']
            label = detection['label']
            score = detection['score']
            class_name = detection['class_name']

            # 绘制边界框
            x1, y1, x2, y2 = bbox.astype(int)
            color = self.get_color(label)
            cv2.rectangle(result, (x1, y1), (x2, y2), color, 2)

            # 绘制标签
            label_text = f"{class_name}: {score:.2f}"
            label_size, _ = cv2.getTextSize(
                label_text, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1
            )
            cv2.rectangle(result, (x1, y1-label_size[1]-4),
                         (x1+label_size[0], y1), color, -1)
            cv2.putText(result, label_text, (x1, y1-2),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)

            # 绘制mask（如果有）
            if 'mask' in detection:
                mask = detection['mask']
                colored_mask = np.zeros_like(result)
                colored_mask[:, :] = color
                result = cv2.addWeighted(result, 0.7, colored_mask, 0.3, 0,
                                       where=mask[..., None])

        # 添加统计信息
        info = f"{self.model_name}: {len(detections)} objects"
        cv2.putText(result, info, (10, 30),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        return result

    def get_color(self, label):
        """根据类别标签获取颜色"""
        # 使用HSV颜色空间生成不同颜色
        hue = (label * 30) % 180
        hsv = np.array([[[hue, 255, 255]]], dtype=np.uint8)
        rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[0, 0]
        return tuple(int(c) for c in rgb)

    def publish_detections(self, detections, header):
        """发布检测结果"""
        detection_array = Detection2DArray()
        detection_array.header = header

        for detection in detections:
            det_msg = Detection2D()
            det_msg.header = header

            # 边界框
            bbox = BoundingBox2D()
            x1, y1, x2, y2 = detection['bbox']
            bbox.center.x = (x1 + x2) / 2
            bbox.center.y = (y1 + y2) / 2
            bbox.size_x = x2 - x1
            bbox.size_y = y2 - y1
            det_msg.bbox = bbox

            # 分类结果
            hypothesis = ObjectHypothesisWithPose()
            hypothesis.id = str(detection['label'])
            hypothesis.score = detection['score']
            det_msg.results.append(hypothesis)

            detection_array.detections.append(det_msg)

        self.detection_pub.publish(detection_array)

class YOLODetector:
    """YOLO目标检测器"""

    def __init__(self):
        rospy.init_node('yolo_detector', anonymous=True)
        self.bridge = CvBridge()

        # 加载YOLO配置
        self.config_path = rospy.get_param('~config', 'yolov4.cfg')
        self.weights_path = rospy.get_param('~weights', 'yolov4.weights')
        self.names_path = rospy.get_param('~names', 'coco.names')

        # 加载模型
        self.load_yolo()

        # ROS通信
        self.image_sub = rospy.Subscriber(
            "/camera/image_raw",
            Image,
            self.image_callback
        )

        self.detection_pub = rospy.Publisher(
            "/yolo/detections",
            Detection2DArray,
            queue_size=1
        )

        self.vis_pub = rospy.Publisher(
            "/yolo/visualization",
            Image,
            queue_size=1
        )

        # 参数
        self.confidence_threshold = rospy.get_param('~confidence', 0.5)
        self.nms_threshold = rospy.get_param('~nms', 0.4)

        rospy.loginfo("YOLO detector started")

    def load_yolo(self):
        """加载YOLO模型"""
        try:
            # 加载类别名称
            with open(self.names_path, 'r') as f:
                self.classes = [line.strip() for line in f.readlines()]

            # 生成颜色
            self.colors = np.random.randint(0, 255,
                                          size=(len(self.classes), 3),
                                          dtype='uint8')

            # 加载网络
            self.net = cv2.dnn.readNet(self.weights_path, self.config_path)

            # 获取输出层
            layer_names = self.net.getLayerNames()
            self.output_layers = [layer_names[i[0] - 1]
                                 for i in self.net.getUnconnectedOutLayers()]

            rospy.loginfo(f"YOLO loaded with {len(self.classes)} classes")

        except Exception as e:
            rospy.logerr(f"Failed to load YOLO: {e}")

    def image_callback(self, msg):
        """处理图像"""
        try:
            cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")

            # YOLO检测
            detections = self.detect_yolo(cv_image)

            # 可视化
            vis_image = self.visualize_yolo(cv_image, detections)

            # 发布结果
            self.publish_detections(detections, msg.header)

            ros_image = self.bridge.cv2_to_imgmsg(vis_image, "bgr8")
            ros_image.header = msg.header
            self.vis_pub.publish(ros_image)

        except Exception as e:
            rospy.logerr(f"YOLO detection error: {e}")

    def detect_yolo(self, image):
        """执行YOLO检测"""
        height, width = image.shape[:2]

        # 创建blob
        blob = cv2.dnn.blobFromImage(
            image, 1/255.0, (416, 416),
            swapRB=True, crop=False
        )

        # 设置输入
        self.net.setInput(blob)

        # 前向传播
        outputs = self.net.forward(self.output_layers)

        # 解析输出
        detections = []
        boxes = []
        confidences = []
        class_ids = []

        for output in outputs:
            for detection in output:
                scores = detection[5:]
                class_id = np.argmax(scores)
                confidence = scores[class_id]

                if confidence > self.confidence_threshold:
                    # 计算边界框
                    center_x = int(detection[0] * width)
                    center_y = int(detection[1] * height)
                    w = int(detection[2] * width)
                    h = int(detection[3] * height)

                    x = int(center_x - w / 2)
                    y = int(center_y - h / 2)

                    boxes.append([x, y, w, h])
                    confidences.append(float(confidence))
                    class_ids.append(class_id)

        # NMS
        indices = cv2.dnn.NMSBoxes(
            boxes, confidences,
            self.confidence_threshold,
            self.nms_threshold
        )

        if len(indices) > 0:
            for i in indices.flatten():
                detections.append({
                    'bbox': boxes[i],
                    'confidence': confidences[i],
                    'class_id': class_ids[i],
                    'class_name': self.classes[class_ids[i]]
                })

        return detections

    def visualize_yolo(self, image, detections):
        """可视化YOLO检测结果"""
        result = image.copy()

        for detection in detections:
            x, y, w, h = detection['bbox']
            class_id = detection['class_id']
            confidence = detection['confidence']
            class_name = detection['class_name']

            # 颜色
            color = tuple(int(c) for c in self.colors[class_id])

            # 边界框
            cv2.rectangle(result, (x, y), (x + w, y + h), color, 2)

            # 标签
            label = f"{class_name}: {confidence:.2f}"
            cv2.putText(result, label, (x, y - 10),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

        return result

    def publish_detections(self, detections, header):
        """发布检测结果"""
        detection_array = Detection2DArray()
        detection_array.header = header

        for detection in detections:
            det_msg = Detection2D()
            det_msg.header = header

            # 边界框
            bbox = BoundingBox2D()
            x, y, w, h = detection['bbox']
            bbox.center.x = x + w / 2
            bbox.center.y = y + h / 2
            bbox.size_x = w
            bbox.size_y = h
            det_msg.bbox = bbox

            # 分类结果
            hypothesis = ObjectHypothesisWithPose()
            hypothesis.id = str(detection['class_id'])
            hypothesis.score = detection['confidence']
            det_msg.results.append(hypothesis)

            detection_array.detections.append(det_msg)

        self.detection_pub.publish(detection_array)

if __name__ == '__main__':
    try:
        # 选择检测器类型
        detector_type = rospy.get_param('~detector_type', 'pytorch')

        if detector_type == 'pytorch':
            detector = DeepLearningDetector()
        else:
            detector = YOLODetector()

        rospy.spin()

    except rospy.ROSInterruptException:
        pass

ROS1从入门到精通 15： 机器人视觉 - 图像处理与计算机视觉

ROS1从入门到精通：第15篇 机器人视觉 - 图像处理与计算机视觉

引言

一、ROS视觉系统架构

1.1 视觉系统组成

1.1.1 硬件层

1.1.2 驱动层

1.1.3 处理层

1.1.4 应用层

1.2 图像消息类型

1.3 视觉处理流程

二、OpenCV与ROS集成

2.1 cv_bridge图像转换

2.1.1 安装cv_bridge

2.1.2 基本转换示例

2.2 图像编码格式

2.3 图像处理管道

三、相机标定与矫正

3.1 相机内参标定

3.2 立体视觉标定

四、特征检测与匹配

4.1 特征点检测

4.2 特征匹配与跟踪

五、目标检测与识别

5.1 传统目标检测

5.2 深度学习目标检测

ROS1从入门到精通 15：机器人视觉 - 图像处理与计算机视觉

ROS1从入门到精通：第15篇机器人视觉 - 图像处理与计算机视觉