YOLO + 3D 空间定位 + 抓取姿态完整方案

我给你工业级最简实现 ：相机识别物体 → 输出 3D 坐标 (X,Y,Z) + 旋转角 → 生成机械臂可直接用的抓取姿态。不用复杂配置，Python + YOLO 直接跑。

一、你需要的硬件（二选一）

方案 A（最简单）：深度相机

推荐：Intel RealSense D435i / D455
优势：直接输出 3D 坐标，不用标定，10 分钟出结果

方案 B（普通相机）：2D 相机 + 标定

需要：标定板 + 手眼标定
适合固定高度场景

我下面给你【深度相机 + YOLO】的完整代码，开箱即用！

二、安装依赖

bash

运行

复制代码

pip install ultralytics opencv-python pyrealsense2 numpy transforms3d

三、完整源码：YOLO 识别 + 3D 坐标 + 抓取姿态

python

运行

复制代码

import cv2
import numpy as np
import pyrealsense2 as rs
from ultralytics import YOLO
import transforms3d.euler as euler

# ====================== 1. 初始化 YOLO ======================
model = YOLO("yolov8n.pt")  # 换成你训练的自定义模型
TARGET_CLASS = 0  # 0=人，可改成你要识别的物体类别

# ====================== 2. 初始化深度相机 ======================
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)

# 对齐深度图到彩色图（关键！3D定位必须）
align_to = rs.stream.color
align = rs.align(align_to)

# 启动相机
pipeline.start(config)

# 获取相机内参（用于像素转3D）
profile = pipeline.get_active_profile()
intr = profile.get_stream(rs.stream.color).as_video_stream_profile().get_intrinsics()

print("相机启动成功，开始识别...")

# ====================== 3. 主循环 ======================
while True:
    # 获取图像
    frames = pipeline.wait_for_frames()
    aligned_frames = align.process(frames)
    
    color_frame = aligned_frames.get_color_frame()
    depth_frame = aligned_frames.get_depth_frame()
    
    if not color_frame or not depth_frame:
        continue

    # 转成OpenCV格式
    img = np.asanyarray(color_frame.get_data())
    depth_img = np.asanyarray(depth_frame.get_data())

    # YOLO 检测
    results = model(img, classes=[TARGET_CLASS])

    for result in results:
        boxes = result.boxes
        for box in boxes:
            # 1. 获取2D框中心
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            cx = (x1 + x2) // 2
            cy = (y1 + y2) // 2

            # 2. 获取深度值 Z
            z = depth_frame.get_distance(cx, cy)
            if z < 0.1 or z > 3.0:  # 过滤无效距离
                continue

            # 3. 像素坐标 → 3D空间坐标（相机坐标系）
            X = (cx - intr.ppx) * z / intr.fx
            Y = (cy - intr.ppy) * z / intr.fy
            Z = z

            # 4. 计算抓取姿态（默认垂直向下抓取）
            # 欧拉角：roll, pitch, yaw（可直接发给机械臂）
            roll = 0.0
            pitch = 0.0
            yaw = 0.0

            # 转四元数（机械臂常用）
            quat = euler.euler2quat(roll, pitch, yaw, axes='sxyz')

            # ====================== 输出结果 ======================
            print(f"\n===== 抓取位姿 =====")
            print(f"3D坐标 X: {X:.3f}  Y: {Y:.3f}  Z: {Z:.3f} (米)")
            print(f"欧拉角 R:{roll:.2f} P:{pitch:.2f} Y:{yaw:.2f}")
            print(f"四元数 W:{quat[0]:.3f} X:{quat[1]:.3f} Y:{quat[2]:.3f} Z:{quat[3]:.3f}")

            # 画框
            cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
            cv2.circle(img, (cx, cy), 5, (0, 0, 255), -1)
            cv2.putText(img, f"X:{X:.2f} Y:{Y:.2f} Z:{Z:.2f}",
                        (cx-80, cy-20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,0,0), 2)

    # 显示画面
    cv2.imshow("YOLO 3D抓取定位", img)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

pipeline.stop()
cv2.destroyAllWindows()

四、输出说明（机械臂直接可用）

程序会实时输出：

3D 空间坐标
- X：左右
- Y：前后
- Z：高度（距离相机）
抓取姿态
- 欧拉角 Roll / Pitch / Yaw
- 四元数 WXYZ

这些数据可以直接发给：

机械臂（Robotiq、遨博、优傲、埃夫特等）
PLC
运动控制器

YOLO + 3D 空间定位 + 抓取姿态 完整方案

一、你需要的硬件（二选一）

方案 A（最简单）：深度相机

方案 B（普通相机）：2D 相机 + 标定

二、安装依赖

三、完整源码：YOLO 识别 + 3D 坐标 + 抓取姿态

四、输出说明（机械臂直接可用）

YOLO + 3D 空间定位 + 抓取姿态完整方案