mono3d汇总 - 技术栈

lidar坐标系

lidar坐标系可以简单归纳为标准lidar坐标系和nucense lidar坐标系，参考链接。这个坐标系和车辆的ego坐标系是一致的。

标准lidar坐标系
opendet3d，mmdetection3d和kitt都i使用了该坐标系

复制代码

                 up z
                    ^   x front
                    |  /
                    | /
     left y <------ 0

kitti采集平台传感器安装示意图如下，其中红色圆圈标记的为lidar坐标系。

后面说的global yaw就是目标与'-y'的夹角，与'-y'重合时是0，与x重合为90度。

nucense lidar坐标系
nucense传感器坐标系示意图如下，可以看出lidar坐标系和和标准lidar坐标系有个90度的旋转关系。

local yaw & global yaw

由于透视投影的关系，目标在相平面上的成像会同时收到目标转动和相对相机位移的双重影响。所以引出了local yaw和global yaw。

网络学习的对象为local yaw(下面的 α z \alpha_z αz，其中 α z = α x + p i / 2 \alpha_z = \alpha_x + pi/2 αz=αx+pi/2)，推理时根据目标位置+local yaw计算出global yaw。

α x \alpha_x αx在kitti数据集中的定义为：

α∈[−π,π]，即从 −180∘ 到 180∘。

α=0：目标物体的方向与相机光轴完全对齐（面向相机）。

α>0：目标物体的朝向偏向相机光轴的左侧（逆时针方向）。

α<0：目标物体的朝向偏向相机光轴的右侧（顺时针方向）。

部分公司2d目标标注的local yaw：目标与相机z同向重叠：90度，与右侧方向的相机x轴重叠：0度。

global yaw为[-pi, pi]之间，一般正前方为0，左边为90，右边-90. 参考lidar_box3d.py中的定义：

python 复制代码

class LiDARInstance3DBoxes(BaseInstance3DBoxes):
   """3D boxes of instances in LIDAR coordinates.

   Coordinates in LiDAR:

   .. code-block:: none

                                up z    x front (yaw=0)
                                   ^   ^
                                   |  /
                                   | /
       (yaw=0.5*pi) left y <------ 0

   The relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0),
   and the yaw is around the z axis, thus the rotation axis=2. The yaw is 0 at
   the positive direction of x axis, and increases from the positive direction
   of x to the positive direction of y.

   Attributes:
       tensor (Tensor): Float matrix with shape (N, box_dim).
       box_dim (int): Integer indicating the dimension of a box. Each row is
           (x, y, z, x_size, y_size, z_size, yaw, ...).
       with_yaw (bool): If True, the value of yaw will be set to 0 as minmax
           boxes.
   """
   YAW_AXIS = 2

   @property
   def corners(self) -> Tensor:
       """Convert boxes to corners in clockwise order, in the form of (x0y0z0,
       x0y0z1, x0y1z1, x0y1z0, x1y0z0, x1y0z1, x1y1z1, x1y1z0).

       .. code-block:: none

                                          up z
                           front x           ^
                                /            |
                               /             |
                 (x1, y0, z1) + -----------  + (x1, y1, z1)
                             /|            / |
                            / |           /  |
              (x0, y0, z1) + ----------- +   + (x1, y1, z0)
                           |  /      .   |  /
                           | / origin    | /
           left y <------- + ----------- + (x0, y1, z0)
               (x0, y0, z0)

       Returns:
           Tensor: A tensor with 8 corners of each box in shape (N, 8, 3).
       """
       if self.tensor.numel() == 0:
           return torch.empty([0, 8, 3], device=self.tensor.device)

       dims = self.dims
       corners_norm = torch.from_numpy(
           np.stack(np.unravel_index(np.arange(8), [2] * 3), axis=1)).to(
               device=dims.device, dtype=dims.dtype)

       corners_norm = corners_norm[[0, 1, 3, 2, 4, 5, 7, 6]]
       # use relative origin (0.5, 0.5, 0)
       corners_norm = corners_norm - dims.new_tensor([0.5, 0.5, 0])
       corners = dims.view([-1, 1, 3]) * corners_norm.reshape([1, 8, 3])

       # rotate around z axis
       corners = rotation_3d_in_axis(
           corners, self.tensor[:, 6], axis=self.YAW_AXIS)
       corners += self.tensor[:, :3].view(-1, 1, 3)
       return corners

mmdetection3d box_3d_mode.py中定义的各种坐标系:

python 复制代码

class Box3DMode(IntEnum):
   """Enum of different ways to represent a box.

   Coordinates in LiDAR:

   .. code-block:: none

                   up z
                      ^   x front
                      |  /
                      | /
       left y <------ 0

   The relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0),
   and the yaw is around the z axis, thus the rotation axis=2.

   Coordinates in Camera:

   .. code-block:: none

               z front
              /
             /
            0 ------> x right
            |
            |
            v
       down y

   The relative coordinate of bottom center in a CAM box is (0.5, 1.0, 0.5),
   and the yaw is around the y axis, thus the rotation axis=1.

   Coordinates in Depth:

   .. code-block:: none

       up z
          ^   y front
          |  /
          | /
          0 ------> x right

   The relative coordinate of bottom center in a DEPTH box is (0.5, 0.5, 0),
   and the yaw is around the z axis, thus the rotation axis=2.
   """

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation 对local yaw给出了示意图：

globa2local转换
参考fcos3d代码：

python 复制代码

def _get_target_single(..):
#...
        # change orientation to local yaw
        gt_bboxes_3d[..., 6] = -torch.atan2(
            gt_bboxes_3d[..., 0], gt_bboxes_3d[..., 2]) + gt_bboxes_3d[..., 6]