【opencv】示例-essential_mat_reconstr.cpp 从两幅图像中恢复3D场景的几何信息

导入OpenCV的calib3d, highgui, imgproc模块以及C++的vector, iostream, fstream库。
定义了getError2EpipLines函数，这个函数用来计算两组点相对于F矩阵（基础矩阵）的投影误差。
定义了sgn函数，用于返回一个双精度浮点数的符号。
定义了getPlanes函数，用于通过RANSAC算法和3D点集从一系列点中找到若干平面，同时这些点会被分类到对应的平面。
main函数则完成了整个立体视觉的重建流程，具体步骤包括：

读取数据文件和图像；
从图像中检测关键点并提取SIFT特征描述符；
使用FLANN库进行高效特征匹配，并使用Lowe's比率测试筛选优秀的匹配；
通过匹配的特征点计算本质矩阵（Essential Matrix）；
计算特征点到极线（epipolar lines）的平均误差；
本质矩阵分解得到旋转矩阵和平移矩阵；
通过三角化方法恢复出3D空间中的点；
对恢复出的3D点进行平面拟合，以识别出可能的物理平面；
显示和保存最终处理后的图像。

总的来说，这段代码展示了如何使用OpenCV库在计算机视觉中从两幅图像中恢复3D场景的几何信息。

leuvenA

leuvenB

essential_mat_data.txt

终端输出：

properties 复制代码

RANSAC essential matrix time 88023mcs.
Number of inliers 212
Mean error to epipolar lines 0.219943
Number of object points 212

cpp 复制代码

#include "opencv2/calib3d.hpp"  // 包含OpenCV相机校准和3D重建的库
#include "opencv2/highgui.hpp"  // 包含OpenCV图像显示的库
#include "opencv2/imgproc.hpp"  // 包含OpenCV图像处理的库


#include <vector>  // 包含向量容器类的头文件
#include <iostream>  // 包含标准输入输出流的头文件
#include <fstream>  // 包含文件输入输出流的头文件


using namespace cv;  // 使用OpenCV命名空间
static double getError2EpipLines(const Mat &F, const Mat &pts1, const Mat &pts2, const Mat &mask) {
    Mat points1, points2;
    // 将点坐标扩展为齐次坐标形式（在原坐标基础上增加一维，设为1）
    vconcat(pts1, Mat::ones(1, pts1.cols, pts1.type()), points1);
    vconcat(pts2, Mat::ones(1, pts2.cols, pts2.type()), points2);


    double mean_error = 0;
    // 计算并累积所有点到极线的距离作为误差
    for (int pt = 0; pt < (int)mask.total(); pt++)
        if (mask.at<uchar>(pt)) { // 如果该点是内点，则计算
            const Mat l2 = F * points1.col(pt); // 使用基础矩阵F计算极线
            const Mat l1 = F.t() * points2.col(pt); // 使用F的转置计算极线
            // 计算点到极线的距离并取平均
            mean_error += (fabs(points1.col(pt).dot(l1)) / sqrt(pow(l1.at<double>(0), 2) + pow(l1.at<double>(1), 2)) +
                           fabs(points2.col(pt).dot(l2)) / sqrt(pow(l2.at<double>(0), 2) + pow(l2.at<double>(1), 2))) / 2;
        }
    return mean_error / mask.total(); // 返回平均误差
}
static int sgn(double val) { return (0 < val) - (val < 0); } // 符号函数


/*
 * @points3d - vector of Point3 or Mat of size Nx3
 * @planes - vector of found planes
 * @labels - vector of size point3d. Every point which has non-zero label is classified to this plane.
 */
// getPlanes函数，从3D点云中拟合多个平面
// @points3d - 3D点的集合，可以是Point3的向量或者是Nx3的Mat类型
// @planes - 存储找到的平面
// @labels - 和points3d大小相同的向量，每个非0标签的点都被归类到对应的平面
static void getPlanes (InputArray points3d_, std::vector<int> &labels, std::vector<Vec4d> &planes, 
                       int desired_num_planes, double thr_, double conf_, int max_iters_) {
    // 获取输入点并转换为双精度浮点数类型
    Mat points3d = points3d_.getMat();
    points3d.convertTo(points3d, CV_64F);  // 将点转换为双精度


    // 如果输入是一个向量，将其转换为Mat
    if (points3d_.isVector())
        points3d = Mat((int)points3d.total(), 3, CV_64F, points3d.data);
    else {
        // 如果输入的是单通道的Mat，转换为有1个通道
        if (points3d.type() != CV_64F)
            points3d = points3d.reshape(1, (int)points3d.total());
        // 如果点的行数小于列数，转置矩阵
        if (points3d.rows < points3d.cols)
            transpose(points3d, points3d);
        // 校验点的维度是否3
        CV_CheckEQ(points3d.cols, 3, "Invalid dimension of point");
    }


    // 接下来是RANSAC拟合平面的代码
    /*
     * 3D plane fitting with RANSAC
     * @best_model contains coefficients [a b c d] s.t. ax + by + cz = d
     *
     */
    // 使用RANSAC算法拟合3D平面
    // @best_model 包含最佳模型的系数[a b c d]，使得ax + by + cz = d
    auto plane_ransac = [] (const Mat &pts, double thr, double conf, int max_iters, Vec4d &best_model, std::vector<bool> &inliers) {
        const int pts_size = pts.rows;  // 点集的大小
        const int max_lo_inliers = 15;  // 局部优化时考虑的最大内点数
        const int max_lo_iters = 10;  // 局部优化的最大迭代次数
        int best_inls = 0;  // 最佳内点数
        // 如果点的个数小于3，则无法拟合平面
        if (pts_size < 3) return false;
        RNG rng;  // 随机数生成器
        const auto * const points = (double *) pts.data;  // 点集的数据指针
        std::vector<int> min_sample(3);  // 最小样本点集
        inliers = std::vector<bool>(pts_size);  // 内点向量
        const double log_conf = log(1-conf);  // 对数置信度
        Vec4d model, lo_model;  // 当前模型和局部优化模型
        std::vector<int> random_pool(pts_size);  // 随机数池
        for (int p = 0; p < pts_size; p++)
            random_pool[p] = p;
    
        // 使用协方差矩阵估计平面系数
        auto estimate = [&] (const std::vector<int> &sample, Vec4d &model_) {
            // 省略详细的数学计算部分，最终目标是根据选取的样本点估计出平面的法向量和中心点坐标，以求得平面的参数
    
            // 根据法向量和质心找到平面模型
            model_ = Vec4d(weighted_normal(0), weighted_normal(1), weighted_normal(2),
                           weighted_normal.dot(Vec3d(c_x, c_y, c_z)));
            return true;
        };
    
        // 计算内点数量
        auto getInliers = [&] (const Vec4d &model_) {
            // 模型参数
            const double a = model_(0), b = model_(1), c = model_(2), d = model_(3);
            int num_inliers = 0;  // 内点数量
            std::fill(inliers.begin(), inliers.end(), false);
            for (int p = 0; p < pts_size; p++) {
                // 判断每个点是否为内点
                inliers[p] = fabs(a * points[3*p] + b * points[3*p+1] + c * points[3*p+2] - d) < thr;
                if (inliers[p]) num_inliers++;
                // 提前中断循环
                if (num_inliers + pts_size - p < best_inls) break;
            }
            return num_inliers;
        };
        // RANSAC算法的主循环
        for (int iters = 0; iters < max_iters; iters++) {
            // 寻找最小样本点集，即随机选取的3个点
            min_sample[0] = rng.uniform(0, pts_size);
            min_sample[1] = rng.uniform(0, pts_size);
            min_sample[2] = rng.uniform(0, pts_size);
            // 如果无法根据当前样本点估计出平面模型则跳过
            if (! estimate(min_sample, model))
                continue;
            // 获取内点数量
            int num_inliers = getInliers(model);
            // 如果当前模型的内点数量是目前最好的，则进行进一步的局部优化
            if (num_inliers > best_inls) {
                // 保存最好的内点集
                std::vector<bool> best_inliers = inliers;
                // 局部优化
                for (int lo_iter = 0; lo_iter < max_lo_iters; lo_iter++) {
                    std::vector<int> inliers_idx; inliers_idx.reserve(max_lo_inliers);
                    randShuffle(random_pool);
                    for (int p : random_pool) {
                        if (best_inliers[p]) {
                            inliers_idx.emplace_back(p);
                            if ((int)inliers_idx.size() >= max_lo_inliers)
                                break;
                        }
                    }
                    if (! estimate(inliers_idx, lo_model))
                        continue;
                    int lo_inls = getInliers(lo_model);
                    if (best_inls < lo_inls) {
                        best_model = lo_model;
                        best_inls = lo_inls;
                        best_inliers = inliers;
                    }
                }
                // 更新最佳模型
                if (best_inls < num_inliers) {
                    best_model = model;
                    best_inls = num_inliers;
                }
                // 更新最大迭代次数
                const double max_hyp = 3 * log_conf / log(1 - pow(double(best_inls) / pts_size, 3));
                if (! std::isinf(max_hyp) && max_hyp < max_iters)
                    max_iters = static_cast<int>(max_hyp);
            }
        }
        // 全部迭代结束后，更新内点集
        getInliers(best_model);
        // 如果存在内点，则返回true
        return best_inls != 0;
    };
    // 初始化标签为0
    labels = std::vector<int>(points3d.rows, 0);
    Mat pts3d_plane_fit = points3d.clone();
    // 保持点的索引数组，对应原始的points3d
    std::vector<int> to_orig_pts_arr(pts3d_plane_fit.rows);
    for (int i = 0; i < (int) to_orig_pts_arr.size(); i++)
        to_orig_pts_arr[i] = i;
    // 进行期望数量的平面拟合
    for (int num_planes = 1; num_planes <= desired_num_planes; num_planes++) {
        Vec4d model;
        std::vector<bool> inl;
        // 调用RANSAC进行平面拟合；如果未找到平面，则退出循环
        if (!plane_ransac(pts3d_plane_fit, thr_, conf_, max_iters_, model, inl))
            break;
        // 将找到的平面模型添加到planes中
        planes.emplace_back(model);


        const int pts3d_size = pts3d_plane_fit.rows;
        // 清空用于下一次拟合的pts3d_plane_fit
        pts3d_plane_fit = Mat();
        pts3d_plane_fit.reserve(points3d.rows);


        int cnt = 0;
        for (int p = 0; p < pts3d_size; p++) {
            // 若点未被当前平面拟合，则保留以进行下一次迭代
            if (!inl[p]) {
                to_orig_pts_arr[cnt] = to_orig_pts_arr[p];
                pts3d_plane_fit.push_back(points3d.row(to_orig_pts_arr[cnt]));
                cnt++;
            } else {
                // 否则将点标记为当前平面的成员
                labels[to_orig_pts_arr[p]] = num_planes;
            }
        }
    }
}
// 主函数入口
int main(int args, char** argv) {
    // 定义存储数据文件和图像目录的字符串变量
    std::string data_file, image_dir;
    // 检查命令行参数是否足够
    if (args < 3) {
        data_file = "essential_mat_data.txt";
        image_dir = "./";
       /*CV_Error(Error::StsBadArg,
                "Path to data file and directory to image files are missing!\nData file must have"
                " format:\n--------------\n image_name_1\nimage_name_2\nk11 k12 k13\n0   k22 k23\n"
                "0   0   1\n--------------\nIf image_name_{1,2} are not in the same directory as "
                "the data file then add argument with directory to image files.\nFor example: "
                "./essential_mat_reconstr essential_mat_data.txt ./");*/
       // 如果参数不足，提示错误信息并退出
    } else {
       // 获取数据文件和图像目录的路径
       data_file = argv[1];
       image_dir = argv[2];
    }
    // 打开并检查数据文件
    std::ifstream file(data_file, std::ios_base::in);
    CV_CheckEQ((int)file.is_open(), 1, "Data file is not found!");
    // 读取图像文件名
    std::string filename1, filename2;
    std::getline(file, filename1);
    std::getline(file, filename2);
    // 加载图像
    Mat image1 = imread(image_dir+filename1);
    Mat image2 = imread(image_dir+filename2);
    // 检查图像是否加载成功
    CV_CheckEQ((int)image1.empty(), 0, "Image 1 is not found!");
    CV_CheckEQ((int)image2.empty(), 0, "Image 2 is not found!");


    // 读取相机内参矩阵
    Matx33d K;
    for (int i = 0; i < 3; i++)
        for (int j = 0; j < 3; j++)
            file >> K(i,j);
    file.close();


    // 定义描述子和关键点向量
    Mat descriptors1, descriptors2;
    std::vector<KeyPoint> keypoints1, keypoints2;


    // 使用SIFT检测关键点
    Ptr<SIFT> detector = SIFT::create();
    detector->detect(image1, keypoints1);
    detector->detect(image2, keypoints2);
    // 计算关键点的描述子
    detector->compute(image1, keypoints1, descriptors1);
    detector->compute(image2, keypoints2, descriptors2);


    // 创建Flann基于KD树的匹配器
    FlannBasedMatcher matcher(makePtr<flann::KDTreeIndexParams>(5), makePtr<flann::SearchParams>(32));


    // 获取最佳匹配点，对其应用Lowe's比率测试
    std::vector<std::vector<DMatch>> matches_vector;
    matcher.knnMatch(descriptors1, descriptors2, matches_vector, 2);


    // 使用Lowe的比率测试过滤关键点
    std::vector<Point2d> pts1, pts2;
    pts1.reserve(matches_vector.size()); pts2.reserve(matches_vector.size());
    for (const auto &m : matches_vector) {
        if (m[0].distance / m[1].distance < 0.75) {
            // 内点用于后续处理
            pts1.emplace_back(keypoints1[m[0].queryIdx].pt);
            pts2.emplace_back(keypoints2[m[0].trainIdx].pt);
        }
    }


    // 通过RANSAC算法寻找基础矩阵
    Mat inliers;
    const int pts_size = (int) pts1.size();
    const auto begin_time = std::chrono::steady_clock::now();
    const Mat E = findEssentialMat(pts1, pts2, Mat(K), RANSAC, 0.99, 1.0, inliers);
    // 打印RANSAC寻找基础矩阵的时间和内点数量
    std::cout << "RANSAC essential matrix time " << std::chrono::duration_cast<std::chrono::microseconds>
            (std::chrono::steady_clock::now() - begin_time).count() <<
            "mcs.\nNumber of inliers " << countNonZero(inliers) << "\n";


    // 将图像坐标转换为矩阵
    Mat points1 = Mat((int)pts1.size(), 2, CV_64F, pts1.data());
    Mat points2 = Mat((int)pts2.size(), 2, CV_64F, pts2.data());
    points1 = points1.t(); points2 = points2.t();
    
    // 计算并打印平均误差至极线
    std::cout << "Mean error to epipolar lines " <<
        getError2EpipLines(K.inv().t() * E * K.inv(), points1, points2, inliers) << "\n";


    // 分解本质矩阵得到旋转和平移
    Mat R1, R2, t;
    decomposeEssentialMat(E, R1, R2, t);//平移向量通常是单位向量，它表示方向而非具体距离，因为本质矩阵本身并不包含关于真实世界尺度的信息。


    // 创建两个相对姿势
    // P1 = K [  I    |   0  ]
    // P2 = K [R{1,2} | {+-}t]
    // 定义第一个相机的投影矩阵P1为内参矩阵K乘以[单位矩阵|零向量]的形式
    Mat P1;
    hconcat(K, Vec3d::zeros(), P1);
    // 定义一个容器，用来存储第二个相机的所有可能的投影矩阵P2s
    std::vector<Mat> P2s(4);
    // 分别为P2s的每个元素计算可能的投影矩阵，考虑R1、R2和t以及-t的四种组合
    hconcat(K * R1,  K * t, P2s[0]);
    hconcat(K * R1, -K * t, P2s[1]);
    hconcat(K * R2,  K * t, P2s[2]);
    hconcat(K * R2, -K * t, P2s[3]);
    
    // 创建两个向量，用于存储每个相机视角下的物体点和对应的图像点索引
    std::vector<std::vector<Vec3d>> obj_pts_per_cam(4);  // 存储物体点
    std::vector<std::vector<int>> img_idxs_per_cam(4);  // 存储对应图像点的索引
    // 初始化相机索引、最佳相机索引和最大物体点数的变量
    int cam_idx = 0, best_cam_idx = 0, max_obj_pts = 0;
    // 遍历四个可能的第二个相机的投影矩阵P2
    for (const auto &P2 : P2s) {
        obj_pts_per_cam[cam_idx].reserve(pts_size);  // 预留空间，优化性能
        img_idxs_per_cam[cam_idx].reserve(pts_size);
        // 遍历每一个点对
        for (int i = 0; i < pts_size; i++) {
            // 只处理内点
            if (! inliers.at<uchar>(i))
                continue;
    
            Vec4d obj_pt;  // 创建一个4D物体点
            // 使用三角化计算物体点的位置
            triangulatePoints(P1, P2, points1.col(i), points2.col(i), obj_pt);
            obj_pt /= obj_pt(3); // 归一化物体点
            // 检查如果投影点拥有正深度
            if (obj_pt(2) > 0) { 
                // 添加到当前相机的物体点向量中
                obj_pts_per_cam[cam_idx].emplace_back(Vec3d(obj_pt(0), obj_pt(1), obj_pt(2)));
                img_idxs_per_cam[cam_idx].emplace_back(i);
            }
        }
        // 更新包含最多物体点的相机投影矩阵的索引
        if (max_obj_pts < (int) obj_pts_per_cam[cam_idx].size()) {
            max_obj_pts = (int) obj_pts_per_cam[cam_idx].size();
            best_cam_idx = cam_idx;
        }
        cam_idx++;  // 增加相机索引以便在下一个循环中使用下一个投影矩阵
    }


    // 打印物体点数量
    std::cout << "Number of object points " << max_obj_pts << "\n";


    // 绘制图像中的内点
    const int circle_sz = 7;
    // 在两个图像中绘制内点
    std::vector<int> labels;  // 存储点云分割后每个点的标签
    std::vector<Vec4d> planes;  // 存储检测到的平面
    // 调用之前解释过的 getPlanes 函数，从3D点中找到最多4个平面
    getPlanes (obj_pts_per_cam[best_cam_idx], labels, planes, 4, 0.002, 0.99, 10000);
    // 获取找到的平面数量
    const int num_found_planes = (int) planes.size();
    RNG rng;  // 随机数生成器，用于颜色生成
    // 为每个平面生成一个随机颜色
    std::vector<Scalar> plane_colors(num_found_planes);
    for (int pl = 0; pl < num_found_planes; pl++)
        plane_colors[pl] = Scalar(rng.uniform(0,256), rng.uniform(0,256), rng.uniform(0,256));
    // 遍历所有的物体点
    for (int obj_pt = 0; obj_pt < max_obj_pts; obj_pt++) {
        // 获取图像点的索引
        const int pt = img_idxs_per_cam[best_cam_idx][obj_pt];
        // 如果这个点属于某个平面，用对应颜色画圆表示
        if (labels[obj_pt] > 0) {
            circle(image1, pts1[pt], circle_sz, plane_colors[labels[obj_pt]-1], -1);
            circle(image2, pts2[pt], circle_sz, plane_colors[labels[obj_pt]-1], -1);
        } else {
            // 否则用黑色画圆表示内点
            circle(image1, pts1[pt], circle_sz, Scalar(0,0,0), -1);
            circle(image2, pts2[pt], circle_sz, Scalar(0,0,0), -1);
        }
    }


    // 连接两个图像
    hconcat(image1, image2, image1);
    // 将图像尺寸调整为新的大小，保持同样的宽高比
    // 假设新的图像大小为1200x800
    const int new_img_size = 1200 * 800;
    resize(image1, image1, Size((int)sqrt((double)image1.cols * new_img_size / image1.rows),
                                (int)sqrt((double)image1.rows * new_img_size / image1.cols)));
    // 显示连接后的图像，并将其保存至文件
    imshow("image 1-2", image1);
    imwrite("planes.png", image1);
    // 等待按键事件以进行交互
    waitKey(0);
}

代码实现了一个复杂的计算机视觉流程，主要步骤如下：

检查并确保命令行参数中提供了有效的数据文件和图像目录路径。
读取数据文件中提供的两张图像的文件名并加载这些图像。
读取相机的内参矩阵。
使用SIFT算法检测关键点，并计算这些关键点的描述符。
使用FLANN库基于KD树的方法对描述符进行匹配，并使用Lowe的比率测试过滤匹配。
计算两组匹配关键点的本质矩阵，并基于本质矩阵的RANSAC算法输出内点。
通过内点和本质矩阵计算重投影误差。
分解本质矩阵以获得旋转矩阵和平移向量，并使用P1和P2s（四种可能的相机位姿）进行三角测量以重建场景的3D点。
在图像上绘制这些重建的3D点，并将它们分配到不同的平面上。
使用getPlanes函数分析从三角化计算获得的3D点云，并试图将点云中的点分配到最多四个平面上。
绘制出与这些平面相关联的点的投影，并将它们用不同颜色标记在两张图像上；未与任何平面相关联的内点将以黑色绘制。
将两张标记好的图像拼接成一张图像，并根据设定的新尺寸调整其大小，保持宽高比不变。
使用OpenCV的GUI功能展示拼接后的图像，并将其保存为文件。
等待用户的按键事件，以结束程序或进行进一步的交云用户交互。

代码的核心功能是从两张有重叠视野的图像中，恢复出场景的三维结构，并尝试将3D点分类到不同的平面上 。最终，代码通过可视化库将结果显示出来，并保存 绘制了3D点和平面分割的图像文件 。这些步骤完成了对场景立体重建和平面分割的处理流程 ，并通过图像展示了结果。利用这种方法，开发者可以在两个视图之间识别公共结构，并估计它们在空间中的相对位置和姿态。

cs 复制代码

CV_CheckEQ((int)file.is_open(), 1, "Data file is not found!");

javascript 复制代码

const Mat E = findEssentialMat(pts1, pts2, Mat(K), RANSAC, 0.99, 1.0, inliers);

css 复制代码

getError2EpipLines(K.inv().t() * E * K.inv(), points1, points2, inliers)

getError2EpipLines函数

css 复制代码

triangulatePoints(P1, P2, points1.col(i), points2.col(i), obj_pt);

css 复制代码

getPlanes(obj_pts_per_cam[best_cam_idx], labels, planes, 4, 0.002, 0.99, 10000);

这段代码的功能主要包括以下几个部分：

读取和预处理图像数据：

通过命令行参数接收图像数据文件和图像目录。
读取两张图像的文件路径，并加载对应的图像数据。

读取相机校准参数：

从文件中读取相机的内部校准矩阵K。

特征点检测和描述：

使用SIFT算法检测两张图片中的关键点，并计算对应的描述子。

特征点匹配：

使用带有FLANN的快速近似最近邻搜索算法进行关键点之间的匹配。
使用Lowe的比率测试过滤匹配，去除错误的匹配点对。

计算本质矩阵：

使用RANSAC算法估计两视图间的本质矩阵E。
计算并输出RANSAC对本质矩阵计算耗时和内点数量。

计算双视图间的误差：

计算并输出关键点对应的集合到极线的平均距离误差。

分解本质矩阵获得位姿：

从本质矩阵中分解出两组旋转和一个平移向量。

重建3D点：

根据计算出的位姿信息，对内点进行三角测量以获得3D坐标。
遍历四种可能的相机第二视角的投影矩阵，寻找最多3D点的相机模型。

平面检测：

对3D点进行多平面拟合，得到最多四个平面的参数。
将3D点分类到各个检测到的平面上。

图形可视化：

在图片上绘制内点和所属平面的关键点。
将两张图像首尾拼接，并对图像大小进行调整。
显示最终结果，输出保存有平面信息的图像。

整个程序是一个图像处理和3D点重建的流程，涉及从图像中提取和匹配特征点、利用相机校准信息计算本质矩阵、分解获得相对位姿、三角测量获取3D坐标、平面拟合与分类，以及结果的可视化。

The End