1 引言

在上两篇系列文章中，我们分别探讨了增量式 SfM 的核心骨架实现与多源信息融合策略。在第 18 篇《最小二乘问题详解18：增量式SFM核心流程实现》中，我们构建了算法的逻辑基座；而在第 19 篇《带先验约束的增量式 SFM 优化与实现》中，我们引入了 INS/GNSS 等外部位姿先验，构建了工业级稳健的重建系统。

回顾第 19 篇的带先验约束流程，我们不难发现，虽然该方案对数据采集设备提出了更高要求，但它实际上简化了 SfM 的核心难题。借助高精度的位姿先验，我们绕过了繁琐且脆弱的初始化步骤，能够直接基于已知的相机位置进行特征匹配与三角化，从而迅速展开全局光束法平差（BA）。在这一过程中，先验信息充当了"上帝之手"，直接为优化问题提供了高质量的初值，规避了纯视觉系统中常见的尺度漂移与拓扑错误。

然而，当我们剥离掉这些外部传感器提供的"辅助信息"，回归到计算机视觉最本源的2D-2D 特征对应关系 时，增量式 SfM 才真正展现出其"从无到有"的魅力与挑战。这就是本文将要深入探讨的主题------无先验约束下的增量式 SfM 自由网平差。

在这种纯视觉模式下，我们失去了绝对的坐标参考系，面对的是一片数据的"黑暗森林"。我们不仅不知道场景的绝对位置，甚至连相机之间的绝对距离（尺度）都无从得知。因此，我们无法像带先验流程那样直接从三角化开始，而是必须从 对极几何（Epipolar Geometry） 出发，通过计算本质矩阵（Essential Matrix, $E$）来推导两视图之间的相对运动。

这一 初始化（Initialization） 过程至关重要，它是整个增量式 SfM 的关键点。本质矩阵 $E$ 的求解质量，直接决定了初始两张像片的相对定向精度，进而作为后续所有相机位姿与 3D 结构生长的基准。一旦初始化失败或误差过大，这种误差将像病毒一样在增量扩展的过程中不断累积和放大，最终导致重建崩溃。因此，理解并掌握无先验约束下的自由网平差，不仅是对 SfM 算法本质的深刻洞察，也是解决最通用、最广泛视觉重建问题的核心能力。

2 子问题

首先还是先关注增量式 SfM 的几个核心子问题。

2.1 本质矩阵问题

第一点当然就是前文中提到了通过对极几何求本质矩阵 $E$。这个问题我们在《最小二乘问题详解13：对极几何中本质矩阵求解》和《最小二乘问题详解14：鲁棒估计与5点算法求解本质矩阵》这两篇文章中详细讨论过，将其具体实现封装一下：

cpp 复制代码

#pragma once

#include <Eigen/Core>

class EssentialProblem {
 public:
  EssentialProblem();

  Eigen::Matrix3d Solve(const Eigen::Matrix3d& K,
                        const std::vector<Eigen::Vector3d>& x1s_noisy,
                        const std::vector<Eigen::Vector3d>& x2s_noisy,
                        double pixelThreshold, int ransacIterations);

 private:
  std::pair<Eigen::Matrix3d, std::vector<bool>> RansacEstimate(
      const std::vector<Eigen::Vector3d>& x1s_noisy,
      const std::vector<Eigen::Vector3d>& x2s_noisy, double pixel_threshold,
      double focal_length, int max_iterations);

  Eigen::Matrix3d EightPointAlgorithm(const std::vector<Eigen::Vector3d>& x1s,
                                      const std::vector<Eigen::Vector3d>& x2s);

  Eigen::Matrix3d OptimizeSampsonError(Eigen::Matrix3d E_init,
                                       const std::vector<Eigen::Vector3d>& x1s,
                                       const std::vector<Eigen::Vector3d>& x2s);

  Eigen::Matrix3d ComputeNormalization(
      const std::vector<Eigen::Vector3d>& points);

  double SampsonDistance(const Eigen::Matrix3d& E, const Eigen::Vector3d& x1,
                         const Eigen::Vector3d& x2);

  double EpipolarResidual(const Eigen::Matrix3d& E, const Eigen::Vector3d& x1,
                          const Eigen::Vector3d& x2);

  Eigen::Matrix3d NormalizeEssentialMatrix(const Eigen::Matrix3d& E);
};

cpp 复制代码

#include "SolveEssential.h"

#include <ceres/ceres.h>

#include <Eigen/Geometry>
#include <iomanip>
#include <iostream>
#include <opengv/relative_pose/CentralRelativeAdapter.hpp>
#include <opengv/sac/Ransac.hpp>
#include <opengv/sac_problems/relative_pose/CentralRelativePoseSacProblem.hpp>
#include <random>
#include <vector>

#include "Math.h"

using namespace std;
using namespace Eigen;

namespace {
struct SampsonError {
  SampsonError(const Eigen::Vector3d& x1, const Eigen::Vector3d& x2)
      : x1_(x1), x2_(x2) {}

  template <typename T>
  bool operator()(const T* const e_ptr, T* residual) const {
    Eigen::Map<const Eigen::Matrix<T, 3, 3, Eigen::ColMajor>> E(e_ptr);

    Eigen::Matrix<T, 3, 1> x1_h(T(x1_(0)), T(x1_(1)), T(x1_(2)));
    Eigen::Matrix<T, 3, 1> x2_h(T(x2_(0)), T(x2_(1)), T(x2_(2)));

    T c = x2_h.transpose() * E * x1_h;
    Eigen::Matrix<T, 3, 1> Ex1 = E * x1_h;
    Eigen::Matrix<T, 3, 1> ETx2 = E.transpose() * x2_h;
    T denom = Ex1.squaredNorm() + ETx2.squaredNorm();

    if (denom < T(1e-12)) {
      residual[0] = T(1e5);
    } else {
      residual[0] = ceres::sqrt(c * c / denom);
    }
    return true;
  }

  static ceres::CostFunction* Create(const Eigen::Vector3d& x1,
                                     const Eigen::Vector3d& x2) {
    return new ceres::AutoDiffCostFunction<SampsonError, 1, 9>(
        new SampsonError(x1, x2));
  }

 private:
  Eigen::Vector3d x1_, x2_;
};
}  // namespace

EssentialProblem::EssentialProblem() {}

Eigen::Matrix3d EssentialProblem::Solve(
    const Eigen::Matrix3d& K, const std::vector<Eigen::Vector3d>& x1s_noisy,
    const std::vector<Eigen::Vector3d>& x2s_noisy, double pixelThreshold,
    int ransacIterations) {
  // RANSAC + 5点算法
  double focalLength = (K(0, 0) + K(1, 1)) * 0.5;
  auto [E_ransac, inlier_mask] = RansacEstimate(
      x1s_noisy, x2s_noisy, pixelThreshold, focalLength, ransacIterations);

  // 提取内点
  std::vector<Eigen::Vector3d> x1_inliers, x2_inliers;
  for (size_t i = 0; i < x1s_noisy.size(); ++i) {
    if (inlier_mask[i]) {
      x1_inliers.push_back(x1s_noisy[i]);
      x2_inliers.push_back(x2s_noisy[i]);
    }
  }

  // === 4. 初值：8点算法 ===
  Eigen::Matrix3d E_8pt = EightPointAlgorithm(x1_inliers, x2_inliers);
  std::cout << "=== 8-Point Estimate ===\n";
  std::cout << E_8pt << "\n\n";

  // === 5. 非线性优化：Sampson 误差 ===
  Eigen::Matrix3d E_final = OptimizeSampsonError(E_8pt, x1_inliers, x2_inliers);
  std::cout << "=== Final Optimized E (Projected) ===\n";
  std::cout << E_final << "\n\n";

  // === 6. 评估三项指标 ===
  double sampson_8pt = 0.0, sampson_final = 0.0;
  double epipolar_mse_8pt = 0.0, epipolar_mse_final = 0.0;
  int N = (int)x1_inliers.size();

  for (int i = 0; i < N; ++i) {
    sampson_8pt += SampsonDistance(E_8pt, x1_inliers[i], x2_inliers[i]);
    sampson_final += SampsonDistance(E_final, x1_inliers[i], x2_inliers[i]);

    double r1 = EpipolarResidual(E_8pt, x1_inliers[i], x2_inliers[i]);
    double r2 = EpipolarResidual(E_final, x1_inliers[i], x2_inliers[i]);
    epipolar_mse_8pt += r1 * r1;
    epipolar_mse_final += r2 * r2;
  }

  sampson_8pt /= N;
  sampson_final /= N;
  epipolar_mse_8pt /= N;
  epipolar_mse_final /= N;

  // 归一化 Frobenius 距离（代数相似性）
  Eigen::Matrix3d E_8pt_norm = NormalizeEssentialMatrix(E_8pt);
  Eigen::Matrix3d E_final_norm = NormalizeEssentialMatrix(E_final);

  // === 6. 输出评估结果 ===
  std::cout << "=== Evaluation Summary ===\n";
  std::cout << "Metric                     | 8-Point       | Optimized\n";
  std::cout << "---------------------------|---------------|------------\n";
  std::cout << "Avg Sampson Error          | " << setw(13) << sampson_8pt
            << " | " << sampson_final << "\n";
  std::cout << "Avg |x2^T E x1|^2 (MSE)   | " << setw(13) << epipolar_mse_8pt
            << " | " << epipolar_mse_final << "\n";

  return E_final;
}

// RANSAC + 5-point (OpenGV)
// 返回：Essential Matrix + inlier mask
std::pair<Eigen::Matrix3d, std::vector<bool>> EssentialProblem::RansacEstimate(
    const std::vector<Eigen::Vector3d>& x1s_noisy,
    const std::vector<Eigen::Vector3d>& x2s_noisy, double pixel_threshold,
    double focal_length, int max_iterations) {
  size_t N = x1s_noisy.size();

  if (x1s_noisy.size() != x2s_noisy.size() || x1s_noisy.size() < 5) {
    std::cerr << "Not enough correspondences.\n";
    return {Eigen::Matrix3d::Identity(), std::vector<bool>(N, false)};
  }

  // OpenGV expects unit bearing vectors
  opengv::bearingVectors_t b1(x1s_noisy.begin(), x1s_noisy.end());
  opengv::bearingVectors_t b2(x2s_noisy.begin(), x2s_noisy.end());

  opengv::relative_pose::CentralRelativeAdapter adapter(b1, b2);

  // 中心相机模型下的相对位姿 SAC 问题
  using SacProblem =
      opengv::sac_problems::relative_pose::CentralRelativePoseSacProblem;

  auto problem = std::make_shared<SacProblem>(adapter, SacProblem::NISTER);

  opengv::sac::Ransac<SacProblem> ransac;
  ransac.sac_model_ = problem;

  // 将像素级内点阈值（如1.0像素）转换为OpenGV所需的弦距离阈值
  // 步骤：pixel → 近似视角θ ≈ pixel/f → 弦距离 = 2*sin(θ/2)
  double theta_approx =
      pixel_threshold / focal_length;  // 小角度近似: θ ≈ tanθ = pixel/f
  ransac.threshold_ =
      2.0 * sin(theta_approx / 2.0);  // 转换为单位球面上的弦距离

  ransac.max_iterations_ = max_iterations;

  if (!ransac.computeModel()) {
    std::cerr << "RANSAC failed.\n";
    return {Eigen::Matrix3d::Identity(), std::vector<bool>(N, false)};
  }

  // OpenGV 返回的是 3x4 pose matrix [R | t]
  Eigen::Matrix<double, 3, 4> Rt = ransac.model_coefficients_;

  Eigen::Matrix3d R = Rt.leftCols<3>();
  Eigen::Vector3d t = Rt.rightCols<1>();

  // 构造 skew-symmetric matrix
  Eigen::Matrix3d t_hat;
  t_hat << 0, -t.z(), t.y(), t.z(), 0, -t.x(), -t.y(), t.x(), 0;

  Eigen::Matrix3d E = t_hat * R;

  // inlier mask
  std::vector<bool> inliers(N, false);

  for (int idx : ransac.inliers_) {
    if (idx >= 0 && idx < static_cast<int>(N)) inliers[idx] = true;
  }

  std::cout << "RANSAC inliers: " << ransac.inliers_.size() << " / " << N
            << std::endl;

  return {E, inliers};
}

using Vector9d = Eigen::Matrix<double, 9, 1>;

// ========================
// 工具函数
// ========================

Eigen::Matrix3d Vec2Mat(const Vector9d& e) {
  return Eigen::Map<const Eigen::Matrix3d>(e.data());
}

Vector9d Mat2Vec(const Eigen::Matrix3d& E) {
  return Eigen::Map<const Vector9d>(E.data());
}

// Sampson 距离计算
double EssentialProblem::SampsonDistance(const Eigen::Matrix3d& E,
                                         const Eigen::Vector3d& x1,
                                         const Eigen::Vector3d& x2) {
  double c = x2.transpose() * E * x1;
  Eigen::Vector3d Ex1 = E * x1;
  Eigen::Vector3d ETx2 = E.transpose() * x2;
  double denom = Ex1.squaredNorm() + ETx2.squaredNorm();
  if (denom < 1e-12) return 1e10;
  return c * c / denom;
}

// 对极约束残差：x2^T E x1
double EssentialProblem::EpipolarResidual(const Eigen::Matrix3d& E,
                                          const Eigen::Vector3d& x1,
                                          const Eigen::Vector3d& x2) {
  return x2.transpose() * E * x1;
}

// ========================
// 辅助函数：计算归一化变换矩阵 T
// ========================
Eigen::Matrix3d EssentialProblem::ComputeNormalization(
    const std::vector<Eigen::Vector3d>& points) {
  Eigen::Vector2d centroid(0, 0);
  for (const auto& p : points) {
    centroid += p.head<2>();
  }
  centroid /= points.size();

  double avg_dist = 0.0;
  for (const auto& p : points) {
    avg_dist += (p.head<2>() - centroid).norm();
  }
  avg_dist /= points.size();

  double scale = sqrt(2.0) / avg_dist;
  Eigen::Matrix3d T = Eigen::Matrix3d::Identity();
  T(0, 0) = T(1, 1) = scale;
  T(0, 2) = -scale * centroid(0);
  T(1, 2) = -scale * centroid(1);
  return T;
}

// ========================
// 8点算法实现
// ========================
Eigen::Matrix3d EssentialProblem::EightPointAlgorithm(
    const std::vector<Eigen::Vector3d>& x1s,
    const std::vector<Eigen::Vector3d>& x2s) {
  Eigen::Matrix3d T1 = ComputeNormalization(x1s);
  Eigen::Matrix3d T2 = ComputeNormalization(x2s);

  std::vector<Eigen::Vector3d> nx1s, nx2s;
  for (size_t i = 0; i < x1s.size(); ++i) {
    nx1s.push_back(T1 * x1s[i]);
    nx2s.push_back(T2 * x2s[i]);
  }

  size_t N = nx1s.size();
  Eigen::MatrixXd A(N, 9);
  for (size_t i = 0; i < N; ++i) {
    double u1 = nx1s[i](0), v1 = nx1s[i](1);
    double u2 = nx2s[i](0), v2 = nx2s[i](1);
    A.row(i) << u1 * u2, v1 * u2, u2, u1 * v2, v1 * v2, v2, u1, v1, 1.0;
  }

  Eigen::JacobiSVD<Eigen::MatrixXd> svd(A, Eigen::ComputeFullV);
  Vector9d e = svd.matrixV().col(8);
  Eigen::Matrix3d E_tilde = Vec2Mat(e);

  Eigen::JacobiSVD<Eigen::Matrix3d> svd_E(
      E_tilde, Eigen::ComputeFullU | Eigen::ComputeFullV);
  Eigen::Vector3d s = svd_E.singularValues();
  double avg = (s(0) + s(1)) / 2.0;
  Eigen::Matrix3d E_normalized =
      svd_E.matrixU() * Eigen::DiagonalMatrix<double, 3, 3>(avg, avg, 0.0) *
      svd_E.matrixV().transpose();

  return T2.transpose() * E_normalized * T1;
}

// 归一化 E 使得 ||E||_F = 1
Eigen::Matrix3d EssentialProblem::NormalizeEssentialMatrix(
    const Eigen::Matrix3d& E) {
  return E / E.norm();
}

// 基于 Sampson 误差的非线性优化
Eigen::Matrix3d EssentialProblem::OptimizeSampsonError(
    Eigen::Matrix3d E_init, const std::vector<Eigen::Vector3d>& x1s,
    const std::vector<Eigen::Vector3d>& x2s) {
  Vector9d e_opt = Mat2Vec(E_init);
  ceres::Problem problem;
  for (int i = 0; i < x1s.size(); ++i) {
    problem.AddResidualBlock(SampsonError::Create(x1s[i], x2s[i]), nullptr,
                             e_opt.data());
  }

  ceres::Solver::Options options;
  options.linear_solver_type = ceres::DENSE_QR;
  options.minimizer_progress_to_stdout = true;
  options.max_num_iterations = 50;

  ceres::Solver::Summary summary;
  ceres::Solve(options, &problem, &summary);
  std::cout << "\n" << summary.BriefReport() << "\n";

  Eigen::Matrix3d E_opt = Vec2Mat(e_opt);

  // 投影回本质矩阵流形（强制 σ1=σ2, σ3=0）
  Eigen::JacobiSVD<Eigen::Matrix3d> svd(
      E_opt, Eigen::ComputeFullU | Eigen::ComputeFullV);
  Eigen::Vector3d s = svd.singularValues();
  double avg = (s(0) + s(1)) / 2.0;
  Eigen::Matrix3d E_final = svd.matrixU() *
                            Eigen::DiagonalMatrix<double, 3, 3>(avg, avg, 0.0) *
                            svd.matrixV().transpose();

  return E_final;
}

这段代码实现还是遵循 5 点算法 + RANSAC 剔除外点 + 8 点算法计算初值 + 非线性最小二乘优化 Sampson 误差的流程。这种稳健的实现对于这里使用的相对简单的仿真数据也是适用的。

2.2 PnP问题

PnP问题是纯视觉的增量式 SFM 自由网平差的另一个关键。新注册的图像如果不能恢复姿态，后续的优化流程就不能正常继续。在带位姿先验的 PnP 问题中，我们可以把位姿先验作为初值来进行非线性优化，估值过程可以很快收敛。在仅有 2D-2D 数据的纯视觉环境下，由于缺乏可靠的位姿先验作为初值参考，优化过程只能以随机初始化或基于简单启发式规则生成的初值为起点，极易陷入局部最优解，导致算法难以收敛。

为了解决这个问题，可以引入高效的线性求解器来提供高质量的初始估计：也就是 EPnP 算法。具体来说，采用 EPnP + 非线性优化 的两阶段策略：首先利用 EPnP 算法快速计算出相机位姿的解析解（Closed-form Solution），以此作为初值；随后将其输入到 Ceres 等非线性优化库中进行精细迭代。

EPnP 的核心思想非常巧妙，它通过引入 虚拟控制点 的概念，将复杂的非线性问题转化为线性问题求解。并且求解过程也不是简单的线性回归，而是一个夹杂了代数分解步骤的线性最小二乘解：

虚拟控制点与线性化
算法首先引入了 4 个"虚拟控制点"的概念。它的核心假设是：场景中任意的 3D 点，都可以表示为这 4 个控制点的加权和。
既然 3D 点可以用控制点表示，那么它们在相机坐标系下的坐标，自然也可以用相机坐标系下的这 4 个控制点来表示，且权重保持不变。
利用这个性质，结合相机的投影方程，我们可以构建一个关于"相机坐标系下 4 个控制点坐标"的线性方程组 （形式为 $Mx = 0$）。这一步将原本非线性的 PnP 问题转化为了一个线性最小二乘问题。
SVD 分解与代数求解
构建好方程组后，我们通过 SVD 分解 来求解这个线性系统，得到控制点在相机坐标系下的基础解系。
但此时解仍然存在尺度模糊（即不知道具体的缩放比例）。为了解决这个问题，EPnP 利用了"控制点之间的欧氏距离在坐标系变换下保持不变"这一几何约束。通过建立距离约束方程，算法可以解算出正确的缩放系数，从而唯一确定控制点在相机坐标系下的精确位置。
位姿恢复
一旦我们知道了 4 个控制点在世界坐标系和相机坐标系下的坐标，问题就退化为了一个简单的绝对定向问题 。此时，只需计算两组点集之间的最佳刚体变换，即可直接解出相机的旋转矩阵 $R$ 和平移向量 $t$。

虽然 EPnP 的核心数学原理基于线性最小二乘和 SVD 分解，但其工程实现涉及复杂的控制点选取策略、零空间维度分类讨论以及多项式求根过程。这个实现过程相当繁复，可以使用计算机视觉领域标准的 OpenGV 库来求解 EPnP。EPnP + 非线性优化的具体实现如下所示：

cpp 复制代码

#pragma once

#include <Eigen/Core>
#include <Eigen/Geometry>
#include <vector>

class PnPProblem {
 public:
  PnPProblem(const Eigen::Matrix3d& K,
             const std::vector<Eigen::Vector3d>& points_world,
             const std::vector<Eigen::Vector2d>& observations);

  bool Solve(Eigen::Matrix3d& R_opt, Eigen::Vector3d& t_opt);

 private:
  bool SolveInitialPose(const std::vector<Eigen::Vector3d>& points_world,
                        const std::vector<Eigen::Vector2d>& observations,
                        double fx, double fy, double cx, double cy,
                        double* q_out,  // [w, x, y, z]
                        double* t_out   // [x, y, z]
  );

  Eigen::Quaterniond CeresQuatToEigen(const double* q_ceres);
  void EigenQuatToCeres(const Eigen::Quaterniond& q_eigen, double* q_ceres);

  double fx;
  double fy;
  double cx;
  double cy;

  const std::vector<Eigen::Vector3d>& points_world;
  const std::vector<Eigen::Vector2d>& observations;
};

cpp 复制代码

#include "SolvePnP.h"

#include <ceres/ceres.h>
#include <ceres/rotation.h>

#include <iomanip>
#include <iostream>
#include <opengv/absolute_pose/CentralAbsoluteAdapter.hpp>
#include <opengv/absolute_pose/methods.hpp>
#include <opengv/sac/Ransac.hpp>
#include <opengv/sac_problems/absolute_pose/AbsolutePoseSacProblem.hpp>
#include <vector>

#include "Math.h"

namespace {
struct ReprojectionError {
  ReprojectionError(const Eigen::Vector3d& X_, const Eigen::Vector2d& obs_,
                    double fx_, double fy_, double cx_, double cy_)
      : X(X_), obs(obs_), fx(fx_), fy(fy_), cx(cx_), cy(cy_) {}

  template <typename T>
  bool operator()(const T* const q, const T* const t, T* residuals) const {
    T p_camera[3];
    T X_world[3] = {T(X.x()), T(X.y()), T(X.z())};

    // Ceres 期望 q 顺序为 [w, x, y, z]
    ceres::QuaternionRotatePoint(q, X_world, p_camera);

    p_camera[0] += t[0];
    p_camera[1] += t[1];
    p_camera[2] += t[2];

    T xp = p_camera[0] / p_camera[2];
    T yp = p_camera[1] / p_camera[2];

    T u = T(fx) * xp + T(cx);
    T v = T(fy) * yp + T(cy);

    residuals[0] = u - T(obs.x());
    residuals[1] = v - T(obs.y());

    return true;
  }

  static ceres::CostFunction* Create(const Eigen::Vector3d& X,
                                     const Eigen::Vector2d& obs, double fx,
                                     double fy, double cx, double cy) {
    return new ceres::AutoDiffCostFunction<ReprojectionError, 2, 4, 3>(
        new ReprojectionError(X, obs, fx, fy, cx, cy));
  }

  Eigen::Vector3d X;
  Eigen::Vector2d obs;
  double fx, fy, cx, cy;
};
}  // namespace

PnPProblem::PnPProblem(const Eigen::Matrix3d& K,
                       const std::vector<Eigen::Vector3d>& points_world,
                       const std::vector<Eigen::Vector2d>& observations)
    : fx(K(0, 0)),
      fy(K(1, 1)),
      cx(K(0, 2)),
      cy(K(1, 2)),
      points_world(points_world),
      observations(observations) {}

bool PnPProblem::Solve(Eigen::Matrix3d& R_opt, Eigen::Vector3d& t_opt) {
  // --- 使用 OpenGV EPnP 计算初值 ---
  double q_est[4] = {1, 0, 0, 0};
  double t_est[3] = {0, 0, 0};

  std::cout << "\n=== Computing Initial Value with OpenGV EPnP ==="
            << std::endl;
  if (!SolveInitialPose(points_world, observations, fx, fy, cx, cy, q_est,
                        t_est)) {
    std::cerr << "Failed to compute initial pose. Falling back to identity."
              << std::endl;
    // Fallback
    Eigen::Quaterniond q_init = Eigen::Quaterniond::Identity();
    EigenQuatToCeres(q_init, q_est);
    t_est[0] = t_est[1] = t_est[2] = 0.0;
  }

  // --- 构建 Ceres 问题 ---
  ceres::Problem problem;
  ceres::Manifold* quaternion_manifold = new ceres::QuaternionManifold();

  problem.AddParameterBlock(q_est, 4, quaternion_manifold);
  problem.AddParameterBlock(t_est, 3);

  for (size_t i = 0; i < points_world.size(); ++i) {
    problem.AddResidualBlock(
        ReprojectionError::Create(points_world[i], observations[i], fx, fy, cx,
                                  cy),
        nullptr, q_est, t_est);
  }

  ceres::Solver::Options options;
  options.linear_solver_type = ceres::DENSE_QR;
  options.minimizer_progress_to_stdout = true;
  options.max_num_iterations = 50;

  ceres::Solver::Summary summary;
  std::cout << "\n=== Starting Non-linear Optimization (Ceres) ==="
            << std::endl;
  ceres::Solve(options, &problem, &summary);
  std::cout << summary.BriefReport() << "\n";

  // --- 结果评估 ---
  Eigen::Quaterniond q_opt = CeresQuatToEigen(q_est);
  q_opt.normalize();
  R_opt = q_opt.toRotationMatrix();
  t_opt << t_est[0], t_est[1], t_est[2];

  // 计算误差
  double total_error = 0.0;
  for (size_t i = 0; i < points_world.size(); ++i) {
    Eigen::Vector3d pc = q_opt * points_world[i] + t_opt;
    double u = fx * pc.x() / pc.z() + cx;
    double v = fy * pc.y() / pc.z() + cy;
    double err_x = u - observations[i].x();
    double err_y = v - observations[i].y();
    total_error += err_x * err_x + err_y * err_y;
  }
  double rmse = std::sqrt(total_error / points_world.size());

  std::cout << "\n=== Accuracy Check ===" << std::endl;
  std::cout << "Final RMSE:          " << rmse << " pixels" << std::endl;

  // === 3. 核心判断：仅根据终止类型决定成败 ===
  if (summary.termination_type != ceres::CONVERGENCE) {
    return false;
  }
  if (rmse > 6.0) {  // 阈值根据情况调整，合成数据可以给小一点
    std::cout << "PnP Failed: Converged but RMSE too high." << std::endl;
    return false;
  }

  return true;
}

// 使用 OpenGV EPnP 计算初值
bool PnPProblem::SolveInitialPose(
    const std::vector<Eigen::Vector3d>& points_world,
    const std::vector<Eigen::Vector2d>& observations, double fx, double fy,
    double cx, double cy, double* q_out, double* t_out) {
  if (points_world.size() < 4) {
    std::cerr << "[OpenGV EPnP] Not enough points (need >= 4)" << std::endl;
    return false;
  }

  // 1. 准备 OpenGV 的数据结构
  // OpenGV 通常使用 bearing vectors (归一化射线) 或者直接在 adapter 中处理内参
  // 这里我们使用 CentralAbsoluteAdapter，它需要归一化坐标 (x, y, 1)

  opengv::points_t pts_world;      // 3D 点 (世界系)
  opengv::bearingVectors_t bears;  // 归一化观测射线 (相机系)

  pts_world.reserve(points_world.size());
  bears.reserve(observations.size());

  for (size_t i = 0; i < points_world.size(); ++i) {
    // 填入世界点
    pts_world.push_back(points_world[i]);

    // 将像素坐标转为归一化平面坐标 (x, y, 1)
    double u = observations[i].x();
    double v = observations[i].y();

    double nx = (u - cx) / fx;
    double ny = (v - cy) / fy;

    Eigen::Vector3d bearing_vector(nx, ny, 1.0);
    bearing_vector.normalize();  // 确保是单位向量
    bears.push_back(bearing_vector);
  }

  // 2. 创建 Adapter
  // CentralAbsoluteAdapter 适用于单相机绝对位姿估计
  opengv::absolute_pose::CentralAbsoluteAdapter adapter(bears, pts_world);

  // 3. 调用 EPnP
  // 返回值是 4x4 的变换矩阵 T_cam_world (即 [R|t])
  // 注意：OpenGV 的坐标系定义通常是 X 右，Y 下，Z 前 (符合计算机视觉习惯)
  opengv::transformation_t T_cw = opengv::absolute_pose::epnp(adapter);

  // 4. 提取 R 和 t
  Eigen::Matrix3d R_cw = T_cw.block<3, 3>(0, 0);
  Eigen::Vector3d t_cw = T_cw.block<3, 1>(0, 3);

  // Cheirality check：初值的 cheirality 比误差更重要
  int front_count = 0;
  for (size_t i = 0; i < pts_world.size(); ++i) {
    Eigen::Vector3d pc = R_cw * pts_world[i] + t_cw;
    if (pc.z() > 0) front_count++;
  }

  if (front_count < pts_world.size() / 2) {
    R_cw = -R_cw;
    t_cw = -t_cw;
  }

  // 5. 转换为四元数
  Eigen::Quaterniond q_cw(R_cw);
  q_cw.normalize();

  // 6. 输出到 Ceres 格式数组
  // Ceres 期望四元数顺序: [w, x, y, z]
  q_out[0] = q_cw.w();
  q_out[1] = q_cw.x();
  q_out[2] = q_cw.y();
  q_out[3] = q_cw.z();

  t_out[0] = t_cw.x();
  t_out[1] = t_cw.y();
  t_out[2] = t_cw.z();

  std::cout << "[OpenGV EPnP] Initial pose computed successfully." << std::endl;
  std::cout << "  Init Rot (w,x,y,z): [" << q_out[0] << ", " << q_out[1] << ", "
            << q_out[2] << ", " << q_out[3] << "]" << std::endl;
  std::cout << "  Init Trans (x,y,z): [" << t_out[0] << ", " << t_out[1] << ", "
            << t_out[2] << "]" << std::endl;

  return true;
}

Eigen::Quaterniond PnPProblem::CeresQuatToEigen(const double* q_ceres) {
  return Eigen::Quaterniond(q_ceres[0], q_ceres[1], q_ceres[2], q_ceres[3]);
}

void PnPProblem::EigenQuatToCeres(const Eigen::Quaterniond& q_eigen,
                                  double* q_ceres) {
  q_ceres[0] = q_eigen.w();
  q_ceres[1] = q_eigen.x();
  q_ceres[2] = q_eigen.y();
  q_ceres[3] = q_eigen.z();
}

3 主流程

在完善了本质矩阵问题和PnP问题的实现之后，就可以进行修改增量式 SFM 框架的主流程了，具体代码如下：

cpp 复制代码

#include <Eigen/Dense>
#include <algorithm>
#include <filesystem>
#include <fstream>
#include <iostream>
#include <map>
#include <optional>

#include "Bundle.h"
#include "CameraIntrinsics.hpp"
#include "CameraPose.hpp"
#include "Math.h"
#include "Output.h"
#include "ReadData.h"
#include "SFMData.h"
#include "SolveBA.h"
#include "SolveEssential.h"
#include "SolvePnP.h"
#include "SolveTriangulate.h"

using namespace std;

bool GlobalBundleAdjustment(const Eigen::Matrix3d& K, const sfm::Bundle& bundle,
                            vector<cl::CameraExtrinsics>& cameras,
                            vector<cl::ObjectPoint>& objectPoints) {
  map<int, int> cameraIdMap;
  vector<BAProblem::CameraExtrinsics> est_cams;
  for (size_t ci = 0; ci < cameras.size(); ++ci) {
    if (cameras[ci].registered) {
      BAProblem::CameraExtrinsics cameraExtrinsics;
      Eigen::Quaterniond quaternion(cameras[ci].R);
      cameraExtrinsics.q[0] = quaternion.w();
      cameraExtrinsics.q[1] = quaternion.x();
      cameraExtrinsics.q[2] = quaternion.y();
      cameraExtrinsics.q[3] = quaternion.z();
      cameraExtrinsics.t[0] = cameras[ci].t.x();
      cameraExtrinsics.t[1] = cameras[ci].t.y();
      cameraExtrinsics.t[2] = cameras[ci].t.z();
      cameraIdMap[ci] = est_cams.size();
      est_cams.push_back(std::move(cameraExtrinsics));
    }
  }

  vector<BAProblem::ObjectPoint> est_pts;
  vector<BAProblem::Observation> obs;
  map<int, int> pointIdMap;

  for (const auto& track : bundle.tracks) {
    if (objectPoints[track.pointId].triangulated) {
      BAProblem::ObjectPoint objectPoint;
      objectPoint.xyz[0] = objectPoints[track.pointId].pos.x();
      objectPoint.xyz[1] = objectPoints[track.pointId].pos.y();
      objectPoint.xyz[2] = objectPoints[track.pointId].pos.z();

      for (auto ob : track.obs) {
        auto iter = cameraIdMap.find(ob.imgId);
        if (iter != cameraIdMap.end()) {
          BAProblem::Observation observation;
          observation.cam_id = iter->second;
          observation.pt_id = est_pts.size();
          observation.x = bundle.views[ob.imgId][ob.kpId].x();
          observation.y = bundle.views[ob.imgId][ob.kpId].y();
          obs.push_back(std::move(observation));
        }
      }
      pointIdMap.emplace(track.pointId, est_pts.size());
      est_pts.push_back(objectPoint);
    }
  }

  BAProblem problem(K, obs, est_cams, est_pts);
  if (!problem.Solve()) {
    return false;
  }

  for (const auto& [oldId, id] : cameraIdMap) {
    Eigen::Quaternion q(est_cams[id].q[0], est_cams[id].q[1], est_cams[id].q[2],
                        est_cams[id].q[3]);
    cameras[oldId].R = q.toRotationMatrix();
    cameras[oldId].t.x() = est_cams[id].t[0];
    cameras[oldId].t.y() = est_cams[id].t[1];
    cameras[oldId].t.z() = est_cams[id].t[2];
  }

  for (auto track : bundle.tracks) {
    if (objectPoints[track.pointId].triangulated) {
      auto iter = pointIdMap.find(track.pointId);
      if (iter != pointIdMap.end()) {
        objectPoints[track.pointId].pos.x() = est_pts[iter->second].xyz[0];
        objectPoints[track.pointId].pos.y() = est_pts[iter->second].xyz[1];
        objectPoints[track.pointId].pos.z() = est_pts[iter->second].xyz[2];
      }
    }
  }

  return true;
}

bool RegisterNextImage(const Eigen::Matrix3d& K, const sfm::Bundle& bundle,
                       vector<cl::CameraExtrinsics>& cameraExtrinsics,
                       vector<cl::ObjectPoint>& objectPoints) {
  map<int, vector<pair<int, int>>> mapImgTo3dAnd2d;
  for (const auto& track : bundle.tracks) {
    for (const auto& ob : track.obs) {
      if (cameraExtrinsics[ob.imgId].registered) {
        continue;
      }
      if (objectPoints[track.pointId].triangulated) {
        mapImgTo3dAnd2d[ob.imgId].push_back({track.pointId, ob.kpId});
      }
    }
  }

  // 暂时选择点数最多的影像
  int bestImgId = -1;
  int counter = 0;
  for (const auto& [imgId, ids] : mapImgTo3dAnd2d) {
    if (ids.size() > counter) {
      bestImgId = imgId;
      counter = ids.size();
    }
  }

  if (bestImgId == -1) {
    return false;
  }

  const auto& idsMap = mapImgTo3dAnd2d[bestImgId];
  vector<Eigen::Vector3d> points;
  vector<Eigen::Vector2d> observations;
  for (const auto& [id3d, id2d] : idsMap) {
    points.emplace_back(objectPoints[id3d].pos);
    observations.emplace_back(bundle.views[bestImgId][id2d]);
  }

  Eigen::Matrix3d R_est = cameraExtrinsics[bestImgId].R;
  Eigen::Vector3d t_est = cameraExtrinsics[bestImgId].t;
  PnPProblem problem(K, points, observations);
  if (problem.Solve(R_est, t_est)) {
    cameraExtrinsics[bestImgId].R = R_est;
    cameraExtrinsics[bestImgId].t = t_est;
    cameraExtrinsics[bestImgId].registered = true;
    return true;
  }

  return false;
}

void TriangulateNewPoints(const Eigen::Matrix3d& K, const sfm::Bundle& bundle,
                          vector<cl::CameraExtrinsics>& cameraExtrinsics,
                          vector<cl::ObjectPoint>& objectPoints) {
  for (auto track : bundle.tracks) {
    if (objectPoints[track.pointId].triangulated) {
      continue;
    }

    vector<sfm::Observation> registerObservation;
    for (const auto& ob : track.obs) {
      if (cameraExtrinsics[ob.imgId].registered) {
        registerObservation.push_back(ob);
      }
    }

    if (registerObservation.size() < 2) {
      continue;
    }

    std::vector<Eigen::Matrix3d> Rs;
    std::vector<Eigen::Vector3d> ts;
    std::vector<Eigen::Vector2d> observations;
    for (const auto& ob : registerObservation) {
      Rs.push_back(cameraExtrinsics[ob.imgId].R);
      ts.push_back(cameraExtrinsics[ob.imgId].t);
      observations.push_back(bundle.views[ob.imgId][ob.kpId]);
    }

    // 尝试三角化
    Eigen::Vector3d X = Eigen::Vector3d::Zero();
    TriangulateProblem triangulateProblem(K, Rs, ts, observations);
    if (!triangulateProblem.Solve(X)) {
      continue;
    }

    objectPoints[track.pointId].pos = X;
    objectPoints[track.pointId].triangulated = true;
  }
}

void DecomposeEssentialMatrix(const Eigen::Matrix3d& E,
                              std::vector<Eigen::Matrix3d>& Rs,
                              std::vector<Eigen::Vector3d>& ts) {
  Eigen::JacobiSVD<Eigen::Matrix3d> svd(
      E, Eigen::ComputeFullU | Eigen::ComputeFullV);

  Eigen::Matrix3d U = svd.matrixU();
  Eigen::Matrix3d Vt = svd.matrixV().transpose();

  // 保证 det(UV^T) = +1
  if ((U * Vt).determinant() < 0) {
    U.col(2) *= -1;
  }

  Eigen::Matrix3d W;
  W << 0, -1, 0, 1, 0, 0, 0, 0, 1;

  Eigen::Matrix3d R1 = U * W * Vt;
  Eigen::Matrix3d R2 = U * W.transpose() * Vt;

  Eigen::Vector3d t = U.col(2);

  Rs = {R1, R1, R2, R2};
  ts = {t, -t, t, -t};
}

int SelectBestRt(const std::vector<Eigen::Matrix3d>& Rs,
                 const std::vector<Eigen::Vector3d>& ts,
                 const std::vector<Eigen::Vector2d>& x1s,
                 const std::vector<Eigen::Vector2d>& x2s,
                 const Eigen::Matrix3d& K) {
  int best_idx = -1;
  int max_positive = -1;

  for (size_t i = 0; i < Rs.size(); ++i) {
    int positive_count = 0;

    Eigen::Matrix3d R = Rs[i];
    Eigen::Vector3d t = ts[i];

    for (int j = 0; j < x1s.size(); ++j) {
      Eigen::Vector3d X = Eigen::Vector3d::Zero();
      TriangulateProblem triangulateProblem(K, {Eigen::Matrix3d::Identity(), R},
                                            {Eigen::Vector3d::Zero(), t},
                                            {x1s[j], x2s[j]});
      if (!triangulateProblem.Solve(X)) {
        continue;
      }

      double z1 = X.z();
      double z2 = (R * X + t).z();

      if (z1 > 0 && z2 > 0) {
        positive_count++;
      }
    }

    if (positive_count > max_positive) {
      max_positive = positive_count;
      best_idx = i;
    }
  }

  return best_idx;
}

void Init(const Eigen::Matrix3d& K, const sfm::Bundle& bundle,
          vector<cl::CameraExtrinsics>& cameraExtrinsics,
          vector<cl::ObjectPoint>& objectPoints) {
  // 从匹配图中找到重叠度最高、基线最合适的两张图

  const auto& tracks = bundle.tracks;

  map<pair<int, int>, int> imagePairCounter;
  for (const auto& track : tracks) {
    if (track.obs.size() < 2) {
      continue;
    }

    for (size_t i = 0; i < track.obs.size(); ++i) {
      for (size_t j = i + 1; j < track.obs.size(); ++j) {
        pair<int, int> imagePair{track.obs[i].imgId, track.obs[j].imgId};
        auto iter = imagePairCounter.find(imagePair);
        if (imagePairCounter.end() == iter) {
          imagePairCounter.emplace(imagePair, 1);
        } else {
          iter->second += 1;
        }
      }
    }
  }

  // 暂时只选取匹配点最多的
  int counter = 0;
  int image1 = -1;
  int image2 = -1;
  for (const auto& [imagePair, count] : imagePairCounter) {
    if (count > counter) {
      counter = count;
      image1 = imagePair.first;
      image2 = imagePair.second;
    }
  }
  if (image1 == -1 || image2 == -1) {
    return;
  }

  // 找到同名点
  std::vector<Eigen::Vector2d> pixel1coords;
  std::vector<Eigen::Vector2d> pixel2coords;
  std::vector<Eigen::Vector3d> x1s_noisy;
  std::vector<Eigen::Vector3d> x2s_noisy;
  for (const auto& track : tracks) {
    if (track.obs.size() < 2) {
      continue;
    }

    for (size_t i = 0; i < track.obs.size(); ++i) {
      for (size_t j = i + 1; j < track.obs.size(); ++j) {
        if (track.obs[i].imgId == image1 && track.obs[j].imgId == image2) {
          const auto& view1 = bundle.views[image1][track.obs[i].kpId];
          const auto& view2 = bundle.views[image2][track.obs[j].kpId];

          // 像素坐标
          pixel1coords.push_back(view1);
          pixel2coords.push_back(view2);

          // 转为归一化坐标
          Eigen::Vector3d pixel1_noisy(view1.x(), view1.y(), 1);
          Eigen::Vector3d pixel2_noisy(view2.x(), view2.y(), 1);

          Eigen::Vector3d x1_noisy = (K.inverse() * pixel1_noisy).normalized();
          Eigen::Vector3d x2_noisy = (K.inverse() * pixel2_noisy).normalized();

          x1s_noisy.push_back(x1_noisy);
          x2s_noisy.push_back(x2_noisy);
        }
      }
    }
  }

  // 求解本质矩阵
  EssentialProblem essentialProblem;
  Eigen::Matrix3d E = essentialProblem.Solve(K, x1s_noisy, x2s_noisy, 1.0, 100);

  std::vector<Eigen::Matrix3d> Rs;
  std::vector<Eigen::Vector3d> ts;
  DecomposeEssentialMatrix(E, Rs, ts);

  int bestIdx = SelectBestRt(Rs, ts, pixel1coords, pixel2coords, K);

  // 相机 1 为原点，单位矩阵旋转
  cameraExtrinsics[image1].registered = true;
  cameraExtrinsics[image1].R = Eigen::Matrix3d::Identity();
  cameraExtrinsics[image1].t = Eigen::Vector3d::Zero();

  // 相机 2 的位姿 = 相对位姿
  cameraExtrinsics[image2].registered = true;
  cameraExtrinsics[image2].R = Rs[bestIdx];
  cameraExtrinsics[image2].t = ts[bestIdx];

  TriangulateNewPoints(K, bundle, cameraExtrinsics, objectPoints);

  GlobalBundleAdjustment(K, bundle, cameraExtrinsics, objectPoints);
}

void Iterate(const Eigen::Matrix3d& K, const sfm::Bundle& bundle,
             vector<cl::CameraExtrinsics>& cameraExtrinsics,
             vector<cl::ObjectPoint>& objectPoints) {
  while (true) {
    cout << "--- 新一轮迭代 ---" << endl;

    // 1. 尝试注册一张新图片
    // 返回 true 表示成功注册了一张，false 表示没有更多图片可注册
    bool success = RegisterNextImage(K, bundle, cameraExtrinsics, objectPoints);

    if (!success) {
      cout << "没有更多图片可以注册，结束。" << endl;
      break;
    }

    // 2. 三角化新点
    // 利用新注册的相机，与已注册相机之间的匹配，生成新的 3D 点
    TriangulateNewPoints(K, bundle, cameraExtrinsics, objectPoints);

    // 4. 全局光束法平差 (Global BA)
    // 优化所有已注册相机的位姿和所有 3D 点坐标
    cout << "----GlobalBundleAdjustment----" << endl;
    if (!GlobalBundleAdjustment(K, bundle, cameraExtrinsics, objectPoints)) {
      break;
    }

    // break;
  }

  cout << "=== SfM 完成 ===" << endl;
  {
    int counter = 0;
    for (const auto& cam : cameraExtrinsics) {
      if (cam.registered) {
        counter++;
      }
    }
    cout << "最终相机数: " << counter << endl;
  }
  {
    int counter = 0;
    for (const auto& point : objectPoints) {
      if (point.triangulated) {
        counter++;
      }
    }
    cout << "最终 3D 点数: " << counter << endl;
  }
}

void IncrementalSFM(const Eigen::Matrix3d& K,
                    vector<cl::CameraExtrinsics>& cameraExtrinsics,
                    const sfm::Bundle& bundle,
                    vector<cl::ObjectPoint>& objectPoints) {
  // 初始化启动
  Init(K, bundle, cameraExtrinsics, objectPoints);
  // 迭代优化
  Iterate(K, bundle, cameraExtrinsics, objectPoints);
}

int main() {
  std::string cameraIntrinsicsPath =
      "D:/Work/SFMWork/SFM/CameraIntrinsics.json";
  std::string bundlePath = "D:/Work/SFMWork/SFM/Bundle.json";
  std::string outputDir = "D:/Work/SFMWork/SFM";

  auto [K, cameraExtrinsics, bundle, objectPoints] =
      ReadData(cameraIntrinsicsPath, bundlePath);

  IncrementalSFM(K, cameraExtrinsics, bundle, objectPoints);

  Output(K, cameraExtrinsics, bundle, objectPoints, outputDir);
}

在这段代码实现中，有两个关键点体现了本文的无先验实现与先前的依赖先验的不同：

3.1 本质矩阵启动

在带先验的流程中，我们拥有绝对的坐标参考，可以直接进行三角化。但在纯视觉自由网平差中，初始状态下所有数据都是"漂浮"的，我们无法直接确定尺度，必须先通过 对极几何 求解本质矩阵 $E$，从而恢复两视图之间的相对运动（旋转 $R$ 和平移方向 $t$）。

这就引出了增量式 SfM 中最关键的初始化 步骤。代码中的 Init 函数首先遍历所有匹配轨迹，寻找重叠度最高的一对图像作为"种子"。这里的实现逻辑较为简单，仅选取了同名点对数量最多 的一对图像。在实际工程中，仅看点数是不够的。理想的初始化对还需要满足基线条件------基线太短会导致三角化深度误差极大，基线太长则匹配困难。因此，更稳健的策略通常会综合考量"匹配点数"与"视差大小"，选取几何分布最优的一对图像来构建初始骨架。

3.2 歧义性消除

本质矩阵 $E$ 的分解在数学上存在四义性 ，即会分解出 4 组可能的 $(R, t)$ 组合。这意味着，仅凭数学推导，我们无法确定相机是向前平移还是向后平移，也无法确定物体是在相机前方还是后方。

为了从这 4 组解中选出唯一正确的物理真实解，我们需要进行手性检查：

原理：正确的相机位姿应当使得三角化后的 3D 点位于两个相机的正前方 （即深度 $Z > 0$）。
实现：代码中的 SelectBestRt 函数正是执行这一逻辑。它尝试将匹配点三角化，并统计在每种 $(R, t)$ 组合下，有多少个点的深度值为正。最终，我们选取那个能让最多点落在相机前方的解作为初始位姿。

虽然 OpenGV 的 RANSAC + 5点算法在内部已经处理了这一过程，但在本系统中，为了追求极致的严密性，我们采用了 "RANSAC 滤除外点 + 8点算法计算初值 + 非线性优化 Sampson 误差" 的流程。这使得求解出的 $E$ 矩阵更加精准，但同时也要求我们必须手动严谨地处理分解后的歧义性，确保增量式重建的"第一步"迈得稳健而准确。

3.3 运行结果

在完成无先验约束的增量式 SfM 自由网平差后，我们得到了如下的重建结果：

text 复制代码

=== SfM 重建精度报告 ===

=== 总体重投影误差评估 ===
总有效观测点数量: 213825
均方根重投影误差 (RMSE): 0.925533 像素
平均重投影距离: 1.308901 像素
最大重投影误差: 4.736852 像素

评估结果: 优秀 (RMSE < 1.0 px)

=== 各相机重投影误差统计 ===
相机ID	观测点数	平均误差(px)	最大误差(px)
------	--------	----------	----------
0	669		1.138774		3.507408
1	800		1.143470		3.335997
2	963		1.119960		3.558341
3	1236		1.134194		3.828714
4	1384		1.160828		4.002508
5	1273		1.186861		3.400788
6	1419		1.183699		3.492871
7	1307		1.175605		3.275668
8	1283		1.154476		4.030041
9	1125		1.173777		3.302954
10	1300		1.176096		3.561395
11	1171		1.176223		3.541258
12	1203		1.160050		3.402125
13	1249		1.133387		3.659208
14	1225		1.138528		3.633788
15	982		1.135840		3.719180
16	862		1.112004		3.657755
17	814		1.127513		3.426723
18	1163		1.142030		3.467215
19	1553		1.123140		4.273178
20	1930		1.157983		4.526633
21	2123		1.166807		3.760000
22	2463		1.177138		3.750228
23	2486		1.189237		4.736852
24	2481		1.143466		3.686592
25	2536		1.167948		3.438798
26	2507		1.181981		3.609180
27	2472		1.188758		4.423699
28	2441		1.182264		3.603234
29	2483		1.157665		3.327187
30	2484		1.164950		3.781155
31	2492		1.185153		3.670784
32	2397		1.149647		3.426047
33	2044		1.171107		4.462696
34	1734		1.159800		3.820606
35	1407		1.082872		3.349179
36	1043		1.117492		3.077409
37	1350		1.132475		3.575782
38	1859		1.166695		3.566768
39	2214		1.166246		4.206265
40	2403		1.171880		3.918882
41	2410		1.196405		3.748022
42	2455		1.174788		3.738733
43	2530		1.184634		4.019759
44	2513		1.170182		3.729875
45	2486		1.147136		3.860483
46	2473		1.172374		3.837221
47	2486		1.161873		4.006828
48	2418		1.178098		3.607638
49	2485		1.168481		4.618271
50	2481		1.160946		3.773381
51	2049		1.176430		4.057424
52	1771		1.152675		3.946432
53	1659		1.142041		3.543069
54	1354		1.150095		3.438394
55	1600		1.138327		3.702788
56	1752		1.167379		3.827091
57	2157		1.172125		3.782260
58	2391		1.159461		3.774902
59	2429		1.169183		3.842724
60	2378		1.201136		4.486049
61	2437		1.160235		3.625918
62	2497		1.180614		3.512648
63	2457		1.175333		3.901792
64	2586		1.193847		4.473947
65	2575		1.175705		3.773060
66	2599		1.161320		3.512931
67	2607		1.161115		3.698081
68	2451		1.143361		3.820784
69	2056		1.140676		3.687266
70	1671		1.128706		3.619209
71	1263		1.124349		4.584541
72	1122		1.079015		3.740648
73	1567		1.127897		3.590001
74	1955		1.170258		3.543260
75	2223		1.170362		3.765049
76	2526		1.195866		3.869577
77	2529		1.144766		3.896475
78	2509		1.151114		3.560529
79	2513		1.171264		3.767642
80	2552		1.160158		3.700139
81	2525		1.175293		3.726552
82	2513		1.161876		3.560540
83	2503		1.158764		3.792600
84	2502		1.173565		4.141700
85	2492		1.165210		4.019042
86	2346		1.142869		3.652382
87	1984		1.180603		3.698165
88	1868		1.166455		4.111056
89	1289		1.129071		3.762945
90	1075		1.072147		3.739352
91	1441		1.116879		3.608247
92	1756		1.142750		3.263206
93	2055		1.156480		3.373123
94	2242		1.153276		3.855136
95	2224		1.129141		3.476008
96	2380		1.144311		3.556758
97	2379		1.131601		3.560533
98	2219		1.177845		4.151206
99	2356		1.163374		3.976558
100	2341		1.142496		3.640941
101	2263		1.150690		3.291192
102	2241		1.123987		3.727702
103	2374		1.164205		3.684737
104	2180		1.136920		3.452273
105	1911		1.118371		3.378943
106	1699		1.156649		4.433503
107	1365		1.084850		3.220002

从输出的重投影误差报告来看，系统的性能表现优异：

重投影误差：全局光束法平差后的均方根误差（RMSE）仅为 0.925 像素。这意味着计算出的相机位姿和 3D 点坐标在几何上高度自洽，能够精确地解释图像中的特征点位置。
对比分析：这一精度指标与我们在第 19 篇《最小二乘问题详解19：带先验约束的增量式SFM优化与实现》中带先验约束的结果相比差距不大。这证明了"RANSAC + 8点算法 + 非线性优化"这一纯视觉流程在恢复图像间相对几何关系上的有效性。即使没有外部传感器辅助，算法依然能构建出内部一致性极高的几何模型。

另一方面，最终优化后的前三个相机的位姿参数是：

json 复制代码

[
  {
    "pitch": -0.05766287195554473,
    "px": 11.408531513538671,
    "py": 9.711222665234468,
    "pz": 0.48165271573651947,
    "roll": 3.1206414891405863,
    "yaw": 3.138861801848
  },
  {
    "pitch": -0.08348148460337527,
    "px": 11.17640395893969,
    "py": 8.763010024814704,
    "pz": 0.06284357759207775,
    "roll": 3.138016071650729,
    "yaw": 3.0552854848438606
  },
  {
    "pitch": -0.023266005538045045,
    "px": 10.333931416822084,
    "py": 8.59484703080245,
    "pz": 0.32661725863531216,
    "roll": -3.1094300937908668,
    "yaw": 3.0341763341055286
  }
]

而作为对比的真实的相机位姿是：

json 复制代码

[
  {
    "pitch": 0.009722665505805727,
    "px": 200.78643172534493,
    "py": 214.25507616417832,
    "pz": 258.2814900599498,
    "roll": 0.024034282956425585,
    "yaw": 0.06247329611914964
  },
  {
    "pitch": 0.03636411690376355,
    "px": 200.5850499916218,
    "py": 201.2811957198281,
    "pz": 250.38160856483407,
    "roll": 0.008626373359168276,
    "yaw": -0.02017919864408936
  },
  {
    "pitch": -0.02120212654719572,
    "px": 166.34672474432844,
    "py": 208.88006011663123,
    "pz": 252.61967050777355,
    "roll": -0.030947272405334555,
    "yaw": -0.039678961904428135
  }
]

这说明虽然优化结果的内部精度很高，但这个结果是相对的，而非绝对的 。由于纯视觉 SfM 仅依赖 2D-2D 的像素对应关系，系统中不存在任何物理尺度或绝对方向的参考。这导致重建结果具有七参数的不确定性（7 Degrees of Freedom）：

3 个平移自由度 ：我们无法确定场景在空间中的具体位置（$X, Y, Z$ 偏移）。例如，运行结果中第一张图的 $t= $11.4, 9.7, 0.48$ $ 与真实数据 $ $200.7, 214.2, 258.2$ $ 存在巨大的坐标原点差异。
3 个旋转自由度：我们无法确定场景的绝对朝向（Yaw, Pitch, Roll 偏航、俯仰、滚转）。输出的欧拉角与真实值在数值上完全不同，因为坐标系的轴向是任意选取的。
1 个尺度自由度 ：我们无法确定场景的绝对物理大小。在初始化阶段，本质矩阵分解出的平移向量 $\mathbf{t}$ 是尺度归一化 的（即长度被强制为 1）。这意味着，无论真实场景的基线是 1 米还是 100 米，算法初始化时都将其视为"1 个单位"。因此，重建结果与真实数据之间缺失了一个物理尺度因子（Scale Factor） 。输出的平移向量（如 $ $11.4, 9.7, 0.48$ $）仅代表相对方向，其模长仅代表重建系统内部的"任意单位"，与真实物理距离（如 $ $200.7, 214.2, 258.2$ $）无法直接对应。

总而言之，纯视觉 SfM 重建出的是一个欧氏空间下的"相似变换"模型。它完美地保留了场景的几何形状（角度、比例），但丢失了绝对的位置、朝向和尺度。如果需要将这个"自由网"转换到真实世界的地理坐标系中，必须依赖外部先验（如控制点）或物理约束（如双目视觉提供的绝对尺度）。

最小二乘问题详解20：无先验约束下的增量式SFM自由网平差

1 引言

2 子问题

2.1 本质矩阵问题

2.2 PnP问题

3 主流程

3.1 本质矩阵启动

3.2 歧义性消除

3.3 运行结果