In this post, we will discuss the post-processing of LiDAR SLAM. We mainly focus on its problem formulation. The content of this post follows papers [1] and [2].
Problem Formulation

With LiDAR poses, each denoted by $\mathbf{T}_j = (\mathbf{R}_j,\mathbf{t}_j)$ $(j=1,\ldots,M_p)$, the bundle adjustment refers to simultaneously determining all the LiDAR poses (denoted by $\mathbf{T} = (\mathbf{T}_1,\cdots,\mathbf{T}_{M_p})$) and feature parameters (denoted by $\boldsymbol{\pi} = (\pi_1,\cdots,\pi_{M_f})$), such that the reconstructed map agrees with the LiDAR measurements to the best extent. Denote $c(\pi_i,\mathbf{T})$ the map consistency due to the $i$-th feature; a straightforward BA formulation is
$$ \begin{equation*} \min_{\mathbf{T},\,\boldsymbol{\pi}} \left( \sum_{i=1}^{M_f} c(\pi_i,\mathbf{T}) \right). \end{equation*} $$In our BA formulation, we make use of plane and edge features that are often abundant in LiDAR point clouds and minimize the natural Euclidean distance between each measured raw LiDAR point and its corresponding plane or edge feature. Specifically, assume a total number of $N_{ij}$ LiDAR points are measured on the $i$-th feature at the $j$-th LiDAR pose, each denoted by $\mathbf{p}_{f_{ijk}}$ $(k=1,\ldots,N_{ij})$. Its predicted location in the global frame is
$$ \mathbf{p}_{ijk} = \mathbf{R}_j\,\mathbf{p}_{f_{ijk}} + \mathbf{t}_j. $$Aggregating the distances for all points observed in all poses leads to the total map consistency corresponding to this plane feature:
$$ \begin{equation} c(\pi_i,\mathbf{T}) = \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \left\|\, \mathbf{n}_i^{\top}\big(\mathbf{p}_{ijk}-\mathbf{q}_i\big) \right\|_2^{2}. \end{equation} $$Here $N_i=\sum_{j=1}^{M_p} N_{ij}$ is the total number of LiDAR points observed on the plane feature by all poses.
For the edge feature:
$$ \begin{equation} c(\pi_i,\mathbf{T}) = \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \left\| \big(\mathbf{I}-\mathbf{n}_i\mathbf{n}_i^{\top}\big)\big(\mathbf{p}_{ijk}-\mathbf{q}_i\big) \right\|_2^{2}. \end{equation} $$Here $N_i=\sum_{j=1}^{M_p} N_{ij}$ is the total number of LiDAR points observed on the edge feature by all poses.
Elimination of feature parameters
In the BA optimization above, each cost term $c(\pi_i,\mathbf{T})$ depends on only one feature parameter, so the feature parameters can be optimized independently. Concretely,
$$ \begin{aligned} \min_{\mathbf{T},\,\boldsymbol{\pi}} \left(\sum_{i=1}^{M_f} c(\pi_i,\mathbf{T})\right) &= \min_{\mathbf{T}}\left(\min_{\boldsymbol{\pi}} \sum_{i=1}^{M_f} c(\pi_i,\mathbf{T})\right)\\ &= \min_{\mathbf{T}}\left(\sum_{i=1}^{M_f} \min_{\pi_i} c(\pi_i,\mathbf{T})\right). \end{aligned} $$By observation, the solution to the inner minimization problem, $c(\pi_i,\mathbf{T})$, is the smallest singular value and sum of the last two singular values for the plane feature and edge feature, respectively, of the matrix
$$ \begin{aligned} \mathbf{A}_i &\triangleq \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \big(\mathbf{p}_{ijk}-\bar{\mathbf{p}}_i\big)\big(\mathbf{p}_{ijk}-\bar{\mathbf{p}}_i\big)^{\top},\\[2pt] \bar{\mathbf{p}}_i &\triangleq \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \mathbf{p}_{ijk}. \end{aligned} $$Further, let us expand the matrix $\mathbf{A}_i$ as a function of the data points and the poses
$$ \begin{aligned} \mathbf{A}_i &= \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \big(\mathbf{p}_{ijk}-\bar{\mathbf{p}}_i\big)\big(\mathbf{p}_{ijk}-\bar{\mathbf{p}}_i\big)^{\top} \\ &= \frac{1}{N_i}\sum_{j=1}^{M_p}\sum_{k=1}^{N_{ij}} \big(\mathbf{p}_{ijk}\mathbf{p}_{ijk}^{\top}\big) - \bar{\mathbf{p}}_i\bar{\mathbf{p}}_i^{\top} \\ &= \frac{1}{N_i}\,\mathbf{P}_i \;-\; \frac{1}{N_i^{2}}\,\mathbf{v}_i\mathbf{v}_i^{\top} \; \triangleq\; \mathbf{A}(\mathbf{C}_i), \end{aligned} $$where
$$ \begin{aligned} \mathbf{C}_i
\begin{bmatrix} \mathbf{P}i & \mathbf{v}i\ \mathbf{v}i^{\top} & N_i \end{bmatrix}, \quad \mathbf{P}i = \sum{j=1}^{M_p}\sum{k=1}^{N{ij}} \mathbf{p}{ijk}\mathbf{p}{ijk}^{\top}, \quad \mathbf{v}i = \sum{j=1}^{M_p}\sum{k=1}^{N_{ij}} \mathbf{p}_{ijk}. \end{aligned} $$
Consequently, the original BA optimization reduces to
$$ \begin{equation} \min_{\mathbf{T}} \left( \sum_{i=1}^{M_f} \lambda_{\ell}\!\left(\mathbf{A}_i\right) \right), \end{equation} $$where $\ell \in \{2,3\}$ (we omit the exact number of eigenvalues in the cost for brevity).
Since $\mathbf{p}_{ijk} = \mathbf{R}_j \mathbf{p}_{f_{ijk}} + \mathbf{t}_j$, we have
$$ \mathbf{C}_i = \sum_{j=1}^{M_p} \mathbf{T}_j \mathbf{C}_{f_{ij}} \mathbf{T}_j^{\top} $$Thus our optimization problem reduces to
$$ \begin{equation} \min_{\mathbf T_j \in SE(3),\, \forall j} \left( \underbrace{\sum_{i=1}^{M_f} \lambda_{\ell}\!\left( \mathbf A\!\left( \sum_{j=1}^{M_p} \mathbf T_j \mathbf C_{f_{ij}} \mathbf T_j^{\top} \right)\right)}_{c(\mathbf T)} \right). \end{equation} $$Optimization
We can compute the gradient and Jacobian of the cost function, so that we can use any first-order or second-order optimization methods to solve the optimization problem. The core is to compute the gradient and Jacobian of the singular value with respect to the matrix.
For a scalar $x \in \mathbb{R}$ and a matrix $\mathbf{A} \in S^{3 \times 3}$ which depends on $x$, we have the two following conclusions.
$$ \frac{\partial \lambda_k(x)}{\partial x} = \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_k(x) $$$$ \frac{\partial \mathbf{u}_k(x)}{\partial x} = \sum_{k=1, k \ne l}^{3} \frac{1}{\lambda_l - \lambda_k} \mathbf{u}_k(x) \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_l(x) $$where $\lambda_l$ ($l=1,2,3$) denotes the $l$-th largest eigenvalue and $\mathbf{u}_l$ is the corresponding eigenvector. The first equation is used to compute the first order derivative of the singular value with respect to the matrix. The second equation is used to compute the second order derivative of the singular value with respect to the matrix.
$$ \mathbf{A}(x) = \mathbf{U}(x) \mathbf{\Lambda}(x) \mathbf{U}(x)^T $$where $\mathbf{\Lambda}(x) = \text{diag}(\lambda_1(x), \lambda_2(x), \lambda_3(x))$ consists of all the eigenvalues and $\mathbf{U}(x) = [\mathbf{u}_1(x) \ \mathbf{u}_2(x) \ \mathbf{u}_3(x)]$ is an orthonormal matrix consisting of the eigenvectors. Therefore,
$$ \mathbf{\Lambda}(x) = \mathbf{U}(x)^T \mathbf{A}(x) \mathbf{U}(x) $$Taking the derivative of both sides with respect to $x$,
$$ \frac{\partial \mathbf{\Lambda}(x)}{\partial x} = \left(\frac{\partial \mathbf{U}(x)}{\partial x}\right)^T \mathbf{A}(x) \mathbf{U}(x) + \mathbf{U}(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{U}(x) + \mathbf{U}(x)^T \mathbf{A}(x) \frac{\partial \mathbf{U}(x)}{\partial x} $$Since $\mathbf{U}(x)^T \mathbf{A}(x) = \mathbf{\Lambda}(x) \mathbf{U}(x)^T$ and $\mathbf{A}(x) \mathbf{U}(x) = \mathbf{U}(x) \mathbf{\Lambda}(x)$, the equation is
$$ \frac{\partial \mathbf{\Lambda}(x)}{\partial x} = \underbrace{\left(\frac{\partial \mathbf{U}(x)}{\partial x}\right)^T \mathbf{U}(x)}_{\mathbf{D}(x)^T} \mathbf{\Lambda}(x) + \mathbf{U}(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{U}(x) + \mathbf{\Lambda}(x) \underbrace{\mathbf{U}(x)^T \frac{\partial \mathbf{U}(x)}{\partial x}}_{\mathbf{D}(x)} $$Denote $\mathbf{D}(x) \triangleq \mathbf{U}(x)^T \frac{\partial \mathbf{U}(x)}{\partial x}$. Since $\mathbf{U}(x)\mathbf{U}(x)^T = \mathbf{I}$, differentiating both sides with respect to $x$ leads to,
$$ \frac{\partial \mathbf{U}(x)}{\partial x} \mathbf{U}(x)^T + \mathbf{U}(x) \left(\frac{\partial \mathbf{U}(x)}{\partial x}\right)^T \mathbf{U}(x) = \mathbf{0} $$$$ \Rightarrow \mathbf{D}(x) + \mathbf{D}(x)^T = \mathbf{0} $$It is seen that $\mathbf{D}(x)$ is a skew-symmetric matrix whose diagonal elements are zeros. Moreover, since $\mathbf{\Lambda}(x)$ is diagonal, the last two terms of the right side sum to zero on diagonal positions. Only considering the diagonal elements leads to
$$ \frac{\partial \lambda_k(x)}{\partial x} = \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_k(x), \quad k \in \{1, 2, 3\} $$which yields the first conclusion. Now we aim to prove the second one. Since $\frac{\partial \mathbf{\Lambda}(x)}{\partial x}$ is a diagonal matrix, for the off-diagonal element in the $k$-th row, $l$-th column ($k \ne l$), we have
$$ 0 = \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_l(x) + \lambda_k D_{kl}^k - D_{kl}^k \lambda_l $$where $D_{kl}^k$ is the $k$-th row, $l$-th column element in the skew-symmetric $\mathbf{D}(x)$ and satisfies $D_{kl}^k = -D_{lk}^k$. From the above, we can solve $D_{kl}^k$,
$$ D_{kl}^k = \begin{cases} \frac{1}{\lambda_l - \lambda_k} \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_l(x), & k \ne l \\ 0, & k=l \end{cases} $$Since $\mathbf{D}(x) \triangleq \mathbf{U}(x)^T \frac{\partial \mathbf{U}(x)}{\partial x}$, we have $\frac{\partial \mathbf{U}(x)}{\partial x} = \mathbf{U}(x) \mathbf{D}(x)$. Taking the $l$-th column on both sides leads to
$$ \frac{\partial \mathbf{u}_l(x)}{\partial x} = \mathbf{U}(x) \mathbf{D}_{:,l} $$where $\mathbf{D}_{:,l} \in \mathbb{R}^3$ represents the $l$-th column of $\mathbf{D}(x)$. Finally, substituting the expression for $\mathbf{u}_l(x)$ and $D_{kl}^k$, we obtain
$$ \frac{\partial \mathbf{u}_k(x)}{\partial x} = \sum_{k=1, k \ne l}^{3} \frac{1}{\lambda_l - \lambda_k} \mathbf{u}_k(x) \mathbf{u}_k(x)^T \frac{\partial \mathbf{A}(x)}{\partial x} \mathbf{u}_l(x), $$which yields the second conclusion.
References
[1] Z. Liu and F. Zhang, “BALM: Bundle Adjustment for Lidar Mapping,” in IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 3184-3191, April 2021.
[2] Z. Liu, X. Liu, and F. Zhang, “Efficient and consistent bundle adjustment on lidar point clouds,” IEEE Trans. Robot., vol. 39, no. 6, pp. 4366–4386, Dec. 2023.