The traditional image stitiching technique [
5] based on the detection and matching of landmarks (e.g. with a combination of SIFT and RANSAC) is prone to failure in in vivo sequences encountered in clinical conditions. The lack of constrast in the acquired images and the cluttered and varying aspect of the observed scene are challenges responsible for these difficulties. In this section, we present a registration method that addresses the pairwise registration of in vivo frames. Given the aforementioned challenges, we propose not to rely on landmarks. Instead, we perform a dense pixelwise alignment of the gradient orientations and propose a variant of the maximisation of the correlation of image gradients introduced by Tzimiropoulos et al. [
22], with two main differences exposed in details below. Aligning gradient orientations possesses several advantages: it is for example invariant to local changes of contrast and is suitable for registering accurately linear structures such as vessels, which matches the main clinical objective of mosaicking for TTTS surgery, i.e. the creation of an overview of the topology of the placental vascular network. Moreover, by focusing solely on gradient orientations and not on the gradient norms, each pixel is given the same weight, which naturally improves the robustness of the registration to visual artifacts and partial occlusions.
The registration task consists in estimating the true warping
\(w_{j,i}\) such that
\(I_i(\mathbf x )\) and
\(I_j(w_{j,i}(\mathbf x ))\) correspond to the same location for every
\(\mathbf x \in \varOmega _i\). With this formulation,
\(I_i\) is called the fixed image and
\(I_j\) the moving image. We parametrise the homographic warpings with a vector
\(\mathbf p \in \mathbb {R}^8\) corresponding to the 8 coefficients of the canonical homographic representation, i.e. such that
$$\begin{aligned} w((x,y),\mathbf p ) = \left( \frac{p_1 x + p_2 y + p_3}{p_7 x + p_8 y + 1},\frac{p_4 x + p_5 y + p_6}{p_7 x + p_8 y + 1}\right) . \end{aligned}$$
(4)
As discussed above, we propose to look for the registration warping
\(\hat{w}_{j,i}\) which aligns best the gradients of
\(I_i\) and
\(I_j\). Since we explicitly do not want to take into account the strength of the gradients, we first normalise the gradients of the fixed and moving images ensuring a unit gradient norm at every pixel. For a point
\(\mathbf x \in \varOmega _i\) of the domain of the fixed image, we denote
\(\varDelta \theta (\mathbf x , \mathbf p )\) the angle between the gradient of the fixed image and the gradient of the warped moving image at
\(\mathbf x \). We define the final warping
\(\hat{w}_{j,i}\), i.e. the output of our registration method, as
\(\hat{w}_{j,i} = w(.,\hat{\mathbf{p }})\) where
$$\begin{aligned} \hat{\mathbf{p }} = {\mathop {{{\mathrm{argmin}}}}_\mathbf{p }} \sum _\mathbf{x \in \varOmega _i} \sin ^2\varDelta \theta (\mathbf x , \mathbf p ). \end{aligned}$$
(5)
Since
\(\sin ^2t = \frac{1}{2} (1 - \cos 2t)\), the proposed approach can be seen as a variant of the maximisation of the correlation of image gradients [
22] defined by
$$\begin{aligned} \hat{\mathbf{p }} = {\mathop {{{\mathrm{argmax}}}}_\mathbf{p }} \sum _\mathbf{x \in \varOmega _i} \cos \varDelta \theta (\mathbf x , \mathbf p ). \end{aligned}$$
(6)
We can identify two main differences between the two formulations. First, our pixelwise costs based on the sine function are minimal for
\(\varDelta \theta = 0 \) or
\(\varDelta \theta = \pi \), whereas the terms in (
6) are minimal for
\(\varDelta \theta = 0\) only. Thereby, only the orientation (modulo
\(\pi \)) of the gradients is taken into account in our cost function, and not their direction. It appears to be a useful property in practice: as we try to match vessels with an iterative method (see below), optimisation steps must be able to cross areas where gradients are oriented in opposite directions before reaching the optimal vessel alignment. Having written our minimisation problem (
5) as a sum of squares, we are also able to use known results on nonlinear least squares, and more precisely the forward additive version of the Lucas Kanade algorithm [
2]. This formulation not only leads to simpler theoretical mathematical derivations, but also offers the possibility to use off-the-shelf optimised solvers for nonlinear least squares problem, such as the Ceres solver [
17] which includes classical optimisation techniques (the Gauss–Newton, Levenberg–Marquardt or Powell Dog–Leg algorithms, for example).
Solving (5) with the Gauss–Newton algorithm To solve numerically the minimisation problem (
5), we use the fact that it is a nonlinear least squares problem to apply the Gauss–Newton algorithm, which, in the context of image registration, can also be seen as the forward additive version of the Lucas–Kanade algorithm [
2]. To keep the following derivations as general as possible, we denote
\(N = \vert \varOmega _i \vert \) and arbitrarily order the elements of
\(\varOmega _i\) so that
\(\varOmega _i = \lbrace \mathbf x _1, \ldots , \mathbf x _N \rbrace \), and we denote
M the number of parameters encoding the transformation, i.e. the size of the parameter vectors
\(\mathbf p \). Applying the Gauss–Newton algorithm, we approximate iteratively the desired minimum with a series of parameters
\(\mathbf p ^{(1)}, \mathbf p ^{(2)}, \ldots \) such that
$$\begin{aligned} \mathbf p ^{(k+1)} = \mathbf p ^{(k)} - (\mathbf J ^T \mathbf J )^{-1} \mathbf J ^T \mathbf s (\mathbf p ^{(k)}), \end{aligned}$$
(7)
where
\(\mathbf s (\mathbf p ^{(k)})\) is the
\(N \times 1\) column vector
\(\left( \sin \varDelta \theta (\mathbf x _i,\right. \left. \mathbf p ^{(k)}) \right) _{1 \le i \le N}\) and
\(\mathbf J \) is the
\(N \times M\) Jacobian matrix whose coefficients
\(J_{ij}\) are defined as
$$\begin{aligned} J_{ij} = \frac{\partial s_i(\mathbf p ^{(k)})}{\partial p_j} = \frac{\partial \varDelta \theta (\mathbf x _i, \mathbf p ^{(k)})}{\partial p_j} \cos \varDelta \theta (\mathbf x _i, \mathbf p ^{(k)}). \end{aligned}$$
(8)
We denote
\(\mathbf g _m(\mathbf x _i, \mathbf p ^{(k)}) = \left( g_{m,x}(\mathbf x _i, \mathbf p ^{(k)}), g_{m,y}(\mathbf x _i, \mathbf p ^{(k)})\right) \) the gradient of the warped moving image at the location
\(\mathbf x _i\), and
\(\theta _m(\mathbf x _i, \mathbf p ^{(k)})\) (respectively,
\(\theta _f(\mathbf x _i))\) the angle of the gradient of the moving image (respectively, the fixed image) at the location
\(\mathbf x _i\). By definition, we have
\(\varDelta \theta (\mathbf x _i, \mathbf p ^{(k)}) = \theta _m(\mathbf x _i, \mathbf p ^{(k)}) - \theta _f(\mathbf x _i)\) and
$$\begin{aligned} \theta _m(\mathbf x _i, \mathbf p ^{(k)}) = \arctan \frac{g_{m,y}(\mathbf x _i, \mathbf p ^{(k)})}{g_{m,x}(\mathbf x _i, \mathbf p ^{(k)})}, \end{aligned}$$
(9)
so that the coefficients of the Jacobian given in (
8) can be written more explicitly as
$$\begin{aligned} J_{ij} = \frac{1}{\Vert \mathbf g _m \Vert ^2} \left( g_{m,x} \frac{\partial g_{m,y}}{\partial p_j} - g_{m,y} \frac{\partial g_{m,x}}{\partial p_j} \right) \cos \varDelta \theta , \end{aligned}$$
(10)
where the dependencies in
\(\mathbf x _i\) and
\(\mathbf p ^{(k)}\) were omitted for readability. We finally mention that, although the gradients
\(\mathbf g _m(\mathbf x _i, \mathbf p ^{(k)})\) could be computed at each iteration by warping the moving image and computing the gradient of the resulting warped image numerically, it is more efficient to precompute once for all the gradient of the (unwarped) moving image and obtain
\(\mathbf g _m\) by warping this gradient and multiplying it by the Jacobian of the warping, by application of the chain rule [
22]. The partial derivatives of
\(g_{m,x}\) and
\(g_{m,y}\) are obtained similarly.