1 Introduction and Related Work

Fusion of multimodal medical data is one of the most important application of image registration. In radiotherapy the registration of a computed tomography (CT) scan for dose-planning and an MRI for tumour and organs-at-risk delineation have to be aligned in a nonlinear fashion. In ultrasound guided neurosurgery an intra-operative 3D ultrasound (3DUS) has to be registered to a pre-treatment MRI to guide the surgeon during tumour resection. The correction of challenge on brainshift in intra-operative ultrasound (CuRIOUS) addresses the latter and provides a large training dataset of 22 clinical multimodal cases of 3T MRI T1w and T2 FLAIR scans as well as 3DUS after craniotomy but before dura opening with expert-labeled homologous anatomical landmarks as presented and described in detail in [1]. The main challenges that were discussed in previous work revolve around a suitable way to define modality-invariant similarity metrics and fast and robust algorithm for finding the transform that optimises this measure. Popular choices of metrics for MRI-US registration include gradient-based correlation metrics [2], self-similarity descriptors [3] or advanced mutual information variants [4]. Secondly, a suitable optimisation framework has to be adapted to enable optimal performance on the given dataset. In this submission, we argue that excellent results can be achieved by employing off-the-shelf and publicly available algorithms that have been optimised for general medical image registration tasks, such as atlas-based segmentation propagation of abdominal CT, without any further domain specific adaption. In the next section, we will describe the employed method that is based on self-similarity context (SSC) descriptors [3] and the discrete optimisation framework deeds [5] that performed best in two MICCAI segmentation challenges Beyond the Cranial Vault (BVC) in 2015 [6] and Multimodal Whole-Heart Segmentation (MM-WHS) in 2017 [7] with the exact same default parameters and a subsequent label fusion step.

2 Method

Quantised self-similarity context descriptors (SSC) [3] are used to define a similarity metric. Instead of relying on direct intensity comparisons across scans, SSC aims to extract modality-invariant neighbourhood representations separately within each scan based on local self-similarities (normalised patch-distances). It naturally deals well with multi-modal alignment problems, enables contrast invariance, which is particularly beneficial for MRI scans, and focuses the alignment on image edges of the ultrasound.

The dense displacement sampling registration short deeds [5] is a discrete optimisation algorithm that aims to avoid local minima in the cost function. It is therefore in particular suitable for challenging image appearance often seen in intra-operative ultrasound. A dense displacement sampling covers a large range of potential displacements (capture range) and the combinatorial optimisation based on dynamic programming ensures plausible first-order B-spline transformations without unrealistic deformations on a specified control-point grid. A diffusion regularisation is used between edges that connect neighbouring displacement nodes and this graph is simplified to contain no loops (a minimum-spanning-tree) to simplify the optimisation. A symmetry constraint on the nonlinear transform further increases the smoothness of deformations.

3 Implementation

We used c3d, which is a general purpose medical image processing command-line tool and can be found at http://www.itksnap.org/pmwiki/pmwiki.php?n=Downloads.C3D to resample the provided nifti files of 3DUS and T2 FLAIR into a common reference frame and to isotropic voxel sizes of 0.5 mm\(^3\). The command used for the MRI FLAIR and 3DUS data respectively therefore was:

figure a

We then used both linear and deformable parts of the deeds framework as downloaded from https://github.com/mattiaspaul/deedsBCV/ with default settings. These include an linear pre-registration that performs a block-matching on four scale levels and estimates a rigid transformation using:

figure b

In order to be able to apply the estimated linear and nonlinear transformations to manual landmark positions, the algorithm requires segmentation masks. In our case, these will represent landmarks as 3D spheres. After generating two text files with a custom python implementationFootnote 1. The landmark segmentations can be easily generated using c3d as follows:

figure c

The transformation matrix is fed into the deformable part of deeds using the following (default) parameters: number of displacement steps \(l_{\max }=[8, 7, 6, 5, 4]\), quantisation/stride \(q=[5, 4, 3, 2, 1]\), and B-spline grid spacings of [8, 7, 6, 5, 4] voxels. A default weighting of \(\alpha =1.6\) between the SSC-similarity and the diffusion regularisation in deeds was used. An example command to run a registration is as follows:

figure d

The computation times are approx. 5 s for linear alignment and 20 s for deformable registration on a mobile dual-core CPU based on the efficient OpenMP implementation.

After applying the combined transformations to the 3D landmark spheres, their spatial (voxel) coordinates are extracted by calculating the centre of mass in python usingFootnote 2 and the following command, which stores them in a text file and can directly calculate the mTRE when provided with the target landmarks:

figure e

In the next sections the results are presented both visually and numerically and their implications are discussed.

Table 1. Numerical results of accuracy of multimodal registration evaluated with manual landmarks in mm using the deeds algorithm in three different settings. First, only the linear part is considered, which yields an mTRE of 1.88 ± 0.53 mm. Second, a nonlinear transform is estimated in addition, yielding a slightly higher error of 1.92 ± 0.60 mm. But when finally fitting another rigid transform to the nonlinear result, the best mTRE of 1.67 ± 0.54 mm is reached.
Fig. 1.
figure 1

Visual example of US-MRI registration for #3 of the training dataset. The top row shows a colour overlay (US in jet) on top of the original MRI. The bottom row demonstrates a clearly improved alignment when applying the automatically estimated linear transform to the MRI scan.

Fig. 2.
figure 2

Cumulative distribution of landmark errors sorted in ascending order. All variants of the automated multimodal discrete registration decrease the landmark error and improve image alignment.

4 Results and Discussion

All experiments were run with same settings on the 22 training scans of the challenge and evaluated using all manual landmarks as provided by the organisers. The algorithms are fully automatic and require no manual initialisation. We confirmed that the original error (before registration) was approx. 5.4 mm as mentioned in [1]. The numerical results are presented in Table 1 and using distribution plots in Fig. 2. A clear advantage over the initial error can be seen from 5.42 mm to 1.88 mm when using a linear transform only. We were also interested in exploring whether the nonlinear part of the registration may provide a better alignment despite the fact that the ultrasound images were acquired before opening the dura. This is not directly the case as the result slightly deteriorate to 1.92 mm on average. However, when fitting again a rigid transform to the nonlinearly displaced landmark correspondences a mTRE of 1.67 mm. The fitting has been carried out using the technique described in [8]. This indicates that the more flexible deformable registration can improve the match of certain landmarks, but is also less robust in areas of limited contrast. Therefore, the following restriction to a rigid transform, which reduces the influence of outliers, improves the overall outcome.

We further noted that aligning MRI to ultrasound is slightly more accurate than in reverse order. Since, the nonlinear part of deeds is already symmetric this discrepancy could be alleviated by using the approach of [9]. Furthermore, the regularisation parameter could be further optimised from \(\alpha =1.6\) to \(\alpha =0.4\) yielding a modest improvement to 1.62 mm mTRE. A visual example of the registration and multi-modal fusion outcome is shown in Fig. 1.

5 Conclusion and Outlook

In summary, we have demonstrated that the general purpose, publicly available, discrete registration toolbox deeds provides excellent accuracies of 1.62 mm for a challenging ultrasound to MRI brain registration. The method relies on no training data, but potentially the widely applicable self-similarity descriptors could be replaced by a learning-based approach that relies on known correspondences in training cf. [10] or [11]. However, the impact will probably be more pronounced when considering scans with more brain-shift.

A further interesting research direction would be to only learn the spatial layout used for self-similarity distance computations by means of deformable convolutions. These have been successfully applied to registration and segmentation tasks with few labelled datasets [12]. Moreover, the computation time of the algorithm could be drastically reduced (to subsecond runtimes) by performing the similarity and regularisation calculations on a GPU, which we have already demonstrated for parts of the algorithm in [13] and we intend to complete this for the whole algorithm in the near future.