1 Introduction

CT image reconstruction algorithms rely on the assumption that the acquired projection data correspond to the line integral over the spatial distribution of the attenuation coefficient. Scattered X-rays contributing to the measured signal lead to a violation of this assumption, and thus, to the introduction of CT artifacts [6, 9, 29]. These artifacts correspond to a degradation of image quality and impair dimensional measurements [12, 13]. Especially, in case of high scatter-to-primary ratios, apropriate scatter correction is crucial to avoid a loss of accuracy of the metrological assessment.

Several approaches have been proposed to address this issue. In general, they can be divided into two classes: scatter suppression and scatter estimation approaches which are the focus of this manuscript. While scatter suppression approaches try to reduce the amount of scattered X-rays reaching the detector using anti-scatter grids or collimators [26], scatter estimation approaches aim at deriving an estimate of the scatter distribution that is used to correct the acquired projection data [27]. Thereby, the scatter estimate can either be derived using dedicated hardware such as beam blockers or primary modulation grids [3, 7, 8, 20, 24, 28, 38, 39] or using software-based approaches that rely on physical or empirical models to predict X-ray scattering [1, 2, 11, 17, 18, 21, 22, 30, 32,33,34, 36, 37]. Among these methods the gold standard is to use a Monte Carlo (MC) photon transport code [27]. As MC is able to model all the physics of the CT acquisition process, the resulting scatter estimates are very accurate. However, the drawback of MC methods is their high computational complexity. Even highly optimized code does not perform in real-time on conventional hardware.

Thus, if computation time is an issue, so-called kernel-based models are often used in practice. These models approximate the scatter distribution by an integral transform of a scatter source term multiplied with a scatter propagation kernel [27]. Thereby, the scatter source term is usually modeled as a function of the primary intensity and reflects the probability of X-ray scattering along each ray from the X-ray source to a detector pixel. The scatter propagation kernel accounts for the spatial distribution of the scattered X-rays and depends on several parameters such as the acquisition geometry, the spectral distribution of X-rays and the object itself. Different approaches to set the scatter source term and the scatter propagation kernel have been proposed [2, 11, 14, 15, 18, 21, 23, 31,32,33,34]. Basically, they can be divided into model-based and MC-based approaches. The former use a simplified theoretical model to predict X-ray scattering (for instance only forward scattering is assumed [21]) with a set of open parameters. Subsequently, the open parameters are calibrated to fit MC simulations or reference measurements. MC-based approaches, in contrast, rely on needle-beam MC simulations of slabs or ellipsoids with varying dimensions which are calculated prior to the measurement. To estimate scatter within a measured projection, one of the precalculated needle-beam kernels is assigned to every detector pixel according to an appropriate similarity metric. Finally, all kernels are summed up including correction terms that account for differences between the slabs or ellipsoids and the actual object shape.

However, while being fast, conventional kernel-based models have two major drawbacks: they are far less accurate than MC simulations and it is challenging to find parameter sets or correction terms, respectively, that apply to different components in the same way. To overcome these drawbacks, we propose the deep scatter estimation (DSE). It uses a deep convolutional neural network which is trained to reproduce the output of MC simulations using only the acquired projection data as input. Thus, the accuracy of the scatter prediction should be comparable to MC simulations but can be generated in real-time once the network is trained. It has to be noted that DSE is not restricted to reproduce MC simulations but can be trained with any other scatter estimate. In this study, we demonstrate the potential of DSE using simulations as well as measurements. The corresponding scatter estimates are compared against MC simulations as well the scatter estimates derived by a kernel-based approach and a hybrid scatter estimation approach. Thereby, the focus of the manuscript is set on dimensional CT applications. However, the proposed approach is not restricted to dimensional CT but can be applied to any X-ray imaging modality such as medical CT or fluoroscopy for instance.

Fig. 1
figure 1

Architecture of the proposed deep convolutional neural network

2 Material and Methods

2.1 Kernel-Based Scatter Estimation

Kernel-based methods approximate the scatter distribution \(I_\text {s, est}\) by an integral transform of a scatter source term \(T(\psi )\) multiplied with a scatter propagation kernel G:

$$\begin{aligned} I_\text {s, est}({{\varvec{u}}}) = \int T(\psi )(\mathbf {u'}) G({{\varvec{u}}}, {{\varvec{u}}}', {{\varvec{c}}}) d{{\varvec{u}}}', \end{aligned}$$
(1)

where \(\psi \) is the normalized primary intensity. The operator T is usually derived from a physical model such that \(T(\psi )\) represents the probability of X-ray scattering along a ray from the X-ray source to the detector element located at \({{\varvec{u}}}\). The scatter propagation kernel G with its open parameters \({{\varvec{c}}}= (c_0, c_1, \dots )\) accounts for the spreading of scattered X-rays. For a ray heading from the X-ray source to the detector element at \({{\varvec{u}}}'\), \(G({{\varvec{u}}}, {{\varvec{u}}}', {{\varvec{c}}})\) corresponds to the fraction of X-rays reaching the detector element at \({{\varvec{u}}}\). Several approaches to set T and G have been proposed.

In this manuscript, we use a slightly modified version of the kernel-based model of Ohnesorge et al. as a Ref. [21]. Thereby, the scatter source term is given by the forward scatter intensity which corresponds to the probability that an X-ray hitting the detector was scattered in forward direction:

$$\begin{aligned} T(\psi ) = -K \cdot \psi \cdot \ln (\psi ), \end{aligned}$$
(2)

where K refers to the differential cross section of forward scattering. The scatter propagation kernel is modeled as a sum of exponential functions:

$$\begin{aligned} G({{\varvec{u}}}, {{\varvec{u}}}', {{\varvec{c}}})= & {} \sum _{\pm } e^{-c_1(({{\varvec{u}}}-{{\varvec{u}}}') {\hat{\mathbf{{{\varvec{e}}}}}}_{\mathbf{1}} \pm c_2)^2} \nonumber \\&\cdot \sum _{\pm } e^{-c_3(({{\varvec{u}}}-{{\varvec{u}}}'){\hat{\mathbf{{{\varvec{e}}}}}}_{\mathbf{2}} \pm c_4)^2} \end{aligned}$$
(3)

The constant K as well as the open parameters \({{\varvec{c}}}\) of the scatter propagation kernel are determined by modeling them such that the scatter estimate best fits calibration measurements or the output of a MC simulation. Here, this is done by minimizing the following cost function using a simplex algorithm [19]:

$$\begin{aligned} \{K, {{\varvec{c}}}\} = \arg \min \sum _n \sum _{{\varvec{u}}}\Vert I_\text {s, est}(n, {{\varvec{u}}}, K, {{\varvec{c}}}) - I_\text {s}(n, {{\varvec{u}}})\Vert ^2_2,\nonumber \\ \end{aligned}$$
(4)

where n is the sample number, \(I_\text {s, est}\) is the scatter estimate according to Eq. (1) and \(I_\text {s}\) is a reference MC simulation.

2.2 Hybrid Scatter Estimation

Kernel-based approaches usually calibrate the open parameters in advance. Therefore, they might not perfectly fit to the actual measurement. To increase the accuracy Baer et al. proposed to recalibrate the parameters for every measured projection using a coarse MC simulation [1]. Thus, the kernel may be regarded as being a physics-based regularizer to the MC estimate. This so-called hybrid scatter estimation was implemented here as a second reference approach. Thus, for every projection view n, a distinct parameter set was calculated by performing the following minimization using a simplex algorithm:

$$\begin{aligned} {\{K, {{\varvec{c}}}\}}_{n} = \arg \min \sum _{{\varvec{u}}}\Vert I_\text {s, est}({n}, {{\varvec{u}}}, K, {{\varvec{c}}}) - I_\text {s}({n}, {{\varvec{u}}})\Vert ^2_2. \end{aligned}$$
(5)

2.3 Deep Scatter Estimation

Conventional kernel-based models rely on simplified assumptions that do not perfectly fit arbitrary cases. Thus, their accuracy is limited and far below the accuracy of MC simulations. Furthermore it is challenging to adapt a certain model to generalize to different cases. Neural networks have the potential to overcome these drawbacks. Therefore, we propose the deep scatter estimation (DSE), a deep convolutional neural network for real-time scatter estimation. The architecture of our DSE network is shown in Fig. 1. Basically, the network is a modification of the U-net which was proposed by Ronneberger et al. for biomedical image segmentation [25]. Similar to the original model, the network consists of a downward path that plays a role at extracting a hierarchy of features from the input image and an upward path that restores the resolution of the image while transforming the features.

In order to estimate scatter, we use the forward scatter intensity as given in Eq. (2) with \(K=1\) as input to the network. Subsequently, the weights of the convolutional layers are trained to reproduce the output of a MC simulation. Thus, the network internally performs similar operations as kernel-based methods. However, in contrast to these methods, the DSE network is much more flexible since it is able to use non-linear mappings and varying scatter kernels depending on local features of the input image. Thus, DSE should model X-ray scattering more precisely and should better generalize to varying inputs.

For all results presented in this manuscript, the DSE network was trained on a GeForce GTX 1080 for 80 epochs using an Adam optimizer, a batch size of 16, and the mean squared error between the output of the network and the MC scatter \(I_s\) as loss function. To increase the computational performance of the training, we did not use the full size projection data as input but downsampled them to a size of \(256 \times 256\). Since X-ray scatter is known to be low frequent, this downsampling has only minor influence on the accuracy of the scatter estimation. Once the network is trained it can be applied in real-time (\(\approx 20 \, \hbox {ms / projection}\)) to the downsampled testing data. Finally, the scatter estimates are upsampled again to have the full size.

Fig. 2
figure 2

Models used for the simulation study. The materials were chosen to be aluminum (cylinder head, casting, profile), steel (bicycle cassette) and a titanium alloy (compressor wheel). Note that the aluminum profile was used for testing only

2.4 Simulation Study

Considering a certain scatter estimation approach it is beneficial if it does not need to be optimized for every component to be measured but applies to a broad range of components and acquisition parameters. Practically, one would want to optimize its parameters only for a couple of typical components and acquisition parameters. Subsequently, these parameters should also yield appropriate scatter estimates for other components. To investigate the performance of the proposed scatter estimation as well as the reference approaches to do so, a simulation study was performed. Therefore, projection data of different components (see Fig. 2) were simulated as follows. For each component primary intensities \(\psi \) were generated using an analytic model:

$$\begin{aligned} \psi ({{\varvec{u}}}) = \frac{\int dE \, w(E) \, e^{-\mu (E) \cdot p({{\varvec{u}}})}}{\int dE \, w(E)}, \end{aligned}$$
(6)

where w(E) is the detected X-ray spectrum that was generated according to the model of Tucker et al. [35], \(\mu (E)\) is the attenuation coefficient of the component according to the evaluated photon data library [4] and \(p({{\varvec{u}}})\) is the intersection length at detector position \({{\varvec{u}}}\) that is derived by a forward projection of the component’s CAD model. Subsequently, X-ray scatter \(I_s\) was simulated using our in-house MC simulation [1]. Finally, Poisson noise \({{\mathcal {P}}}\) was added to generate the intensity data \({\tilde{\psi }}\):

$$\begin{aligned} {\tilde{\psi }}({{\varvec{u}}}) = \psi ({{\varvec{u}}}) + I_s({{\varvec{u}}}) + {{\mathcal {P}}}(\psi ({{\varvec{u}}}) + I_s({{\varvec{u}}})). \end{aligned}$$
(7)
Table 1 Parameters for the training and testing data of the simulation study

Based on Eq. (7) two data sets were generated: a training data set and a testing data set. Thereby, the training data set is used to optimize the open parameters of the kernel-based approach and the weights of the DSE network (see Sects. 2.1 and 2.3), while the testing data are used to evaluate the performance of the scatter estimation approaches. For the training data set, 16,416 projections were generated using the CAD model of a compressor wheel, a cylinder head, a casting and a bicycle cassette as prior (Fig. 2). The corresponding simulation parameters are given in Table 1. The testing data set consists of a tomography (720 projections/360\(^\circ \)) of a compressor wheel, a cylinder head, a casting and a bicycle cassette and an aluminum profile. To make sure that training data do not resemble the testing data, they were simulated using different parameters (see Table 1). Different scaling factors were applied to the prior models such that they differ in size. The data was simulated with a different orientation of the models (tilt angle) and different magnifications. Furthermore, different tube voltage and prefilter settings were used.

Fig. 3
figure 3

Aluminum profile measured at our in-house table-top system

2.5 Measurement Data

To evaluate the performance of the DSE for real data, measurements of an aluminum profile (see Fig. 3) were conducted at our in-house table-top CT that is equipped with a 110 kV micro-focus X-ray tube and a Varian 4030 flat detector. Similar to the simulation study, a training data set and a testing data set was generated. Again, the training data set is used for parameter optimization and the testing data set is used for performance evaluation. Basically, there are two possible approaches to get the training data set. Probably the most accurate way is to use data of different components that were measured at the same CT system as well as the corresponding MC scatter simulations. However, since we were not aware of enough measurement data to prevent an overfitting of the DSE network, the training data set is based on simulations. The simulated data that was generated according to Eq. (7) with

$$\begin{aligned} \psi ({{\varvec{u}}}) = G_\text {off}({{\varvec{u}}}) *\frac{\int dE \, w(E) \, e^{-\mu (E) \cdot p({{\varvec{u}}})}}{\int dE \, w(E)}, \end{aligned}$$
(8)

Thereby, we tuned the simulation such that it best resembles measurements of our table-top CT. Therefore, the detected X-ray spectrum w(E) of our system was estimated as described in Ref. [10]. Furthermore, off-focal radiation that was modeled as a convolution with an off-focal kernel \(G_\text {off}\) as described in Ref. [16] was included in the simulation. As prior for the generation of the training data, the CAD models described in Sect. 2.4 were used. However, in contrast to the simulation study, the material of all components was set to aluminum as it is not possible to penetrate steel or titanium parts with a 110 kV X-ray source appropriately. All parameters are summarized in Table 2.

Table 2 Parameters for the simulated training data set and the measurement

3 Results

3.1 Simulation Study

Scatter estimates were evaluated for simulated tomographic measurements of five different components. The data were generated as described in Sect. 2.4. Subsequently, scatter was estimated using the kernel-based approach, the hybrid scatter estimation and DSE. The corresponding results of an exemplary projection view are shown in Fig. 4. A more quantitative evaluation that calculates the mean absolute percentage error (MAPE) between the scatter estimate and the ground truth for all projection views is given in Table 3. Thereby, the MAPE of the kernel-based method is in between 8.8 and 19.8% with a maximum error between 30.3 and 87.7%. Since the hybrid scatter estimation calculates a distinct parameter set for every projection, there is an increased performance with a MAPE between 2.7 and 11.7% and a maximum error between 16.4 and 63.4%. DSE clearly outperforms the reference approaches leading to scatter estimates with a MAPE between 0.6 and 1.5% with a maximum error between 5.0 and 13.2%. Similar trends can be observed considering CT images. Therefore, the scatter estimates are subtracted from the scatter corrupted projection data to derive a scatter corrected data set. Subsequently, the corrected projections were reconstructed analytically using the FDK algorithm [5]. Exemplary images are shown in Fig. 5. While all scatter correction approaches lead to a significant improvement of CT value accuracy, the kernel-based and the hybrid approach tend to overcorrect scatter. As a result streak artifacts are introduced to the CT reconstructions. In contrast, the DSE leads to CT images that are almost free of artifacts.

Fig. 4
figure 4

Absolute percentage error between the scatter estimates of the simulation study and the ground truth for an exemplary projection

3.2 Measurement Data

Scatter estimates were evaluated for a tomographic measurement of an aluminum profile at our in-house table-top CT system. The training data and the measurement data were generated as described in Sect. 2.5. Subsequently, scatter was estimated using the kernel-based approach, the hybrid scatter estimation and the DSE. Since there is no ground truth for the measurement data, the scatter estimates were compared against a MC scatter prediction. Scatter estimates for an exemplary projection view are shown in Fig. 6. A quantitative evaluation yields a MAPE of the kernel-based method of 12.6% with a maximum error of 41.4%. As to be expected, the hybrid scatter estimation yields more accurate scatter estimates with a MAPE of 5.4% and a maximum error of 31.4%. Similar to the simulation study, DSE shows the best performance. Here the MAPE is 2.7% and the maximum error is 10.0%. It has to be noted that the evaluation was restricted to the area of the component since the error in air does not affect the CT value distribution of the component significantly.

Table 3 Mean and maximum absolute percentage error between the scatter estimate and the ground truth evaluated for all 720 projection views of each component

The impact of the scatter correction on CT images is shown in Fig. 7. Both, the reconstruction that uses the kernel-based scatter correction as well as the reconstruction that uses the hybrid scatter correction show strong streak artifacts. These artifacts are a result of an overestimation of X-ray scattering that can also be observed in Fig. 6. In regions of high attenuation the scatter distribution is slightly lower. However, the kernel-based approach and the hybrid scatter estimation do not reproduce that dip in the scatter distribution. As a result, the attenuation is overestimated within the corrected projections which leads to the introduction of streak artifacts to the reconstructed images. In comparison, the proposed DSE approach yields images with similar quality as the MC scatter correction.

Fig. 5
figure 5

CT reconstructions of projections without scatter (first column), with scatter (second column) as well as the difference between the scatter corrected reconstructions and the ground truth (third to sixth column)

Fig. 6
figure 6

Absolute percentage error between the scatter estimates of the measurement and the MC scatter prediction for an exemplary projection

Fig. 7
figure 7

CT reconstructions of projections with MC scatter correction (first column), without correction (second column) as well as the difference between the three investigated scatter correction approaches and the MC scatter corrected reconstruction (third to sixth column)

4 Discussion and Conclusion

This manuscript describes the application of a deep convolutional neural network to estimate X-ray scatter in real-time. Therefore, the proposed DSE network is trained to reproduce the output of MC simulations using the acquired projection data as input. In contrast to conventional kernel-based scatter estimation approaches the DSE has the advantage of being able to use non-linear mappings and varying scatter kernels depending on local features of the input image. Thus, X-ray scattering can be modeled more precisely leading to an increased accuracy of the scatter estimates. The potential of DSE was demonstrated for simulated and measured data. The simulation study shows that the DSE generalizes well to measurements of different components with different materials and varying acquisition parameters. The performance of DSE was evaluated for cases that differed from the training data in terms of size, shape and acquisition parameters. For any of the tested components, the MAPE between the DSE scatter prediction and the ground truth was less than 1.5%. This suggests that for a practical application of DSE it is sufficient to train the network using a couple of typical cases and typical acquisition parameters. Subsequently, DSE can be applied to other cases without a major loss of accuracy. In contrast to DSE, the reference approaches showed a significantly inferior performance. The kernel-based approach led to scatter estimates with a MAPE between 8.8 and 19.8%. Also more sophisticated approaches such as the hybrid scatter estimation were less accurate (MAPE between 2.7 and 11.7%) than DSE. Especially, in regions of high attenuation the reference methods often overestimated the actual scatter distribution leading to streak artifacts within the reconstructed CT images. Similar trends can be observed for real data measured at our in-house table-top CT system. Also here, DSE clearly outperforms the two reference approaches. While CT reconstructions that were corrected using the kernel-based method and the hybrid scatter estimation show streak artifacts, DSE yields almost the same results as the MC-based correction.

However, compared to the simulation study, DSE is less accurate in case of measured data. This may be explained by the fact that the network was not trained using measurements but using simulations. Although the simulations were tuned to reproduce measurements at our CT system, they do not perfectly resemble real data. Therefore, we assume that the accuracy of the scatter estimates can be further increased if the training is performed on measured data.

It has to be noted that DSE, as it is applied here, highly relies on the accuracy of the MC simulation. If the MC code does not predict the actual scatter distribution correctly, DSE does not either. However, DSE is not restricted to reproduce MC simulations but can be trained with any other scatter estimate i.e. a scatter estimate derived using beam blockers or primary modulation approaches.