Introduction
Breast MRI, with its superior soft tissue contrast and ability to visualize angiogenesis through contrast-enhanced techniques, offers distinct advantages over other modalities like ultrasound (US) and mammography for breast cancer detection [
1]. Recently, the European Society for Breast Imaging (EUSOBI) published a consensus statement advocating routine screening for patients with dense breasts [
2]. However, the wide-scale application of breast MRI is impeded by the high costs associated with MRI scanners and the required acquisition times. Consequently, there is a growing demand for accelerated image acquisition methods that can reduce examination time and thus increase the availability of breast MRI. A prevalent approach to accelerate MRI acquisition involves undersampling the k-space data and reconstructing the image data from this undersampled k-space data utilizing sensitivity encoding (SENSE) [
3] or compressed sensing (CS) [
4]. The advent of generative machine learning models [
5] has extended these possibilities. Generative models can learn the distinguishing characteristics of typical images and can subsequently be used to synthesize new images that retain all the properties of real images [
6]. This technology has shown promising applications in radiology, for example by reducing the need for contrast agents [
7], and presents potential avenues for the acceleration of MRI acquisition [
8,
9].
Among these generative models, score-based models offer a compelling approach [
10‐
12]. These models operate within the framework of unsupervised learning. By integrating the generative model’s knowledge of what a typical MRI image would look like, ambiguities arising from missing k-space data can be resolved. This enables the reconstruction process to be completed even when some fractions of the k-space data are missing [
13].
MRI reconstruction is an important technique in medical imaging. Traditional MRI reconstruction techniques can be time-consuming and computationally expensive, which can limit their practical use in clinical applications. However, recent advances in machine learning have seen generative models become very efficient in improving the speed and accuracy of MRI reconstruction. Existing studies have investigated the use of generative adversarial networks (GANs) [
14‐
16]. However, GANs are difficult to train, and diffusion models have been shown to deliver better performance for medical imaging, exceeding GANs in terms of diversity and image fidelity [
17]. Therefore, we concentrated on score-based models, which have the same underlying structure as diffusion-based models and allow for the mathematical integration of the MRI reconstruction process [
10,
13].
In this study, we leverage the score-based generative model to accelerate breast MRI acquisition. Our approach involves training a large dataset of breast MRI images with a deep neural network to learn the underlying probability distribution of the MRI images, i.e., to learn the general appearance of these images. By combining this prior knowledge of the learned probability distribution with the acquired k-space measurements, we can quickly and accurately reconstruct MRI images and compare the image quality of reconstructions at various levels of undersampling of the k-space data.
Our hypotheses in this study were: (1) score-based models can serve as effective generative models for synthesizing breast MRI images, and (2) these models can accelerate breast MRI acquisition without compromising image quality.
Discussion
Our study highlighted the potential of the score-based diffusion model to streamline MRI reconstruction, presenting an acceleration in MRI while maintaining clarity and reliability. As the acceleration factor increased, we observed that both quantitative measures (PSNR and SSIM) and qualitative evaluations by radiologists showed a decrease in image quality. Notably, the model’s reconstructions from undersampled k-space data reached acceleration factors as high as R = 20, though image quality ratings reduced considerably for this highest acceleration level.
In addition to our findings, it is important to acknowledge that while PSNR and SSIM provide valuable insights into the technical quality of image reconstruction, they may not fully encapsulate the complexities of visual assessments conducted by radiologists. The presence of outliers in our data suggests that evaluations by radiologists are influenced by a range of factors that extend beyond the simple metrics of image fidelity. Factors such as the clinical relevance of the images, the radiologists’ years of experience, and the perceptibility of important diagnostic features play significant roles in their judgments. This observation underlines the notion that quantitative metrics, while helpful, do not fully capture the complexities of visual assessment and clinical applicability as perceived by medical experts [
24]. Our findings generally showed good agreement between the quantitative evaluations of our reconstruction technique and the qualitative assessments by radiologists; however, this highlights the need for a cautious interpretation of these metrics in clinical settings. Future research should consider these dynamics to better understand how such tools can be integrated into clinical practice without over-reliance on quantitative metrics alone.
Presently, acceleration techniques such as sensitivity encoding (SENSE) [
3] and generalized autocalibrating partially parallel acquisitions (GRAPPA) [
25] dominate clinical practice, with emerging research focusing on convolutional neural networks for MRI reconstruction [
26,
27]. Yet, our study emphasizes the unique advantages of the score-based model. Primarily, it sidesteps the need for multiple coils, a requisite for SENSE and GRAPPA. Our results evidenced this with an achieved acceleration factor of 20, even when using just four coils. Further, the versatility of the score-based model allows it to adapt to various acquisition schemes post-training, a contrast to specialized neural networks which demand specific retraining for different acquisition techniques [
26,
27].
Furthermore, we incorporated both T1 and T2 weighted MRI sequences as class labels into the score diffusion model. This approach allowed us to train a single model that could handle multiple MRI sequences rather than separate models for each sequence. In preliminary experiments, we also tested whether a model that was only trained on T2-weighted images could be used to reconstruct T1-weighted images, but we found that we needed to integrate T1-weighted images for this to work. This integration is beneficial as it reduces the computational cost of training separate models and improves the accuracy and efficiency of MRI reconstruction.
However, to minimize variability, it is crucial for radiologists to adhere to established guidelines and protocols for image interpretation. Consulting with colleagues or referring to other imaging modalities may also be necessary. A significant limitation of our study is that the diagnostic accuracy of the reconstructed images has not been evaluated. For instance, it is conceivable that reconstruction methods based on generative AI models might not accurately represent potentially malignant lesions. While no evidence of this issue was found, the lack of evidence does not confirm the absence of a problem. Thus, extensive investigations are needed before these methods can be reliably used in clinical settings. Should future studies show that the presented methods can be implemented in clinical practice, then that would help alleviate the problem of limited MRI capabilities available to patients. This is particularly pressing considering the recent recommendation of the European Society for Breast Imaging to screen women with dense breasts by means of MRI [
28].
Another approach to employing generative models for breast imaging with MRI is to reduce the need for contrast agents. As demonstrated in a recent publication by Müller-Franzes et al [
7], generative models can enhance MRI subtraction images with reduced contrast doses. However, we did not train a model for different anatomical regions or employ the model on external institutions. As a proof-of-concept study, we intended to show that the generative model learns the underlying distribution of breast images at the institution where it had been trained. There is evidence, however, that a model trained on one specific anatomy or dataset can also be used for the reconstruction of unrelated anatomies: Jalal et al trained a score-based diffusion model to reconstruct MRI images of the brain and applied it to abdominal and knee MRI images [
13]. Future research should investigate this and perform clinical evaluations of such reconstructed images.
Despite the promising outcomes, our study isn’t devoid of limitations. First, our score-based model’s dependency on a vast set of fully sampled images for training could limit its adaptability across diverse datasets or higher-resolution images. This raises pertinent questions about its generalization capacities which necessitate further exploration. Moreover, our study’s scope was limited to two-dimensional imaging, and understanding its efficacy on three-dimensional imaging remains a topic for future investigations. Our focus was also predominantly on T2-weighted images; thus, extrapolating our findings to varied MRI contrasts requires additional research. Last, the inherent nature of generative models to possibly “hallucinate” information is a crucial consideration, especially when assessing their viability in clinical settings. The reliability of such models in clinical scenarios remains an essential avenue for future studies.
In conclusion, our work demonstrates the potential of score-based models for the acceleration of MRI reconstruction. In contrast to existing approaches, the score-based model does not require multiple coils and can be used with arbitrary acquisition schemes. Further research is needed, but we reckon that score-based models are a promising approach for accelerating MRI reconstruction in clinical practice.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.