In this study we have analysed the voxel-wise relationship between PSMA PET SUV and mpMRI parameters and developed radiomics-based machine learning models to predict tumour location and grade. This builds upon our earlier proof-of-concept study [
10] which had a smaller cohort of nine patients imaged with a non-uniform set of PET tracers with only five having the Ga68-PSMA-11 tracer. For this study, we combined imaging from these five patients with an additional 14 patients who all had PET imaging with the Ga68-PSMA-11 tracer, along with mpMRI according to ESUR guidelines [
12], to ensure data consistency. Furthermore, the earlier study did not investigate the use of radiomics-based machine learning models to predict tumour location and grade which we have been able to develop here, for the future goal of implementing BiRT which requires a voxel-level dose distribution.
Strengths of this study include the use of a highly controlled dataset, the accurate co-registration of PET/CT and mpMRI with ground truth histology data using an established framework [
15], and the inclusion of DCE MRI parameters when most other studies have incorporated only ADC and T2w imaging from mpMRI. Furthermore, the voxel-wise approach differs from most other studies which have used a region of interest (ROI)-based approach, and the step-wise development of two classifiers to predict tumour location and then tumour grade is unique, as all other studies, according to our knowledge, have focussed on only one of these classification tasks.
Correlation analysis
Correlation analysis was conducted using Spearman rank correlation between imaging parameters rather than Pearson correlation, which had been used in our prior study [
10]. The Spearman rank correlation method was considered more appropriate, as it did not assume the underlying data were normally distributed which was confirmed by the Kolmogorov–Smirnov test results. Despite the different correlation method, overall correlation trends validated our earlier findings [
10], confirming that perfusion-related mpMRI parameters from DCE MRI are most strongly correlated with PSMA PET SUV, whereas ADC and T2w MRI are not strongly correlated. Quantitative parameters Ktrans and iAUGC60, the most common DCE MRI biomarker used in oncology trials [
22], and semi-quantitative parameters ME and IRE each showed consistently strong positive correlations with PSMA PET SUV, while the semi-quantitative parameter TTP showed a negative correlation with PSMA PET SUV. These findings are consistent with a study by Zhao et al. [
20] who compared PSMA PET with DCE MRI parameters in 39 patients and reported malignant lesions had significantly shorter TTP than benign lesions. Overall results indicate that tumour voxels with increased PSMA PET tracer uptake correspond with higher levels of tissue perfusion, a key characteristic of tumours.
Tumour detection
Many artificial intelligence models that use radiomic features have been developed to predict tumour location from prostate mpMRI, with some gaining FDA and CE approval [
24,
25]. In contrast, relatively few studies have predicted tumour location or tumour grade using radiomic features from PET and mpMRI in combination [
26]. This is partially due to a lack of standardisation for PET imaging and for computing PET radiomic features, as well as limited datasets available with ground truth histopathology for model development and validation. The larger voxel sizes used in PET and inherently lower signal-to-noise ratio when compared with mpMRI also provide challenges [
27]. Standard PET SUV parameters, including SUVmax, are often used in clinical practice however PET radiomic features offer significant potential for prostate cancer applications [
27] which may outperform standard metrics, and studies using PET radiomics are on the rise [
28‐
32].
The mpMRI and PET radiomics-based models developed in this study, contribute towards this growing research field. The tumour prediction and tumour grading models were developed as two separate classification tasks, with the predicted tumour voxels output by the tumour detection model being further classified into high-grade or low-grade voxels by the grading classifier. A single classifier approach would have resulted in a multi-class classification problem, with three classes: benign, high-grade tumour and low-grade tumour, which would have given a highly imbalanced class split in the dataset requiring a lot of data augmentation to reduce the imbalance and more feature inputs. Hence the two-classifier approach was preferred as it lowered the computational cost and simplified the performance assessment for each task.
Results showed that an RFC model for detecting prostate cancer using combined PET and mpMRI radiomic features performed better than RFC models developed using radiomic features from either modality alone. This was not unexpected, due to the aforementioned complementary nature of the imaging modalities as demonstrated in many non-radiomics studies and clinical trials [
9,
33‐
35]. This result was also consistent with studies by Zamboglou et al. [
36] and Spohn et al. [
37] who did not utilise radiomics, but similarly validated imaging with ground truth pathology data and showed that PSMA PET is a valuable addition alongside mpMRI for defining the gross tumour volumes (GTV) for focal therapy applications, where mpMRI is more likely to underestimate tumour volume than PSMA PET [
36,
37].
The top 10 features for each model in our study demonstrated the superior performance of PET radiomic features for tumour detection, compared to mpMRI radiomic features, and their better predictive performance than the commonly used metric SUVmax. When assessing the most predictive features from mpMRI, the ADC map ranked the highest in the combined and the mpMRI alone models, with the NGTDM coarseness texture feature being the top-ranking in both followed by the 10th percentile ADC value. The only other mpMRI parameter with radiomic features in the top 10 list for the mpMRI alone model was the semi-quantitative DCE MRI parameter TTP, demonstrating the importance of perfusion imaging for detecting tumours especially when PSMA PET is unavailable.
Tumour grading
The tumour grading model in our study showed promising results, with high overall accuracy values across patients. The performance of the grading model was limited by the tumour detection model however, because any voxel with undetected tumour would automatically be considered benign and the tumour grading model would not classify them into high grade or low grade at all. Therefore, further development of these models with larger datasets would be required to improve individual patient performance.
Several studies aiming to assess tumour grade using PET and mpMRI data can be compared with these findings. Domachevsky et al. [
38] previously analysed data from 22 patients to characterise prostate cancer and cell density using Ga68-PSMA-PET/MRI data. While they did not extract radiomic features, they showed that PET SUVmax, ADCmin and ADCmean were distinct biomarkers for differentiating between tumours with Gleason Score ≥ 7 and benign tissue. In another study by Papp et al. [
39], data from 52 patients were used to investigate the diagnostic performance of RFC classifiers with radiomic features from PSMA PET, ADC and T2w MRI to predict low-risk versus high-risk lesions. Their radiomics-based RFC model was better for predicting lesion risk than SUVmax (AUC was 0.86 versus 0.80). Their feature ranking analysis similarly showed that PSMA radiomic features were the most important, compared to ADC and T2w MRI features. A study by Solari et al. [
29] reported the complementary value of PSMA-PET and ADC radiomics. With a retrospective cohort of 101 patients, they extracted radiomic features from the entire prostate gland and developed a series of SVM models using single modality and combined modalities to predict Gleason Score. Models which combined PET and ADC radiomic features outperformed single modality radiomics-based models and other combined modality radiomic-based models (PET + T1w and PET + T2w) to give balanced accuracy 82% ± 5%. In a recent study by Feliciani et al. [
31], preliminary results were shown using radiomic features from PSMA PET and ADC maps to predict ISUP grade obtained from ground truth histology, which showed the complementary nature of PET and ADC radiomic features. In contrast to our study, none of these studies incorporated perfusion parameters from DCE MRI and they all had the tumour location delineated manually prior to development of their tumour grade model.
There are limitations to our study, including the small dataset of 19 patients, with imaging performed on two different MRI scanners and five different PET/CT scanners (Additional file
1: Table S1). Fourteen of 19 PET scans in the dataset were from two GE scanner types, which used the same VPFXS reconstruction method and PET voxel size of 2.86 mm × 2.86 mm × 3.27 mm; however, the time of bed positions varied between patients from 2 to 4 min. The remaining five patients in the dataset were scanned with three different Siemens scanner types and used either a point spread function (PSF) or and ordered-subset expectation maximisation (OSEM) reconstruction method with varying sized Gaussian filter kernels, differing bed position times ranging from 2 to 3.5 min and all utilised a larger in-plane resolution than GE scanner PET images. Each of these acquisition parameters can impact the partial volume effects in the PET images; however, in this study, it was not possible to account for all these variations and we assumed the images and SUV values could be directly compared. All PSMA PET and mpMRI parameter maps were resampled into 0.8-mm isotropic voxels to enable accurate co-registration with histology, which inherently assumed that resampling did not result in information loss or negatively impact radiomic feature extraction. Further studies would be required to determine how best to account for the partial volume effects caused by differing PET acquisition parameters for optimal utilisation in radiomic-based machine learning models.
Additional limitations include that histology data were not obtained from prostate tissue at the apex or the base, due to standard histology processing requiring these sections to be cut in a parasagittal manner which meant they could not be co-registered with imaging data. This means the spatial coverage of the histology used for ground truth validation varied between patients, with a median 5 histology slices covering 2 cm of tissue and ranging from a minimum 3 slices (covering 1 cm of tissue) and maximum 8 slices (covering 3.5 cm). Hence the predictive models may not be as accurate at the apex and base, as they are at the mid-gland. Only a selection of perfusion parameters from DCE MRI was used to develop the classification models; however, incorporating other parameters, such as Ve, may have improved accuracy. In addition, the prostate was not separated into peripheral and transitional zones, which would have allowed the development of zone-specific models as the tissue differs between zones.
There are no standardised rules for choosing kernel size, so a large kernel of 9 voxels in each direction was chosen to match the average tumour size on PET imaging; however, a different kernel size may have given better results and improved the detection of small tumours. Studies by Yi et al. [
40] and Zamboglou et al. [
32] may be valuable to consider here, as they have both recently demonstrated that PSMA PET radiomic features can detect invisible tumour lesions with high accuracy. Both researchers extracted radiomic features within the whole or half gland, but could not indicate where these invisible lesions were located, an important requirement for BiRT treatment planning.