Introduction
Prostate cancer (PCa) is a commonly diagnosed malignancy that is associated with significant patient mortality [
1]. If detected early, localised disease can typically be treated with radiotherapy or radical prostatectomy (RP) interventions with high success rates. However, biochemical recurrence, defined by rising serum prostate specific antigen (PSA) levels, can occur, with the possibility of the patient developing metastatic disease with a substantially poorer prognosis [
2].
PCa imaging has rapidly advanced with the advent of radiotracers targeting the prostate specific membrane antigen (PSMA) transmembrane protein that is overexpressed on the majority of malignant PCa cells [
3]. These PSMA-targeting radioligands can facilitate positron emission tomography/computed tomography (PET/CT) imaging with superior diagnostic performance to conventional imaging techniques, particularly for biochemically recurrent (BCR) PCa patients [
4‐
6].
Evaluating patient response to therapeutic interventions remains critical to PCa patient care, and the quantitative analysis of medical images affords the opportunity to perform response assessments non-invasively. There exist several generalised imaging response assessment frameworks that are applied across a range of cancer types, such as the Response Evaluation Criteria in Solid Tumours (RECIST 1.1) and the PET Evaluation Response Criteria in Solid Tumours (PERCIST) [
7,
8]. The updated Prostate Cancer Working Group 3 (PCWG3) criteria detail prostate cancer-specific imaging response criteria; however, they make no recommendations on PSMA imaging modalities, referring only to conventional imaging modalities such as CT and bone scintigraphy [
9]. Recently, response assessment frameworks designed specifically for PSMA PET/CT images have been proposed, including the PSMA PET progression criteria (PPP) and the Response Evaluation Criteria in PSMA PET/CT (RECIP 1.0) [
10,
11]. The prognostic utility of the PPP and RECIP 1.0 frameworks has been demonstrated in high disease burden metastatic castration resistant PCa (mCRPC) populations undergoing
177Lu-PSMA radioligand therapy, with a recent comparative study finding RECIP 1.0 to have the highest inter-reader reliability and prognostic utility in classification of progressive disease vs. non-progressive disease [
12,
13]. It remains unclear; however, whether the RECIP 1.0 criteria retain its prognostic value in alternative patient populations with less advanced disease.
RECIP 1.0 requires the measurement of the change in whole-body tumour burden between baseline and follow-up imaging. Typically, this biomarker is quantified from PSMA PET scans using semi-automated techniques that require manual modifications [
14,
15]. Artificial intelligence (AI) affords a unique opportunity to quantify tumour burden fully automatically, with recent work demonstrating the feasibility of using convolutional neural network (CNN) architectures to automatically segment patient disease in PSMA PET/CT scans [
16‐
18]. AI-based disease burden quantification has the potential to facilitate fast and reproducible response assessment if integrated into frameworks such as RECIP 1.0.
The primary aim of this study was to validate the prognostic value of the radiographic RECIP 1.0 response assessment framework with respect to overall survival (OS) in a cohort of biochemically recurrent (BCR) PCa patients undergoing standard-of-care treatment. The secondary aim was to analyse whether AI-based tumour burden quantification techniques could be integrated into the RECIP 1.0 framework.
Discussion
Evaluating disease progression in molecular imaging is a critical component of patient care. Response assessment frameworks that are intended for clinical use should demonstrate prognostic utility in the cohort that they are utilised in. The RECIP 1.0 criteria has demonstrated its prognostic power in high disease burden mCRPC populations undergoing
177Lu-PSMA radioligand therapy [
12,
13], but its prognostic utility in less advanced disease populations remained to be validated. In this study, we demonstrated that in a less advanced disease BCR PCa population undergoing standard-of-care treatment with a long follow-up time, the RECIP criteria retains its prognostic significance. Furthermore, we showed the feasibility of incorporating automated AI-based lesion delineations into the RECIP framework without loss of prognostic value. With the potential for AI tumour burden quantification to facilitate both fast and completely reproducible response assessment, the clinical implications of this are significant.
AI-based lesion segmentation in PSMA images is rapidly advancing, with numerous studies demonstrating the potential for fully automatic PCa lesion delineation [
16‐
18]. To our knowledge, this work is the first to report the prognostic value of a fully automatic AI-based methodology for tumour burden quantification in a response assessment setting in prostate cancer, with previous work in this space employing semi-automated segmentation techniques. Kind et al. [
12] retrospectively analysed the prognostic value of the RECIP framework in mCRPC patients undergoing
177Lu-PSMA radioligand therapy, with tumour burden quantified semi-automatically using the methodology developed by Seifert et al. [
15]. Their results demonstrated a significantly increased risk of death for RECIP-PD patients (HR 2.69 (1.42–5.11),
p = 0.002), a finding that was replicated in our less advanced disease population for both semi-automated (HR = 3.78 (1.96–7.28),
p < 0.005) and AI-based (HR = 3.75 (1.23–11.47),
p = 0.02) segmentation methods. Gafita et al. [
13] in their recent comparative study utilised the semi-automated qPSMA software [
14] for tumour volume quantification, yielding also a significant increased risk of death for RECIP-PD patients undergoing
177Lu-PSMA radioligand therapy (HR = 4.33 (2.80–6.70),
p < 0.001) that is again similar to our results. Our novel AI-based method has the advantages of both complete reproducibility and requiring no manual modifications of the segmentation mask relative to these semi-automated techniques.
In the original RECIP 1.0 study, it was hypothesised that in patients who demonstrated a ΔTTV increase of > 20%, those who also had new lesions develop between scans would have a significantly worse survival probability relative to those who did not have new lesions [
11]. Our study confirmed this hypothesis (HR = 3.22 (1.05–9.89),
p = 0.04), suggesting that the decision to incorporate the presence of new lesions into the RECIP framework for defining RECIP-PD was valid and translates also to lower disease burden PCa populations. This analysis was done only for the semi-automated segmentation method because the sample size of patients who had AI lesion segmentation and a ΔTTV
AI of > 20% was small (
n = 20).
It is noteworthy that there was higher concordance between AI and manual scan interpretation for ΔTTV > 20% (moderate agreement,
k = 0.60) than for RECIP-PD classification (fair agreement,
k = 0.31). The example presented in Fig.
5 demonstrates why this might be the case. This patient presented with a new nodal lesion between scans that was not detected by the AI model. This resulted in a discordant RECIP-PD classification. However, despite this false negative, both segmentation methods were in agreement about whether there was a ΔTTV > 20% (ΔTTV
man = 325%, ΔTTV
AI = 54%), because the AI model predicted a large increase in the volume of another nodal lesion in the left iliac between scans. Therefore, the incorporation of the criteria for new lesions into the RECIP framework may make it more difficult for agreement to be reached between segmentation methods, since a single false negative or positive can impact the classification. Despite this lower agreement, however, both segmentation methods demonstrated significant prognostic value in RECIP-PD classifications.
Summary assessments of disease progression at the patient level may obscure lesion-level response heterogeneity. Individual metastatic disease sites may present with underlying molecular heterogeneity which can lead to a ‘mixed response’ scenario; whereby, some lesions may respond well to treatment and reduce in volume or uptake, while others can increase in size or uptake, or new disease sites can appear within the patient [
23,
24]. Published test-retest repeatability limits for metastatic PCa lesions in [
68Ga]Ga-PSMA-11 PET images can be used to inform a lesion-level response analysis which puts the patient-level RECIP classification into further context [
25]. This lesion-level response assessment analysis, which was out of the scope of the present study, is something that future work should investigate.
This study does have some limitations that should be noted. Patients were treated according to standard-of-care at the discretion of the treating physician and the patient. This means that heterogeneous treatments were administered to patients between scans, which has the benefit of being highly reflective of the treatment scenarios likely to occur in everyday clinical practice for this patient population. However, this does make it difficult to make robust conclusions about individual treatment methods on their own, and future prospective studies are necessary to elucidate the prognostic value of RECIP for specific treatment interventions in BCR PCa populations. Additionally, the segmentations generated by the AI model were used without modification or expert quality assurance. While this provides a good estimate of how well the model is performing, this is highly unlikely to be how the model is used in actual clinical practice, where AI-generated delineations will likely serve either as an initial best approximation with subsequent human modifications, or as a quality assurance check on human-generated segmentations. With such checks and balances in place, false negatives (and false positives) such as described above can potentially be mitigated. Further prospective clinical studies are required in order to quantify AI model prognostic significance when incorporated into RECIP 1.0 in a real-world clinical context [
26].
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.