Skip to main content
Erschienen in: EJNMMI Research 1/2020

Open Access 01.12.2020 | Original research

Visual and quantitative evaluation of [18F]FES and [18F]FDHT PET in patients with metastatic breast cancer: an interobserver variability study

verfasst von: Lemonitsa H. Mammatas, Clasina M. Venema, Carolina P. Schröder, Henrica C. W. de Vet, Michel van Kruchten, Andor W. J. M. Glaudemans, Maqsood M. Yaqub, Henk M. W. Verheul, Epie Boven, Bert van der Vegt, Erik F. J. de Vries, Elisabeth G. E. de Vries, Otto S. Hoekstra, Geke A. P. Hospers, C. Willemien Menke-van der Houven van Oordt

Erschienen in: EJNMMI Research | Ausgabe 1/2020

Abstract

Purpose

Correct identification of tumour receptor status is important for treatment decisions in breast cancer. [18F]FES PET and [18F]FDHT PET allow non-invasive assessment of the oestrogen (ER) and androgen receptor (AR) status of individual lesions within a patient. Despite standardised analysis techniques, interobserver variability can significantly affect the interpretation of PET results and thus clinical applicability. The purpose of this study was to determine visual and quantitative interobserver variability of [18F]FES PET and [18F]FDHT PET interpretation in patients with metastatic breast cancer.

Methods

In this prospective, two-centre study, patients with ER-positive metastatic breast cancer underwent both [18F]FES and [18F]FDHT PET/CT. In total, 120 lesions were identified in 10 patients with either conventional imaging (bone scan or lesions > 1 cm on high-resolution CT, n = 69) or only with [18F]FES and [18F]FDHT PET (n = 51). All lesions were scored visually and quantitatively by two independent observers. A visually PET-positive lesion was defined as uptake above background. For quantification, we used standardised uptake values (SUV): SUVmax, SUVpeak and SUVmean.

Results

Visual analysis showed an absolute positive and negative interobserver agreement for [18F]FES PET of 84% and 83%, respectively (kappa = 0.67, 95% CI 0.48–0.87), and 49% and 74% for [18F]FDHT PET, respectively (kappa = 0.23, 95% CI − 0.04–0.49). Intraclass correlation coefficients (ICC) for quantification of SUVmax, SUVpeak and SUVmean were 0.98 (95% CI 0.96–0.98), 0.97 (95% CI 0.96–0.98) and 0.89 (95% CI 0.83–0.92) for [18F]FES, and 0.78 (95% CI 0.66–0.85), 0.76 (95% CI 0.63–0.84) and 0.75 (95% CI 0.62–0.84) for [18F]FDHT, respectively.

Conclusion

Visual and quantitative evaluation of [18F]FES PET showed high interobserver agreement. These results support the use of [18F]FES PET in clinical practice. In contrast, visual agreement for [18F]FDHT PET was relatively low due to low tumour-background ratios, but quantitative agreement was good. This underscores the relevance of quantitative analysis of [18F]FDHT PET in breast cancer.
Hinweise
Lemonitsa H. Mammatas and Clasina M. Venema are co-leading authors.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Breast cancer is the most common malignancy in women in the Western world. The majority of breast tumours express the oestrogen receptor (ER), which is the main indicator of potential response to anti-oestrogen therapies [1, 2]. Therefore, it is mandatory to determine ER expression in breast cancer. Recently, the androgen receptor (AR) emerged as a possible target for breast cancer therapy. The AR is present in 70–80% of patients with breast cancer, and AR antagonists are under investigation in clinical trials [36].
A tumour biopsy is the gold standard to determine receptor expression. However, this is an invasive procedure, is not always feasible in case of inaccessible tumour sites, and is subject to sampling errors [7]. The 16α-[18F]fluoro-17β-oestradiol ([18F]FES) and 16β-[18F]fluoro-5α-dihydrotestosterone ([18F]FDHT) PET/CT have been developed to non-invasively visualise, respectively, the ER and AR status in the tumour lesions within a patient. Previously, it has been shown that [18F]FES and [18F]FDHT uptake correlate well with ER and AR expression levels in representative breast cancer biopsies [810]. As a diagnostic tool, [18F]FES PET leads to better diagnostic understanding in 88% and to a change of therapy in 48% of the patients presenting with a clinical dilemma [11]. To predict treatment effects, [18F]FES PET can be used to assess residual ER availability during treatment with, e.g. fulvestrant, a selective ER downregulator. Inadequate reduction of the [18F]FES PET signal (< 75%) by fulvestrant treatment was associated with early progression [12]. Similarly, in patients with prostate cancer, [18F]FDHT PET was used to determine the optimal dose of the AR blocker enzalutamide in a phase 1 trial [13]. Lastly, patients with ER-positive breast cancer and high [18F]FDG uptake showed a worse progression free survival if [18F]FES uptake was low in comparison to high [18F]FES uptake (3 versus 8 months, respectively) [14].
For all these potential applications, reliable, observer-independent identification and quantification of [18F]FES and [18F]FDHT uptake in tumour lesions is essential for translation to daily clinical practice. Up till now, there are no data on the interobserver variability of [18F]FES and [18F]FDHT PET in breast cancer. Therefore, the primary objective of this study was to examine interobserver variability in visual and quantitative assessment of [18F]FES and [18F]FDHT PET. Secondary objectives included the effect of tumour to background ratio (TBR), tracer accumulation, tumour size and the use of different SUV parameters (SUVmax, SUVpeak or SUVmean) on interobserver agreement. Also, the added value of quantitative assessment in comparison to visual assessment was examined, and the number of lesions detected on [18F]FES and [18F]FDHT was compared with those detected on conventional imaging methods (contrast enhanced CT scan and bone scan).

Materials and methods

Patient population

This prospective two-centre interobserver variability study was part of a study investigating the correlation between [18F]FES and [18F]FDHT uptake and ER and AR expression in simultaneously biopsied metastases, of which the results have been published elsewhere [8]. Patients were recruited from September 2014 to August 2015 at the CCA-VUmc University Medical Center Amsterdam and the University Medical Center Groningen in the Netherlands.
Eligibility criteria included metastatic breast cancer and an ER-positive primary tumour, ≥ 1 extrahepatic tumour lesion, ECOG performance status of ≤ 2 and a postmenopausal status or use of LHRH-agonists. Patients were excluded if they had used ER or AR binding drugs during the 6 weeks before study entry, because these ligands compete with tracer binding.
All patients had to give written informed consent before study participation. The study was conducted in compliance with the ethical principles originating in or derived from the Declaration of Helsinki and in compliance with all International Conference on Harmonization Good Clinical Practice guidelines. The local medical ethics committee approved the study (NCT01988324).

Imaging protocols

[18F]FES and [18F]FDHT were produced as described previously [15, 16]. On separate days, ≤ 14 days apart, 200 MBq (± 10%) of each tracer was injected. After 60 min (± 5 min), a low-dose CT was performed during tidal breathing for attenuation correction, followed by a whole-body PET scan (skull vertex to mid-thigh, 2 min per bed position). PET/CT scans were made using a Philips Gemini TF-64 PET/CT (Amsterdam) or Siemens 64 slice mCT PET/CT (Groningen). Acquisition and reconstruction protocols used on both scanners were according to the recommendations of the European Association of Nuclear Medicine (EARL) [17].
In addition, a high-resolution, contrast-enhanced CT chest-abdomen and bone scan was performed within 6 weeks of the PET scans for comparison.

Image analyses

Contrast enhanced CT scans were examined by experienced radiologists and bone scans by experienced nuclear medicine physicians, respectively, masked for the [18F]FES and [18F]FDHT PET results. Two independent observers from each centre (LM and CV), trained and supervised by two experienced nuclear medicine physicians, performed visual and quantitative analyses. The observers had knowledge of conventional imaging results (contrast enhanced CT and bone scans).
A visually PET-positive lesion was defined as focal uptake above local background incompatible with physiological uptake. Liver metastases were excluded from all analyses in this study because of high physiological [18F]FES and [18F]FDHT uptake in healthy liver tissue, making reliable identification of metastases difficult. In addition, if visual interpretation of uptake in a (potential) lesion was impossible, e.g. due to overlap with adjacent organs with high physiological tracer, the readers independently reported it as ‘not evaluable’ in the visual ratings, and these were excluded from further analyses. For each patient, the observers made a list that consisted of all lesions already detected on conventional imaging, followed by additional lesions discovered on [18F]FES or [18F]FDHT PET. An anatomical description of all the lesions was reported in order to match the results. In case a lesion was not reported by one of the two observers, it was scored as not visible for that observer. All visually PET-positive lesions were quantified, as well as PET-negative lesions that were identified on conventional imaging (i.e. lesions on bone scintigraphy and/or high resolution CT > 1 cm).
Each observer manually drew volumes of interest (VOI) on the tumour contours, using PET images for PET-positive lesions and low-dose CT images for PET-negative lesions (lesions only seen on bone scan or high-resolution CT were visually matched on the low-dose CT). Lesions were separately analysed based on visibility on either PET or conventional imaging alone to investigate the influence of visibility on imaging techniques on interobserver agreement.
For every VOI, the standardised uptake values (SUV), i.e. the tracer uptake within a VOI normalised to the injected dose and body weight, were calculated using the software programs accurate (in-house build using IDL, observer 1) and syngo.via version VB10B, Siemens (observer 2). Both programs yielded identical results on test images. Three types of SUV were compared in this study: SUVmax (voxel with highest SUV within the VOI), SUVpeak (average SUV of a 1 cm3 sphere containing the hottest voxels of the VOI) and SUVmean with isocontour 50% of SUVmax (average SUV of all voxels with uptake ≥ 50% of SUVmax).
Based on previous studies, an SUVmax [18F]FES cut-off ≥ 1.5 was used to define ER-positivity (corresponding with a IHC cut-off of ≥ 1%) and an SUVmax [18F]FDHT cut-off ≥ 1.9 for AR positivity (corresponding with a IHC cut-off of ≥ 10%) [8, 9].
For [18F]FES and [18F]FDHT, the SUVmax tumour-background ratio (TBR) was defined as the ratio of the SUVmax of a tumour lesion and the SUVmean of healthy background tissue. To determine the SUVmean of healthy background tissue, a VOI was drawn on reference tissue in the unaffected contralateral site whenever available or in the unaffected surrounding tissue of the same origin [18].

Statistical analyses

For visual assessments, agreement was calculated with absolute and relative measures of interobserver agreement. Absolute agreement is the probability that if one observer would score a lesion as visible (positive agreement) or not visible (negative agreement) on the PET scan, the other observer would do the same. It is calculated by the following formulas: positive agreement = 2 × lesions visible to both observers/(2 × lesions visible to both observers + lesions only visible to observer 1 + lesions only visible to observer 2) and negative agreement = 2 × lesions not visible to both observers/(2 × lesions not visible to both observers + lesions only not visible to observer 1 + lesions only not visible to observer 2) [19]. In order to compare results with previous studies, also reliability (relative agreement) was calculated according to Cohen’s kappa, and the results were interpreted as follows: kappa 0.01–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial and 0.81–1.00 as almost perfect interobserver agreement [20]. To account for potential within-person correlation in visual assessments, a chi-square test was performed to examine whether the percentage visual agreement differed per patient.
For quantitative assessments, parameters are presented as mean ± SD, and reliability was calculated with intraclass correlation coefficients (ICC) using a two-way random effect model with absolute agreement. For the interpretation of the ICCs, the following guideline was used: ≥ 0.90 as excellent, ≥ 0.75 as good, ≥ 0.50 as moderate and < 0.50 as poor [21].
Absolute agreement on quantitative assessments were analysed with Bland-Altman plots (differences between observers showed a normal distribution). For each lesion, it graphically shows the average SUV of observers 1 and 2 on the x-axes and on the y-axes the difference between observers for each lesion, expressed as percentage of the average SUV value. Percentage differences were used instead of absolute differences to achieve independence of magnitude of differences from magnitude of SUV values, and it facilitates comparisons between the SUV parameters SUVmax, SUVmean and SUVpeak, which may show large differences in absolute values.
To investigate the effect of TBRs on interobserver variability, differences between TBRs of [18F]FES and [18F]FDHT PETs were tested with Wilcoxon matched pairs signed rank tests. In addition, correlations between tracer uptake or tumour size and percentage interobserver differences were determined using the Spearman correlation coefficient (r). Finally, linear regression was performed to find the linear function between SUVmax, SUVpeak and SUVmean for [18F]FES and [18F]FDHT PET, and Cochran’s Q and McNemar tests were used to analyse differences between visibility and quantitative uptake above or below cut-off for SUVmax, SUVpeak and SUVmean. P value < 0.05 was considered significant. Statistical analyses were generated using the SPSS software (version 22; IBM, SPSS statistics).

Results

Patient characteristics

A total of 120 lesions were identified in 10 patients using the different imaging modalities (Table 1). Most lesions were skeletal (66%), followed by lymph node (25%) and visceral metastases (9%). The median number of lesions per patient was 9 (range 2–32).
Table 1
Patient characteristics
Characteristic
Number (n = 10)
%
Age in years, mean (range)
67 (48–79)
 
Biopsy of primary tumour
  
 ER+/AR+
10
100
Biopsy of metastases
 ER+/AR+
8
80
 ER+/AR-
1
10
 ER−/AR−
1
10
Previous treatment lines
 0–1
3
30
 2–4
7
70
Visible lesions: total, median per patient (range)
120, 9 (2–32)
 
 Conventional imaging (CT, bone scan)
69 (54, 40)
58 (45, 33)
 Visible on PET alone ([18F]FES or [18F]FDHT PET)
51 (33, 20)
42 (28, 16)
  Total visible on [18F]FES PET (observer 1, 2)
64, 69
53, 58
  Total visible on [18F]FDHT PET (observer 1, 2)
36, 37
30, 31
Location
 Bone (conventional imaging, [18F]FES, [18F]FDHT PET)
79 (55, 45, 37)
66 (80, 54, 64)
 Lymph node (conventional imaging, [18F]FES; [18F]FDHT PET)
30 (8, 29, 16)
25 (12, 35, 28)
 Viscerala (conventional imaging, [18F]FES, [18F]FDHT PET)
11 (6, 9, 5)
9 (9, 11, 9)
aExcluding liver

Comparison of lesion detection on different imaging modalities

Of the 120 lesions in total (Table 1), most were identified on [18F]FES PET (n = 64 [53%] and n = 69 [58%] by observer 1 and 2, respectively), followed by high-resolution CT (n = 54 [45%]), bone scintigraphy (n = 40 [33%]) and [18F]FDHT PET (n = 36 [30%] and n = 37 [31%]). Fifty and 42% of the lesions identified on [18F]FES PET by observer 1 and 2, respectively, were also detected on high resolution CT or bone scintigraphy (Fig. 1). For [18F]FDHT PET, 55% and 49% of the identified lesions were seen with conventional imaging. Conversely, 46 and 42% of the lesions identified on conventional imaging were visible on [18F]FES PET by, respectively, observer 1 and 2, and 29% and 26% were seen on [18F]FDHT PET. In particular, more lymph node lesions were detected on [18F]FES PET and [18F]FDHT PET compared to conventional imaging: 97% and 53% versus 27% of all detected lymph node lesions, respectively.

Visual analysis of [18F]FES and [18F]FDHT PET images

Out of 120 lesions, a total of 87 and 74 on [18F]FES and [18F]FDHT PET, respectively, were analysed for visual interobserver agreement. The other lesions were excluded because one or both observers reported these as ‘not evaluable’ due to overlap with adjacent organs with high physiological tracer uptake.
For lesions visible on conventional imaging, [18F]FES PET readings (Table 2) had substantial positive and negative agreement of 84% (95% CI 72–92%) and 83% (95% CI 70–91%), respectively (kappa = 0.67, 95% CI 0.48–0.87). By including lesions that were only visible on [18F]FES PET, the positive agreement improved to 88% (95% CI 80–93%) for all lesions scored on [18F]FES PET (negative agreement remained the same). [18F]FDHT PET showed lower positive agreement of 49% (95% CI 32–65%) for lesions visible on conventional imaging, while negative agreement was 74% (95% CI 62–83%) (kappa = 0.23, 95% CI − 0.04–0.49). Positive agreement for all lesions scored on [18F]FDHT PET was 58% (95% CI 43–71%). By looking at lesions only visible on PET and not on conventional imaging, the positive agreement rate was the highest: 91% (95% CI 81–96%) for [18F]FES PET and 80% (95% CI 55–93%) for [18F]FDHT PET. Visual interobserver agreement was not significantly different between the 10 different patients in this study: P = 0.159 for [18F]FES PET and P = 0.387 for [18F]FDHT PET.
Table 2
Visual interobserver agreement for lesions visible (A, C) and not visible on conventional imaging (B, D) on [18F]FES and [18F]FDHT PET, respectively
A Visual interobserver agreement on [18F]FES PET for lesions visible on conventional imaging
 
Observer 1
 
Observer 2
Visible
Not visible
Not evaluablea
Total
 Visible
24
3
2
29
 Not visible
6
22
4
32
 Not evaluablea
2
6
0
8
 Total
32
31
6
69
B Visual interobserver agreement on [18F]FES PET for lesions not visible on conventional imaging
 
Observer 1
 
Observer 2
Visible
Not visible
Not evaluablea
Total
 Visible
26
3
11
40
 Not visible
2
1b
3
6
 Not evaluablea
4
1
0
5
 Total
32
5
14
51
C Visual interobserver agreement on [18F]FDHT PET for lesions visible on conventional imaging
 
Observer 1
   
Observer 2
Visible
Not visible
Not evaluablea
Total
 Visible
9
8
1
18
 Not visible
11
27
1
39
 Not evaluablea
0
11
1
12
 Total
20
46
3
69
D Visual interobserver agreement on [18F]FDHT PET for lesions not visible on conventional imaging
 
Observer 1
   
Observer 2
Visible
Not visible
Not evaluablea
Total
 Visible
6
2
11
19
 Not visible
1
10c
6
17
 Not evaluablea
9
4
2
15
 Total
16
16
19
51
aNot evaluable lesions due to overlap with adjacent organs with high physiological tracer uptake
bLesions identified on [18F]FDHT PET
cLesions identified on [18F]FES PET
An important aspect in the identification of tumour lesions is how well tracer uptake can be distinguished from background uptake in normal reference tissue. The TBR of [18F]FDHT was significantly lower than that of [18F]FES (Fig. 2). In bone lesions, the mean TBR of [18F]FDHT was 2.0 (± SD 0.6) versus 3.3 (± SD 2.2) for [18F]FES (P = 0.003). In addition, in lymph node lesions, the mean [18F]FDHT TBR was 4.6 (± SD 1.9) compared to 10.7 (± SD 8.4) for [18F]FES (P < 0.0001).

Quantitative analyses of [18]FES and [18F]FDHT PET images

Out of 120 lesions, a total of 94 and 95 were quantified by both observers on [18F]FES and [18F]FDHT PET, respectively. The other lesions were not quantified by one or both of the observers as a result of overlap with adjacent organs with high physiological tracer uptake, unless there was a clear anatomical substrate on other imaging modalities allowing for reliable VOI definition.
In general, interobserver agreement was excellent for PET quantification (Fig. 3) of all lesions combined (i.e. visible on PET or seen on conventional imaging). The ICCs for quantification of SUVmax, SUVpeak and SUVmean on [18F]FES PET were 0.98 (95% CI 0.96–0.98), 0.97 (95% CI 0.96–0.98) and 0.89 (95% CI 0.83–0.92). For [18F]FDHT PET, the ICCs were lower with 0.78 (95% CI 0.66–0.85), 0.76 (95% CI 0.63–0.84) and 0.75 (95% CI 0.62–0.84), respectively.
In addition, [18F]FES (Fig. 4) and [18F]FDHT PET (Fig. 5) quantification was analysed separately with Bland Altman plots for all lesions visible on PET or lesions only visible on conventional imaging (hence, PET-negative lesions). For [18F]FES PET, PET-positive lesions showed excellent quantitative interobserver agreement with mean differences < 2% and 95% limits of agreement (LOA95%) being narrower for SUVmax (LOA95% − 31.3 to 34.3%) and SUVpeak (LOA95% − 31.1 to 28.4%), compared to SUVmean (LOA95% − 46.5 to 44.3%). More differences were shown for PET-negative lesions with mean interobserver differences < 14% and larger LOA95% (within ± 75%), but note that absolute differences between observers were generally low due to a low SUV. Similarly, for [18F]FDHT PET, interobserver agreement was better for PET-positive (mean interobserver differences < 7%, LOA95% within ± 45 %) compared to PET-negative lesions (mean interobserver differences < 12%, LOA95% within ± 76%). SUVmax and SUVpeak showed a better interobserver agreement in comparison to SUVmean for the quantification of lesions visible on [18F]FES PET, while on [18F]FDHT PET the different SUV parameters were comparable.
Higher levels of tracer accumulation in PET positive lesions were not associated with improved interobserver agreement (for [18F]FES PET: Spearman r = 0.04, 0.26 and 0.14 for SUVmax, SUVpeak and SUVmean, respectively and for [18F]FDHT PET: Spearman r = 0.00, r = 0.03 and r = − 0.17, respectively). In addition, there was no correlation between tumour size and interobserver agreement (for [18F]FES PET: Spearman r = 0.10, r = 0.08 and r = 0.06, for SUVmax, SUVpeak and SUVmean, respectively and for [18F]FDHT PET: Spearman r = − 0.07, r = − 0.16 and r = − 0.42, respectively).

The added value of quantitative assessment in comparison to visual assessment

Based on previous studies, [18F]FES and [18F]FDHT SUVmax cut-off levels of 1.5 and 1.9, respectively, have been identified. There are however limited data on quantitative thresholds and corresponding cut-off values for SUVpeak and SUVmean. Based on linear regression of all lesions quantified in this study, an SUVmax cut-off of 1.5 on [18F]FES PET corresponded with an SUVpeak of 1.2 and an SUVmean of 1.1 (Supplementary figure S1), and for [18F]FDHT PET, an SUVmax cut-off of 1.9 corresponded with an SUVpeak of 1.6 and an SUVmean of 1.3.
For diagnostic purposes, it is important to identify all receptor positive tumour lesions. Therefore, we compared visual and quantitative tracer uptake above/below cut-off levels (Table 3). In 3% and 1% of the lesions scored visually positive on [18F]FES PET by observer 1 and 2 respectively, SUVmax was below the threshold of 1.5. For [18F]FDHT PET, 14% of the visually positive lesions scored by observer 1 as well as observer 2 had an SUVmax below the threshold of 1.9. There were no structural differences between observer 1 and 2. The discrepancies were mostly seen in lesions located in tissue with low background uptake such as skin and lung metastases (Supplementary table S1). Conversely, in 44% and 39% of the lesions scored visually negative on [18F]FES PET by observer 1 and 2, respectively, SUVmax was ≥ 1.5. Similarly, 31% and 52% of the visually negative lesions had an SUVmax ≥ 1.9 on [18F]FDHT PET, respectively. However, in most cases (60%), we observed overlap with organs having high physiological tracer accumulation such as the liver and bowel, followed by lesions that were determined to be visually positive at second glance (32%). After correction for these effects, ≤ 4% of the visually negative lesions had a SUVmax above cut-off for both tracers.
Table 3
Discrepancies between visual and quantitative assessments (above/below cut-off values for receptor positivity) for [18F]FES (A) and [18F]FDHT PET (B)
A [18F]FES
Observer 1
 
Observer 2
 
 
Visible (n= 64)
Not visible (n= 36)
Visible (n= 69)
Not visible (n= 38)
SUVmax ≥ 1.5
62 (97%)
16 (44%)
68 (99%)
15 (39%)
SUVmax < 1.5
2 (3%)
20 (56%)
1 (1%)
23 (61%)
SUVpeak ≥ 1.2
54 (84%)
19 (53%)
67 (97%)
16 (42%)
SUVpeak < 1.2
10 (16%)
17 (47%)
2 (3%)
22 (58%)
SUVmean ≥ 1.1
57 (89%)
8 (22%)
67 (97%)
11 (29%)
SUVmean < 1.1
7 (11%)
28 (78%)
2 (3%)
27 (71%)
B [18F]FDHT
Observer 1
 
Observer 2
 
 
Visible (n= 36)
Not visible (n= 62)
Visible (n= 37)
Not visible (n= 56)
SUVmax ≥ 1.9
31 (86%)
19 (31%)
32 (86%)
29 (52%)
SUVmax < 1.9
5 (14%)
43 (69%)
5 (14%)
27 (48%)
SUVpeak ≥ 1.6
30 (83%)
25 (40%)
33 (89%)
30 (54%)
SUVpeak < 1.6
6 (17%)
37 (60%)
4 (11%)
26 (46%)
SUVmean ≥ 1.3
31 (86%)
20 (32%)
33 (89%)
30 (54%)
SUVmean < 1.3
5 (14%)
42 (68%)
4 (11%)
26 (46%)
Not evaluable lesions were excluded as reported in Table 2
Comparing the impact of the different SUV parameters on discrepancies between visual and quantitative assessments showed no significant differences with the only exception that SUVmean showed less visually negative lesions above cut-off on [18FES]PET than SUVmax or SUVpeak for observer 1 (P = 0.008 and P = 0.001, respectively), but not for observer 2 (P = 0.125 and P = 0.063, respectively).

Discussion

Interobserver variability is an important step in the clinical application of diagnostic tools. Here, we showed that both visual and quantitative evaluation were highly reproducible between independent observers evaluating [18F]FES PET at separate centres using different scanners and software. Visual positive and negative absolute agreement was > 80%, with a kappa of 0.67. Also, the interobserver reliability of quantitative metrics was excellent for SUVmax and SUVpeak (ICC of 0.98 and 0.97, respectively) and good for SUVmean (ICC of 0.89). In comparison, staging patients with breast cancer showed similar results for bone scintigraphy (kappa 0.62–0.78) and [18F]FDG PET (kappa 0.65 and an ICC of 0.93 for the quantification of [18F]FDG uptake) [2226].
[18F]FDHT PET also showed good interobserver reliability for quantitative assessments with ICCs ≥ 0.75. These values are slightly lower than those of [18F]FES PET, and this was probably due to the lower lesional [18F]FDHT uptake, because quantitative agreement according to Bland Altman analyses were comparable for both tracers. The TBR of [18F]FDHT was considerably lower compared to [18F]FES. This probably explains the higher variability in visual interpretation (kappa = 0.23), mainly caused by a low visual positive agreement (49%) in lesions already identified by conventional imaging modalities, while positive agreement in lesions not identified by conventional imaging was much higher (80%), as well as negative visual agreement between observers (74%). An important impeding factor was the significantly lower TBR of [18F]FDHT in bone and lymph node lesions compared to [18F]FES PET. The TBR of [18F]FDHT in the current study (2.0 for bone and 4.6 for lymph nodes) was also lower than in prostate cancer metastases (3.3 for bone and 5.7 for soft tissue metastases) with an SUVmax three times higher in prostate cancer (7.1–9.1 versus 2.0 in the present breast cancer study) [27, 28]. This suggests that higher AR expression likely results in better interobserver reliability.
Our study had some limitations. There were only a limited number of patients included in this study. However, receptor expression between lesions within a single patient can be heterogeneous [29], which was confirmed in the present study resulting in the coverage of a large range of data in 120 lesions [8]. In addition, we showed there was no within-patient correlation in visual assessments. A second limitation is a substantial number of ‘not evaluable’ lesions, due to overlap with adjacent organs with high physiological background. The decision for evaluability was left to each observer individually, which may have contributed to the low agreement (≤ 6%) on these ‘not evaluable lesions’. For future studies, we recommend that all lesions with physiological background overlap from the liver, gallbladder, intestine, bladder and for [18F]FDHT also from bloodpool are regarded as not evaluable. A third limitation is the lack of robust [18F]FES and [18F]FDHT thresholds for test positivity. We used an SUVmax cut-off of 1.5 for [18F]FES and 1.9 for [18F]FDHT PET based on previous data corresponding with ER and AR positivity in biopsies and so far showing the best predictive value for response to endocrine therapy [8, 9, 30, 31]. Some studies suggested an SUVmax cut-off of 2.0 for [18F]FES PET, taking into account the background [18F]FES uptake in normal tissues which can exceed the cut-off of 1.5 [2931]. Tissue specific cut-off values may indeed be more appropriate as there are responders to endocrine therapy with a tumour SUVmax < 2.0. In the current study, up to 20% of the visually positive lesions had an SUVmax < 2.0, while < 3% had an SUVmax < 1.5 (Supplementary table S2).
For diagnostic purposes, simple visual assessment of [18F]FES uptake may suffice to determine the receptor status of a tumour lesion (agreement was high between visual assessment and the applied SUVmax cut-off value of 1.5 for ER-positivity). True discrepancies between visibility and corresponding uptake above or below cut-off were low (< 4%), making quantification of visually negative lesions not only cumbersome, but also unnecessary. Also, quantification of lesions without visual [18F]FES uptake leads to higher interobserver variability due to differences in VOI definition. However, quantification remains a helpful tool for nuclear medicine physicians in ‘equivocal [18F]FES lesions’. In addition, quantification is useful to measure receptor availability over time for the evaluation of treatment effects. In contrast, quantification of [18F]FDHT uptake is still required in future breast cancer studies, as we have shown relatively low visual agreement.
The role of [18F]FES and [18F]FDHT PET in addition to conventional imaging modalities needs to be defined further. It has to be taken into account that besides partial volume effects and constraints due to background tracer uptake limiting their detection, receptor expression can be heterogeneous and variable during the course of the disease [11, 32]. In addition, treatment may induce changes in receptor expression, but also eradicated tumour cells can leave a visible lesion on conventional imaging (e.g. sclerotic bone lesions), in absence of viable tumour cells. In the current study with heavily pretreated patients, 42–46% and 26–29% of the lesions identified by conventional imaging were detected on [18F]FES and [18F]FDHT PET, respectively. Vice versa, only approximately 50% of the lesions observed on [18F]FES PET and [18F]FDHT PET were identified by conventional imaging.
Therefore, a potential role for [18F]FES PET may be in staging of early ER-positive breast cancer as an addition to existing imaging techniques. Standard staging with [18F]FDG PET can miss low-intermediate grade ER-positive lesions due to their low metabolic activity [33]. We are currently investigating [18F]FES PET in staging patients with low grade, ER-positive locally advanced or recurrent breast cancer versus [18F]FDG PET (NCT03726931), and in metastatic breast cancer versus addition to conventional diagnostics (NCT01957332). The non-invasive visualisation of receptor status in metastatic lesions with PET offers a number potential clinical advantages. For example, in case conventional diagnostics cannot establish a final diagnosis of suspected metastatic breast cancer lesions (e.g. as a result of inaccessible biopsy sites or repeated biopsy sampling errors). Also, PET imaging may help to determine the hormone receptor status of different tumour sites within a patient and guide treatment decisions, for instance, to decide on the origin of a metastatic lesion in case of multiple primary tumours or to determine whether receptor conversion occurred in metastases from a single primary tumour [11]. If validated, this may help with multimodality treatment strategies for heterogeneous tumour sites of breast cancer, such as endocrine therapy for [18F]FES positive lesions combined with a local modality such as radiotherapy for concurrent [18F]FES negative lesions [34].

Conclusion

In conclusion, our findings demonstrate that visual and quantitative evaluation of [18F]FES PET has a high interobserver concordance and support the use in clinical practice. Although [18F]FDHT PET showed relatively low visual agreement, presumably a result of the low AR expression and consequently low TBR in patients with breast cancer, there was good quantitative agreement between observers, acceptable for further [18F]FDHT PET imaging studies in breast cancer.

Acknowledgements

We thank the patients who participated in this study and their families. In addition, we acknowledge the efforts of the clinical and imaging teams at the university medical centres participating in this study.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the Institutional Review Board of University Medical Center, Groningen, and also approved by the Institutional Review Board of the Amsterdam UMC, location VUmc University Medical Center, Amsterdam, the Netherlands (File no. 2014.501-NL41954.042.13) and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Not applicable.

Competing interests

The authors declare that they have no competing interests.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
2.
Zurück zum Zitat Yamashita H, Yando Y, Nishio M, Zhang Z, Hamaguchi M, Mita K, et al. Immunohistochemical evaluation of hormone receptor status for predicting response to endocrine therapy in metastatic breast cancer. Breast Cancer. 2006;13:74–83.CrossRef Yamashita H, Yando Y, Nishio M, Zhang Z, Hamaguchi M, Mita K, et al. Immunohistochemical evaluation of hormone receptor status for predicting response to endocrine therapy in metastatic breast cancer. Breast Cancer. 2006;13:74–83.CrossRef
4.
Zurück zum Zitat Krop I, Colleoni M, Traina T, Holmes F, Estevez L, et al. Results from a randomized placebo-controlled phase 2 trial evaluating exemestane ± enzalutamide in patients with hormone receptor–positive breast cancer. Abstract GS4-07. San Antonio Breast Cancer Symposium. San Antonio, Texas; 2017. Krop I, Colleoni M, Traina T, Holmes F, Estevez L, et al. Results from a randomized placebo-controlled phase 2 trial evaluating exemestane ± enzalutamide in patients with hormone receptor–positive breast cancer. Abstract GS4-07. San Antonio Breast Cancer Symposium. San Antonio, Texas; 2017.
6.
Zurück zum Zitat Traina TA, Yardley DA, Schwartzberg LS, O'Shaughnessy J, Cortes J, Awada A, et al. Overall survival (OS) in patients (Pts) with diagnostic positive (Dx+) breast cancer: subgroup analysis from a phase 2 study of enzalutamide (ENZA), an androgen receptor (AR) inhibitor, in AR+ triple-negative breast cancer (TNBC) treated with 0-1 prior lines of therapy. J Clin Oncol. 2017;35:1089. https://doi.org/10.1200/JCO.2017.35.15_suppl.1089.CrossRef Traina TA, Yardley DA, Schwartzberg LS, O'Shaughnessy J, Cortes J, Awada A, et al. Overall survival (OS) in patients (Pts) with diagnostic positive (Dx+) breast cancer: subgroup analysis from a phase 2 study of enzalutamide (ENZA), an androgen receptor (AR) inhibitor, in AR+ triple-negative breast cancer (TNBC) treated with 0-1 prior lines of therapy. J Clin Oncol. 2017;35:1089. https://​doi.​org/​10.​1200/​JCO.​2017.​35.​15_​suppl.​1089.CrossRef
10.
Zurück zum Zitat Chae SY, Ahn SH, Kim SB, Han S, Lee SH, Oh SJ, et al. Diagnostic accuracy and safety of 16alpha-[(18)F]fluoro-17beta-oestradiol PET-CT for the assessment of oestrogen receptor status in recurrent or metastatic lesions in patients with breast cancer: a prospective cohort study. Lancet Oncol. 2019;20:546–55. https://doi.org/10.1016/S1470-2045(18)30936-7.CrossRefPubMed Chae SY, Ahn SH, Kim SB, Han S, Lee SH, Oh SJ, et al. Diagnostic accuracy and safety of 16alpha-[(18)F]fluoro-17beta-oestradiol PET-CT for the assessment of oestrogen receptor status in recurrent or metastatic lesions in patients with breast cancer: a prospective cohort study. Lancet Oncol. 2019;20:546–55. https://​doi.​org/​10.​1016/​S1470-2045(18)30936-7.CrossRefPubMed
15.
Zurück zum Zitat Liu A, Dence CS, Welch MJ, Katzenellenbogen JA. Fluorine-18-labeled androgens: radiochemical synthesis and tissue distribution studies on six fluorine-substituted androgens, potential imaging agents for prostatic cancer. J Nucl Med. 1992;33:724–34.PubMed Liu A, Dence CS, Welch MJ, Katzenellenbogen JA. Fluorine-18-labeled androgens: radiochemical synthesis and tissue distribution studies on six fluorine-substituted androgens, potential imaging agents for prostatic cancer. J Nucl Med. 1992;33:724–34.PubMed
16.
Zurück zum Zitat Römer J, Steinbach J, Kasch H. Studies on the synthesis of 16α-[18F] fluoroestradiol. Appl Rad Isotop. 1996;47:395–9.CrossRef Römer J, Steinbach J, Kasch H. Studies on the synthesis of 16α-[18F] fluoroestradiol. Appl Rad Isotop. 1996;47:395–9.CrossRef
18.
Zurück zum Zitat Jansen BHE, Kramer GM, Cysouw MCF, Yaqub MM, de Keizer B, Lavalaye J, et al. Healthy tissue uptake of (68)Ga-prostate specific membrane antigen (PSMA), (18)F-DCFPyL, (18)F-fluoromethylcholine (FCH) and (18)F-dihydrotestosterone (FDHT). J Nucl Med. 2019. https://doi.org/10.2967/jnumed.118.222505. Jansen BHE, Kramer GM, Cysouw MCF, Yaqub MM, de Keizer B, Lavalaye J, et al. Healthy tissue uptake of (68)Ga-prostate specific membrane antigen (PSMA), (18)F-DCFPyL, (18)F-fluoromethylcholine (FCH) and (18)F-dihydrotestosterone (FDHT). J Nucl Med. 2019. https://​doi.​org/​10.​2967/​jnumed.​118.​222505.
20.
Zurück zum Zitat Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977:363–74. Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977:363–74.
21.
Zurück zum Zitat Portney LG, Watkins MP. Foundations of clinical research: applications to practice: Pearson/Prentice Hall Upper Saddle River, NJ; 2009. Portney LG, Watkins MP. Foundations of clinical research: applications to practice: Pearson/Prentice Hall Upper Saddle River, NJ; 2009.
22.
Zurück zum Zitat Sawicki LM, Grueneisen J, Schaarschmidt BM, Buchbender C, Nagarajah J, Umutlu L, et al. Evaluation of 18 F-FDG PET/MRI, 18 F-FDG PET/CT, MRI, and CT in whole-body staging of recurrent breast cancer. Eur J Radiol. 2016;85:459–65.CrossRef Sawicki LM, Grueneisen J, Schaarschmidt BM, Buchbender C, Nagarajah J, Umutlu L, et al. Evaluation of 18 F-FDG PET/MRI, 18 F-FDG PET/CT, MRI, and CT in whole-body staging of recurrent breast cancer. Eur J Radiol. 2016;85:459–65.CrossRef
23.
Zurück zum Zitat van der Hoeven JJ, Hoekstra OS, Comans EF, Pijpers R, Boom RP, van Geldere D, et al. Determinants of diagnostic performance of [F-18] fluorodeoxyglucose positron emission tomography for axillary staging in breast cancer. Ann Surg. 2002;236:619.CrossRef van der Hoeven JJ, Hoekstra OS, Comans EF, Pijpers R, Boom RP, van Geldere D, et al. Determinants of diagnostic performance of [F-18] fluorodeoxyglucose positron emission tomography for axillary staging in breast cancer. Ann Surg. 2002;236:619.CrossRef
28.
30.
34.
Metadaten
Titel
Visual and quantitative evaluation of [18F]FES and [18F]FDHT PET in patients with metastatic breast cancer: an interobserver variability study
verfasst von
Lemonitsa H. Mammatas
Clasina M. Venema
Carolina P. Schröder
Henrica C. W. de Vet
Michel van Kruchten
Andor W. J. M. Glaudemans
Maqsood M. Yaqub
Henk M. W. Verheul
Epie Boven
Bert van der Vegt
Erik F. J. de Vries
Elisabeth G. E. de Vries
Otto S. Hoekstra
Geke A. P. Hospers
C. Willemien Menke-van der Houven van Oordt
Publikationsdatum
01.12.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
EJNMMI Research / Ausgabe 1/2020
Elektronische ISSN: 2191-219X
DOI
https://doi.org/10.1186/s13550-020-00627-z

Weitere Artikel der Ausgabe 1/2020

EJNMMI Research 1/2020 Zur Ausgabe