Background
Positron emission tomography (PET) with fluor-18-fluorodeoxyglucose ([
18F]FDG) in combination with computed tomography (CT) is increasingly utilized for radiation treatment planning in patients with head and neck squamous cell carcinoma (HNSCC). The use of PET-CT scanners with different hardware specifications or methods of image acquisition and reconstruction can result in undesired variation of quantitative [
18F]FDG uptake metrics [
1]. To ensure the reproducibility of quantitative data between PET scanners, the European Association of Nuclear Medicine (EANM) has initiated the EANM Research Ltd. (EARL) harmonization program. These give guidelines on how to perform PET imaging, aiming to harmonize patient preparation, scan acquisition, and image reconstruction [
2].
The first version of the EANM guidelines (EARL1) was introduced in 2010 [
3,
4]. Over the years, multiple technological advances in PET-CT imaging regarding both hard- and software have improved contrast recovery with better spatial resolution and lesion detectability [
5,
6]. Among these new developments are the introduction of time-of-flight, point spread function, smaller voxel sizes and digital silicon photomultiplier detectors. An updated version of the EANM guidelines (EARL2) was introduced in 2019 to take these developments into account [
6,
7]. Compared to EARL1, application of the EARL2 image reconstruction methods can result in significant changes in quantitative [
18F]FDG uptake metrics, such as the maximum standardized uptake value (SUV
max), SUV
mean, metabolic target volume (MTV), and total lesion glycolysis (TLG) [
8]. Changes in quantitative PET readings can have important clinical implications for tumor staging and treatment. Ly et al. reported that the use of EARL2 versus EARL1 reconstructions for lymphoma lesions led to an upgrade in Deauville score in 33% of the patients, resulting in a treatment intensification in 9% of the patients [
9].
Changes of quantitative [
18F]FDG uptake metrics as a result from EARL2 image reconstruction methods may also affect treatment of patients with HNSCC. For radiation treatment, [
18F]FDG PET-CT imaging can be used for primary tumor segmentation and to guide dose escalation to a metabolic subvolume within the tumor [
10‐
12]. In addition, enhanced contrast ratios with EARL2 can also improve the detection of nodal metastases and thus consequentially alter nodal staging and radiation treatment.
Therefore, the aim of this study is to compare quantitative [18F]FDG uptake metrics of the primary tumor and lymph nodes in patients with HNSCC using EARL2 versus EARL1 reconstructed images and to describe clinical implications for nodal staging and treatment.
Discussion
This study demonstrates a significant increase in SUV
max (16.5%) and SUR
max (9.6%) of primary tumor and lymph nodes on EARL2 reconstructed imaging compared to EARL1 in patients with HNSCC. Absolute differences in volume and spatial overlap of MTVs were small between EARL1 and EARL2 reconstructed images, irrespective of the segmentation method used. Relative differences in MTVs were small using the adaptive threshold method and larger when using static SUV or SUR thresholds. Moreover, as a result of a higher SUV
max on EARL2 reconstructed images, more lymph nodes were likely to be scored as (probably) malignant with visual interpretation. This would have had consequences for the N-classification in 18% (6/33) and affecting radiation treatment in 24% (8/33) of the patients. These observations in a cohort with head and neck cancer patients are in line with the results of several previous phantom and clinical studies in other tumor sites, such as lymphoma and non-small cell lung cancer [
6,
8,
9].
The SUV
max was on average 16.5% higher on EARL2 compared to EARL1 reconstructed images, with a strong correlation for both SUV
max and SUR
max between EARL1 and EARL2. In patients with lymphoma and non-small cell lung cancer, Kaalep et al. found that SUV
max on EARL2 was on average 34% higher compared to EARL1 [
8]. In line with our study, the largest differences in maximum [
18F]FDG uptake between EARL1 and EARL2 were observed in smaller lesions. This can be explained by the better resolution of EARL2 reconstructed images and thus reducing the partial volume effect [
5,
6]. The current study demonstrates a smaller but still significant increase in the maximum [
18F]FDG uptake on EARL2 when using a target to background ratio (SUR
max) compared to SUV
max. Few other studies demonstrate that the use of tumor-to-liver ratios also do not completely mitigate the effect of different EARL reconstructions [
8,
19].
Absolute differences of primary tumor MTVs between EARL1 and EARL2 were small (< 1.0 cm
3), independent of the segmentation method used. This is clinically important for radiation dose escalation to MTVs within the primary tumor volume based on [
18F]FDG PET imaging. Although absolute differences in MTVs between both EARL reconstructions were small, the differences were still statistically significant for most segmentation methods (7/9). In contrast to static SUV or SUR thresholds, we observed that MTVs were smaller on EARL2 using relative threshold methods (i.e., MAX40% & MAX50%). This is in line with the results reported by Kaalep et al. [
8]. However, they reported a median difference in MTV on EARL2 of -27% compared to EARL1 with the MAX41% segmentation method, compared to only -7.5% in our study [
8]. Recently Ferrandez et al. calculated changes in MTV between EARL1 and EARL2 in 56 lymphoma lesions [
20]. For the MAX41% and SUV2.5 method MTVs decreased with 27% and 4%, respectively. The smaller differences in MTVs observed between EARL1 and EARL2 in the current study may result from the use of time-of-flight and point spread function in both EARL reconstructions while this was not the case in the other studies. Therefore, differences in MTVs were most likely the result of different pixel and filter sizes only. Finally, patients with lymphoma and non-small cell lung cancer generally have larger tumor volumes than patients with HNSCC. Although absolute tumor volumes were not reported by Kaalep et al., and thus cannot be compared to the current data, this could potentially have contributed to the different findings in our study.
In literature several post-acquisition harmonization methods have been described to minimize variability in MTVs when using EARL2 vs. EARL1 reconstructed images. Kaalep et al. performed post-filtering of EARL2 reconstructed images with a 6–7 mm Gaussian filter, in order to generate EARL1 compliant quantitative data from EARL2 images [
8]. This would obviate the need to perform a EARL1 compliant reconstruction, while both EARL2 and EARL1 images are still available to allow comparison of quantitative data with historic cohorts. Recently Ferrandez et al. investigated the ComBat harmonization method, aiming to align MTVs from EARL1 and EARL2 reconstructed images [
20]. This ComBat harmonization resulted in an improved agreement of MTVs from different reconstructions for most segmentation methods. The advantage of ComBat is that it directly applies to quantitative metrics already extracted from the images based on assumptions and estimations of batch effects, without the need to actually have access to the images [
21]. A limitation is that the transformation is specific for each type of tissue, tumor, scanner and segmentation method. In a prospective setting, such as in our study, we strongly believe in the importance of upfront harmonization strategies (like EARL) and advise that both EARL1 and EARL2 reconstructed images are acquired for each patient. This allows for a direct comparison of quantitative [
18F]FDG uptake metrics on both images, next to morphological features of the lesion. However, in a retrospective setting, post-acquisition harmonization methods such as ComBat and post-filtering can be useful when comparing quantitative metrics based on the latest EARL protocol (e.g. EARL2 or in the future EARL3) with historic cohorts.
For the majority of segmentation methods (8/9), CE values ranged between 0.10 and 0.20, indicating a good spatial overlap of MTVs on both EARL images. This is especially important in radiation treatment planning, as false-negative and false-positive volumes may impact tumor control probability or treatment induced toxicity.
For TLG, differences between EARL1 and EARL2 were also dependent on the segmentation method used. Relative differences were smallest using the MAX40% method and larger using static SUV thresholds. Kaalep et al. reported a median relative difference in TLG on EARL2 of 23% compared to EARL1 with a static threshold of SUV4 [
8]. For the MAX41% method the TLG on EARL2 decreased with only 2%. These results are comparable to the findings in our study when using SUV3.5/4.5 and MAX40% thresholds. As TLG reflects the total [
18F]FDG accumulation in the lesion, which obviously should be equal for both EARL reconstructions, it should be less sensitive to different reconstruction methods and lesion size compared with SUV
max [
8,
22,
23]. Based on our results, the MAX40% method may be a good candidate for estimating the TLG because the differences between EARL1 and EARL2 were small. This is relevant because there is an increasing interest in TLG in literature as several studies reported that changes in TLG during treatment are predictive for loco-regional control and overall survival in patients with HNSCC [
24,
25].
Our analysis demonstrated that quantitative visual evaluation of cervical lymph nodes on EARL2 compared to EARL1 would have changed the N-classification in 18% and affected radiation treatment in 24% of the patients. Similarly, Ly et al. showed that in 52 lymphoma patients EARL2 versus EARL1 reconstructions led to an upgrade in Deauville score in 18 patients (33%), resulting in a treatment intensification in 5 patients (9%) [
9]. As such, caution is warranted when applying quantitative [
18F]FDG uptake thresholds, that are based on EARL1 imaging, directly to EARL2 diagnostic imaging as this comes with a risk of upstaging and overtreatment. Therefore, EARL1 based quantitative thresholds should be re-evaluated before being implemented on EARL2 imaging.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.