Introduction
The loco-regional control of mucosal primary head and neck squamous cell carcinoma (MPHNSCC) has improved significantly using primary radiation therapy (RT) and concurrent systemic therapy [
1]. However, these treatments are associated with significant long-term toxicity, which can have a lasting impact on quality of life [
2]. Frequently recurrences occur within the initial gross tumour volume [
3], and strategies that can better identify tumour radioresistance and selection of appropriate treatment intensification may improve outcomes.
2-[18F] fluoro-2-deoxy-D-glucose positron emission tomography-computed tomography (FDG PET-CT) has an established role in assessment of treatment outcome after completion of organ-preserving chemo-RT for locally advanced MPHNSCC. This is usually performed at 3 to 4 months after completion of treatment with high negative predictive value and low to moderate positive predictive value for tumour recurrence [
4]. Obtaining FDG PET-CT at an earlier time point can result in poorer specificity due to treatment-related inflammation. Research evaluating the role of FDG PET-CT in assessing early treatment response during RT is limited to small studies with mixed results [
5‐
8].
The maximum standardised uptake value (SUV
max) has been the most commonly used PET metabolic parameter for staging and monitoring of treatment response. More novel parameters being assessed include the metabolic tumour volume (MTV) and total lesional glycolysis (TLG). These measurements provide volumetric information on glucose metabolism of the tumour. A recent meta-analysis reported that MTV and TLG from staging FDG PET-CT are additional and independent prognostic indicators for patients with MPHNSCC [
9]. There is no published data evaluating the role of MTV or TLG during RT for MPHNSCC. Furthermore, there is no study to date that has reported the prognostic value of the nodal disease during RT in MPHNSCC.
This study aims to investigate the utility of FDG PET-CT performed in the third week of primary RT (iPET) as a prognostic indicator for MPHNSCC. Specifically, we aimed to assess whether the residual metabolic tumour burden measured by SUVmax, MTV and/or TLG correlated with patient outcomes. The secondary aim was to assess whether the metabolic response, as assessed by percentage reduction of these three metabolic parameters, can also predict treatment outcome.
Materials and methods
Study population
Patients with biopsy-proven, newly diagnosed locally advanced MPHNSCC treated by primary RT with curative intent were retrospectively reviewed as part of a trial approved by the local research ethics committee (Sydney South West Area Health Service Human Research Ethics Committee). Only patients with both staging and mid treatment FDG PET-CT performed during the third week of RT were included for analysis. Patients with early stage diseases, nasopharyngeal cancer (NPC) or being treated with RT only were excluded.
Imaging technique
The studies were acquired in RT treatment position on either a Philips Gemini-GXL-PET-CT (n = 48) or a GE Discovery-710 PET-CT (n = 24). Patients received between 4.1 (for GE) and 5.18 (for Philips) MBq/kg of FDG after at least 4 hours of fasting. The average blood sugar level was 5.7 ± 1.2 mmol/L (range: 3.3-9.6 mmol/L). The staging and all sequential post-treatment scans were performed on the same scanner with the same acquisition and reconstruction protocols.
The PET studies were acquired in three-dimensional (3D) mode for a total acquisition time of 1.5 – 2.5 min per bed position adjusted according to the patient weight, from vertex to proximal femora at about 1-hour post injection. Transmission scans and attenuation corrections were obtained by using CT: a Philips Brilliance 6-slice CT or a 64-slice GE CT, using helical mode without the use of a contrast medium. CT images were acquired at 3.75 to 5 mm slice thickness and reconstructed to a transaxial matrix size of 512 × 512. The current (30–40 mAs) and voltage (120–140 kV) were varied according to the patient weight. The PET images were reconstructed using a Philips Line of Response-Row Action Maximum Likelihood Algorithm (LOR-RAMLA) or GE VUE Point FX (Time of Flight) algorithm into a 144 × 144 (for Philips) or 256 × 256 (for GE) matrix size with a slice thickness of 3.75 to 4.0 mm.
FDG PET image interpretation and metabolic parameter measurement
All FDG-PET images were analysed by consensus reading by two nuclear medicine physicians and a radiation oncologist, blinded to clinical data except the primary tumour site. The semi-quantitative analysis was performed on an Advantage Workstation (GE Healthcare) using the PET-VCAR (Volume Computer-Assisted Reading) software (version 1.0). The maximum SUV (SUV
max) was derived by selecting the most intensely avid area of uptake at the primary tumour on the axial slice. The SUV value was derived as follows: SUV =
\( \frac{C\left( Bq/ ml\right)}{\left(\frac{A(Bq)}{m(g)}\right)} \) (decay-corrected administered activity [KBq] per millilitre of tissue volume)/(injected FDG activity [KBq]/body weight in gram). The MTV was derived by applying a fixed SUV threshold of 2.5 as the lowest limit of the segmentation criteria. The computer-assisted, automatically derived contouring margins and regions of interest (ROIs) for both measurements were checked on three sectional images (axial, coronal and sagittal) to ensure accurate inclusion of primary tumour and nodal sites, and exclusion of adjacent normal structures. The single-component modality was deselected to prevent the segmentation that may potentially derive from grabbing normal structures outside the region of interest (ROI). The TLG was calculated according to the formula: TLG = mean SUV x MTV. The fixed SUV threshold of 2.5 was chosen because it was recommended as an appropriate criterion in a recent meta-analysis, and is commonly used and associated with prognostic outcome [
9]. SUVmax, MTV and TLG were derived for both primary tumour (PT) and index nodes (IN), which is defined as a lymph node or confluent nodal group with the highest TLG reflecting highest metabolic burden.
Treatment
All patients were treated with IMRT or helical TomoTherapy®: total treatment dose to the GTV was 60-70Gy (2–2.2Gy/fraction); high risk cervical lymph node regions received 60-66Gy (1.8-2Gy/fraction), and the low risk regions received 56Gy (1.6-1.7Gy/fraction). All patients were treated with systemic therapy (chemotherapy or Cetuximab). Management of all cases were reviewed and consensus reached in our Head and Neck multidisciplinary team meetings (HNMDT) prior to commencing treatment.
Statistical analysis
The predictive accuracy of all three metabolic parameters (absolute values and percentage reductions) of primary tumour (PT) and index nodes (IN) for treatment outcomes was evaluated using receiver operating characteristic (ROC) analysis with the area under the curve (AUC) as an index of accuracy. Optimal cutoffs (OC) for analysis were derived from the ROC curves aiming for best sensitivity and specificity. Time to local, regional or distant failures and survival times were calculated from the date of staging FDG PET-CT. Disease-free survival (DFS), loco-regional failure-free survival (LRFS), metastatic failure-free survival (MFFS) and overall survival (OS) curves were estimated using Kaplan-Meier (KM) analysis and compared using the log-rank (Mantel-Cox) test.
Cox proportional hazards models with 95 % confidence interval and multivariate analysis were performed using clinical confounders (smoking, alcohol consumption, T stage, N stage and Overall AJCC stages). The Pearson correlation test (two-tailed) was used to evaluate the correlation between SUVmax, MTV and TLG. Statistical significance was considered when the p value was ≤ 0.05 and all levels of significance were two sided. Statistical analysis was performed using IBM SPSS Statistics, version 22.0.
Discussion
To the best of our knowledge, this is the largest series for iPET and the first study to evaluate the role of TLG or MTV during RT in MPHNSCC. Our study is also the first study that assesses the prognostic value of mid-treatment nodal disease during RT. This study demonstrates that residual tumour metabolic burden of primary tumour, as measured by the FDG-PET metabolic parameters during the third week of primary RT for locally advanced MPHNSCC, can predict treatment outcome for loco-regional control and disease-free survival. Our findings are consistent with the hypothesis that the residual metabolic burden mid-treatment may correlate with tumour radiosensitivity. These iPET metabolic parameters could aid in the stratification of patients with poor or good treatment outcomes and allow for selection for adaptive therapy. In other words, it is likely to filter radioresistant disease, which may not be detected by pre-PET, at an early time point to allow individually adaptive radiotherapy.
There is limited data evaluating the role of FDG PET-CT in assessing early RT response at primary tumour site for MPHNSCC with mixed and inconclusive results, and most studies have used either visual analysis or reduction in SUV as criteria. Hentschel et al. performed three serial PET scans on 37 patients during RT for MPHNSCC (patients were divided between: after 10–20, 30–40 or 50–60 Gy), and found that an SUV
max decrease of ≥ 50 % was prognostic of loco-regional control and survival, but did not report the predictive accuracy of the test parameters [
8]. Castaldi et al. evaluated the SUV changes based on modified EORTC criteria of 30 patients after 2 weeks of RT and failed to demonstrate any significant correlation with clinical outcomes [
6]. Ceulemans et al. performed visual analysis of 40 patients after 47Gy and found that complete metabolic response had relatively low sensitivity and low positive predictive value for loco-regional control [
5]. Chen et al. reviewed SUV
max and the reduction ratio of SUV
max (SRR) after cumulative dose of 40-50Gy during RT and found significant correlation of SRR with DFS and OS, but not between SUV
max and oncological outcomes [
10]. In contrast to this, Farrag et al. reported that the SUV
max level after 4 weeks or 47 Gy was significantly associated with OS [
11].
In our study, the third week of treatment was chosen pragmatically, as this was thought to be the most clinically relevant time frame in which a meaningful response may be assessed, but before significant inflammation occurs from RT and also allowing enough time for adaptation of treatment. In the third week, our results show TLG is a better and statistically more significant predictor of treatment outcome than SUVmax or MTV, consistent with our hypothesis that TLG can best reflect tumour metabolic burden, rather than relying on the highest intensity in a single voxel measured by SUVmax. The prognostic value of TLG was more pronounced after the subgroup analysis (radiotherapy and chemotherapy only), since it was the only metabolic parameter showing significant associations with oncological outcomes (DFS and LRFS). Although larger studies with longer follow-up are required to validate our findings, it appears that TLG would be the most reliable prognostic indicator to assess the interim therapeutic response in MPHNSCC, especially when treated with radiation therapy and chemotherapy. Due to a small sample size (n = 15), further subgroup analyses were not performed for patients treated with Cetuximab concurrently with RT.
Consistent with some of published results, our study has shown that SUV
max reduction may not be ideal for early response monitoring in MPHNSCC. To match the methodology of published data [
8], we assessed an SUV
max reduction of >50 % as a prognostic indicator, and found no significant difference in outcome. This finding may be explained by tumour heterogeneity, and the reduction in the metabolic burden probably represents killing of the more radiosensitive component of tumour, and the amount of residual metabolic burden is the more useful predictor of treatment outcome as it may give a measure of the volume of residual and/or radioresistant tumour. Another possible contributing factor is “rebound” FDG uptake due to treatment-induced inflammation. Inflammation related to RT can affect the measurements of FDG-PET. A study of the retention index using weekly dynamic PET on ten xenografts in mice with MPHNSCC undergoing 5 weeks of RT, reported that day 7/after 15Gy appeared to be the best time point in monitoring early response, taking into consideration the treatment-induced rebound FDG uptake [
12]. This study also showed that retention index is superior to SUV
max in predicting treatment outcome.
We are aware of only one other clinical series assessing the prognostic value of residual metabolic burden. In that study, metabolic rate derived from plasma FDG level of iPET after 24 Gy of RT was found to be superior to SUV
max in predicting local control in MPHNSCC [
7]. Routine calculation of MR in clinical practice is, however, not practical for most centres.
As shown in Table
4, all but two studies (our study and Hentschel et al.) included nasopharyngeal cancer, while our study was the only one that evaluated all metabolic parameters. The majority of studies used either IMRT and/or TomoTherapy®, excepting two (Hentschel et al. and Brun et al.). We excluded nasophayngeal cancers (NPC) since the majority of patients that attended our centres had the endemic type NPC, which followed a different natural history with a higher rate of distant recurrence compared to other MPHNSCC [
13]. Unlike other published studies (Table
4), in order to avoid the heterogeneity which could affect the oncological outcome, we also excluded early stage patients (stage I and II) and patients treated with RT only.
Table 4
Studies evaluating the predictive role of metabolic parameters of 2-[18F] fluoro-2-deoxy-D-glucose positron emission tomography-computed tomography performed during radiation therapy in head and neck cancer
Our study | 72 | OCC, OPC, LRC, HPC | IIIx18, IVx54 | CRT×57, Cet-RT×15 | 25 (6–70) | 3rd week | SUVmax, MTV, TLG | DFS, LRFS, MFFS, OS | Yes (SUVmax, MTV, TLG with DFS, LRFS) |
| 51 | OPC, HPC, NPC | IIIx16, IVx35 | RT×7, CRT×41, Cet-RT × 3 | 23 (7–53) | Cumulative dose of 40-50Gy | SUVmax, SRR | DFS, PRFS, NRFS, OS | Yes (SRR-P with DFS and OS) |
Castaldi et al., 2012 [ 6] | 26 | OPC, HPC, NPC, LRC | IIx1, IIIx7, IVx18 | CRT | 29.2 (2.8-56) | After 2 wks | SUVmax
| RFS and DFS | No |
Hentschel et al., 2011 [ 8] | 37 | OCC, OPC, HPC, LRC | Not clear (only T and N stage reported) | CRT | 26 (8–50) | 10-20 Gy/week 1or2) 14 to 21 days (range: 1st to 6th week) | SUVmax, GTV PET | LRFS, DFS, OS | Yes (SUVmax >50 % reduction after 10-20Gy or week 1-2 with OS) |
Ceulemans et al. 2011 [ 5] | 40 | OCC, OPC, NPC, HPC, LRC | Ix2, IIx9, IIIx10, IVx19 | RT×34, CRT×16 | 26 (7–50) | Week 4/ 47Gy | CR/NCR (visual assessment) | OS | No (CR with OS) |
| 43 | NPC, LRC, HPC, OPC, OCC | Not clear (only T and N stage reported) | RT×27, CRT×16 | Median 12.7 months (3–34.5) | After 4 weeks or 47Gy | SUVmax
| DFS, OS | Yes (SUVmax with OS) |
| 47 | OCC, OPC, HPC, LRC, OTHERS | II-IIIx17, IVx30 | RT × 37, IC+CRT×10, RT+Sg×1, RT+Sg+BR×1 | 39.6 (14.4-82.8) | 1-3 weeks | MR + SUV | CR (Complete Remission, LC, OS | Yes (MR with CR, LC, OS). |
There is no published literature on the prognostic value of mid-treatment nodal response in patients with MPHNSCC undergoing primary RT. In head and neck cancer, the largest node with highest metabolic burden (measured by TLG and defined as “index node” in our series) is likely to be the most predictive of treatment outcome, and most reproducible across different centres. Therefore, we decided to assess the metabolic parameters of IN and its correlation with treatment outcomes. In our study, the response rate rather than the residual metabolic burden of IN appears to predict the tumour outcome in nodal disease. The percentage reduction of MTV was found to be the only prognostic indicator of DFS. This suggests that the therapeutic response of the nodal disease may not be the same as the primary tumour. On the other hand, the predictive value of residual metabolic burden measured by all three metabolic parameters was improved significantly when both PT and IN were combined. To our knowledge, there is no study that reports a functional imaging study that is predictive of distant failure in non-nasopharyngeal MPHNSCC. In our study, in patients with nodal disease, TGLPT+IN is predictive of the distant failure rates and overall survival. This information is likely to be useful, especially in adaptive systemic therapy trials, and therefore, should be validated in larger prospective studies.
One limitation of this study is that tumour grading and HPV status are not available for the majority of patients, but we believe that correlation with HPV status may better identify patients suitable for dose de-intensification. We are currently evaluating the feasibility of deriving this information retrospectively, and correlating this with the prognostic significance of pre-PET and iPET metabolic parameters. Further biological profiling in iPET such as tumour hypoxia or proliferation indices may better explain the radiobiology for possible tumour radioresistance in patients with high residual metabolic burden, and identify the best strategy for adaptive therapy. In addition, assessment of nodal metabolic status in iPET may also provide additional prognostic information and improve correlation with treatment outcomes, particularly OS.
Although we have used week 3 for iPET as a clinical time point to allow enough cell kill and time for adapting therapy, the optimal time to perform iPET remains undecided. Despite this, we have demonstrated the utility of week 3 iPET to identify patients with poor and good treatment outcome for selection for possible adaptive therapy. Future studies with serial dynamic PET assessing all metabolic parameters may be of value to further improve the predictive sensitivity and specificity with iPET. Localisation of high risk areas should also be evaluated, including determining the role of functional magnetic resonance imaging to target radioresistant subvolumes within the gross tumour.