Radiologic methods that accurately assess clinical response are essential for the evaluation of current and experimental regimens used to treat hematologic malignancies. Recent advances that incorporate combination chemotherapy and the anti-CD20-targeted agent rituximab (Rituxan) have improved the clinical outcome of patients diagnosed with diffuse large B-cell lymphoma (DLBCL), but only 60% of all DLBCL patients are potentially cured and achieve sustained progression-free survival (PFS). PFS after salvage therapies including autologous stem cell transplantation drops to 30% leading to a disease relapse and a poor prognosis.1 A response-adaptive imaging strategy that accurately determines the initial response to therapy and then individualizes subsequent treatment could improve PFS, reduce relapse rates and improve clinical outcomes.

Positron emission tomography (PET) integrated with computerized tomography (CT) combines anatomical delineation and metabolic activity of tumor tissue counts as the main tool to determine the therapeutic response of DLBCL patients. 18F-labeled-fluorodeoxyglucose (18F-FDG) PET can differentiate viable tumor from posttreatment necrotic tissue or fibrosis making it the imaging modality-of-choice upon completion of chemotherapy.2 Although there has been an increasing trend to perform interim PET/CT (interim PET) after 2–4 cycles of induction chemotherapy to monitor response and tailor consolidation therapy, the optimal interpretation method for interim PET analyses remains uncertain.3 Importantly, there is an unmet for a quantitative, standardized and reproducible method for this purpose.4

Although semiquantitative methods, such as determination of the maximum standard uptake value (SUVmax), partially meet these criteria, studies have not defined a uniformly applicable SUVmax reduction cutoff that accurately predicts PFS or clinical outcome.5, 6 SUVmax represents a single-pixel value, which reflects maximum intensity of 18F-fluorodeoxyglucose (FDG) activity in the tumor and ignores the extent of metabolic abnormality and changes in the distribution of a tracer within a lesion. SUVmax reflects increased anaerobic metabolism and higher glucose consumption. This region of tumors is located in the hypoxic tumor core with irregular angiogenesis, which result in more leaky and less effective vasculature that may cause less effective medication delivery.7 Thus, using SUVmax reduction to assess chemotherapy effectiveness may miss the more dynamic area of the tumor and those with improved drug delivery. Although complete disappearance of SUVmax may indicate complete response, SUVmax, in fact, may not be the best index to determine the early tumor response to a given treatment. Therefore, alternative metabolic parameters that integrate both tumor volume and intensity of uptake may provide more precise clinical information. We hypothesized that a method that maximized the detection of all metabolically active regions within the tumor mass, defined as the metabolic tumor volume (MTV), could serve as a better predictor of clinical outcome than semiquantitative methods, that is, SUVmax measurement. Here, we compared the ability of MTV measurement by gradient- or threshold-based methods with semiquantitative SUVmax measurement on interim PET analyses to predict the PFS of DLBCL patients after initial therapy.

A total of 197 patients with pathology confirmed diagnosis of DLBCL were treated from December 2006 to December 2014. Of the 197 patients, 140 underwent interim PET analysis. Patient characteristics are shown in Table 1. The primary end point of the study was PFS, as defined by the time from the beginning of treatment to first progression, relapse, death from any cause or last follow-up visit. Patients still alive were censored at the date of last contact. Interim PET analysis was performed after 2–4 cycles of chemotherapy, acquired from the orbits to the proximal third of the thighs. All patients fasted >6 h before intravenous injection of 18F-glucose, had glucose levels >90 and <160 mg/dl at the moment of injection, scans were performed within 90 min after injection and granulocyte-colony stimulating factor was stopped >48 h before imaging. Interim results were interpreted as either positive or negative by visual dichotomous response criteria according to the five-point score Deauville system.

Table 1 (A) Patients characteristics and interim PET interoperation based on visual assessment among 140 evaluable patients with DLBCL. (B) The PET parameters on the first (pretreatment) and second (interim) PET

To evaluate the contribution of metabolic activity within the tumor periphery in assessing clinical outcomes, two different methods—fixed threshold- and gradient-based—were used to measure MTV. Fixed threshold-based measures tumor volume using software that includes all detectable areas with 18FDG uptake greater than a fixed percentage of SUVmax (usually defined as 37%).8 Gradient-based methods are designed to allow a better estimation of intensity by reconstructing images that are denoised and deblurred with an edge-preserving filter and iterative deconvolution algorithm.9 Differences in uptake and metabolism at tumor periphery, where a sharp drop in FDG uptake is seen, are considered to be the edge of the metabolically active tumor volume. Gradient-based methods appear to be more accurate compared with source-to-background ratio methods for segmenting FDG-PET images.10 SUVmax and MTV were determined from the initial and interim PET images using PET Edge software (MIMSoftware Inc., Cleveland, OH, USA).

Median follow-up period for patients in the study was 37 months. R-CHOP (Rituximab, Cyclophosphamide, doxorubicin/Hydroxydaunomycin, vincristine/Oncovin and Prednisone) and R-DA-EPOCH (Rituximab-Dose-Adjusted Etoposide, Prednisone, Oncovin, Cyclophosphamide, Hydroxydaunorubicin) were the first line of therapy in 74% and 26% of patients, respectively. On interim PET/CT, 69% of patients achieved complete response with the remaining patients showing partial response based on visual assessment. Dichotomous visual interpretation of interim PET did not correlate with PFS (log-rank P=0.37). Compared with the threshold-based method, the gradient-based method resulted in a statistically significant greater MTV in pretreatment, as well as interim PET images. However, no significant difference was noted between the reduction in MTV determined by the threshold-based (ΔMTVT) or gradient-based (ΔMTVG) methods (median 34% vs 36%, P=0.29). However, the reduction in SUVmax (ΔSUVmax) was smaller when measured by ΔMTVT and ΔMTVG (median ΔSUVmax, ΔMTVT and ΔMTVG is 65%, 34%, 36% respectively, P=0.043).

As no difference was found between the two methods to determine ΔMTV and as the threshold-based method was more versatile, this method was used to correlate interim PET values with PFS. To identify an optimal threshold cutoff that could predict PFS more accurately, receiver operating characteristic (ROC) curve analysis was used. The area under the ROC curve (AUC) provides a measure of the accuracy of a diagnostic test and ranges from 0.5 (random guessing) to 1.0 (perfect test).11 Thresholds of ΔSUVmax and ΔMTV by this method were 72% and 52%, respectively. ΔMTV predicted PFS better than ΔSUVmax as the AUC for ΔMTV was significantly larger compared with that for for ΔSUVmax (AUCΔMTV: 0.713 and AUCΔSUVmax: 0.873; P: 0.0324) (Figure 1a). All patients who achieved an SUVmax reduction greater than the cutoff value determined by the ROC analysis (ΔSUVmax>72%) were then stratified into two groups based on an ΔMTV cutoff value > or <52%. From a total of 115 patients who achieved a ΔSUVmax >72% on interim PET/CT imaging, 77 (67%) had a ΔMTV >52%. Importantly, patients who achieved a ΔMTV >52% had a statistically significantly greater PFS compared with patients who achieved a ΔMTV <52% (hazard ratio: 1.37; confidence interval: 1.03–1.71, P=0.02; Figure 1b). Among 115 patients who achieved a ΔSUVmax >72% on interim PET and those who demonstrated a ΔMTV >52% exhibited greater PFS (hazard ratio=1.37; confidence interval=1.03–1.71; P=0.02).

Figure 1
figure 1

(a) ROC curves for the MTV and ΔSUVmax for predicting PFS. MTV was measured by two different methods, threshold-based using 37% SUVmax as the threshold and gradient-based using the PET Edge software. The software calculates spatial derivatives along the tumor radii and then defines the tumor edge on the basis of derivative levels and the continuity of the tumor edge. All the measurements were performed by a single operator. The thresholds of ΔSUVmax and ΔMTV by ROC curve analysis were 72% and 52%, respectively. ΔMTV predicted PFS better than ΔSUVmax as the AUC for ΔMTV was significantly larger compared with the AUC for ΔSUVmax (AUCΔMTV: 0.713 and AUCΔSUVmax 0.873; P=0.0324). (b) Kaplan–Meier curve for patient who achieved adequate ΔSUVmax reduction (ΔSUVmax >72%) stratified to two groups based on ΔMTV. ΔMTV can predict PFS in a subset of patients who had significant SUVmax reduction on interim PET.

In this study, a retrospective study was performed to correlate the reduction in MTV and SUVmax on interim PET with PFS. MTV measurement using a gradient-based method rendered assessment of a greater tumor volume compared with the threshold-based method. The two methods reveal a similar percent reduction in MTV and appear equivalent with respect to interim PET results. However, MTV measurement by either method after initial treatment was a better predictor of PFS compared with SUVmax reduction. Further analysis also revealed the underlying importance of MTV reduction on interim PET to predict PFS for patients who had also achieved a significant reduction in SUVmax (Figure 1b). Although SUVmax assessment represents a significant improvement over subjective visual assessment of interim PET scans, alone it does not adequately predict PFS.12 In contrast, MTV assessment (by either gradient- or threshold-based methods) more accurately predicted PFS as it incorporates the metabolic contribution of the tumor periphery. Commonly, peripheral tumor is not adequately assessed, although it is metabolically active.

Although prior reports highlight the prognostic value of imaging PET based on visual assessment, other studies, including ours, have not demonstrated a statistically significant difference for positive or negative.13 Such results may be because of the high degree of interobserver variability inherent in visual assessment methods. The ΔSUVmax cutoff values estimated by ROC analysis used here to distinguish good and bad responders were similar to those values previously reported in independent cohorts after either two or four cycles of induction treatment.4, 14 Thus, these thresholds appear to be robust and reproducible regardless of patient age and International Prognostic Index in DLBCL patients. Our study highlights the importance of MTV assessment combined with semiquantitative measurements on interim PET to better predict the clinical outcome of DLBCL patients. Metabolic activity of peripheral tumor should be incorporated into response-adaptive strategies and prospective trials that evaluate the response to current and novel therapeutic regimens to treat DLBCL patients.