Clinical Investigation
Reproducibility of “Intelligent” Contouring of Gross Tumor Volume in Non–Small-Cell Lung Cancer on PET/CT Images Using a Standardized Visual Method

https://doi.org/10.1016/j.ijrobp.2009.06.032Get rights and content

Purpose

Positron emission tomography/computed tomography (PET/CT) is increasingly used for delineating gross tumor volume (GTV) in non–small-cell lung cancer (NSCLC). The methodology for contouring tumor margins remains controversial. We developed a rigorous visual protocol for contouring GTV that uses all available clinical information and studied its reproducibility in patients from a prospective PET/CT planning trial.

Methods and Materials

Planning PET/CT scans from 6 consecutive patients were selected. Six “observers” (two radiation oncologists, two nuclear medicine physicians, and two radiologists) contoured GTVs for each patient using a predefined protocol and subsequently recontoured 2 patients. For the estimated GTVs and axial distances, least-squares means for each observer and for each case were calculated and compared, using the F test and pairwise t-tests. In five cases, tumor margins were also autocontoured using standardized uptake value (SUV) cutoffs of 2.5 and 3.5 and 40% SUVmax.

Results

The magnitude of variation between observers was small relative to the mean (coefficient of variation [CV] = 3%), and the total variation (intraclass correlation coefficient [ICC] = 3%). For estimation of superior/inferior (SI), left/right (LR), and anterior/posterior (AP) borders of the GTV, differences between observers were also small (AP, CV = 2%, ICC = 0.4%; LR, CV = 6%, ICC = 2%; SI, CV 4%, ICC = 2%). GTVs autocontoured generated using SUV 2.5, 3.5, and 40% SUVmax differed widely in each case. An SUV contour of 2.5 was most closely correlated with the mean GTV defined by the human observers.

Conclusions

Observer variation contributed little to total variation in the GTV and axial distances. A visual contouring protocol gave reproducible results for contouring GTV in NSCLC.

Introduction

Positron emission tomography (PET) using the radiopharmaceutical 18F-fluorodeoxyglucose (FDG) is significantly more accurate than computed tomography (CT) alone in the diagnosis (1), staging 2, 3, and restaging (4) of lung cancer. PET/CT provides the most complete and reliable indication of the extent of gross tumor in non–small-cell lung cancer (NSCLC) available from any imaging investigation (5), and the use of FDG-PET/CT for radiation therapy (RT) planning is a dynamic area of research (6). The use of CT alone for tumor contouring is associated with extremely poor reproducibility (7), but when PET information is added, there is much greater congruence between observers 8, 9. However, because tumor margins can appear indistinct on PET, it can be difficult to define the limits of the gross tumor volume (GTV). Additionally, moving tumors, such as lung cancers, often appear to occupy more space on PET than on CT and can have particularly indistinct edges (10).

The two main approaches to contouring the GTV using PET employ either automated methods or visual methods that require human judgement. The range of automated or semiautomated methods available for use in lung cancer includes various levels of standardized uptake values (SUV) (11) or SUV thresholds (6) or lesion-to-background ratios. More complex methods may give better results 12, 13, 14 and for a given dataset are highly reproducible. However, different algorithms can give widely different results when used in the same patients (15). No automated method provides a comprehensive and reliable solution to the problem of defining GTV in moving tumors (16). All purely “automated” algorithms exclude other qualitative and quantitative information available from PET/CT images and disregard clinical information such as bronchoscopy findings. Additionally, measured SUV varies with the PET scanner used and the conditions under which it is measured (17). Tumors may have heterogeneous FDG uptake (18), and for small lesions SUV is incorrectly measured because of partial volume effects (19). Automated tumor contouring is attractive because it restricts human variability. However, little attention has been paid to the reproducibility of tumor contouring by human observers, a method that is actually widely used.

In a previous study of CT-based GTV contouring in NSCLC, we reported that interobserver variation was dramatically reduced by strict contouring guidelines (20). We have developed a rigorous visual PET/CT contouring protocol, based on that experience, that defines standardized settings for window level at the treatment planning workstation, provides all available clinical and imaging information to the person doing the contouring, and uses both PET and CT information in the contouring process. Here we describe our visual contouring protocol and report the results of a study in which observers used the protocol to contour the GTV in a group of patients with NSCLC, to see how reproducibly they could define both the tumor volume and the tumor location in three-dimensional space. For comparison with the visual contouring method, we used an automated algorithm to delineate tumor margins using a range of SUV-based parameters.

Section snippets

Patients

This study used PET/CT images from consecutive patients entered in a large prospective clinical trial of imaging in RT planning and was approved by the institutional ethics committee. The first patient was selected at random. Eligible patients had Stage I–III NSCLC and remained suitable for radical chemoradiation after a combined staging/treatment planning PET/CT scan. Two of the patients, C and E, had significant degrees of atelectasis. In Patient E, a tumor had caused collapse of the right

GTV, log(GTV) and IGTV

Images of tumor contours drawn by the six observers on PET/CT images of all 6 patients are shown in Fig. 1. Least-squares GTV means for the six observers and the six cases are presented in Table 1a. Residual vs. fitted value plots indicated that the variation in GTV increased with mean GTV, but this relationship seemed to be uncoupled by repeating the analyses on the natural logarithms of the GTV values. In the following discussion, all comments and conclusions are based on the analyses of

Discussion

This study was designed to investigate the reproducibility of rigorous visual PET/CT contouring in RT planning for NSCLC. The reasons for avoiding purely automated contouring were described in a recent article (22). Our contouring methodology has gradually been refined after the earlier experience with coregistered PET and CT images 23, 24 and experience with contouring using CT alone (20). When our CT contouring method was adapted for use with PET/CT, it became the responsibility of a

References (25)

Cited by (56)

  • Variability of gross tumour volume delineation: MRI and CT based tumour and lymph node delineation for lung radiotherapy

    2022, Radiotherapy and Oncology
    Citation Excerpt :

    Phase 1 target volume delineation was based on departmental protocol, GTVp was contoured using lung window width/level (1600/-600 HU), whilst GTVn was contoured on mediastinal window width/level (400/20 HU) as per European Organization for Research and Treatment of Cancer (EORTC) guidelines [12]. The PET visual method was adopted for viewing PET images [13]. Phase 2 delineation was based on MRI and FDG-PET datasets.

View all citing articles on Scopus

Conflict of interest: none.

View full text