Introduction

Computed tomography colonography (CTC) is an established diagnostic technique for both symptomatic and screening patients [1]. The diagnostic performance of CTC has generally necessitated comparison between CTC and subsequent optical colonoscopy (OC) [2]. Such comparisons are often performed by an experienced radiologist who uses prespecified criteria to match polyps identified by CTC with those found at subsequent OC [2, 3].

A variety of matching criteria have been described, usually based on the location, size, and morphology of polyps [48]. For example, a correct matching may be assumed when the polyp identified by CTC is found in the same or adjacent colonic segment as the polyp detected by OC. For convenience the colon is usually divided into six segments: caecum, ascending, transverse and descending colon, sigmoid and rectum [2, 3]. For size matching a frequently described criterion stipulates that the CTC polyp must be within 50% of the diameter measured at colonoscopy. Matching based on general morphology may also be performed.

However, observers using identical criteria may nevertheless match different polyps because the matching procedure is subjective and requires interpretation of both CTC and colonoscopy data. Furthermore, variation in the matching criteria stipulated by different researchers hinders comparisons between different studies.

The purpose of our study was to investigate to what extent experienced readers differ when matching polyps between CTC and OC and to explore the reasons underpinning any differences. Ultimately we aimed to develop criteria to minimize matching disagreement with a view to improve study methodology and facilitate interstudy comparisons.

Methods

Eight highly experienced CTC researchers from six centers in Europe and the USA participated. We administered a questionnaire to document their current criteria for polyp matching. To investigate matching in daily practice, the eight readers were asked to match preselected cases. Reader experience varied from 250 to over 3,000 CTC interpretations and between 100 and 3,000 CTC cases with corresponding OC.

Questionnaire

The questionnaire presented multiple choice questions relating to matching criteria used by readers for their research studies (Table 1). Options were formulated based on descriptions of matching criteria from the literature [48].

Table 1 Questionnaire matching

Patients

A radiology researcher (M. L.) selected 28 cases from two research databases of 170 surveillance patients and 240 fecal occult blood test positive patients who had undergone both CTC and subsequent colonoscopy. These studies had been approved by the local Medical Ethics Committee and informed consent was obtained from all participants.

All CTC examinations had been read prospectively by one of four experienced observers, each with at least 100 CTC interpretations with colonoscopic verification. Observers had marked any polyp and indicated the morphology, size, location, and their confidence.

Colonoscopy with segmental unblinding was performed subsequently by a gastroenterologist, gastroenterology resident, or gastroenterology nurse under supervision. Maximal polyp diameter was estimated by using an opened biopsy forceps and in some cases with a linear measure probe additionally (Olympus America). All colonoscopies were videotaped starting from the caecum.

Case selection

As the present study aimed at evaluating concordance when matching polyps between CTC and OC, selection was biased towards cases likely to prove challenging. A research fellow experienced in matching (>250 matched CTC and OC studies) selected cases with polyps that failed to meet the typical matching criteria specified by the literature [48]. For example, a polyp at CTC whose location apparently differed by two segments or more from the location suggested by colonoscopy. Only technically adequate CTC examinations were selected so as not to confound matching, by insufficient distension, for example.

Twenty-one cases were selected where a single polyp at CTC and OC had to be matched by the experienced reader. To evaluate a broad spectrum of potential matching scenarios, in 13 cases the CTC and colonoscopy data were purposely perturbed so that two different patients were combined. In this way, different morphologies and/or locations could be presented to the reader. Figure 1 illustrates examples of three cases. Seven other cases were selected that had multiple polyps at CTC and/or OC. Again, difficult cases were purposely selected whose polyps could not necessarily be matched using established criteria.

Fig. 1
figure 1

a Case 2; CTC polyp: caecum, 7.1 mm, sessile. OC polyp: ascending colon, 3 mm, sessile. From left to right: 2D image, 3D image, colonoscopy image. In this case all eight readers indicated a match. b Case 15; CTC polyp: sigmoid, 5.2 mm, sessile. OC polyp: ascending colon, 6 mm, sessile. In this case only one reader indicated a match. c Case 19; CTC polyp: descending colon, 17.9 mm, pedunculated. OC polyp: pedunculated, 6 mm, pedunculated. Four of the eight readers indicated a match

Reviewing matching cases

All readers performed the observations at their own department and were free from clinical commitments during the matching procedure. Readers were free to use their own visualization software to read cases, but a laptop with View Forum software (Version 6.2, Philips, Best, Netherlands) was also available. Polyps initially found by the CTC observer (in the original research study) were presented to the readers by a researcher with information on morphology, size, location, and certainty of diagnosis scored by the observer. The experienced readers were able to remeasure the polyps if they wished. Colonoscopic information was also available to the readers: colonoscopy videos, diameter information, and location and morphology.

Polyp matching

Readers completed a data form for each case. For the 21 single-polyp cases the readers indicated whether they considered the CTC and OC polyp a correct match. If readers believed the two polyps were not the same, the researcher queried their reasoning and classified each mismatch as due to disagreement relating to: (1) diameter; (2) morphology; (3) location.

In the seven multiple-polyp cases, readers were invited to indicate which of the polyps presented to them matched and which they believed did not. Again, reasons for mismatching were explored.

Statistical analysis

Because cases were preselected, only descriptive statistics were performed. A per case analysis was performed for the 21 single-polyp cases; for each case the number of the eight observers reporting a match was determined. For size and location matching we determined the number of instances in which a reader did not adhere to their own matching criteria, prespecified by them in the questionnaire.

Of the seven multiple-polyp cases, the number of matched polyps per size category (≥10 mm, 6 to <10 mm, or <6 mm) was counted per case and summarized per observer. Reference diameter was colonoscopic excepting nonmatched CTC polyps.

Results

Questionnaire

All readers stated that they normally use colonoscopy reports or a case record form completed during colonoscopy for polyp matching. Seven readers used video stills of colonoscopic polyps; four readers also used colonoscopy videos; two readers also employed pathology reports for polyp diameter information.

All readers normally required endoscopic information relating to segmental location, size, and morphology. Six readers defined a flat lesion as one whose width must be at least twice its height; two also stated that the polyp must protrude less than 3 mm from the mucosa. The remaining two readers exclusively used this latter definition for flat lesions. Three readers wished to have information about the distance of the polyp from the anus and two wished to have histology information to facilitate matching. Table 2 details the different criteria described by readers for matching.

Table 2 Different matching criteria indicated by eight readers in the questionnaire

Single-polyp cases

Agreement amongst readers could concern agreement on the presence of a match as well as the lack of presence of a match in a case, both important aspects in matching. Disagreement amongst readers in a case means that about half of the readers concern the presence of a match and the other half not. In the 21 cases with a single polyp, readers considered a match between CTC and OC to be present in between 13 (62%) and 19 (90%) of cases, i.e., there was some disagreement as to whether a match between CTC and OC was possible or not due to a perceived unacceptable discrepancy for one or more of the typical matching criteria (size, location, and morphology).

To evaluate the magnitude of this disagreement we analyzed the per case agreement or disagreement. We then found that the readers agreed completely or almost completely in 15 of 21 cases with respect to the presence or lack of a match of CTC and colonoscopy findings. Complete agreement was present in five cases in whom all eight readers agreed on a match. Almost complete agreement on matching (i.e., seven of eight readers indicated a match) was present in seven cases, whereas almost complete agreement on the lack of matching was present in three cases. In six of the 21 cases, however, a considerable disagreement in matching was found. In five cases only four readers indicated a match, and in one case five readers indicated a match. Figure 2 indicates how many cases readers agreed and disagreed in matching the CTC and colonoscopy polyp.

Fig. 2
figure 2

Agreement and disagreement amongst readers in matching 21 single polyp cases. At the x-axis agreement or disagreement in matching is presented. The number of cases is given on the y-axis. In the ideal situation, all readers agree on the presence (8/8) or the lack (0/8) of a match; this means complete matching agreement (gray bar). When only half the readers agree on a match and the other half do not agree (4/8) this is complete matching disagreement (gray bar). When only seven of eight readers indicate a match or no match there is almost complete matching agreement (black bar)

To explore the rationale underpinning this disagreement we evaluated data separately for cases with location, size, and morphology discrepancies. In the five cases that were selected for segmental location difference between the CTC and colonoscopy polyp, there was a high matching agreement across readers. Nearly all readers refused to match polyps where the CTC and colonoscopy location differed by three or more adjacent segments.

Regarding the ten cases where diameter disagreements ostensibly prevented matching between the CTC and colonoscopy, in five cases nearly all readers in practice ignored diameter discrepancies of more than 100% between the CTC and colonoscopy polyp. In the other five cases, while diameter discrepancies of more than 100% again existed between polyps, only four of eight readers (in four cases) and five of eight readers (in one case) found matching possible, indicating poor agreement existed.

In cases with different morphology, agreement for matching was high except for a single case; case 21 demonstrated a fecal residue at CTC and a polyp at colonoscopy in which only four of eight readers performed a match between the fecal residue and a polyp.

Overall, 55 polyps (mean 6.9 polyps per reader) were matched by readers despite individual polyps not fulfilling the criteria for matching on the basis of diameter prespecified by each reader. Overall, 12 polyps (mean 1.5 polyps per reader) were matched despite not fulfilling criteria prespecified by readers for segmental location. Two polyps were matched despite not fulfilling both diameter and segmental criteria.

Multiple-polyp cases

The seven cases with multiple polyps had 11 CTC polyps and 12 colonoscopy polyps of 10 mm or larger, 18 CTC and 12 colonoscopy polyps of 6–9 mm, and nine CTC and 20 colonoscopy polyps smaller than 6 mm. The total number of polyps matched per reader varied from 27 to 35 (Table 3). In case 3, for example, one reader matched five polyps on CTC with OC, while another reader matched nine. For CTC polyps of 10 mm or larger, the number of matches per reader showed less variation, between 9 and 11 (Table 4). For polyps smaller than 6 mm the inter-reader variability was larger, with the number of matches varied from 7 to 14. The number of false negative CTC polyps of 10 mm or larger per reader ranged from one to three and the number of false positive CTC polyps of 10 mm or larger ranged from zero to two. Reasons for not matching CTC and colonoscopy polyps were mismatching due to location, size, and morphology.

Table 3 Number of matched polyps per reader in the multiple-polyp cases
Table 4 Numbers of true positive, false positive, and false negative CTC polyps per size category per reader in the multiple-polyp cases

Observations

Six readers remeasured polyps on CTC when a large diameter difference was apparent between it and the OC polyp. To resolve this, four readers disagreed with the diameter recorded by the colonoscopist and reinterpreted the size of the OC polyp shown on video. To determine polyp location precisely, two readers thoroughly examined the colonoscopic video to clarify the colonic segment or location of the polyp compared to a fold. Four readers occasionally scrutinized the morphology of the polyp at video, especially in pedunculated cases (e.g., to determine stalk length) and cases with flat polyps.

Discussion

We have investigated if disagreement exists between experienced readers when attempting to match polyps identified by CTC and subsequent colonoscopy. Eight experienced CTC researchers apparently used similar matching criteria, based on polyp location, size, and morphology. Readers largely agreed when matching cases with single polyps. We found, however, substantial disagreement in a minority but non-negligible proportion of cases. Disagreement was also present in cases with multiple polyps but predominantly for the least relevant polyps, i.e., those smaller than 6 mm.

The CTC literature describes various matching criteria, based on expert opinion rather than an evidence-based approach. Evidence-based matching criteria are difficult to formulate because a robust reference standard for matching corresponding CTC and colonoscopy polyps poses very substantial methodological difficulties. We did not aim to validate matching criteria. Rather, we investigated how readers matched in practice and the level of disagreement between them.

While we found substantial agreement there was also non-negligible disagreement, predominantly due to a large perceived diameter difference between CTC and OC. This was not a constant observation, however, because some cases with similar discrepancies were matched by most readers. The reasons underpinning this observation were unclear, despite us asking readers for their rationale.

Discrepancy was also noted in those cases with multiple polyps. For the most clinically important polyps (≥10 mm), we observed minimal disagreement. In populations with a low prevalence of polyps, matching is less problematic because few polyps need be matched [6]. However, when the number of polyps per patient increases, matching will likely become less straightforward and we have demonstrated inter-reader disagreement.

After case matching, readers were asked for their matching criteria via a questionnaire. All readers reported practically identical matching criteria (described in the Introduction). While these criteria are apparently straightforward, in practice there are several problems. Difficulties when matching location exist because anatomical borders are ill-defined and colonoscopists frequently cannot locate the endoscope tip with precision [9]. In our study almost all readers took this into account and were prepared to match polyps that were not within the same or adjacent colonic segments.

Another problem exists when matching based on polyp diameter. Colonoscopic estimation of diameter is imprecise [10, 11]. CTC measurements are also variable but are probably more accurate [1214]. We found that most readers remeasured CTC polyps and often redefined the colonoscopist’s assessment of diameter from the video provided. Hence apparently large diameter differences between polyps did not always preclude a match.

Matching of polyps based on morphology also differed between readers because judgment of morphology is subjective. While definitions of lesion morphology are clearly described [15], we found that readers often used different definitions for flat lesions at CTC.

Because we found that experienced readers did not always adhere to established matching criteria, we propose that disagreement is best resolved by consensus. At the very least, two readers would then have to consider whether a match between a polyp imaged by both CTC and OC was possible, which is likely to reduce error and uncertainty. Such an approach is inevitably time consuming and an alternative is to perform consensus matching only when polyps do not satisfy generally accepted criteria for matching. However, as we have stated, these criteria are not evidence-based and our study was not designed to provide such a base. However, we do propose a matching procedure (Fig. 3) suggesting consensus matching by at least two experienced readers where cases do not satisfy conventional matching criteria. Optimally, at least one observer should be a radiologist and the other a colonoscopist since both have different attributes.

Fig. 3
figure 3

Matching procedure of CTC and colonoscopy polyps. a Six colonic segments are considered: caecum, ascending, transverse and descending colon, sigmoid, and rectum. b Consensus matching must be performed with at least two experienced persons, preferably one radiologist and one gastroenterologist. c CTC and colonoscopy polyp have a similar appearance/shape (judged by the observer who is performing the matching)

A potential limitation of our study is that we did not present pathology reports or the histological diameter of excised polyps. It is however questionable whether these data would provide useful additional information since polyps often shrink after polypectomy due to electrosurgical tissue effects and vascular collapse [16]. Another limitation is that our cases were purposely biased towards difficult cases. This was done to magnify any inter-reader variation in a pragmatic manner. Although this approach was efficient, as a consequence it was impossible to calculate meaningful metrics applicable to real-world scenarios.

In summary, we found that experienced CTC readers agree to a considerable extent when matching polyps detected by CTC to those found at subsequent OC in difficult cases, but non-negligible disagreement exists. Such disagreement may explain data variation of some studies on the diagnostic accuracy of CTC. We suggest using a consensus to minimize disagreement when matching those cases that do not satisfy established matching criteria.