Background
-
sensitivity (probability that a person with the disease has a positive test result);
-
specificity (probability that a healthy person has a negative test result);
-
cancer detection rate (number of cancer cases detected per 1,000 women examined);
-
recall rate (proportion of women who are recalled for further investigation); and
-
cost-effectiveness?
Methods
CAD (Computer-aided detection)
Literature search and selection of articles
-
population-based screening
-
≥5,000 women included
-
study setting corresponding to Swedish conditions
-
follow-up time ≥ 12 months
-
mammography readings with one breast radiologist + CAD compared with readings by two breast radiologists.
Assessment of diagnostic accuracy
Rating quality of individual studies
High: small risk of bias
| Prospective study design. Particular emphasis on the following: |
● adequately described patients constituting a representative and clinically relevant sample (QUADAS items 1, 2). | |
● the index test should not form part of the reference standard (item 7). | |
● evaluators should be masked to results of index test and reference test (items 10, 11) | |
● the tests should be described in sufficient detail to permit replication (items 8, 9). | |
● sample size ≥ 5000. | |
● diagnostic accuracy presented as sensitivity and specificity. | |
Moderate: moderate risk of bias
| Prospective study design |
Since no prospective studies based on digital mammography could be identified, scanned analogue images were accepted. Otherwise the same criteria as for high quality were required. | |
Low: high risk of selection and/or verification bias
| Retrospective study design. Selected or enriched samples |
Rating evidence across studies
-
High (⊕⊕⊕⊕). Based on high or moderate quality studies containing no factors that weaken the overall judgement.
-
Moderate (⊕⊕⊕O). Based on high or moderate quality studies containing isolated factors that weaken the overall judgement.
-
Limited (⊕⊕OO). Based on high or moderate quality studies containing factors that weaken the overall judgement.
-
Insufficient (⊕OOO). The evidence base is insufficient when scientific evidence is lacking, the quality of available studies is low or studies of similar quality are contradictory.
-
Sensitivity = probability that a person with a disease has a positive test result.
-
Specificity = probability that a healthy person has a negative test result.
-
Relative sensitivity = number of detected cancer cases per reader divided by the total number of detected cancer cases.
-
Population based mammography screening = all women in certain age groups receive a personal mailed invitation to get a mammogram at regular intervals (1.5 – 3 years)
-
Cancer detection rate = the number of cancer cases detected per 1000 women examined.
-
Recall rate = the number of women per 1000 woman recalled for further investigation.
-
Interval cancer = cancer cases detected between two screening occasions.
Results
Author, Year (ref) | Study design, Study period,Population, Readers | Index test (I) | Reference test | Results CI= confidence interval Se= sensitivity Sp=specificity | Study quality, Comments |
---|---|---|---|---|---|
Gilbert et al., 2008 [71] | Prospective, multicentre 2006-2007 | I.1: single reading + CAD, n=28,204 | Biopsy of suspected cases or follow-up (not all, though; number not reported) |
Cancer detection rate:
| Moderate |
Single reading + CAD: 7.02 /1000. | |||||
Population:
| Double reading: 7.06/1000. | Restricted generalisability since results were based on single reading +CAD by experienced radiologists. | |||
Difference not statistically significant (NS). | |||||
I.2: double reading, n=28,204. | |||||
Initially invited: 68,060 women. | |||||
Recall rate:
| Incomplete follow-up, particularly affecting the estimates of sensitivity. | ||||
Investigated: 28,204. | |||||
Aged 50-70 years (1 % > 70 years). | Single reading + CAD: 3.9 %. | ||||
Double reading: 3.4 %. | Scanned analogue mammograms. | ||||
Difference 0.5 % (95 % CI: 0.3;0.8). | |||||
Readers: radiologists (n=17), specially trained staff (n=10). | |||||
Accuracy:
| |||||
Single reading + CAD: | |||||
Se= 87.2 % | |||||
Sp= 96.9 % | |||||
All readers had at least 6 years’ experience and >5000 readings/year | Double reading: | ||||
Se= 87.7 % | |||||
Sp= 97.4 % | |||||
Difference in sensitivity: | |||||
0.5 % (95 % CI: | |||||
-7.4;6.6), (NS). | |||||
Difference in specificity 0,5% ( CI not specified but reported NS). | |||||
Gromet et al., 2008 [69] | Retrospective | I.1: Single reading + CAD | Biopsy and follow-up |
Cancer detection rate:
| Low |
Population:
| Single reading + CAD: 4.2/1000. | Retrospective study (controlled for age and time since last screening). | |||
231 221 women | Double reading: 4.46/1000 (NS). | ||||
2001-05 | n=118,808. | ||||
I.2: Double reading | Follow-up time unclear. | ||||
Readers:
| |||||
Screening situation not applicable to European conditions (i.e. recall rate higher than accepted in Europe). | |||||
Single reading + CAD: specialists in mammography. | |||||
n=112,413. |
Recall rate:
| ||||
Single reading + CAD: 10.6 %. | |||||
Double reading: Specialists in mammography + radiology. | Double reading:11.9%. | ||||
Difference statistically significant (p=0.001). | |||||
Invitation procedure and blinded readings unclear. | |||||
Accuracy:
| |||||
Single reading + CAD: Se= 90.4 % | Scanned analogue mammograms. | ||||
Double reading: | |||||
Se=88.0 %. | |||||
Difference statistically significant. | |||||
Percent of recalled with cancer: | |||||
Single reading + CAD: 3.9%. | |||||
Double reading: 3.7% (NS). | |||||
Georgian-Smith et al., 2007 [68] | Prospective | I.1: Single reading + CAD | Biopsy and at least 12 months´ follow-up to detect false negatives. |
Cancer detection rate:
| Low |
Study period: 2001-03 | Single reading +CAD: 2.0/1000. | Screening situation not applicable to European conditions. Invitation procedure not described. | |||
n=6381. | Double reading: 2.4/1000 (NS). | ||||
Population: 6381 consecutive screening examinations | |||||
I.2: Double reading | |||||
Recall rate:
| Population, selection criteria, withdrawals unclear. | ||||
n=6381. | Single reading +CAD: 7.87%. | ||||
Double reading: 7.93% (NS). | |||||
Readers:
| Not independent double reading but blinded to CAD | ||||
Experienced breast radiologists |
Accuracy:
| ||||
Sensitivity and specificity not reported. | Number of recalls based on all readings. | ||||
Single reading + CAD. | Scanned analogue radiographs. | ||||
Double reading: Not independent reading. | |||||
Khoo et al., 2005 [70] | Prospective | I.1: Single reading +CAD n= 6111. | Biopsy |
Cancer detection rate:
| Low |
Study period: not reported. | Not reported | Total for double reading + single reading + symptomatic patients:10/1000. | A so-called relative sensitivity used since 3-year follow-up not yet achieved. | ||
No follow-up | |||||
Population: 6,111 women (45-94 years), screening every 3rd year | |||||
Not reported individually for the groups. | |||||
Relatively high screening age and long screening intervals. | |||||
I.2: Double reading n= 6111. | |||||
Recall rate:
| |||||
Single reading + CAD: 6.1%. | Unclear whether the readings were blinded. | ||||
Double reading: 5.0 %. | Incomplete follow-up. | ||||
Readers:
| Difference statistically significant | Scanned analogue radiographs. | |||
Radiologists (n=7) and specially trained staff (n=5). | |||||
Accuracy: (relative sensitivity)* | |||||
Single reading + CAD: Se= 91.5%. | |||||
Double reading: Se= 98.4% (NS). | |||||
Double reading not always performed by two radiologists. |
Outcome | Sample size (no. of studies) | True positive: Single reading + CAD (95% CI) | True positive: Double reading (95% CI) | Absolute difference (95%CI) | Quality of evidence | Rating based on study design/quality, indirectness, consistency, precision and publication bias** |
---|---|---|---|---|---|---|
Cancer detection rate | 28,204 (1) | 0.702% | 0.706% | 0.004% | (⊕OOO) | Study quality –1 |
(0.6–0.8) | (0.6–0.8) | (NS*) | Insufficient | Indirectness–1 | ||
Recall rate | 28,204 (1) | 3,9% | 3,4% | 0,5% | (⊕OOO) | Study quality –1 |
(3,7–4,1) | (3,2–3,6) | (0,3–0,8) | Insufficient | Indirectness -1 One study –1 |
Economic aspects
Discussion
Conclusions
-
The scientific evidence is insufficient to determine whether CAD + single reading by one breast radiologist would yield results that are at least equivalent to those obtained in standard practice, i.e. double reading where two breast radiologists independently read the mammographic images.
-
Since the medical consequences are uncertain, it is not possible to determine the cost-effectiveness or the socioeconomic consequences of replacing one of the readings with CAD in the context of mammography screening.
-
Since this literature review, CAD technology has advanced further, thanks to improvements in computer software and digitalization.
-
Additional prospective and preferably randomized population-based studies are essential to understand the method’s specific benefits, consequences, and costs.