Selection of reviews
Review articles were included in the analysis if they met all four of the following inclusion criteria: (a) the reviewers specified a search criteria and the databases in which the search was conducted; (b) the review was focused on mammography for breast cancer screening; (c) at least two primary studies addressing the harms or benefits of mammography were cited; and (d) the reviewers made conclusions about the harms or benefits of mammography for breast cancer screening in relation to the evidence. Outcomes related to benefits included breast cancer survival (mortality reduction), and cost-effectiveness of screening for quality-of-life. Outcomes related to harms included over-diagnosis, false positives, unnecessary treatments, radiation cancers, anxiety or worry, and pain or discomfort. Reviews that examined only diagnosis endpoints without considering survival or harms were excluded, as were reviews that only considered high-risk populations or populations of women who had previously been diagnosed with breast cancer. Articles were also excluded if they were guidelines, no longer archived or accessible online, not peer reviewed, or were in a language other than English.
Screening and data extraction
Two investigators independently screened article titles and abstracts against the inclusion criteria, and then examined the full text of articles against the inclusion and exclusion criteria. Discrepancies were resolved by discussion at both stages.
The review design characteristics extracted included patient age ranges covered in the evidence, the types of primary studies analysed, the set of outcome measures examined, the presence or absence of a meta-analysis, and the year the systematic review was published. Patient age ranges were assigned to one of four categories: 49 or under, 50 to 69 years, 70 years or older, and one other group for reviews that considered all ages or did not specify an age range. These age groups were selected to correspond with the common age ranges used in the most recent guidelines. The age group for women aged between 50 and 69 in particular is where there has been substantial disagreement about how often women should undergo mammography, and this was a focus of our study. Where age ranges differed from the four groups, we identified the group with the largest overlap (see Additional file
1). The types of studies included in the review were classified into controlled trials, observational studies, both forms of primary studies, or cost-effectiveness analyses.
We recorded the professional role or specialty of all individual authors and categorised them as clinicians or non-clinicians. Professional role was determined by the affiliation listed on the article, employment history, qualifications, and listed research interests. These elements were identified and interpreted from the systematic review and biographies on institutional webpages where available. Qualifications for clinicians included MD or equivalent degrees, and non-clinical qualifications included PhD and MPH. Where corresponding authors had both clinical and non-clinical qualifications, we assigned them to the clinician group if we identified a clinical affiliation on institutional webpages or recent publications, and to the non-clinician group if we could find no such evidence. Where authors were all clinicians or all non-clinicians, we labelled the review as such, and where the authors had a mix of the two types of professional roles, we labelled the review according to the professional role of the corresponding author, under the assumption that the corresponding author takes primary responsibility for the conclusions drawn in the review (Additional file
2).
A financial competing interest disclosure may have described research funding, ownership, or fees from a developer of mammography systems or software. Financial competing interests were determined from disclosure statements in the article. If a competing interest was identified and was financial in nature, we assumed it to be relevant and labelled all conclusions in the systematic review as associated with a financial competing interest. We also noted the presence or absence of a disclosure statement in the systematic reviews and labelled systematic reviews without a disclosure separately.
Two investigators read each included systematic review in its entirety to evaluate the mammography recommendations contained in the conclusions. Each conclusion was judged as favourable or non-favourable by assigning it to one of eight types. Four types were considered to be favourable (evidence of benefits, benefits outweigh harms, the practice is cost-effective, no evidence of harms), and four were labelled as non-favourable (evidence of harms, harms outweigh benefits, the practice is not cost-effective, no evidence of benefits). A third investigator read any reviews for which there was a disagreement to produce a final grading. For each conclusion, we also extracted supporting conclusion statements from the systematic reviews, as well as any statements that made recommendations about who should undergo screening by mammography and how often it should be done.
Some of systematic reviews examined evidence for different age ranges or frequency of screening separately, producing conclusions for each. These conclusions were considered separately. This means that systematic reviews may be represented in the analysis with more than one conclusion and those conclusions may differ.
Analysis
A linear regression was used to check whether favourable conclusions became more or less common over the period of study, testing whether changes in the primary evidence influenced the likelihood of a favourable conclusion in systematic reviews. Where appropriate, we performed chi-square tests to test the association between favourable conclusions and each of the review design (evidence selection, age groups, outcomes, or the use of meta-analysis) and author (professional roles and financial competing interests) characteristics extracted from the systematic reviews.