Review
Systematic reviews to evaluate diagnostic tests

https://doi.org/10.1016/S0301-2115(00)00463-2Get rights and content

Abstract

Diagnostic testing and screening is a critical part of the clinical process because inappropriate diagnostic strategies put patients at risk and entail a serious waste of resources. It is being increasingly recognised that absence of clear summaries of individual research studies on the repeatability, accuracy and impact of tests, which are often scattered across many different journals, is a major impediment. Just as the need to develop means to systematically review research assessing the effectiveness of treatments has been pursued over the last decade, so more recently attention has focused on how research on diagnostic tests might also be systematically reviewed. These reviews present a huge methodological challenge. This paper describes the use of a systematic approach to collation, appraisal and synthesis of information contained in the primary literature about accuracy of diagnostic strategies.

Introduction

Clinicians are expected to make a diagnosis, to give a prognosis and provide treatment, typically in that sequence. Diagnosis is established by a combination of patient’s characteristics, history, examination and tests, and is the cornerstone of good clinical care. Without accurate diagnosis doctors cannot prognosticate accurately or choose the best treatment. Indeed a wrong diagnosis can harm patients by exposing them to the wrong therapy, while correct prediction of disease allows timely use of effective treatment. This is why clinicians need guidance about efficient diagnostic strategies and why evidence-based medicine clearly includes evidence-based diagnosis.

Lack of clear summaries of the individual research studies on the repeatability, accuracy and impact of tests, which may be scattered across many different journals, is a major impediment in optimising patient management. Just as the need to develop means to systematically review research assessing the effectiveness of treatments has been pursued over the last decade, so attention needs to be focused on how research on diagnostic tests might also be reviewed systematically [1], [2], [3].

Although diagnosis often raises more questions than therapy, primary research on evaluation of tests is generally poor in quality [4], [5], [6], [7]. The methodological deficiencies include biases arising from study design, the use of inappropriate patient groups, the unreliability of tests, lack of blinding and partial verification of outcome. Even when individual studies of good quality exist, they often lack power and require quantitative synthesis to produce reliable estimates of diagnostic accuracy. Authors of primary research studies claim more positive conclusions than can be supported by the rigour of the study design, conduct and analysis [8], [9], [10], [11]. This tendency to exaggerate about the findings of a study may be even greater in diagnostic research.

Although there are many reviews, commentaries and recommendations on diagnostic tests, they offer only limited guidance for practice because few apply scientific strategies to limit bias in their assembly, appraisal, and synthesis of primary studies [12], [13], [14]. There is a need to identify biases in primary diagnostic test research and to synthesise their results using quantitative methods in systematic reviews. At present there is a dearth of such reviews and in this commentary, we will demonstrate how the systematic approach can be used in this area to allow readers to critically appraise published review articles about diagnostic strategies. We do not intend to provide guidance on conducting reviews.

Systematic reviews should be conducted with the same rigour as primary research. Off-the-cuff commentaries by ‘content experts’ on the invitation of journal editors are no substitute. Conducting a research overview is conceptually analogous to conducting an epidemiological survey involving the steps shown in Table 1. A brief description of the basic steps is provided below.

Section snippets

Formulation of objectives

The history, physical examination and tests (laboratory, electrophysiology, radiology, etc.) all qualify as diagnostic technologies [15], [16], [17], [18]. The ultimate aim is to assess the effectiveness of the test in improving patients’ outcomes, which is addressed in the setting of randomised trial [16] (Table 2). To demonstrate effectiveness of diagnostic strategies in trials involves several complexities. For example, a combination of accurate testing and effective treatment is required to

Study identification

Literature searches should identify all potentially relevant primary diagnostic test studies. Computerized bibliographic databases such as MEDLINE (National Library of Medicine), EMBASE (Excerpta Medica), and SCISEARCH (Science Citation Index) are a good starting point. Complex search strategies may be required to generate comprehensive citation lists [21]. Constructing a sensible combination of search terms requires a structured approach, which involves breaking down the review question into

Study selection

The selection criteria for inclusion in the review should be explicit (Table 3). Studies are usually selected for inclusion in a review using a two-stage process. First, the citation lists from the literature search are scrutinised and full manuscripts of those citations that appear to meet the selection criteria obtained. Then two reviewers independently assess the manuscripts against the criteria. Their agreement should be measured [23], [24], [25] and when they disagree they should meet.

Study quality assessment

Methodologic quality is defined as the confidence that the study design, conduct and analysis has minimised biases in estimating the usefulness of the test in question. Variations in study quality may explain different results between studies. The extent to which primary research meets methodological standards will influence the strength of any recommendations from the review and help make recommendations to improve future studies. There are several tools available to assess the quality of

Data collection

A pre-designed and piloted form for extracting results from each article enables the generation of 2×2 tables to calculate diagnostic accuracy. Ideally, these should be completed in duplicate to minimise errors. Reviewers must obtain missing information from primary investigators.

Analysis

This usually consists of three stages: description of the identified studies, synthesis of their results (meta-analysis), and exploration of variation in results from study to study (heterogeneity) and its causes.

It is unfortunate that most of the attention paid to the methodological aspects of systematic reviews has concentrated on the statistical aspects of the methods used to pool data. In reality the quantitative phase of the analysis is an extension of the narrative data description. The

Conclusion

In summary, systematic reviews of diagnostic literature on a given clinical condition allow us to assess the quality of the available evidence and to identify specific tests (including history, physical examination and tests) that have diagnostic value. These reviews should lead to formulation of recommendations for current practice and future research. Just as an evidence-based culture in delivery of health care has been supported by systematic reviews of literature on therapeutic

References (49)

  • M.C. Reid et al.

    Use of methodological standards in diagnostic test research. Getting better but still not good

    JAMA

    (1995)
  • K.S. Khan et al.

    Evaluating the measurement variability of clinical investigations

    Br. J. Obstet. Gynaecol.

    (1997)
  • J.G. Lijmer et al.

    Empirical evidence of design related bias in studies of diagnostic tests

    JAMA

    (1999)
  • J.S. Bailar et al.

    Classification of biomedical research reports

    N. Engl. J. Med.

    (1984)
  • J.S. Bailar

    Science, statistics and deception

    Ann. Intern. Med.

    (1986)
  • C.D. Mulrow

    The medical review article: state of the science

    Ann. Intern. Med.

    (1987)
  • A.D. Oxman et al.

    Guidelines for reading literature reviews

    Can. Med. Assoc. J.

    (1988)
  • A.D. Oxman et al.

    For the evidence-based Medicine working group. Users’ Guides to the Medical Literature. VI. How to use an overview

    JAMA

    (1994)
  • F.A. McAllister et al.

    For CARE-COPD1 Group. Why we need large simple studies of the clinical examination: the problem and a proposed solution

    Lancet

    (1999)
  • D.G. Fryback et al.

    The efficacy of diagnostic imaging

    Med. Decision Making

    (1991)
  • E.C. Vamvakas

    Meta-analyses of studies of the diagnostic accuracy of laboratory tests. A review of the concepts and methods

    Arch. Pathol. Lab. Med.

    (1998)
  • P.F.W. Chien et al.

    How useful is uterine artery doppler flow velocimetry in predicting pre-eclampsia, intrauterine growth retardation and perinatal death? An overview

    Br. J. Obstet. Gynaecol.

    (2000)
  • P.F.W. Chien et al.

    The diagnostic accuracy of cervico-vaginal foetal Fibronectin in predicting Preterm delivery: an overview

    Br. J. Obstet. Gynaecol.

    (1997)
  • R.J. McManus et al.

    Review of the usefulness of contacting other experts when conducting a literature search for systematic reviews

    BMJ

    (1998)
  • Cited by (79)

    • Systematic review and meta-analysis of middle cerebral artery Doppler to predict perinatal wellbeing

      2012, European Journal of Obstetrics and Gynecology and Reproductive Biology
    • Pulse oximetry screening for critical congenital heart defects in asymptomatic newborn babies: A systematic review and meta-analysis

      2012, The Lancet
      Citation Excerpt :

      We aimed to assess the performance of pulse oximtery as a screening method for the detection of critical congenital heart defects in asymptomatic newborn babies. This systematic review was undertaken with a prospective protocol using recommended methods.19,20 We searched Medline (1951–2011), Embase (1974–2011), Cochrane Library (2011), and Scisearch (1974–2011) for relevant citations, and hand searched the reference lists of relevant articles for eligble studies.

    • Systematic review and meta-analysis of the test accuracy of ductus venosus Doppler to predict compromise of fetal/neonatal wellbeing in high risk pregnancies with placental insufficiency

      2010, European Journal of Obstetrics and Gynecology and Reproductive Biology
      Citation Excerpt :

      This review suggests that ductus venosus Doppler is a useful test in the management of the high risk pregnancy at risk of fetal/neonatal compromise. The strengths of our review lie in the methodology used, which complies with existing guidelines for the reporting of systematic reviews and also guidelines specific to the reporting of systematic reviews of observational studies [14,48]. Our literature searches were extensive and designed to be sensitive rather than specific and were performed without language restrictions.

    View all citing articles on Scopus
    View full text