What is new?
- •
The capture–mark–recapture (CMR) modeling technique is an empirically derived stopping rule to decide when to stop searching in systematic reviews, and it can be used to estimate the world literature (the Horizon) on a given topic.
- •
The CMR technique estimated the closeness to capturing the total literature in the context of a systematic review of osteoporosis disease management tools.
- •
The Horizon Estimation can be valuable for researchers conducting systematic reviews to determine how many articles were “missing” in their initial searching methods, and if it is necessary to extend searching to other databases or alternate searching resources.
The main goal of systematic reviews is to capture and analyze the total literature that meets content and methodologic standards on a particular topic to answer a research question. However, collecting this body of literature can be a challenge for several reasons. Exhaustive searching in large bibliographic databases such as Medline can be time consuming, and searches designed to be comprehensive often result in low yields of relevant or useful information [1]. Searching for systematic reviews encompasses two components: The first is for content such as drugs or diseases. The second is often for methods as, for example, searching for randomized controlled trials (RCTs) or observational studies. Search filters have been developed and tested to aid clinicians and researchers by retrieving articles that have high potential to meet methodologic standards when conducting systematic reviews [2]. Despite having the methods filters, the search process still requires more time and human resources than are often available. Timely publication is essential so that evidence remains up-to-date. One of the most important challenges to conducting systematic reviews is to determine how extensive the search should be, particularly because it is unknown as to how much literature exists on any given topic [1]. Although valid guidelines for the quality of reporting of meta-analyses (e.g., the QUOROM statement) [3] and for performing methodologically rigorous systematic reviews [4] exist, little guidance is available on when to confidently stop searching for material when conducting a systematic review.
Stopping rules have been used to decide when clinical trials should stop once benefit or harm has been identified [1], [5]. Stopping trials based on accumulation of “enough” data is important and contentious and all decisions need to be strongly based on evidence [5]. Stopping rules for searching would be ideal in the production of systematic reviews, but no empirically derived stopping rules exist for this purpose. Several practical methods have been proposed, including those by Chilcott et al., who have suggested that searching can stop when an additional searching method or resource provides less than 1% of the total accumulated relevant articles [6]. However, the authors outline that this approach may not be appropriate for identifying studies of effectiveness where the goal of the systematic review is “comprehensiveness” rather than achieving saturation [6]. Bradford's law of scatter has also been suggested to predict the size of the literature [7]. However, neither method has been found to be satisfactory for predicting the size of a body of literature.
The statistical modeling technique, capture–mark–recapture (CMR) has commonly been applied to problems where multiple samples of some occurrence are conducted to estimate the size of the whole population (entitled the Horizon Estimate). Its application in producing systematic reviews offers a potential solution to the challenge of identifying when to stop searching for articles. In general terms, the CMR process involves capturing an initial sample from a population of interest, marking the elements in the sample with some type of tag, and then releasing the sample back into the population so that the marked elements are available to be recaptured in subsequent sampling exercises. The items recaptured in subsequent samples provide the basis for estimating the total population. For an article found in the first database search, it is captured and marked by recording it in a list of relevant articles. Subsequent searches may identify some of these marked articles and hence recapture them. These subsequent searches will identify articles not captured in the first search. Indeed, any set of k searches may or may not identify an article, so recording all possibilities would yield a set of 2k cells with frequencies that would be unknown for one cell and known for the rest. This one cell is what is estimated by the proposed CMR model fitting. The Horizon is then the sum of the known cells and the estimated missing cell. CMR methods have commonly been used in ecology. For example, two samples of fish caught on separate occasions can estimate the number of fish in a lake [8], [9]. Although it was pioneered in ecology, the CMR concept is now being tested in many situations. For example, in epidemiology, CMR has been used to check the completeness of case ascertainment in population-based studies [10] and to estimate the number of cases of disease such as myocardial infarction and inflammatory bowel disease in a given region [11, [12]. Other recent applications include the use of CMR to calculate the total number of journals that contain nephrology content (i.e., ascertain the Horizon of all nephrology journals) [13], and as a method for assessing publication bias in systematic reviews [14].
To our knowledge, only one study has evaluated the completeness of systematic literature searches using CMR modeling techniques [15]. The study used the simplest model possible—the comparison of two databases (Medline and hand searching of one journal). The model showed that two articles were missed from the estimated total population size of 160 articles (95% confidence interval [CI] 158–164) [15]. However, the CMR process of estimating the completeness of literature searching can be extended to more than two databases. After the second database has been searched, the Horizon Estimate of the total literature can be calculated. If the systematic reviewers had set a certain threshold of articles (i.e., the proportion of all possible studies that the review should ideally contain in relation to all that likely exist in the literature), the reporting of the estimate of the Horizon Estimate and its 95% CI after each subsequent database search in conjunction with the number of retrieved articles can be used to decide whether searching should continue.
We tested a stopping rule for searching in systematic reviews by applying the principles of CMR to estimate the total number of articles in the literature identified in a systematic review of RCTs evaluating clinical decision support tools for osteoporosis disease management. We selected four major databases for the systematic review search strategy, which were searched sequentially: Medline, EMBASE, CINAHL, and Ovid EBM reviews (all were searched using the Ovid Technologies system). These data were used to build the Horizon Estimation model. We sequentially determined the estimated total population of the literature in this area and how effective the databases were at retrieving these articles at two levels of screening in the production of the systematic review.