The capture–mark–recapture technique can be used as a stopping rule when searching in systematic reviews

doi:10.1016/j.jclinepi.2008.06.001

Journal of Clinical Epidemiology

Volume 62, Issue 2, February 2009, Pages 149-157

https://doi.org/10.1016/j.jclinepi.2008.06.001 Get rights and content

Abstract

Objective

Researchers have no empirically based search stopping rule when looking for potentially relevant articles for inclusion in systematic reviews. We tested a stopping strategy based on capture–mark–recapture (CMR; i.e., the Horizon Estimate) statistical modeling to estimate the total number of articles in the domain of clinical decision support tools for osteoporosis disease management using four large bibliographic databases (Medline, EMBASE, CINAHL, and EBM reviews).

Study Design and Setting

Retrospective evaluation of the Horizon Estimate using a systematic review of randomized controlled trials (RCTs) at two levels of article screening: title and abstract (1,246 potentially relevant articles) and full text (42 potentially relevant articles).

Results

The CMR model suggests that the total number of potential articles was 1,838 for the first level of screening, and 49 for the full-text level. The four databases provided 68% of known articles for the first level of screening and 81% for full-text screening.

Conclusions

The CMR technique can be used in systematic reviews to estimate the closeness to capturing the total body of literature on a given topic. More studies are needed to objectively determine the usefulness of Horizon Estimates as a stopping rule strategy for systematic review searching.

Introduction

What is new?

•
The capture–mark–recapture (CMR) modeling technique is an empirically derived stopping rule to decide when to stop searching in systematic reviews, and it can be used to estimate the world literature (the Horizon) on a given topic.
•
The CMR technique estimated the closeness to capturing the total literature in the context of a systematic review of osteoporosis disease management tools.
•
The Horizon Estimation can be valuable for researchers conducting systematic reviews to determine how many articles were “missing” in their initial searching methods, and if it is necessary to extend searching to other databases or alternate searching resources.

The main goal of systematic reviews is to capture and analyze the total literature that meets content and methodologic standards on a particular topic to answer a research question. However, collecting this body of literature can be a challenge for several reasons. Exhaustive searching in large bibliographic databases such as Medline can be time consuming, and searches designed to be comprehensive often result in low yields of relevant or useful information [1]. Searching for systematic reviews encompasses two components: The first is for content such as drugs or diseases. The second is often for methods as, for example, searching for randomized controlled trials (RCTs) or observational studies. Search filters have been developed and tested to aid clinicians and researchers by retrieving articles that have high potential to meet methodologic standards when conducting systematic reviews [2]. Despite having the methods filters, the search process still requires more time and human resources than are often available. Timely publication is essential so that evidence remains up-to-date. One of the most important challenges to conducting systematic reviews is to determine how extensive the search should be, particularly because it is unknown as to how much literature exists on any given topic [1]. Although valid guidelines for the quality of reporting of meta-analyses (e.g., the QUOROM statement) [3] and for performing methodologically rigorous systematic reviews [4] exist, little guidance is available on when to confidently stop searching for material when conducting a systematic review.

Stopping rules have been used to decide when clinical trials should stop once benefit or harm has been identified [1], [5]. Stopping trials based on accumulation of “enough” data is important and contentious and all decisions need to be strongly based on evidence [5]. Stopping rules for searching would be ideal in the production of systematic reviews, but no empirically derived stopping rules exist for this purpose. Several practical methods have been proposed, including those by Chilcott et al., who have suggested that searching can stop when an additional searching method or resource provides less than 1% of the total accumulated relevant articles [6]. However, the authors outline that this approach may not be appropriate for identifying studies of effectiveness where the goal of the systematic review is “comprehensiveness” rather than achieving saturation [6]. Bradford's law of scatter has also been suggested to predict the size of the literature [7]. However, neither method has been found to be satisfactory for predicting the size of a body of literature.

The statistical modeling technique, capture–mark–recapture (CMR) has commonly been applied to problems where multiple samples of some occurrence are conducted to estimate the size of the whole population (entitled the Horizon Estimate). Its application in producing systematic reviews offers a potential solution to the challenge of identifying when to stop searching for articles. In general terms, the CMR process involves capturing an initial sample from a population of interest, marking the elements in the sample with some type of tag, and then releasing the sample back into the population so that the marked elements are available to be recaptured in subsequent sampling exercises. The items recaptured in subsequent samples provide the basis for estimating the total population. For an article found in the first database search, it is captured and marked by recording it in a list of relevant articles. Subsequent searches may identify some of these marked articles and hence recapture them. These subsequent searches will identify articles not captured in the first search. Indeed, any set of k searches may or may not identify an article, so recording all possibilities would yield a set of 2^k cells with frequencies that would be unknown for one cell and known for the rest. This one cell is what is estimated by the proposed CMR model fitting. The Horizon is then the sum of the known cells and the estimated missing cell. CMR methods have commonly been used in ecology. For example, two samples of fish caught on separate occasions can estimate the number of fish in a lake [8], [9]. Although it was pioneered in ecology, the CMR concept is now being tested in many situations. For example, in epidemiology, CMR has been used to check the completeness of case ascertainment in population-based studies [10] and to estimate the number of cases of disease such as myocardial infarction and inflammatory bowel disease in a given region [11, [12]. Other recent applications include the use of CMR to calculate the total number of journals that contain nephrology content (i.e., ascertain the Horizon of all nephrology journals) [13], and as a method for assessing publication bias in systematic reviews [14].

To our knowledge, only one study has evaluated the completeness of systematic literature searches using CMR modeling techniques [15]. The study used the simplest model possible—the comparison of two databases (Medline and hand searching of one journal). The model showed that two articles were missed from the estimated total population size of 160 articles (95% confidence interval [CI] 158–164) [15]. However, the CMR process of estimating the completeness of literature searching can be extended to more than two databases. After the second database has been searched, the Horizon Estimate of the total literature can be calculated. If the systematic reviewers had set a certain threshold of articles (i.e., the proportion of all possible studies that the review should ideally contain in relation to all that likely exist in the literature), the reporting of the estimate of the Horizon Estimate and its 95% CI after each subsequent database search in conjunction with the number of retrieved articles can be used to decide whether searching should continue.

We tested a stopping rule for searching in systematic reviews by applying the principles of CMR to estimate the total number of articles in the literature identified in a systematic review of RCTs evaluating clinical decision support tools for osteoporosis disease management. We selected four major databases for the systematic review search strategy, which were searched sequentially: Medline, EMBASE, CINAHL, and Ovid EBM reviews (all were searched using the Ovid Technologies system). These data were used to build the Horizon Estimation model. We sequentially determined the estimated total population of the literature in this area and how effective the databases were at retrieving these articles at two levels of screening in the production of the systematic review.

Section snippets

Methods of the systematic review

Studies were identified by searching in the following order: Medline (1966 to July 2006), EMBASE (1980–2006), CINAHL (1982 to July 2006), and Ovid EBM Reviews (Cochrane Database of Systematic Reviews, ACP Journal Club, Database of Abstracts of Reviews of Effects, and the Cochrane Clinical Trials Registry). We also searched the grey literature: the websites of Canadian Institutes of Health Research (CIHR), US Agency for Healthcare Research and Quality (US AHRQ), US Computer Retrieval of

Results

The first Horizon Estimate was based on the 1,246 articles that were identified during Level 1 screening for potentially relevant articles in the systematic review. Level 2 screening provided 39 articles that were analyzed to produce a second Horizon Estimate.

The results of the Horizon Estimate for the systematic review are shown in Table 1. For the first level of study selection (i.e., 1,246 articles), the Horizon Estimate (i.e., the total population of articles) was estimated to be 1,729

Discussion

The completeness of our retrieval of the total population of studies (i.e., the Horizon Estimate) varied greatly between the two different levels of study selection. At the level of abstract and title review (i.e., 1,246 articles), the estimate showed that searching the four large databases captured 68% of known articles, whereas at the level of full-text review, the literature search captured 81% of known articles. The findings of the PubMed verification strategy showed that we were not able

Conclusions

The CMR technique can estimate the total number of articles that exist on a topic (i.e., the Horizon Estimate). This information can be valuable for those conducting a systematic review or meta-analysis or other tasks that require identification of the total body of literature such as health technology assessments or economics studies. Using the CMR method, we can determine how many articles are potentially available, and calculate how many articles are “missing” in our searching. These data

References (32)

D. Moher et al.
Improving the quality of reports of meta-analyses of randomized controlled trials: the QUOROM statement
Lancet
(1999)
D.A. Bennett et al.
Capture–recapture is a potentially useful method for assessing publication bias
J Clin Epidemiol
(2004)
A.M. Baker et al.
A web-based diabetes care management support system
J Comm Qual Improv
(2001)
P. Cram et al.
A randomized trail to assess the impact of direct reporting of DXA scan results to patients on quality of osteoporosis care
J Clin Densitom
(2006)
InterTASC Information Specialists' Sub-GroupSearch filter resource. Available at...
Cochrane handbook. Available at http://www.cochrane.org/admin/manual.htm Accessed November...
M. Petticrew et al.
Systematic reviews in the social sciences: a practical guide
(2005)
J. Pater et al.
The ethics of early stopping rules
J Clin Oncol
(2005)
J. Chilcott et al.
The role of modeling in prioritising and planning clinical trials
Health Technol Assess
(2003)
P. Royle et al.
Systematic reviews of epidemiology in diabetes: finding the evidence
BMC Med Res Methodol
(2005)

D.G. Chapman

The estimation of biological populations

Ann Math Stat

(1954)

K.H. Pollock

Modeling capture, recapture, and removal statistics for estimation of demographic parameters for fish and wildlife populations: past, present, and future

J Am Stat Assoc

(1991)

E.B. Hook et al.

Capture–recapture methods in epidemiology: methods and limitations

Epidemiol Rev

(1995)

R.E. Laporte et al.

Monitoring the incidence of myocardial infarctions: applications of capture–mark–recapture technology

Int J Epidemiol

(1992)

D. Palli et al.

Population-based studies of IBD incidence in Italy and capture–recapture methods

Int J Epidemiol

(1997)

Goldsmith CH, Haynes RB, Garg AX, McKibbon KA, Wilczynski NL, Kastner M, et al. Horizon estimation—what is the horizon...

Cited by (32)

The crossover design for studies of infertility employing in-vitro fertilization: A methodological survey
2019, Contemporary Clinical Trials Communications
Infertility has become increasingly common worldwide. There is a need for the infertility literature to evaluate new interventions with IVF. The crossover design presents many methodological advantages for IVF trials. In addition to providing a within-person comparison of outcomes, it offers participants the opportunity to potentially benefit from more than one available treatment. However, infertility studies present a unique challenge in terms of bias: successful participants do not cross over to the second treatment group.
The main objective of our study was to survey the methodological features of crossover trials for infertility with in-vitro fertilization (IVF) based interventions. A secondary focus was reporting key results.
We conducted a methodological survey by systematically searching Medline and Embase databases. The capture-recapture technique was used to estimate the number of relevant studies that were not retrieved by our search strategy. We employed the Cochrane risk of bias tool to assess methodological rigour. Crossover-specific methods features were summarized. Treatment effects for pregnancy outcomes across studies are also presented.
15 studies met inclusion criteria. Most studies were deemed to have high or unclear risks of bias, usually because of incomplete reporting of outcome data and assessment procedures. 13 studies did not employ crossover-specific methods to analyze outcome data by period, which may bias treatment effect estimates. Four studies reported pregnancy outcome data with sample sizes from both treatment periods. Of these four studies, three reported that the control intervention was favoured.
The main limitation of our survey was the small sample size of studies. Future reviews should be larger and seek to encompass a broader range of the infertility literature. Despite the issues identified in the included trials, consideration should still be given to using the crossover design in future infertility research. Employing crossover-specific analysis methods, such as accounting for participant non-completion, along with strict adherence to CONSORT reporting guidelines, may significantly reduce the risk of bias in individual studies.
Systematic review identifies six metrics and one method for assessing literature search effectiveness but no consensus on appropriate use
2018, Journal of Clinical Epidemiology
Citation Excerpt :
Effectiveness, reported in purely quantitative terms, tells researchers little about the value of the studies identified or missed, or what the effect of missing studies means [60]. It is unclear what proportion of relevant studies identified represents an adequate literature search, so researchers are presently required to make their own judgments of sensitivity [76–78]. Sensitivity values do not help researchers understand this problem.
To identify the metrics or methods used by researchers to determine the effectiveness of literature searching where supplementary search methods are compared to bibliographic database searching. We also aimed to determine which metrics or methods are summative or formative and how researchers defined effectiveness in their studies.
Systematic review. We searched MEDLINE and Embase to identify published studies evaluating literature search effectiveness in health or allied topics.
Fifty studies met full-text inclusion criteria. Six metrics (sensitivity, specificity, precision, accuracy, number needed to read, and yield) and one method (capture recapture) were identified.
Studies evaluating effectiveness need to identify clearly the threshold at which they will define effectiveness and how the evaluation they report relates to this threshold. Studies that attempt to investigate literature search effectiveness should be informed by the reporting of confidence intervals, which aids interpretation of uncertainty around the result, and the search methods used to derive effectiveness estimates should be clearly reported and validated in studies.
Bibliographic search with Mark-and-Recapture
2015, Physica A: Statistical Mechanics and its Applications
Citation Excerpt :
This is a similar problem faced by clinical researchers as the results of medical trials are disparate in different databases (Medline, EMBASE, CINAHL, and EBM reviews). Thus clinical researchers used Mark-and-Recapture as a stopping rule to estimate the completeness of their research [6–8]. In this paper we extend the idea to different disciplines and assess the quality of academic search engines.
Mark-and-Recapture is a methodology from Population Biology to estimate the population of a species without counting every individual. This is done by multiple samplings of the species using traps and discounting the instances that were caught repeated. In this paper we show that this methodology is applicable for bibliographic analysis as it is also not feasible to count all the relevant publications of a research topic. In addition this estimation also allows us to propose a stopping rule for researchers to decide how far one should extend their search for relevant literature.
Citation networks of related trials are often disconnected: Implications for bidirectional citation searches
2014, Journal of Clinical Epidemiology
Reports of randomized controlled trials (RCTs) should set findings within the context of previous research. The resulting network of citations would also provide an alternative search method for clinicians, researchers, and systematic reviewers seeking to base decisions on all available evidence. We sought to determine the connectedness of citation networks of RCTs by examining direct (referenced trials) and indirect (through references of referenced trials, etc) citation of trials to one another.
Meta-analyses were used to create citation networks of RCTs addressing the same clinical questions. The primary measure was the proportion of networks where following citation links between RCTs identifies the complete set of RCTs, forming a single connected citation group. Other measures included the number of disconnected groups (islands) within each network, the number of citations in the network relative to the maximum possible, and the maximum number of links in the path between two connected trials (a measure of indirectness of citations).
We included 259 meta-analyses with a total of 2,413 and a median of seven RCTs each. For 46% (118 of 259) of networks, the RCTs formed a single connected citation group—one island. For the other 54% of networks, where at least one RCT group was not cited by others, 39% had two citation islands and 4% (10 of 257) had 10 or more islands. On average, the citation networks had 38% of the possible citations to other trials (if each trial had cited all earlier trials). The number of citation islands and the maximum number of citation links increased with increasing numbers of trials in the network.
Available evidence to answer a clinical question may be identified by using network citations created with a small initial corpus of eligible trials. However, the number of islands means that citation networks cannot be relied on for evidence retrieval.
Capture-mark-recapture to estimate the number of missed articles for systematic reviews in surgery
2013, American Journal of Surgery
Systematic reviews are an important knowledge synthesis tool, but with new literature available each day, reviewers must balance identifying all relevant literature against timely synthesis.
This study tested capture-mark-recapture (CMR), an ecology-based technique, to estimate the total number of articles in the literature identified in a systematic review of adult trauma care quality indicators.
The systematic review included 40 articles identified from online searches and citation references. The CMR model suggested that 3 (95% confidence interval [CI]: 0 to 6) articles were missed and the database search provided 93% (one-sided 95% CI: ≥83%) of known articles for inclusion in the systematic review. The search order used for identifying the articles was optimal among the 24 that could have been used.
The CMR technique can be used in systematic reviews in surgery to estimate the closeness to capturing the total body of literature for a specific topic.
Capture-mark-recapture as a tool for estimating the number of articles available for systematic reviews in critical care medicine
2013, Journal of Critical Care
Citation Excerpt :
After the search of each bibliographic database, articles were marked as being retrieved from that search (eg, Medline was arbitrarily selected to be the first database searched) and compared with articles retrieved through subsequent searches (eg, Embase was arbitrarily selected to be the second database searched; Table 1). To estimate the article population size, fitted estimates of the cell counts were calculated using Poisson regression [10]. Analysis was done using Stata 11.2 [14].
Systematic reviews are an important knowledge synthesis tool for critical care medicine clinicians and researchers. With new literature available each day, reviewers must balance identifying all relevant literature against timely synthesis. We therefore sought to apply capture-mark-recapture, a novel methodology, to estimate the population of articles available for a systematic review of effective patient rounding practices in critical care medicine.
Capture-mark-recapture was applied retrospectively to estimate the population of articles available for a systematic review of 4 bibliographic databases. All research studies (no methodology restrictions) of patient rounding practices in critical care medicine were included. Estimates of article population size were calculated for search of the bibliographic databases, selection of articles for full-text review, and selection of articles for inclusion in the systematic review.
Capture-mark-recapture estimated a population of 28 839 articles (95% confidence interval [CI], 12 393-70 990) for search of the bibliographic databases, 169 articles (95% CI, 152-202) for full-text review, and 48 articles (95% CI, 39-131) for inclusion in the systematic review. These estimates suggest that our search identified 15% (4462/28 839) of the population of potentially available articles for the search of the bibliographic databases, 79% (133/169) of articles for full-text review, and 79% (38/48) of articles for inclusion in the systematic review.
The capture-mark-recapture technique can be applied to systematic reviews in critical care medicine with heterogeneous study methodologies to estimate the population of articles available. Capture-mark-recapture may help clinicians who use systematic reviews to estimate search completeness and researchers who perform systematic reviews to develop more efficient literature search strategies.

View all citing articles on Scopus

View full text

Original ArticleThe capture–mark–recapture technique can be used as a stopping rule when searching in systematic reviews

Abstract

Objective

Study Design and Setting

Results

Conclusions

Introduction

Section snippets

Methods of the systematic review

Results

Discussion

Conclusions

Lancet

J Clin Epidemiol

J Comm Qual Improv

J Clin Densitom

Systematic reviews in the social sciences: a practical guide

The ethics of early stopping rules

J Clin Oncol

The role of modeling in prioritising and planning clinical trials

Health Technol Assess

Systematic reviews of epidemiology in diabetes: finding the evidence

BMC Med Res Methodol

The estimation of biological populations

Ann Math Stat

Modeling capture, recapture, and removal statistics for estimation of demographic parameters for fish and wildlife populations: past, present, and future

J Am Stat Assoc

Capture–recapture methods in epidemiology: methods and limitations

Epidemiol Rev

Monitoring the incidence of myocardial infarctions: applications of capture–mark–recapture technology

Int J Epidemiol

Population-based studies of IBD incidence in Italy and capture–recapture methods

Int J Epidemiol

Original Article
The capture–mark–recapture technique can be used as a stopping rule when searching in systematic reviews