Background
Meta-analysis is an increasingly popular statistical method for comparing and summarizing the results of multiple independent studies. First introduced to clinical research in the 1980s, meta-analysis is now a cornerstone of evidence-based medicine [
1]. It has also become an important step in establishing the credibility of research findings, such as those from hypothesis-free discovery research studies [
2]. The number of published meta-analyses indexed in PubMed is increasing by about 20 % per year (PubMed).
An ideal meta-analysis provides a complete representation of all relevant data, both published and unpublished. Finding eligible studies is often the most challenging and time-consuming phase in conducting a meta-analysis, especially when the terminology for key concepts, variables and outcomes differs among studies. The Cochrane Collaboration— internationally regarded for its rigorous approach to meta-analyses of clinical interventions—recommends searching multiple publication databases by using Boolean combinations of all possible keywords, including synonyms and related words that authors may have used to describe their studies, and complementing keyword-based searches with hand screening of references listed in the retrieved articles [
3]. Casting a wide net often retrieves thousands of publications that must be screened to find a handful of eligible studies. Despite its inefficiency, this approach remains the gold standard.
Finding eligible studies by screening the references and subsequent citations of articles that are already known could be seen as a way to crowd-source expert knowledge of the published scientific literature. The network properties of scientific citations have been studied extensively since the 1950s, when they were used to create the Science Citation Index [
4,
5]; they have been further exploited in the development of online research tools such as
Web of Science,
Scopus and
Google Scholar. Some current research explores the use of computational algorithms to automate citation retrieval for systematic reviews [
6].
Although it is intuitively appealing, backward and forward citation checking falls short as a way to identify eligible articles for meta-analysis. Searching these ‘direct’ citations could be an efficient strategy only if eligible studies consistently cited all relevant earlier work, thus creating a single citation network, but this is often not the case. For example, a review of 259 meta-analyses found that in fewer than half (46 %) were included articles connected in a single citation network; in the remainder, included articles were in either two (39 %) or three or more (15 %) disconnected citation networks [
7]. Citation searching has thus gained only equivocal support, even as a complement to keyword searching [
8,
9].
Searching based on direct citations is insensitive and inefficient because researchers tend to cite only some related earlier articles, not all. Although eligible studies may be only sparsely connected by direct citations, taking indirect connections into account can help identify additional studies. For example, two eligible studies that are not connected by direct citations might both be co-cited by the same newer article [
10], or they may be coupled because they both cite the same earlier article [
11]. These citing and cited articles may be commentaries, reviews or original research articles on related topics.
The principles of co-citation and bibliographic coupling are used extensively in bibliometrics and scientometrics to document and visualize similarity between articles, topics, authors and disciplines [
12‐
15]; however, they have not been used specifically to find eligible studies for meta-analyses or systematic reviews. We propose a search method that ranks articles on their degree of co-citation with one or more known articles and demonstrate that other studies eligible for inclusion in the meta-analysis rank high on this list.
Discussion
Before discussing the implications of our method, several methodological issues about the studies needs to be discussed. First, we evaluated the performance of our method conservatively by assuming that the original meta-analyses were comprehensive and complete. Thus, when we failed to retrieve a study, we considered it a shortcoming of our method, not of the published meta-analysis. Yet, in the meta-analysis of second surgery in Crohn’s disease, for example, we missed the only two pediatric studies [
16], and we missed five articles that were published before 1975 (Table
2); these studies may be less comparable to others included in the meta-analysis. Furthermore, for all meta-analyses, we found original articles on the same topic that were more frequently co-cited than the articles that were included (see examples in Additional file
1: Table S2); however, we did not attempt to investigate whether they had been excluded after screening or perhaps should have been included in the meta-analyses.
Second, our method demonstrated lower efficiency and accuracy in the second study, which could be attributed to several factors. The second study included more highly cited topics, which tend to generate a higher number of co-citations, thus reducing efficiency. This study also included more meta-analyses for which the authors screened a relatively low number of articles. In the first study, none of the meta-analyses had screened fewer than 500 articles and only three (30 %) had screened fewer than 1,000 (Table
1); in contrast, of the 42 meta-analyses in the second study, 10 (24 %) had screened fewer than 500 articles and 20 (48 %) had screened fewer than 1,000 (Table
4).
The second study also included more meta-analyses on heterogeneous topics, which tended to reduce accuracy. For example, we retrieved only 10 % of the studies included in a meta-analysis on normalization of vitamin D levels in children of various ages and with various diseases [
18]; 18 % of the studies on the use of simulation-based assessments for patient-related outcomes for a variety of tasks and skills in physicians, medical students, dentists and nurses [
19]; and 38 % of the studies on the safety of intravenous iron preparations in patients with various disorders [
20]. Clearly, the method does not work when the topic of the meta-analysis is heterogeneous and the studies of interest are unlikely to have cited each other. The second study also included several meta-analyses with very small sample sizes, including one in which half of the studies were case reports that had few or no references [
21], as well as a meta-analysis for which the ‘known’ studies were cited only four times in total [
22]. The percentage of retrieved studies jumped to 89 % when these five meta-analyses were excluded.
And third, we compared our method with literature searches of the published meta-analyses that often combined separate searches in multiple databases, supplemented with the screening of references lists, conference abstracts and grey literature, and the consultation of experts. These additional strategies may have yielded studies that were not indexed in databases like Web of Science or Medline, and contributed to underestimation of the accuracy. For example, we were unable to retrieve the two master theses that were included in a meta-analysis for which the authors searched the Dissertation Abstracts International database, [
23] and missed many South-American and Asian studies of a meta-analysis for which the authors additionally searched the LILACS and KOREAMED databases [
20]. Additional strategies like these can be used to complement our search method--either to find more eligible studies or to increase confidence in the results of the search method when no other studies are found.
Using a citation-based search to identify articles for meta-analysis has several advantages. Perhaps most importantly, the quality of the search does not depend on keywords, which is particularly relevant for topics where there is no consistent terminology. In contrast to machine-learning algorithms, citation-based searching does not depend on the quality and selection of a training set. Co-citation searching was more efficient than keyword-based searching, retrieving a median of 76 % of eligible studies from a short list of around 100 of the most frequently co-cited articles (Table
1). Co-citation searching also retrieved articles published in journals that were not indexed in Web of Science, suggesting that the need to search other databases could be reduced. An interesting example is the meta-analysis of immunoglobulin treatment for severe acute respiratory infections such as SARS, avian influenza (H1N1), and the Spanish influenza of 1918 [
24]. This meta-analysis included 16 studies published in 1919–1920, of which we were able to retrieve 13. These included publications in the Norsk Magazin för Laegevidenskapen, Boston Medical and Surgical Journal, La Presse Médicale, New York Medical Journal and Hygiea, which are all journals that no longer exist. These studies could be retrieved because they had been cited by studies of more recent outbreaks that were published in journals that were indexed in Web of Science.
The accuracy and efficiency of co-citation searching depends on characteristics of the underlying citation network. By design, our method misses the studies that the collective community of researchers apparently did not find worth citing. In our analysis, these included abstracts, articles in non-English languages, very old articles, and publications in semi-scientific journals, reports, websites, and theses. In addition, some newer and some very old articles were not cited often enough to rank high in our search. Some modifications of our method could help identify these articles; for example, as shown in Table
2, half of the missed articles were connected with retrieved articles through direct citations. Aggregating and ranking the direct citations among all articles that are retrieved by our search might be an efficient way to find them. Other modifications might be necessary when the method is applied to topics with very dense citation networks of highly-cited articles; in these situations the number of articles to be screened could be limited further, for example, by setting a higher citation threshold.
Competing interests
ACJWJ has filed a patent application for the method described in this article. MG declares that she has no competing interests.
Authors’ contributions
ACJWJ developed the method, designed the study and carried out the analyses. MG critically reviewed the method, the study design and results. Both authors contributed to the writing of the manuscript and approved the final version.