Interpretation of results
In this study, we aimed to evaluate to what extent the SRs and MAs published in some of the most influential journals of ENT use methods to reduce the risk of PB and different techniques to assess the risk of its presence. Our findings revealed that this issue is not addressed optimally in a considerable proportion of the SRs.
First, the search strategies used in these SRs were not comprehensive enough to mitigate the risk of PB. Most SRs restricted their search to papers published in English, thus suffering from a great risk of language bias. Although previous studies have shown that the impact of language bias is negligible on the results of an SR in most circumstances [
35‐
37], exceptions have also been observed [
38‐
40]. As a result, Cochrane recommends that language restrictions should not be used unless in the setting of rapid reviews, and even in that setting, its use should be justified by the reviewers [
4]. Also, most of the SRs did not search for other sources of data or grey literature. This issue is of great importance as it has been found that such data can seriously affect the results of an SR [
41,
42]. Specifically, including a grey literature search should be seriously considered when conducting an SR because an association between “statistically significant” results and publication has been documented in previous studies [
4].
Another finding of interest was that almost half of the SRs did not assess the risk of PB. This finding becomes bolder knowing that our analyses revealed that the risk of PB was considerably higher in the SRs that did not assess the risk of its presence (63.6% vs 28.9%). The reason behind this phenomenon is unknown, but some of the potential reasons could be as follows: (a) reviewers trying not to downgrade the confidence in their results; (b) lack of methodological expertise and knowledge for assessing the risk of PB which also resulted in designing poor search strategies; and (c) solely due to chance. Nevertheless, the journal editors and reviewers should ask the authors to assess the risk of PB in their SRs whenever feasible.
More importantly, we saw that in a lot of the cases where reviewers found a high risk for PB in their SRs, they did not try to estimate the intervention effect corrected for the impact of PB, take the risk of PB in making conclusions, or expand their search to reduce the risk of PB presence. This issue should be specifically noted by journal editors, asking the authors to include other sources of data as well when the risk of PB was assessed to be high, in an attempt to avoid publishing inflated results as much as possible.
Another finding of interest was the inappropriate use of methods to assess the risk of PB. Although this problem was not frequent across the SRs, some used inappropriate tests to assess funnel plot asymmetry, such as using Egger’s regression test instead of Deeks’ regression in the setting of DTA SRs or using statistical tests alone with no visual inspection of the funnel plot beforehand. Both journal editors and reviewers must note that the results of statistical tests for funnel plot asymmetry should be interpreted in light of the visual inspection of the plot, as all these tests are known to have low statistical power [
27]. Other factors should also be considered for using such tests, such as the fact that they are not recommended for cases when there are less than 10 studies included in the MA or that they should not be used when studies are of similar size [
27]. Using contour-enhanced funnel plots is also highly desirable as they help with differentiating the reasons for funnel plot asymmetry [
43].
Finally, we assessed some possible factors that might have contributed to the risk of PB presence. Surprisingly though, none of those factors (language restriction, a search of sources other than bibliographic databases, and the number of databases searched) had a statistically significant correlation with the presence of PB. This could be due to some possible reasons: First, it might be due to the small sample size of SRs included in the test. Another reason could be that some risk of PB was inevitable even in the absence of language restriction of the search, seeking other sources of data, and searching a large number of databases. Nevertheless, the results of these tests do not exclude the fact that implementing these measures will most probably reduce the risk of PB.
Implications
Our findings indicate the lack of methodological sufficiency for conducting high-quality SRs in the most influential journals of the field, which in turn might have led to the possible dissemination of inflated results. We strongly encourage future reviewers and editors of journals to take the issue of PB seriously and demand authors to take measures to reduce its risk and use appropriate methods to assess its possible presence. As PB is an issue at the outcome level, we also encourage future reviewers who want to conduct a study similar to ours in their fields to also assess if the SRs that evaluated the risk of PB for their primary outcome did the same for the secondary outcomes in their study as well. Finally, if feasible, we encourage future researchers who want to conduct a similar study to use more robust selection criteria for including SRs, as our criteria, which was a necessity due to the lack of enough review resources in our team, might have introduced some degree of selection bias in the results. Overall, the issue of PB is a serious issue that can result in the dissemination of inflated results, and thus, the whole scientific community is encouraged to take this phenomenon into more careful consideration, especially when conducting an SR.