Discussion
In this paper, we presented MetaBayesDTA, an extensively expanded web-based R Shiny [
17] application based on MetaDTA [
9]. The application enables users to conduct Bayesian meta-analysis of diagnostic test accuracy studies, both assuming a perfect reference test or modelling an imperfect reference test, without users having to install any software or have any knowledge of R [
16] or Stan [
14] programming.
The application uses the bivariate model [
1] to conduct analysis assuming a perfect reference test, and users can also conduct univariate meta-regression and subgroup analysis. It uses LCMs [
4,
5] to conduct analyses without assuming a perfect gold standard, allowing the user to run models assuming conditional independence or dependence, options for whether to model the reference and index test sensitivities and specificities as fixed or random effects, and can model multiple reference tests using a meta-regression covariate for the type of reference test. The application allows users to input their own prior distributions, which is particularly useful for the LCM models since information about the accuracy of the reference test(s) is often known. Similarly to MetaDTA [
9], the tables and figures can be downloaded, and the graphs are highly customizable. Furthermore, risk of bias and quality assessment results from the QUADAS-2 [
22] tool can be incorporated into the sROC plot; integrating risk of bias into the main analysis decreases the tendency to think of risk of bias as an afterthought. Sensitivity analysis allowing users to remove selected studies can also be carried out easily for all models.
As we discussed in the “why is this application needed?” section (see Table
1), our app offers improvements over both BayesDTA [
12] and MetaDTA [
9‐
11]. Namely, for the bivariate model, unlike both BayesDTA and MetaDTA, our app allows subgroup analysis and univariate meta-regression (either categorical or continuous covariate) to be carried out, which also allows users to easily conduct comparative test accuracy meta-analysis to compare two or more tests to one another. Furthermore, unlike BayesDTA, for the LCM, our app can assess model fit using the correlation residual plot [
30], and it can model multiple reference tests, using a categorical covariate for the type of reference test. This is important since studies included in meta-analysis of test accuracy often use different reference tests, and the accuracy can vary greatly between them. Even though it is more complicated than MetaDTA, as it can run 5 different models rather than one and the graphs have more customization options, it has a cleaner layout and many of the menus are hidden unless the user clicks on them to display more options, thanks to the shinydashboard [
19] and shinyWidgets [
20] R packages. In general, there are some benefits of using Bayesian methods for meta-analysis of test accuracy as opposed to frequentist. For instance, being able to include informative prior information is particularly useful for the imperfect gold standard model, where parameter identifiability if often an issue. Furthermore, Bayesian methods generally outperform frequentist methods when there are few studies in a meta-analysis (which is often the case) - as frequentist methods are more likely to underestimate the between-study heterogeneity [
36].
Our web application has some limitations which give way to future developments. For example relating to meta-analysis of test accuracy without assuming a perfect gold standard (LCM), whilst users can model the data without assuming conditional independence between tests, it does not offer functionality to impose restrictions on the correlation structure. Therefore, a potential improvement would be to allow users to impose these restrictions, such as assuming the same correlation in the diseased and non-diseased groups, and/or forcing the correlations between the tests to be positive. Another limitation of the LCM is that it can only model different reference tests using categorical meta-regression and therefore assumes that all of the reference tests have the same between-study variances. Although this is often an advantage compared to conducting a subgroup analysis for each reference test, sometimes it might make sense to run a more complex model which assumes separate between-study variances for some reference tests and assumes fixed effects for reference tests only observed in a few (e.g., 5) studies, therefore adding this functionality is a potential update.
For the bivariate model, a potential update for both subgroup analysis and categorical meta-regression would be allow users to specify different priors for each of the groups. Furthermore, for meta-regression, although our application allows users to see the pairwise differences and ratio’s between the different categories of a categorical covariate (making it possible to use for comparative test accuracy of multiple tests), it only shows these for the meta-regression which assumes the variances are the same between all tests. However, in some instances it might make sense for the variances for some (or all - which would be equivalent to conducting a subgroup analysis) of the tests to be different, so a future update to improve the application would be to also display the pairwise differences and ratios for the subgroup analysis, and allowing users to assume independent variances for some tests but shared variances across other tests.
Another limitation is that our application only allows subgroup analysis and meta-regression (besides for modelling different reference tests) to be conducted using the bivariate model, which assumes a perfect gold standard. A potential improvement would be to allow users to run subgroup analyses and meta-regression for the LCM. Furthermore, the application requires users to have some knowledge about checking Bayesian model diagnostics to check that the models have been fitted OK - although the application does contain some information (in the “model diagnostics” tabs) which explains how to interpret some of the model diagnostics, and also directs users to online resources which explain how to interpret the model diagnostics so users do not have to find this information themselves.
It is important to note that this app is a beta version, so it is expected that there may be some bugs. Therefore, we welcome any user feedback - this can be done by completing the user feedback questionnaire (a link is provided in a pop-up box which appears when accessing MetaBayesDTA), or by emailing the first author of this paper. Responses to this feedback questionnaire will inform future updates of the application and will ensure that the user-friendliness of MetaBayesDTA increases over time and becomes a widely used diagnostic test accuracy meta-analysis web application, as MetaDTA [
9] has become. A number of features included in MetaBayesDTA were included as a result of user and stake holder feedback - including the imperfect gold standard models, the meta-regression and subgroup analysis, the “hidden” menus and options to make the interface look cleaner and less intimidating, and the Bayesian capabilities of the application.
In general, one could argue that easy-to-use apps could lead to the over-application of complex methods even when they are not appropriate. This is because web applications - such as the one presented in this paper - will allow less experienced researchers to be able to conduct complex analyses which would otherwise be inaccessible to them, lowering the amount of knowledge needed to perform the analysis, and therefore increasing the chance of invalid results being published. Therefore, we recommend that there is a statistician (with knowledge of how to check Bayesian model diagnostics) in the review team. Furthermore, we have implemented a number of features in our application to minimise the risk of misleading research outputs being produced. These include: the informative pop-up boxes which appear which give information about setting up appropriate prior distributions and remind users to check the sampler diagnostics every time they run a new model, guidance in the “sampler diagnostics” tab so that users can interpret the sampler diagnostics, and implementing appropriate restrictions (e.g., whenever random-effects are used, the 95% prediction regions will always be displayed on the sROC plots - we do not allow only 95% credible regions to be displayed as this will not portray information about the between-study heterogeneity and can be misleading).
One could also argue that the widespread usability of apps could stimulate the uptake of more appropriate methods, which means that better methods will become standard practice more quickly. This could have important impacts for clinical practice; for instance, the fact that our app allows one to easily conduct a meta-analysis of test accuracy without assuming a gold standard without assuming the same reference test is used across all studies opens up many new datasets to synthesis, since many studies are conducted using different imperfect reference tests.
Conclusions
In this paper, we presented MetaBayesDTA [
13], a user-friendly, interactive web application which allows users to conduct Bayesian meta-analysis of test accuracy, with or without a gold standard. The application uses methods which were previously only available by using statistical programming languages, such as R [
16].
This application could have a wide-ranging impact across academia, guideline writers, policy makers, and industry. For example, when there is not a perfect reference test available, the estimates of test accuracy can change quite notably when relaxing the perfect reference test assumption, leading to potentially different conclusions being drawn about the accuracy of a test which could ultimately lead to changes in which tests are used in clinical practice. Furthermore, the ability of the app to easily conduct comparative test accuracy meta-analysis means that clinicians will more easily be able to tell which tests are better.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.