Skip to main content

01.12.2018 | Research article | Ausgabe 1/2018 Open Access

BMC Medicine 1/2018

Performance of InSilicoVA for assigning causes of death to verbal autopsies: multisite validation study using clinical diagnostic gold standards

BMC Medicine > Ausgabe 1/2018
Abraham D. Flaxman, Jonathan C. Joseph, Christopher J. L. Murray, Ian Douglas Riley, Alan D. Lopez
Wichtige Hinweise

Electronic supplementary material

The online version of this article (https://​doi.​org/​10.​1186/​s12916-018-1039-1) contains supplementary material, which is available to authorized users.
A comment to this article is available online at https://​doi.​org/​10.​1186/​s12916-020-01517-w.



Recently, a new algorithm for automatic computer certification of verbal autopsy data named InSilicoVA was published. The authors presented their algorithm as a statistical method and assessed its performance using a single set of model predictors and one age group.


We perform a standard procedure for analyzing the predictive accuracy of verbal autopsy classification methods using the same data and the publicly available implementation of the algorithm released by the authors. We extend the original analysis to include children and neonates, instead of only adults, and test accuracy using different sets of predictors, including the set used in the original paper and a set that matches the released software.


The population-level performance (i.e., predictive accuracy) of the algorithm varied from 2.1 to 37.6% when trained on data preprocessed similarly as in the original study. When trained on data that matched the software default format, the performance ranged from −11.5 to 17.5%. When using the default training data provided, the performance ranged from −59.4 to −38.5%. Overall, the InSilicoVA predictive accuracy was found to be 11.6–8.2 percentage points lower than that of an alternative algorithm. Additionally, the sensitivity for InSilicoVA was consistently lower than that for an alternative diagnostic algorithm (Tariff 2.0), although the specificity was comparable.


The default format and training data provided by the software lead to results that are at best suboptimal, with poor cause-of-death predictive performance. This method is likely to generate erroneous cause of death predictions and, even if properly configured, is not as accurate as alternative automated diagnostic methods.
Über diesen Artikel

Weitere Artikel der Ausgabe 1/2018

BMC Medicine 1/2018 Zur Ausgabe