Skip to main content
main-content

01.12.2013 | Research article | Ausgabe 1/2013 Open Access

BMC Medical Research Methodology 1/2013

Diagnosing problems with imputation models using the Kolmogorov-Smirnov test: a simulation study

Zeitschrift:
BMC Medical Research Methodology > Ausgabe 1/2013
Autoren:
Cattram D Nguyen, John B Carlin, Katherine J Lee
Wichtige Hinweise

Electronic supplementary material

The online version of this article (doi:10.​1186/​1471-2288-13-144) contains supplementary material, which is available to authorized users.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors participated in the design of the simulation study and the interpretation of the results. CN performed the case study analysis and conducted the simulations. CN wrote the first draft of the manuscript and prepared all tables and graphs. All authors read and contributed to the final manuscript.

Abstract

Background

Multiple imputation (MI) is becoming increasingly popular as a strategy for handling missing data, but there is a scarcity of tools for checking the adequacy of imputation models. The Kolmogorov-Smirnov (KS) test has been identified as a potential diagnostic method for assessing whether the distribution of imputed data deviates substantially from that of the observed data. The aim of this study was to evaluate the performance of the KS test as an imputation diagnostic.

Methods

Using simulation, we examined whether the KS test could reliably identify departures from assumptions made in the imputation model. To do this we examined how the p-values from the KS test behaved when skewed and heavy-tailed data were imputed using a normal imputation model. We varied the amount of missing data, the missing data models and the amount of skewness, and evaluated the performance of KS test in diagnosing issues with the imputation models under these different scenarios.

Results

The KS test was able to flag differences between the observations and imputed values; however, these differences did not always correspond to problems with MI inference for the regression parameter of interest. When there was a strong missing at random dependency, the KS p-values were very small, regardless of whether or not the MI estimates were biased; so that the KS test was not able to discriminate between imputed variables that required further investigation, and those that did not. The p-values were also sensitive to sample size and the proportion of missing data, adding to the challenge of interpreting the results from the KS test.

Conclusions

Given our study results, it is difficult to establish guidelines or recommendations for using the KS test as a diagnostic tool for MI. The investigation of other imputation diagnostics and their incorporation into statistical software are important areas for future research.
Zusatzmaterial
Literatur
Über diesen Artikel

Weitere Artikel der Ausgabe 1/2013

BMC Medical Research Methodology 1/2013 Zur Ausgabe