Abstract
Purpose
Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory.
Methods
Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin’s “missing completely at random,” “missing at random,” and “missing not at random”). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36.
Results
For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations.
Conclusions
Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
Similar content being viewed by others
Abbreviations
- MCAR:
-
Missing completely at random
- MAR:
-
Missing at random
- MNAR:
-
Missing not at random
- PMS:
-
Personal mean score
- MI:
-
Multiple Imputation
- HD:
-
Hot deck
- QoL:
-
Quality of life
- SF-36:
-
Medical outcome study 36-item short-form health survey
- FIML:
-
Full information maximum likelihood
References
Enders, C. (2006). A primer on the use of modern missing-data methods in psychosomatic medicine research. Psychosomatic Medicine, 68, 427–436.
Graham, J. W. (2009). Missing data analysis: Making it works in the real world. Annual Review of Psychology, 60, 549–576.
Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48(1), 5–47.
Zwinderman, A. H. (1992). Statistical analysis of longitudinal quality of life data with missing measurements. Quality of Life Research, 1(3), 219–224.
Fairclough, D. L., Peterson, H. F., & Chang, V. (1998). Why are missing quality of life data a problem in clinical trials of cancer therapy? Statistics in Medicine, 17(5–7), 667–677.
Troxel, A. B., Fairclough, D. L., Curran, D., & Hahn, E. A. (1998). Statistical analysis of quality of life with missing data in cancer clinical trials. Statistics in Medicine, 17(5–7), 653–666.
Little, R., & Rubin, D. (1987). Statistical analysis with missing data. New York: John Wiley and Sons.
Donaldson, G. W., & Moinpour, C. M. (2005). Learning to live with missing quality-of-life data in advanced-stage disease trials. Journal of Clinical Oncology, 23, 447–453.
Fielding, S., Fayers, P. M., & Ramsay, C. R. (2009). Investigating the missing data mechanism in quality of life outcomes: A comparison of approaches. Health and Quality of Life Outcomes, 8(16), 1477–7525.
Peyre, H., Coste, J., & Leplège, A. (2010). Identifying type and determinants of missing items in quality of life questionnaires: Application to the SF-36 French version of the 2003 decennial health survey. Health and Quality of Life Outcomes, 8, 16.
Morris, J., & Coyle, D. (1994). Quality of life questionnaires in cancer clinical trials: Imputing missing values. Psycho-Oncology, 3(3), 215–222.
Fairclough, D. L., & Cella, D. F. (1996). Functional assessment of cancer therapy (FACT-G): Non-response to individual questions. Quality of Life Research, 5(3), 321–329.
Fayers, P. M., Curran, D., & Machin, D. (1998). Incomplete quality of life data in randomized trials: Missing items. Statistics in Medicine, 17(5–7), 679–696.
Hawthorne, G., & Elliott, P. (2005). Imputing cross-sectional missing data: Comparison of common techniques. The Australian and New Zealand Journal of Psychiatry, 39(7), 583–590.
Croy, C. D., & Novins, D. K. (2005). Methods for addressing missing data in psychiatric and developmental research. Journal of the American Academy of Child and Adolescent Psychiatry, 44(12), 1230–1240.
Schafer, J., & Graham, J. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Stuart, E. A., Azur, M., Frangakis, C., & Leaf, P. (2009). Multiple imputation with large data sets: A case study of the children’s mental Health initiative. American Journal of Epidemiology, 169, 1133–1139.
Perneger, T. V., & Burnand, B. (2005). A simple imputation algorithm reduced missing data in SF-12 health surveys. Journal of Clinical Epidemiology, 58(2), 142–149.
Lin, T. H. (2006). Missing data imputation in quality-of-life assessment: Imputation for WHOQOL-BREF. Pharmacoeconomics, 24(9), 917–925.
Sande, I. (1983). Hot Deck imputation procedures, incomplete data in samples surveys. New York: Academic Press. (Book).
Rubin, D. (1987). Multiple imputation for nonreponse in surveys. New York: John Wiley and Sons. (Book).
Donders, A. R., van der Heijden, G. J., Stijnen, T., & Moons, K. G. (2006). Review: A gentle introduction to imputation of missing values. Journal of Clinical Epidemiology, 59(10), 1087–1091.
Shrive, F. M., Stuart, H., Quan, H., & Ghali, W. A. (2006). Dealing with missing data in a multi-question depression scale: A comparison of imputation methods. BMC Medical Research Methodology, 6, 57.
Lanoë, J., Makdessi-Raynaud, Y. (2005). L’état de santé en France en 2003. Santé perçue, morbidité déclarée et recours aux soins à travers l’enquête décennale santé. Etudes et résultats (DRESS), 436,1–12.
Ware, J. E., Jr, & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30(6), 473–483.
Stewart, A. L., & Ware, J. E. (1992). Measuring functioning and well-being. Durham and London: D.U. Press. (Book).
Leplege, A., Ecosse, E., Verdier, A., Perneger, T. V. (1998). The French SF-36 Health Survey: Translation, cultural adaptation and preliminary psychometric evaluation. Journal of Clinical Epidemiology, 51(11), 1013–23.
Leplege, A., Ecosse, E., Pouchot, J., Coste, J., & Perneger, T. V. (2001). Le questionnaire MOS SF-36, manuel de l’utilisation et guide d’interprétation des scores. Paris: ESTEM. (Book).
Ware, J. E., & Gandek, B. (1998). Overview of the SF-36 health survey and the international quality of life assessment (IQOLA). Journal of Clinical Epidemiology, 51(11), 903–911.
Fayers, P. M., Hand, D. J., Bjordal, K., & Groenvold, M. (1997). Causal indicators in quality of life research. Quality of Life Research, 6, 393–406.
Fayers, P. M., & Hand, D. J. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research, 6, 139–150.
Gandek, B., Ware, J. E., Aaronson, N. K., Alonso, J., Apolone, G., Bjorner, J., et al. (1998). Tests of data duality, scaling assumptions, and reliability of the SF-36 in Eleven Countries: Results from the IQOLA Project. Journal of Clinical Epidemiology, 51(11), 1149–1158.
Enders, C. K. (2001). A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling, 8(1), 128–141.
Sentas, P., & Angelis, L. (2006). Categorical missing data imputation for software cost estimation by multinomial logistic regression. Journal of Systems and Software, 79(3), 404–414.
Vermunt, J. K., Van Ginkel, J. R., Van der Ark, L. A., & Sijtsma, K. (2008). Multiple imputation of incomplete categorical data using latent class analysis. Sociological Methodology, 38, 369–397.
Reiter, J., Raghunathan, T. E., & Kinney, S. K. (2006). The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodology, 32(2), 143–149.
Binder, D. A., Sun, W. (1996). Frequency valid multiple imputation for surveys with a complex design. Proceedings of the survey research methods section, ASA, 281–286.
Sterne, J. A. C., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., et al. (2009). Multiple imputation for missing data in epidemiological and clinical research. Potential and Pitfalls BMJ, 338, b2393.
Acknowledgments
We thank Jean Louis Lanoë for allowing us to exploit the data of the 2003 Decennial Health Survey. We also thank David Jegou and Vivian Viallon for assistance with simulations.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Peyre, H., Leplège, A. & Coste, J. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey. Qual Life Res 20, 287–300 (2011). https://doi.org/10.1007/s11136-010-9740-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-010-9740-3