Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey

Peyre, Hugo; Leplège, Alain; Coste, Joël

doi:10.1007/s11136-010-9740-3

Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey

Published: 01 October 2010

Volume 20, pages 287–300, (2011)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Hugo Peyre¹,
Alain Leplège² &
Joël Coste¹

3750 Accesses
178 Citations
Explore all metrics

Abstract

Purpose

Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory.

Methods

Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin’s “missing completely at random,” “missing at random,” and “missing not at random”). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36.

Results

For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations.

Conclusions

Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What difference does multiple imputation make in longitudinal modeling of EQ-5D-5L data? Empirical analyses of simulated and observed missing data patterns

Article Open access 19 November 2021

Multiple imputation to deal with missing EQ-5D-3L data: Should we impute individual domains or the actual index?

Article 04 December 2014

Combining proration and full information maximum likelihood in handling missing data in Likert scale items: A hybrid approach

Article 06 August 2021

Abbreviations

MCAR:: Missing completely at random
MAR:: Missing at random
MNAR:: Missing not at random
PMS:: Personal mean score
MI:: Multiple Imputation
HD:: Hot deck
QoL:: Quality of life
SF-36:: Medical outcome study 36-item short-form health survey
FIML:: Full information maximum likelihood

References

Enders, C. (2006). A primer on the use of modern missing-data methods in psychosomatic medicine research. Psychosomatic Medicine, 68, 427–436.
Article PubMed Google Scholar
Graham, J. W. (2009). Missing data analysis: Making it works in the real world. Annual Review of Psychology, 60, 549–576.
Article PubMed Google Scholar
Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48(1), 5–47.
Article PubMed Google Scholar
Zwinderman, A. H. (1992). Statistical analysis of longitudinal quality of life data with missing measurements. Quality of Life Research, 1(3), 219–224.
Article CAS PubMed Google Scholar
Fairclough, D. L., Peterson, H. F., & Chang, V. (1998). Why are missing quality of life data a problem in clinical trials of cancer therapy? Statistics in Medicine, 17(5–7), 667–677.
Article CAS PubMed Google Scholar
Troxel, A. B., Fairclough, D. L., Curran, D., & Hahn, E. A. (1998). Statistical analysis of quality of life with missing data in cancer clinical trials. Statistics in Medicine, 17(5–7), 653–666.
Article CAS PubMed Google Scholar
Little, R., & Rubin, D. (1987). Statistical analysis with missing data. New York: John Wiley and Sons.
Google Scholar
Donaldson, G. W., & Moinpour, C. M. (2005). Learning to live with missing quality-of-life data in advanced-stage disease trials. Journal of Clinical Oncology, 23, 447–453.
Article Google Scholar
Fielding, S., Fayers, P. M., & Ramsay, C. R. (2009). Investigating the missing data mechanism in quality of life outcomes: A comparison of approaches. Health and Quality of Life Outcomes, 8(16), 1477–7525.
Google Scholar
Peyre, H., Coste, J., & Leplège, A. (2010). Identifying type and determinants of missing items in quality of life questionnaires: Application to the SF-36 French version of the 2003 decennial health survey. Health and Quality of Life Outcomes, 8, 16.
Article PubMed Google Scholar
Morris, J., & Coyle, D. (1994). Quality of life questionnaires in cancer clinical trials: Imputing missing values. Psycho-Oncology, 3(3), 215–222.
Article Google Scholar
Fairclough, D. L., & Cella, D. F. (1996). Functional assessment of cancer therapy (FACT-G): Non-response to individual questions. Quality of Life Research, 5(3), 321–329.
Article CAS PubMed Google Scholar
Fayers, P. M., Curran, D., & Machin, D. (1998). Incomplete quality of life data in randomized trials: Missing items. Statistics in Medicine, 17(5–7), 679–696.
Article CAS PubMed Google Scholar
Hawthorne, G., & Elliott, P. (2005). Imputing cross-sectional missing data: Comparison of common techniques. The Australian and New Zealand Journal of Psychiatry, 39(7), 583–590.
PubMed Google Scholar
Croy, C. D., & Novins, D. K. (2005). Methods for addressing missing data in psychiatric and developmental research. Journal of the American Academy of Child and Adolescent Psychiatry, 44(12), 1230–1240.
Article PubMed Google Scholar
Schafer, J., & Graham, J. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Article PubMed Google Scholar
Stuart, E. A., Azur, M., Frangakis, C., & Leaf, P. (2009). Multiple imputation with large data sets: A case study of the children’s mental Health initiative. American Journal of Epidemiology, 169, 1133–1139.
Article PubMed Google Scholar
Perneger, T. V., & Burnand, B. (2005). A simple imputation algorithm reduced missing data in SF-12 health surveys. Journal of Clinical Epidemiology, 58(2), 142–149.
Article PubMed Google Scholar
Lin, T. H. (2006). Missing data imputation in quality-of-life assessment: Imputation for WHOQOL-BREF. Pharmacoeconomics, 24(9), 917–925.
Article PubMed Google Scholar
Sande, I. (1983). Hot Deck imputation procedures, incomplete data in samples surveys. New York: Academic Press. (Book).
Google Scholar
Rubin, D. (1987). Multiple imputation for nonreponse in surveys. New York: John Wiley and Sons. (Book).
Book Google Scholar
Donders, A. R., van der Heijden, G. J., Stijnen, T., & Moons, K. G. (2006). Review: A gentle introduction to imputation of missing values. Journal of Clinical Epidemiology, 59(10), 1087–1091.
Article PubMed Google Scholar
Shrive, F. M., Stuart, H., Quan, H., & Ghali, W. A. (2006). Dealing with missing data in a multi-question depression scale: A comparison of imputation methods. BMC Medical Research Methodology, 6, 57.
Article PubMed Google Scholar
Lanoë, J., Makdessi-Raynaud, Y. (2005). L’état de santé en France en 2003. Santé perçue, morbidité déclarée et recours aux soins à travers l’enquête décennale santé. Etudes et résultats (DRESS), 436,1–12.
Google Scholar
Ware, J. E., Jr, & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30(6), 473–483.
Article PubMed Google Scholar
Stewart, A. L., & Ware, J. E. (1992). Measuring functioning and well-being. Durham and London: D.U. Press. (Book).
Google Scholar
Leplege, A., Ecosse, E., Verdier, A., Perneger, T. V. (1998). The French SF-36 Health Survey: Translation, cultural adaptation and preliminary psychometric evaluation. Journal of Clinical Epidemiology, 51(11), 1013–23.
Google Scholar
Leplege, A., Ecosse, E., Pouchot, J., Coste, J., & Perneger, T. V. (2001). Le questionnaire MOS SF-36, manuel de l’utilisation et guide d’interprétation des scores. Paris: ESTEM. (Book).
Google Scholar
Ware, J. E., & Gandek, B. (1998). Overview of the SF-36 health survey and the international quality of life assessment (IQOLA). Journal of Clinical Epidemiology, 51(11), 903–911.
Article PubMed Google Scholar
Fayers, P. M., Hand, D. J., Bjordal, K., & Groenvold, M. (1997). Causal indicators in quality of life research. Quality of Life Research, 6, 393–406.
Article CAS PubMed Google Scholar
Fayers, P. M., & Hand, D. J. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research, 6, 139–150.
CAS PubMed Google Scholar
Gandek, B., Ware, J. E., Aaronson, N. K., Alonso, J., Apolone, G., Bjorner, J., et al. (1998). Tests of data duality, scaling assumptions, and reliability of the SF-36 in Eleven Countries: Results from the IQOLA Project. Journal of Clinical Epidemiology, 51(11), 1149–1158.
Article CAS PubMed Google Scholar
Enders, C. K. (2001). A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling, 8(1), 128–141.
Article Google Scholar
Sentas, P., & Angelis, L. (2006). Categorical missing data imputation for software cost estimation by multinomial logistic regression. Journal of Systems and Software, 79(3), 404–414.
Article Google Scholar
Vermunt, J. K., Van Ginkel, J. R., Van der Ark, L. A., & Sijtsma, K. (2008). Multiple imputation of incomplete categorical data using latent class analysis. Sociological Methodology, 38, 369–397.
Google Scholar
Reiter, J., Raghunathan, T. E., & Kinney, S. K. (2006). The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodology, 32(2), 143–149.
Google Scholar
Binder, D. A., Sun, W. (1996). Frequency valid multiple imputation for surveys with a complex design. Proceedings of the survey research methods section, ASA, 281–286.
Sterne, J. A. C., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., et al. (2009). Multiple imputation for missing data in epidemiological and clinical research. Potential and Pitfalls BMJ, 338, b2393.
Google Scholar

Download references

Acknowledgments

We thank Jean Louis Lanoë for allowing us to exploit the data of the 2003 Decennial Health Survey. We also thank David Jegou and Vivian Viallon for assistance with simulations.

Author information

Authors and Affiliations

Biostatistics and Epidemiology Unit, Assistance Publique-Hôpitaux de Paris, Hôpital Cochin, Nancy-Université, Université Paris-Descartes, Université Metz Paul Verlaine, Research unit APEMAC, EA 4360, 27 rue du Faubourg Saint-Jacques, 75674, Paris Cedex 14, France
Hugo Peyre & Joël Coste
Department of History and Philosophy of Sciences, University of Paris Diderot, Paris 7, France
Alain Leplège

Authors

Hugo Peyre
View author publications
You can also search for this author in PubMed Google Scholar
Alain Leplège
View author publications
You can also search for this author in PubMed Google Scholar
Joël Coste
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joël Coste.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peyre, H., Leplège, A. & Coste, J. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey. Qual Life Res 20, 287–300 (2011). https://doi.org/10.1007/s11136-010-9740-3

Download citation

Accepted: 02 September 2010
Published: 01 October 2010
Issue Date: March 2011
DOI: https://doi.org/10.1007/s11136-010-9740-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey