Skip to main content
Erschienen in: Health Services and Outcomes Research Methodology 3-4/2017

29.03.2017

Two parts are better than one: modeling marginal means of semicontinuous data

verfasst von: Valerie A. Smith, Brian Neelon, Matthew L. Maciejewski, John S. Preisser

Erschienen in: Health Services and Outcomes Research Methodology | Ausgabe 3-4/2017

Einloggen, um Zugang zu erhalten

Abstract

In health services research, it is common to encounter semicontinuous data characterized by a point mass at zero followed by a continuous distribution with positive support. These are often analyzed using two-part mixtures that separately model the probability of use to account for the portion of the sample with zero values. Commonly, but not always, the second component models the continuous values conditional on them being positive. Prior work examining whether such two-part models are needed to appropriately draw inference from semicontinuous data compared to standard one-part regression models has found mixed results. However, prior studies have generally used only measures of model fit on a single dataset, leaving a definitive conclusion uncertain. This paper provides a detailed evaluation using simulations of the appropriateness of standard one-part generalized linear models (GLMs) compared to a recently developed marginalized two-part (MTP) model. The MTP model, unlike the one-part GLMs, explicitly accounts for the point mass at zero, yet takes the same form for the marginal mean as the commonly used GLM with log link, making the covariate effects directly comparable. We simulate data scenarios with varying sample sizes and percentages of zeros. One-part GLMs resulted in increased bias, lower than nominal coverage of confidence intervals, and inflated type I error rates, rendering them inappropriate for use with semicontinuous data. Even when distributional assumptions were violated, estimates of covariate effects and type I error rates under the MTP model remained robust.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985) Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985)
Zurück zum Zitat Basu, A., Manning, W.G.: Issues for the next generation of health care cost analyses. Med. Care 47, S109–S114 (2009)CrossRefPubMed Basu, A., Manning, W.G.: Issues for the next generation of health care cost analyses. Med. Care 47, S109–S114 (2009)CrossRefPubMed
Zurück zum Zitat Basu, A., Rathouz, P.J.: Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics 6, 93–109 (2005)CrossRefPubMed Basu, A., Rathouz, P.J.: Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics 6, 93–109 (2005)CrossRefPubMed
Zurück zum Zitat Belotti, F., Deb, P., Manning, W.G., Norton, E.C.: twopm: two-part models. Stata J. 15, 3–20 (2015) Belotti, F., Deb, P., Manning, W.G., Norton, E.C.: twopm: two-part models. Stata J. 15, 3–20 (2015)
Zurück zum Zitat Blough, D.K., Madden, C.W., Hornbrook, M.C.: Modeling risk using generalized linear models. J. Health Econ. 18, 153–171 (1999)CrossRefPubMed Blough, D.K., Madden, C.W., Hornbrook, M.C.: Modeling risk using generalized linear models. J. Health Econ. 18, 153–171 (1999)CrossRefPubMed
Zurück zum Zitat Buntin, M.B., Zaslavsky, A.M.: Too much ado about two-part models and transformation?: comparing methods of modeling Medicare expenditures. J. Health Econ. 23, 525–542 (2004)CrossRefPubMed Buntin, M.B., Zaslavsky, A.M.: Too much ado about two-part models and transformation?: comparing methods of modeling Medicare expenditures. J. Health Econ. 23, 525–542 (2004)CrossRefPubMed
Zurück zum Zitat Chai, H.S., Bailey, K.R.: Use of log-skew-normal distribution in analysis of continuous data with a discrete component at zero. Stat. Med. 27, 3643–3655 (2008)CrossRefPubMedPubMedCentral Chai, H.S., Bailey, K.R.: Use of log-skew-normal distribution in analysis of continuous data with a discrete component at zero. Stat. Med. 27, 3643–3655 (2008)CrossRefPubMedPubMedCentral
Zurück zum Zitat Cragg, J.G.: Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39, 829–844 (1971)CrossRef Cragg, J.G.: Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39, 829–844 (1971)CrossRef
Zurück zum Zitat Diehr, P., Yanez, D., Ash, A., Hornbrook, M., Lin, D.: Methods for analyzing health care utilization and costs. Annu. Rev. Public Health 20, 125–144 (1999)CrossRefPubMed Diehr, P., Yanez, D., Ash, A., Hornbrook, M., Lin, D.: Methods for analyzing health care utilization and costs. Annu. Rev. Public Health 20, 125–144 (1999)CrossRefPubMed
Zurück zum Zitat Duan, N., Manning Jr., W.G., Morris, C.N., Newhouse, J.P.: A comparison of alternative models for the demand of medical care. J. Bus. Econ. Stat. 1, 115–126 (1983) Duan, N., Manning Jr., W.G., Morris, C.N., Newhouse, J.P.: A comparison of alternative models for the demand of medical care. J. Bus. Econ. Stat. 1, 115–126 (1983)
Zurück zum Zitat Fitzmaurice, G.M., Laird, N.M., Ware, J.H.: Applied Longitudinal Analysis. Wiley, New York (2012) Fitzmaurice, G.M., Laird, N.M., Ware, J.H.: Applied Longitudinal Analysis. Wiley, New York (2012)
Zurück zum Zitat Kahwati, L.C., Lance, T.X., Jones, K.R., Kinsinger, L.S.: RE-AIM evaluation of the Veterans Health Administration’s MOVE! weight management program. Transl. Behav. Med. 1, 551–560 (2011)CrossRefPubMedPubMedCentral Kahwati, L.C., Lance, T.X., Jones, K.R., Kinsinger, L.S.: RE-AIM evaluation of the Veterans Health Administration’s MOVE! weight management program. Transl. Behav. Med. 1, 551–560 (2011)CrossRefPubMedPubMedCentral
Zurück zum Zitat Kauermann, G., Carroll, R.J.: A note on the efficiency of sandwich covariance matrix estimation. J. Am. Stat. Assoc. 96, 1387–1396 (2001)CrossRef Kauermann, G., Carroll, R.J.: A note on the efficiency of sandwich covariance matrix estimation. J. Am. Stat. Assoc. 96, 1387–1396 (2001)CrossRef
Zurück zum Zitat Liu, L., Cowen, M.E., Strawderman, R.L., Shih, Y.-C.T.: A flexible two-part random effects model for correlated medical costs. J. Health Econ. 29, 110–123 (2010)CrossRefPubMed Liu, L., Cowen, M.E., Strawderman, R.L., Shih, Y.-C.T.: A flexible two-part random effects model for correlated medical costs. J. Health Econ. 29, 110–123 (2010)CrossRefPubMed
Zurück zum Zitat Madden, C.W., Mackay, B.P., Skillman, S.M., Ciol, M., Diehr, P.K.: Risk adjusting capitation: applications in employed and disabled populations. Health Care Manag. Sci. 3, 101–109 (2000)CrossRefPubMed Madden, C.W., Mackay, B.P., Skillman, S.M., Ciol, M., Diehr, P.K.: Risk adjusting capitation: applications in employed and disabled populations. Health Care Manag. Sci. 3, 101–109 (2000)CrossRefPubMed
Zurück zum Zitat Manning, W.G., Mullahy, J.: Estimating log models: to transform or not to transform? J. Health Econ. 20, 461–494 (2001)CrossRefPubMed Manning, W.G., Mullahy, J.: Estimating log models: to transform or not to transform? J. Health Econ. 20, 461–494 (2001)CrossRefPubMed
Zurück zum Zitat Manning, W.G., Morris, C.N., Newhouse, J.P., Orr, L.L., Duan, N., Keeler, E., Leibowitz, A., Marquis, K., Marquis, M., Phelps, C.: A two-part model of the demand for medical care: preliminary results from the health insurance study. In: Health, Economics, and Health Economics, pp. 103–123. North-Holland, Amsterdam (1981) Manning, W.G., Morris, C.N., Newhouse, J.P., Orr, L.L., Duan, N., Keeler, E., Leibowitz, A., Marquis, K., Marquis, M., Phelps, C.: A two-part model of the demand for medical care: preliminary results from the health insurance study. In: Health, Economics, and Health Economics, pp. 103–123. North-Holland, Amsterdam (1981)
Zurück zum Zitat Manning, W.G., Basu, A., Mullahy, J.: Generalized modeling approaches to risk adjustment of skewed outcomes data. J. Health Econ. 24, 465–488 (2005)CrossRefPubMed Manning, W.G., Basu, A., Mullahy, J.: Generalized modeling approaches to risk adjustment of skewed outcomes data. J. Health Econ. 24, 465–488 (2005)CrossRefPubMed
Zurück zum Zitat Mullahy, J.: Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J. Health Econ. 17, 247–281 (1998)CrossRefPubMed Mullahy, J.: Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J. Health Econ. 17, 247–281 (1998)CrossRefPubMed
Zurück zum Zitat Neelon, B., O’Malley, A.J., Smith, V.: Modeling zero-modified count and semicontinuous data in health services research, Part 2: Case studies. Stat. Med. 35, 5094–5112 (2016)CrossRefPubMed Neelon, B., O’Malley, A.J., Smith, V.: Modeling zero-modified count and semicontinuous data in health services research, Part 2: Case studies. Stat. Med. 35, 5094–5112 (2016)CrossRefPubMed
Zurück zum Zitat Park, R.E.: Estimation with heteroscedastic error terms. Econometrica 34, 888 (1966)CrossRef Park, R.E.: Estimation with heteroscedastic error terms. Econometrica 34, 888 (1966)CrossRef
Zurück zum Zitat Preisser, J.S., Das, K., Long, D.L., Divaris, K.: Marginalized zero-inflated negative binomial regression with application to dental caries. Stat. Med. 35, 1722–1735 (2016)CrossRefPubMed Preisser, J.S., Das, K., Long, D.L., Divaris, K.: Marginalized zero-inflated negative binomial regression with application to dental caries. Stat. Med. 35, 1722–1735 (2016)CrossRefPubMed
Zurück zum Zitat Royall, R.M.: Model robust confidence intervals using maximum likelihood estimators. Int. Stat. Rev. 54, 221–226 (1986)CrossRef Royall, R.M.: Model robust confidence intervals using maximum likelihood estimators. Int. Stat. Rev. 54, 221–226 (1986)CrossRef
Zurück zum Zitat Smith, V.A., Preisser, J.S.: Direct and flexible marginal inference for semicontinuous data. Stat. Methods Med. Res. (2015). doi:10.1177/0962280215602290 (published online September 1, 2015) Smith, V.A., Preisser, J.S.: Direct and flexible marginal inference for semicontinuous data. Stat. Methods Med. Res. (2015). doi:10.​1177/​0962280215602290​ (published online September 1, 2015)
Zurück zum Zitat Smith, V.A., Preisser, J.S., Neelon, B., Maciejewski, M.L.: A marginalized two-part model for semicontinuous data. Stat. Med. 33, 4891–4903 (2014)CrossRefPubMed Smith, V.A., Preisser, J.S., Neelon, B., Maciejewski, M.L.: A marginalized two-part model for semicontinuous data. Stat. Med. 33, 4891–4903 (2014)CrossRefPubMed
Zurück zum Zitat Smith, V.A., Neelon, B., Preisser, J.S., Maciejewski, M.L.: A marginalized two-part model for longitudinal semicontinuous data. Stat. Methods Med. Res. (2015). doi:10.1177/0962280215592908 (published online July 7, 2015) Smith, V.A., Neelon, B., Preisser, J.S., Maciejewski, M.L.: A marginalized two-part model for longitudinal semicontinuous data. Stat. Methods Med. Res. (2015). doi:10.​1177/​0962280215592908​ (published online July 7, 2015)
Metadaten
Titel
Two parts are better than one: modeling marginal means of semicontinuous data
verfasst von
Valerie A. Smith
Brian Neelon
Matthew L. Maciejewski
John S. Preisser
Publikationsdatum
29.03.2017
Verlag
Springer US
Erschienen in
Health Services and Outcomes Research Methodology / Ausgabe 3-4/2017
Print ISSN: 1387-3741
Elektronische ISSN: 1572-9400
DOI
https://doi.org/10.1007/s10742-017-0169-9

Weitere Artikel der Ausgabe 3-4/2017

Health Services and Outcomes Research Methodology 3-4/2017 Zur Ausgabe