The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting

Jen, Min-Hua; Bottle, Alex; Kirkwood, Graham; Johnston, Ron; Aylin, Paul

doi:10.1007/s10729-011-9159-6

The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting

Published: 10 May 2011

Volume 14, pages 267–278, (2011)
Cite this article

Health Care Management Science Aims and scope Submit manuscript

Min-Hua Jen^1,2,3,
Alex Bottle¹,
Graham Kirkwood⁴,
Ron Johnston⁵ &
…
Paul Aylin¹

358 Accesses
9 Citations
Explore all metrics

Abstract

We have previously described a system for monitoring a number of healthcare outcomes using case-mix adjustment models. It is desirable to automate the model fitting process in such a system if monitoring covers a large number of outcome measures or subgroup analyses. Our aim was to compare the performance of three different variable selection strategies: “manual”, “automated” backward elimination and re-categorisation, and including all variables at once, irrespective of their apparent importance, with automated re-categorisation. Logistic regression models for predicting in-hospital mortality and emergency readmission within 28 days were fitted to an administrative database for 78 diagnosis groups and 126 procedures from 1996 to 2006 for National Health Services hospital trusts in England. The performance of models was assessed with Receiver Operating Characteristic (ROC) c statistics, (measuring discrimination) and Brier score (assessing the average of the predictive accuracy). Overall, discrimination was similar for diagnoses and procedures and consistently better for mortality than for emergency readmission. Brier scores were generally low overall (showing higher accuracy) and were lower for procedures than diagnoses, with a few exceptions for emergency readmission within 28 days. Among the three variable selection strategies, the automated procedure had similar performance to the manual method in almost all cases except low-risk groups with few outcome events. For the rapid generation of multiple case-mix models we suggest applying automated modelling to reduce the time required, in particular when examining different outcomes of large numbers of procedures and diseases in routinely collected administrative health data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models

Article Open access 04 May 2018

Validation of a new predictive risk model: measuring the impact of the major modifiable risks of death for patients and populations

Article Open access 01 October 2015

Risk and Outcome Assessments

References

Allison P (2001) Logistic regression using the SAS system: theory and application. Cary, NC: SAS Institute and New York: John Wiley
Austin PC, Tu JV (2004) Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol 57:1138–1146
Article Google Scholar
Austin PC, Tu JV (2004) Bootstrap methods for developing predictive models. Am Stat 58:131–137
Article Google Scholar
Aylin P, Bottle A, Majeed A (2007) Use of administrative data or clinical databases as predictors of risk of death in hospital: comparison of models. Br Med J 334:1044–51
Article Google Scholar
Bottle A, Aylin P (2008) Intelligent Information: a national system for monitoring clinical performance. Health Serv Res 43:10–31
Article Google Scholar
Cao R (1999) An overview of bootstrap methods for estimating and predicting in time series. TEST Official J Span Soc Stat Oper Res 8:95–166
Google Scholar
Chen M-H, Ibrahim JG, Yiannoutsos C (1999) Prior elicitation, variable selection, and bayesian computation for logistic regression models. J R Stat Soc Ser B 61:223–242
Article Google Scholar
Cook NR (2007) Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115:928–935
Article Google Scholar
Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 35:352–359
Article Google Scholar
Efroymson MA (1960) Multiple regression analysis. In: Ralston A, Wilf HS (eds) Mathematical methods for digital computers. Wiley, New York
Google Scholar
Hanley JA, McNeil BJ (1982) The meaning and use of the area under the receiver operating characteristics (ROC) curve. Radiology 143:29–36
Google Scholar
Harrell FE Jr, Lee KL, Mark DB (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15:361–387
Article Google Scholar
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Book Google Scholar
Ikeda M, Ishigaki T, Yamauchi K (2002) Relationship between Brier score and area under the binormal ROC curve. Comput Meth Programs Biomed 67:187–194
Article Google Scholar
Iezzoni LI (1997) Risk adjustment for measuring healthcare outcomes, 2nd edn. Health Administration, Chicago
Google Scholar
Mackillop WJ, Quirt CF (1997) Measuring the accuracy of prognostic judgments in oncology. J Clin Epidemiol 50:21–29
Article Google Scholar
Mitchell TJ, Beauchamp JJ (1988) Bayesian variable selection in linear regression (with discussion). J Am Stat Assoc 83:1023–1036
Article Google Scholar
Moons K, Royston P, Vegouwe Y, Grobbee D, Altman D (2009) Prognosis and prognostic research: what, why, and how? Br Med J 338:1317–1320
Article Google Scholar
Nicholl J (2007) Case-mix adjustment in non-randomised observational evaluations: the constant risk fallacy. J Epidemiol Community Health 61:1010–1013
Article Google Scholar
SAS Institute Inc. SAS version 9.1.
Signorini DF, Andrews PJD, Jones PA, Joanna MW, Miller JD (1999) Predicting survival using simple clinical variables: a case study in traumatic brain injury. J Neurol Neurosurg Psychiatry 66:20–25
Article Google Scholar
The Information centre (2008) Hospital Episode Statistics [Online]. [accessed March 2010]; Available from: URL http://www.hesonline.nhs.uk/
Weisburg S (1985) Applied linear regression, 2nd edn. John Wiley and Sons inc. pp. 214–215
Wilkinson L, Dallal GE (1981) Tests of significance in forward selection regression with an F-to enter stopping rule. Technometrics 23:377–380
Article Google Scholar

Download references

Funding

MHJ, AB and PA and the Dr Foster Unit at Imperial are principally funded via a research grant by Dr Foster Intelligence, an independent healthcare information company. The Unit is affiliated with the Imperial Centre for Patient Safety and Service Quality at Imperial College Healthcare NHS Trust which is funded by the National Institute of Health Research and the Centre for Infection Prevention and Management funded by the UK Clinical Research Collaboration. The Department of Primary Care & Public Health at Imperial College is grateful for support from the NIHR Biomedical Research Centre scheme, the NIHR Collaboration for Leadership in Applied Health Research & Care (CLAHRC) Scheme.

Ethics

We have permission from the NIGB under Section 251 of the NHS Act 2006 (formerly Section 60 approval from the Patient Information Advisory Group) to hold confidential data and analyse them for research purposes. Consent was given on behalf of patients since for national data, individual consent is considered unfeasible. We have ethical approval from the South East Research Ethics Committee.

Author information

Authors and Affiliations

Dr. Foster Unit at Imperial College, Department of Primary Care and Public Health, Imperial College London, London, W6 8RP, UK
Min-Hua Jen, Alex Bottle & Paul Aylin
Evidence Review Business Unit, HERON Evidence Development, Luton, LU2 8DL, UK
Min-Hua Jen
Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan, Republic of China
Min-Hua Jen
Centre for International Public Health Policy, University of Edinburgh, Medical Quad, Teviot Place, Edinburgh, EH8 9AG, UK
Graham Kirkwood
School of Geographical Sciences, University of Bristol, University Road, Bristol, BS8 1SS, UK
Ron Johnston

Authors

Min-Hua Jen
View author publications
You can also search for this author in PubMed Google Scholar
Alex Bottle
View author publications
You can also search for this author in PubMed Google Scholar
Graham Kirkwood
View author publications
You can also search for this author in PubMed Google Scholar
Ron Johnston
View author publications
You can also search for this author in PubMed Google Scholar
Paul Aylin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min-Hua Jen.

Appendix

Table 5 List of diagnosis groups

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jen, MH., Bottle, A., Kirkwood, G. et al. The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting. Health Care Manag Sci 14, 267–278 (2011). https://doi.org/10.1007/s10729-011-9159-6

Download citation

Received: 25 August 2010
Accepted: 25 April 2011
Published: 10 May 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s10729-011-9159-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting

Abstract

Access this article

Similar content being viewed by others

The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models

Validation of a new predictive risk model: measuring the impact of the major modifiable risks of death for patients and populations

Risk and Outcome Assessments

References

Funding

Ethics

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting

Abstract

Access this article

Similar content being viewed by others

The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models

Validation of a new predictive risk model: measuring the impact of the major modifiable risks of death for patients and populations

Risk and Outcome Assessments

References

Funding

Ethics

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation