Skip to main content
Erschienen in: European Journal of Epidemiology 10/2012

01.10.2012 | METHODS

Assessing the discriminative ability of risk models for more than two outcome categories

verfasst von: Ben Van Calster, Yvonne Vergouwe, Caspar W. N. Looman, Vanya Van Belle, Dirk Timmerman, Ewout W. Steyerberg

Erschienen in: European Journal of Epidemiology | Ausgabe 10/2012

Einloggen, um Zugang zu erhalten

Abstract

The discriminative ability of risk models for dichotomous outcomes is often evaluated with the concordance index (c-index). However, many medical prediction problems are polytomous, meaning that more than two outcome categories need to be predicted. Unfortunately such problems are often dichotomized in prediction research. We present a perspective on the evaluation of discriminative ability of polytomous risk models, which may instigate researchers to consider polytomous prediction models more often. First, we suggest a “discrimination plot” as a tool to visualize the model’s discriminative ability. Second, we discuss the use of one overall polytomous c-index versus a set of dichotomous measures to summarize the performance of the model. Third, we address several aspects to consider when constructing a polytomous c-index. These involve the assessment of concordance in pairs versus sets of patients, weighting by outcome prevalence, the value related to models with random performance, the reduction to the dichotomous c-index for dichotomous problems, and interpretation. We illustrate these issues on case studies dealing with ovarian cancer (four outcome categories) and testicular cancer (three categories). We recommend the use of a discrimination plot together with an overall c-index such as the Polytomous Discrimination Index. If the overall c-index suggests that the model has relevant discriminative ability, pairwise c-indexes for each pair of outcome categories are informative. For pairwise c-indexes we recommend the ‘conditional-risk’ method which is consistent with the analytical approach of the multinomial logistic regression used to develop polytomous risk models.
Literatur
1.
Zurück zum Zitat Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009.
2.
Zurück zum Zitat Biesheuvel CJ, Vergouwe Y, Steyerberg EW, Grobbee DE, Moons KGM. Polytomous logistic regression analysis could be applied more often in diagnostic research. J Clin Epidemiol. 2008;61:125–34.PubMedCrossRef Biesheuvel CJ, Vergouwe Y, Steyerberg EW, Grobbee DE, Moons KGM. Polytomous logistic regression analysis could be applied more often in diagnostic research. J Clin Epidemiol. 2008;61:125–34.PubMedCrossRef
3.
Zurück zum Zitat Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87.PubMedCrossRef Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87.PubMedCrossRef
4.
Zurück zum Zitat Harrell FE Jr. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001. Harrell FE Jr. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001.
5.
Zurück zum Zitat Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36.PubMed Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36.PubMed
7.
Zurück zum Zitat Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn. 2001;45:171–86.CrossRef Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn. 2001;45:171–86.CrossRef
8.
Zurück zum Zitat Obuchowski NA, Goske MJ, Applegate KE. Assessing physicians’ accuracy in diagnosing paediatric patients with acute abdominal pain: measuring accuracy for multiple diseases. Stat Med. 2001;20:3261–78.PubMedCrossRef Obuchowski NA, Goske MJ, Applegate KE. Assessing physicians’ accuracy in diagnosing paediatric patients with acute abdominal pain: measuring accuracy for multiple diseases. Stat Med. 2001;20:3261–78.PubMedCrossRef
9.
Zurück zum Zitat Provost F, Domingos P. Tree induction for probability-based ranking. Mach Learn. 2003;52:199–215.CrossRef Provost F, Domingos P. Tree induction for probability-based ranking. Mach Learn. 2003;52:199–215.CrossRef
10.
Zurück zum Zitat Obuchowski NA. Estimating and comparing diagnostic tests’ accuracy when the gold standard is not binary. Acad Radiol. 2005;12:1198–204.PubMedCrossRef Obuchowski NA. Estimating and comparing diagnostic tests’ accuracy when the gold standard is not binary. Acad Radiol. 2005;12:1198–204.PubMedCrossRef
11.
Zurück zum Zitat Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S, Steyerberg EW. Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index. Stat Med. 2012;31:2610–26. Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S, Steyerberg EW. Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index. Stat Med. 2012;31:2610–26.
12.
Zurück zum Zitat Nakas CT, Yiannoutsos CT. Ordered multiple-class ROC analysis with continuous measurements. Stat Med. 2004;23:3437–49.PubMedCrossRef Nakas CT, Yiannoutsos CT. Ordered multiple-class ROC analysis with continuous measurements. Stat Med. 2004;23:3437–49.PubMedCrossRef
13.
Zurück zum Zitat Nakas CT, Alonzo TA. ROC graphs for assessing the ability of a diagnostic marker to detect three disease classes with an umbrella ordering. Biometrics. 2007;63:603–9.PubMedCrossRef Nakas CT, Alonzo TA. ROC graphs for assessing the ability of a diagnostic marker to detect three disease classes with an umbrella ordering. Biometrics. 2007;63:603–9.PubMedCrossRef
14.
Zurück zum Zitat Van Calster B, Van Belle V, Vergouwe Y, Steyerberg EW. Discrimination ability of prediction models for ordinal outcomes: relationship between existing measures and a new measure. Biom J. 2012;54:674–85.PubMedCrossRef Van Calster B, Van Belle V, Vergouwe Y, Steyerberg EW. Discrimination ability of prediction models for ordinal outcomes: relationship between existing measures and a new measure. Biom J. 2012;54:674–85.PubMedCrossRef
15.
Zurück zum Zitat Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21:128–38.PubMedCrossRef Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21:128–38.PubMedCrossRef
16.
Zurück zum Zitat Panici PB, Muzii L, Palaia I, Manci N, Bellati F, Plotti F, et al. Minilaparotomy versus laparoscopy in the treatment of benign adnexal cysts: a randomized clinical study. Eur J Obstet Gynecol Reprod Biol. 2007;133:218–22.PubMedCrossRef Panici PB, Muzii L, Palaia I, Manci N, Bellati F, Plotti F, et al. Minilaparotomy versus laparoscopy in the treatment of benign adnexal cysts: a randomized clinical study. Eur J Obstet Gynecol Reprod Biol. 2007;133:218–22.PubMedCrossRef
17.
Zurück zum Zitat Tinelli R, Tinelli A, Tinelli FG, Cicinelli E, Malvasi A. Conservative surgery for borderline ovarian tumors: a review. Gynecol Oncol. 2006;100:185–91.PubMedCrossRef Tinelli R, Tinelli A, Tinelli FG, Cicinelli E, Malvasi A. Conservative surgery for borderline ovarian tumors: a review. Gynecol Oncol. 2006;100:185–91.PubMedCrossRef
19.
Zurück zum Zitat Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML, et al. A logistic regression model to distinguish between the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis (IOTA) group. J Clin Oncol. 2005;23:8794–801.PubMedCrossRef Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML, et al. A logistic regression model to distinguish between the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis (IOTA) group. J Clin Oncol. 2005;23:8794–801.PubMedCrossRef
20.
Zurück zum Zitat Van Holsbeke C, Van Calster B, Testa AC, Domali E, Lu C, Van Huffel S, et al. Prospective internal validation of mathematical models to predict malignancy in adnexal masses: results from the International Ovarian Tumor Analysis Study. Clin Cancer Res. 2009;15:684–91.PubMedCrossRef Van Holsbeke C, Van Calster B, Testa AC, Domali E, Lu C, Van Huffel S, et al. Prospective internal validation of mathematical models to predict malignancy in adnexal masses: results from the International Ovarian Tumor Analysis Study. Clin Cancer Res. 2009;15:684–91.PubMedCrossRef
21.
Zurück zum Zitat Timmerman D, Van Calster B, Testa AC, Guerriero S, Fischerova D, Lissoni AA, et al. Ovarian cancer prediction in adnexal masses using ultrasound-based logistic regression models: a temporal and external validation study by the IOTA group. Ultrasound Obstet Gynecol. 2010;36:226–34.PubMedCrossRef Timmerman D, Van Calster B, Testa AC, Guerriero S, Fischerova D, Lissoni AA, et al. Ovarian cancer prediction in adnexal masses using ultrasound-based logistic regression models: a temporal and external validation study by the IOTA group. Ultrasound Obstet Gynecol. 2010;36:226–34.PubMedCrossRef
22.
Zurück zum Zitat Van Holsbeke C, Van Calster B, Bourne T, Ajossa S, Testa AC, Guerriero S, et al. External validation of diagnostic models to estimate the risk of malignancy in adnexal masses. Clin Cancer Res. 2012;18:815–25.PubMedCrossRef Van Holsbeke C, Van Calster B, Bourne T, Ajossa S, Testa AC, Guerriero S, et al. External validation of diagnostic models to estimate the risk of malignancy in adnexal masses. Clin Cancer Res. 2012;18:815–25.PubMedCrossRef
23.
Zurück zum Zitat Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I. Terms, definitions and measurements to describe the ultrasonographic features of adnexal tumors: a consensus opinion from the international ovarian tumor analysis (IOTA) group. Ultrasound Obstet Gynecol. 2000;16:500–5.PubMedCrossRef Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I. Terms, definitions and measurements to describe the ultrasonographic features of adnexal tumors: a consensus opinion from the international ovarian tumor analysis (IOTA) group. Ultrasound Obstet Gynecol. 2000;16:500–5.PubMedCrossRef
24.
Zurück zum Zitat Van Calster B, Valentin L, Van Holsbeke C, Zhang J, Jurkovic D, Lissoni AA, et al. A novel approach to predict the likelihood of specific ovarian tumor pathology based on serum CA-125: a multicenter observational study. Cancer Epidemiol Biomarkers Prev. 2011;20:2420–8.PubMedCrossRef Van Calster B, Valentin L, Van Holsbeke C, Zhang J, Jurkovic D, Lissoni AA, et al. A novel approach to predict the likelihood of specific ovarian tumor pathology based on serum CA-125: a multicenter observational study. Cancer Epidemiol Biomarkers Prev. 2011;20:2420–8.PubMedCrossRef
25.
Zurück zum Zitat Hosmer DW, Lemeshow S. Applied logistic regression. 2nd ed. New York: Wiley; 2000.CrossRef Hosmer DW, Lemeshow S. Applied logistic regression. 2nd ed. New York: Wiley; 2000.CrossRef
26.
Zurück zum Zitat Van Calster B, Valentin L, Van Holsbeke C, Testa AC, Bourne T, Van Huffel S, et al. Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models. BMC Med Res Methodol. 2010;10:96.PubMedCrossRef Van Calster B, Valentin L, Van Holsbeke C, Testa AC, Bourne T, Van Huffel S, et al. Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models. BMC Med Res Methodol. 2010;10:96.PubMedCrossRef
27.
Zurück zum Zitat Steyerberg EW, Keizer HJ, Fosså SD, Sleijfer DT, Toner GC, Schraffordt Koops H, et al. Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups. J Clin Oncol. 1995;13:1177–87.PubMed Steyerberg EW, Keizer HJ, Fosså SD, Sleijfer DT, Toner GC, Schraffordt Koops H, et al. Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups. J Clin Oncol. 1995;13:1177–87.PubMed
28.
Zurück zum Zitat Steyerberg EW, Gerl A, Fosså SD, Sleijfer DT, de Wit R, Kirkels WJ, et al. Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer. J Clin Oncol. 1998;16:269–74.PubMed Steyerberg EW, Gerl A, Fosså SD, Sleijfer DT, de Wit R, Kirkels WJ, et al. Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer. J Clin Oncol. 1998;16:269–74.PubMed
29.
Zurück zum Zitat Vergouwe Y, Steyerberg EW, de Wit R, Roberts JT, Keizer HJ, Collette L, et al. External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients. Br J Cancer. 2003;88:843–7.PubMedCrossRef Vergouwe Y, Steyerberg EW, de Wit R, Roberts JT, Keizer HJ, Collette L, et al. External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients. Br J Cancer. 2003;88:843–7.PubMedCrossRef
30.
Zurück zum Zitat Vergouwe Y, Steyerberg EW, Foster RS, Sleijfer DT, Fosså SD, Gerl A, et al. Predicting retroperitoneal histology in postchemotherapy testicular germ cell cancer: a model update and multicentre validation with more than 1000 patients. Eur Urol. 2007;51:424–32.PubMedCrossRef Vergouwe Y, Steyerberg EW, Foster RS, Sleijfer DT, Fosså SD, Gerl A, et al. Predicting retroperitoneal histology in postchemotherapy testicular germ cell cancer: a model update and multicentre validation with more than 1000 patients. Eur Urol. 2007;51:424–32.PubMedCrossRef
31.
Zurück zum Zitat Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, et al. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. 2008;5:e165.PubMedCrossRef Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, et al. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. 2008;5:e165.PubMedCrossRef
32.
Zurück zum Zitat Van Calster B, Van Belle V, Condous G, Bourne T, Timmerman D, Van Huffel S. Multi-class AUC metrics and weighted alternatives. In: Liu D, Kozma R, editors. Proceedings of the 21st international joint conference on neural networks. Los Alamitos: IEEE Computer Society; 2008. p. 1391–7. Van Calster B, Van Belle V, Condous G, Bourne T, Timmerman D, Van Huffel S. Multi-class AUC metrics and weighted alternatives. In: Liu D, Kozma R, editors. Proceedings of the 21st international joint conference on neural networks. Los Alamitos: IEEE Computer Society; 2008. p. 1391–7.
33.
Zurück zum Zitat Vickers AJ, Cronin AM, Begg CB. One statistical test is sufficient for assessing new predictive markers. BMC Med Res Methodol. 2011;11:13.PubMedCrossRef Vickers AJ, Cronin AM, Begg CB. One statistical test is sufficient for assessing new predictive markers. BMC Med Res Methodol. 2011;11:13.PubMedCrossRef
34.
Zurück zum Zitat Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565–74.PubMedCrossRef Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565–74.PubMedCrossRef
35.
Zurück zum Zitat Leeflang MMG, Bossuyt PMM, Irwig L. Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis. J Clin Epidemiol. 2009;62:5–12.PubMedCrossRef Leeflang MMG, Bossuyt PMM, Irwig L. Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis. J Clin Epidemiol. 2009;62:5–12.PubMedCrossRef
36.
Zurück zum Zitat Webb GI, Ting KM. On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn. 2005;58:25–32.CrossRef Webb GI, Ting KM. On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn. 2005;58:25–32.CrossRef
37.
Zurück zum Zitat Whiting P, Rutjes AWS, Reitsma JB, Glas AS, Bossuyt PMM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med. 2004;140:189–202.PubMed Whiting P, Rutjes AWS, Reitsma JB, Glas AS, Bossuyt PMM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med. 2004;140:189–202.PubMed
38.
Zurück zum Zitat Moons KGM, van Es GA, Deckers JW, Habbema JDF, Grobbee DE. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: a clinical example. Epidemiology. 1997;8:12–7.PubMedCrossRef Moons KGM, van Es GA, Deckers JW, Habbema JDF, Grobbee DE. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: a clinical example. Epidemiology. 1997;8:12–7.PubMedCrossRef
39.
Zurück zum Zitat Pepe MS, Janes HE. Gauging the performance of SNPs, biomarkers, and clinical factors for predicting risk of breast cancer (editorial). J Natl Cancer Inst. 2008;100:978–9.PubMedCrossRef Pepe MS, Janes HE. Gauging the performance of SNPs, biomarkers, and clinical factors for predicting risk of breast cancer (editorial). J Natl Cancer Inst. 2008;100:978–9.PubMedCrossRef
40.
Zurück zum Zitat Janes H, Pepe MS, Gu W. Assessing the value of risk predictions using risk stratification tables. Ann Intern Med. 2008;149:751–60.PubMed Janes H, Pepe MS, Gu W. Assessing the value of risk predictions using risk stratification tables. Ann Intern Med. 2008;149:751–60.PubMed
41.
Zurück zum Zitat Dreiseitl S, Ohno-Machado L, Binder M. Comparing three-class diagnostic tests by three-way ROC analysis. Med Decis Making. 2000;20:323–31.PubMedCrossRef Dreiseitl S, Ohno-Machado L, Binder M. Comparing three-class diagnostic tests by three-way ROC analysis. Med Decis Making. 2000;20:323–31.PubMedCrossRef
42.
Zurück zum Zitat Skaltsa K, Jover L, Fuster D, Carrasco JL. Optimum threshold estimation based on cost function in a multistate diagnostic setting. Stat Med. 2012;31:1098–109.PubMedCrossRef Skaltsa K, Jover L, Fuster D, Carrasco JL. Optimum threshold estimation based on cost function in a multistate diagnostic setting. Stat Med. 2012;31:1098–109.PubMedCrossRef
43.
Zurück zum Zitat O’Brien DB, Gupta MR, Gray RM. Cost-sensitive multi-class classification from probability estimates. In: Cohen WW, McCallum A, Roweis ST, editors. Proceedings of the 25th international conference on machine learning. New York: Association for Computing Machinery; 2008. p. 712–9.CrossRef O’Brien DB, Gupta MR, Gray RM. Cost-sensitive multi-class classification from probability estimates. In: Cohen WW, McCallum A, Roweis ST, editors. Proceedings of the 25th international conference on machine learning. New York: Association for Computing Machinery; 2008. p. 712–9.CrossRef
Metadaten
Titel
Assessing the discriminative ability of risk models for more than two outcome categories
verfasst von
Ben Van Calster
Yvonne Vergouwe
Caspar W. N. Looman
Vanya Van Belle
Dirk Timmerman
Ewout W. Steyerberg
Publikationsdatum
01.10.2012
Verlag
Springer Netherlands
Erschienen in
European Journal of Epidemiology / Ausgabe 10/2012
Print ISSN: 0393-2990
Elektronische ISSN: 1573-7284
DOI
https://doi.org/10.1007/s10654-012-9733-3

Weitere Artikel der Ausgabe 10/2012

European Journal of Epidemiology 10/2012 Zur Ausgabe