Skip to main content
Erschienen in: Journal of Medical Systems 2/2011

01.04.2011 | Original Paper

Using Data Mining Techniques in Monitoring Diabetes Care. The Simpler the Better?

verfasst von: Dario Gregori, Michele Petrinco, Simona Bo, Rosalba Rosato, Eva Pagano, Paola Berchialla, Franco Merletti

Erschienen in: Journal of Medical Systems | Ausgabe 2/2011

Einloggen, um Zugang zu erhalten

Abstract

We aim at evaluating how data-mining statistical techniques can be applied on medical records and administrative data of diabetes and how they differ in terms of capabilities of predicting outcomes (e.g. death). Data on 3,892 outpatient patients with a diagnosis of type 2 diabetes from the San Giovanni Battista Hospital in Torino. Six statistical classifiers were applied: Logistic regression (LR), Generalized Additive Model (GAM), Projection pursuit Regression (PPR), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Artificial Neural Networks (ANN). All models selected the same subset of covariates. ANN is the model performing worse, whereas simpler models, like LR, GAM and LDA seem to perform better. GAM is associated with a very small misclassification rate. The agreement in predicting individual outcomes among models is 0.23 (SE 0.06, Kappa). Monitoring on the basis of patients’ characteristics is highly dependent from the statistical properties of the chosen statistical model.
Literatur
1.
Zurück zum Zitat Podgorelec, V., Kokol, P., Stiglic, M. M., Hericko, M., and Rozman, I., Knowledge discovery with classification rules in a cardiovascular dataset. Comput. Methods Programs Biomed. 80(Suppl 1):S39–S49, 2005.CrossRef Podgorelec, V., Kokol, P., Stiglic, M. M., Hericko, M., and Rozman, I., Knowledge discovery with classification rules in a cardiovascular dataset. Comput. Methods Programs Biomed. 80(Suppl 1):S39–S49, 2005.CrossRef
2.
Zurück zum Zitat Zhang, Q. P., Sun, D. Y., Lu, M., Qin, P., and Shang, T., The application of biomed-informatics in cardiovascular research—Data and knowledge. Sheng Li Ke Xue Jin Zhan. 36(2):119–124, 2005. Zhang, Q. P., Sun, D. Y., Lu, M., Qin, P., and Shang, T., The application of biomed-informatics in cardiovascular research—Data and knowledge. Sheng Li Ke Xue Jin Zhan. 36(2):119–124, 2005.
3.
Zurück zum Zitat Bo, S., Ciccone, G., Grassi, G., et al., Patients with type 2 diabetes had higher rates of hospitalization than the general population. J. Clin. Epidemiol. 57(11):1196–1201, 2004.CrossRef Bo, S., Ciccone, G., Grassi, G., et al., Patients with type 2 diabetes had higher rates of hospitalization than the general population. J. Clin. Epidemiol. 57(11):1196–1201, 2004.CrossRef
4.
Zurück zum Zitat R Development Core Team. R: A language and environment for statistical computing 2005. R Development Core Team. R: A language and environment for statistical computing 2005.
5.
Zurück zum Zitat Fisher, R. A., The use of multiple measurements in taxonomic problems. Annals of Eugenics. 8:376–386, 1936. Fisher, R. A., The use of multiple measurements in taxonomic problems. Annals of Eugenics. 8:376–386, 1936.
6.
Zurück zum Zitat Tatsuoka, M. M., Discriminant analysis. Institute for Personality and Ability Testing, Champaign, 1970. Tatsuoka, M. M., Discriminant analysis. Institute for Personality and Ability Testing, Champaign, 1970.
7.
Zurück zum Zitat Nelder, J. A., and Wedderburn, R. W. M., Generalized linear models. J. R. Stat. Soc., Ser. A. 135:370–384, 1972.CrossRef Nelder, J. A., and Wedderburn, R. W. M., Generalized linear models. J. R. Stat. Soc., Ser. A. 135:370–384, 1972.CrossRef
8.
Zurück zum Zitat Hastie, T. J., and Tibshirani, R. J., Generalized additive models. Chapman and Hall, New York, 1990.MATH Hastie, T. J., and Tibshirani, R. J., Generalized additive models. Chapman and Hall, New York, 1990.MATH
9.
10.
Zurück zum Zitat Ripley, B. D., Pattern recognition and neural networks. Cambridge University Press, Cambridge, 1996.MATH Ripley, B. D., Pattern recognition and neural networks. Cambridge University Press, Cambridge, 1996.MATH
11.
Zurück zum Zitat Efron, B., Estimating the error rate of a prediction rule: Some improvements on crossvalidation. J. Am. Stat. Assoc. 78:316–331, 1983.MathSciNetMATHCrossRef Efron, B., Estimating the error rate of a prediction rule: Some improvements on crossvalidation. J. Am. Stat. Assoc. 78:316–331, 1983.MathSciNetMATHCrossRef
12.
Zurück zum Zitat Siegel, S. and Castellan, J. N. Nonparametric statistics for the behavioral sciences. 2nd ed. McGraw-Hill, 1988. Siegel, S. and Castellan, J. N. Nonparametric statistics for the behavioral sciences. 2nd ed. McGraw-Hill, 1988.
13.
Zurück zum Zitat Bartfay, E., Mackillop, W. J., and Prater, J. L., Comparing the predictive value of neural network models to logistic regression models on the risk of death for small-cell lung cancer patients. Eur. J. Cancer Care. 15(2):115–124, 2006.CrossRef Bartfay, E., Mackillop, W. J., and Prater, J. L., Comparing the predictive value of neural network models to logistic regression models on the risk of death for small-cell lung cancer patients. Eur. J. Cancer Care. 15(2):115–124, 2006.CrossRef
14.
Zurück zum Zitat Braitman, L. E., and Davidoff, F., Predicting clinical states in individual patients. Ann. Intern. Med. 125(5):406–412, 1996. Braitman, L. E., and Davidoff, F., Predicting clinical states in individual patients. Ann. Intern. Med. 125(5):406–412, 1996.
15.
Zurück zum Zitat Reilly, B. M., and Evans, A. T., Translating clinical research into clinical practice: Impact of using prediction rules to make decisions. Ann. Intern. Med. 144(3):201–209, 2006. Reilly, B. M., and Evans, A. T., Translating clinical research into clinical practice: Impact of using prediction rules to make decisions. Ann. Intern. Med. 144(3):201–209, 2006.
16.
Zurück zum Zitat Scott, L. J., Warram, J. H., Hanna, L. S., Laffel, L. M., Ryan, L., and Krolewski, A. S., A nonlinear effect of hyperglycemia and current cigarette smoking are major determinants of the onset of microalbuminuria in type 1 diabetes. Diabetes. 50(12):2482–2489, 2001.CrossRef Scott, L. J., Warram, J. H., Hanna, L. S., Laffel, L. M., Ryan, L., and Krolewski, A. S., A nonlinear effect of hyperglycemia and current cigarette smoking are major determinants of the onset of microalbuminuria in type 1 diabetes. Diabetes. 50(12):2482–2489, 2001.CrossRef
17.
Zurück zum Zitat Andersen, A. H., Gash, D. M., and Avison, M. J., Principal component analysis of the dynamic response measured by fMRI: A generalized linear systems framework. Magn. Reson. Imaging. 17(6):795–815, 1999.CrossRef Andersen, A. H., Gash, D. M., and Avison, M. J., Principal component analysis of the dynamic response measured by fMRI: A generalized linear systems framework. Magn. Reson. Imaging. 17(6):795–815, 1999.CrossRef
18.
Zurück zum Zitat Du, Y., and Liang, Y., Data mining for seeking accurate quantitative relationship between molecular structure and GC retention indices of alkanes by projection pursuit. Comput. Biol. Chem. 27(3):339–353, 2003.CrossRef Du, Y., and Liang, Y., Data mining for seeking accurate quantitative relationship between molecular structure and GC retention indices of alkanes by projection pursuit. Comput. Biol. Chem. 27(3):339–353, 2003.CrossRef
19.
Zurück zum Zitat Du, Y., Liang, Y., and Yun, D., Data mining for seeking an accurate quantitative relationship between molecular structure and GC retention indices of alkenes by projection pursuit. J. Chem. Inf. Comput. Sci. 42(6):1283–1292, 2002. Du, Y., Liang, Y., and Yun, D., Data mining for seeking an accurate quantitative relationship between molecular structure and GC retention indices of alkenes by projection pursuit. J. Chem. Inf. Comput. Sci. 42(6):1283–1292, 2002.
20.
Zurück zum Zitat Gribonval, R., From projection pursuit and CART to adaptive discriminant analysis? IEEE Trans. Neural Netw. 16(3):522–532, 2005.CrossRef Gribonval, R., From projection pursuit and CART to adaptive discriminant analysis? IEEE Trans. Neural Netw. 16(3):522–532, 2005.CrossRef
21.
Zurück zum Zitat Ren, S., and Kim, H., Comparative assessment of multiresponse regression methods for predicting the mechanisms of toxic action of phenols. J. Chem. Inf. Comput. Sci. 43(6):2106–2110, 2003. Ren, S., and Kim, H., Comparative assessment of multiresponse regression methods for predicting the mechanisms of toxic action of phenols. J. Chem. Inf. Comput. Sci. 43(6):2106–2110, 2003.
22.
Zurück zum Zitat Vlassis, N., Motomura, Y., and Krose, B., Supervised dimension reduction of intrinsically low-dimensional data. Neural Comput. 14(1):191–215, 2002.MATHCrossRef Vlassis, N., Motomura, Y., and Krose, B., Supervised dimension reduction of intrinsically low-dimensional data. Neural Comput. 14(1):191–215, 2002.MATHCrossRef
23.
Zurück zum Zitat Ennis, M., Hinton, G., Naylor, D., Revow, M., and Tibshirani, R., A comparison of statistical learning methods on the GUSTO database. Stat. Med. 17:2501–2508, 1998.CrossRef Ennis, M., Hinton, G., Naylor, D., Revow, M., and Tibshirani, R., A comparison of statistical learning methods on the GUSTO database. Stat. Med. 17:2501–2508, 1998.CrossRef
24.
Zurück zum Zitat Almeida, J. S., Predictive non-linear modeling of complex data by artificial neural networks. Curr. Opin. Biotechnol. 13(1):72–76, 2002.CrossRef Almeida, J. S., Predictive non-linear modeling of complex data by artificial neural networks. Curr. Opin. Biotechnol. 13(1):72–76, 2002.CrossRef
25.
Zurück zum Zitat Tafeit, E., and Reibnegger, G., Artificial neural networks in laboratory medicine and medical outcome prediction. Clin. Chem. Lab. Med. 37(9):845–853, 1999.CrossRef Tafeit, E., and Reibnegger, G., Artificial neural networks in laboratory medicine and medical outcome prediction. Clin. Chem. Lab. Med. 37(9):845–853, 1999.CrossRef
26.
Zurück zum Zitat Tu, J. V., Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 49:1225–1231, 1996.CrossRef Tu, J. V., Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 49:1225–1231, 1996.CrossRef
27.
Zurück zum Zitat Schwarzer, G., Vach, W., and Schumacher, M., On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Stat. Med. 19(4):541–561, 2000.CrossRef Schwarzer, G., Vach, W., and Schumacher, M., On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Stat. Med. 19(4):541–561, 2000.CrossRef
28.
Zurück zum Zitat Ripley, B. D. Statistical aspects of neural networks. In: Barndorff-Nielsen, O. E., JJLe, ed. Networks and chaos—statistical and probabilistic aspects. London: Chapman and Hall, 1993. Ripley, B. D. Statistical aspects of neural networks. In: Barndorff-Nielsen, O. E., JJLe, ed. Networks and chaos—statistical and probabilistic aspects. London: Chapman and Hall, 1993.
29.
Zurück zum Zitat Vach, W., Rossner, R., and Schumacher, M., Neural networks and logistic regression: Part II. Comput. Stat. Data Anal. 21:683–701, 1996.MATHCrossRef Vach, W., Rossner, R., and Schumacher, M., Neural networks and logistic regression: Part II. Comput. Stat. Data Anal. 21:683–701, 1996.MATHCrossRef
30.
Zurück zum Zitat Dybowski, R., Weller, P., Chang, R., and Gant, V., Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm. Lancet. 347(9009):1146–1150, 1996.CrossRef Dybowski, R., Weller, P., Chang, R., and Gant, V., Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm. Lancet. 347(9009):1146–1150, 1996.CrossRef
31.
Zurück zum Zitat Justice, A. C., Covinsky, K. E., and Berlin, J. A., Assessing the generalizability of prognostic information. Ann. Intern. Med. 130(6):515–524, 1999. Justice, A. C., Covinsky, K. E., and Berlin, J. A., Assessing the generalizability of prognostic information. Ann. Intern. Med. 130(6):515–524, 1999.
Metadaten
Titel
Using Data Mining Techniques in Monitoring Diabetes Care. The Simpler the Better?
verfasst von
Dario Gregori
Michele Petrinco
Simona Bo
Rosalba Rosato
Eva Pagano
Paola Berchialla
Franco Merletti
Publikationsdatum
01.04.2011
Verlag
Springer US
Erschienen in
Journal of Medical Systems / Ausgabe 2/2011
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI
https://doi.org/10.1007/s10916-009-9363-9

Weitere Artikel der Ausgabe 2/2011

Journal of Medical Systems 2/2011 Zur Ausgabe