Skip to main content
Erschienen in: International Journal of Diabetes in Developing Countries 2/2016

18.04.2015 | Original Article

Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran

verfasst von: Mahmoud Heydari, Mehdi Teimouri, Zainabolhoda Heshmati, Seyed Mohammad Alavinia

Erschienen in: International Journal of Diabetes in Developing Countries | Ausgabe 2/2016

Einloggen, um Zugang zu erhalten

Abstract

In today’s medical world, data on symptoms of patients with various diseases are so widespread, that analysis and consideration of all factors is merely not possible by a person (doctor). Therefore, the need for an intelligent system to consider the various factors and identify a suitable model between the different parameters is evident. Knowledge of data mining, as the foundation of such systems, has played a vital role in the advancement of medical sciences, especially in diagnosis of various diseases. Type 2 diabetes is one of these diseases, which has increased in recent years, which if diagnosed late can lead to serious complications. In this paper, several data mining methods and algorithms have been used and applied to a set of screening data for type 2 diabetes in Tabriz, Iran. The performance of methods such as support vector machine, artificial neural network, decision tree, nearest neighbors, and Bayesian network has been compared in an effort to find the best algorithm for diagnosing this disease. Artificial neural network with an accuracy rate of 97.44 % has the best performance on the chosen dataset. Accuracy rates for support vector machine, decision tree, 5-nearest neighbor, and Bayesian network are 81.19, 95.03, 90.85, and 91.60 %, respectively. The results of the simulations show that the effectiveness of various classification techniques on a dataset depends on the application, as well as the nature and complexity of the dataset used. Moreover, it is not always possible to say that a classification technique will always have the best performance. Therefore, in cases where data mining is used for diagnosis or prediction of diseases, consultation with specialists is inevitable, for selecting the number and type of dataset parameters to obtain the best possible results.
Literatur
1.
Zurück zum Zitat Shaw J, Sicree R, Zimmet P. Global estimates of the prevalence of diabetes for 2010 and 2030. Diabetes Res Clin Pract. 2010;87(1):4–14.CrossRefPubMed Shaw J, Sicree R, Zimmet P. Global estimates of the prevalence of diabetes for 2010 and 2030. Diabetes Res Clin Pract. 2010;87(1):4–14.CrossRefPubMed
2.
Zurück zum Zitat Prevention and control of non-communicable diseases. WHO Information Note 23 July 2010. Prevention and control of non-communicable diseases. WHO Information Note 23 July 2010.
4.
Zurück zum Zitat Cerqueira M, Cravioto A, Dianis N, Ghannem H, Levitt A, Yan L. Global response to non-communicable disease. BMJ. 2011;342 (d3823). Cerqueira M, Cravioto A, Dianis N, Ghannem H, Levitt A, Yan L. Global response to non-communicable disease. BMJ. 2011;342 (d3823).
6.
Zurück zum Zitat IDF Diabetes Atlas. 5th ed. International Diabetes Federation; 2011. IDF Diabetes Atlas. 5th ed. International Diabetes Federation; 2011.
7.
Zurück zum Zitat Zimmet P. Diabetes epidemiology as a tool to trigger diabetes research and care. Diabetologia. 1999;42(5):499–518.CrossRefPubMed Zimmet P. Diabetes epidemiology as a tool to trigger diabetes research and care. Diabetologia. 1999;42(5):499–518.CrossRefPubMed
8.
Zurück zum Zitat Hagan MT, Demuth HB, Beale MH. Neural network design. Boston: Pws Pub; 1996. Hagan MT, Demuth HB, Beale MH. Neural network design. Boston: Pws Pub; 1996.
9.
Zurück zum Zitat Kayaer K, Yıldırım T, editors. Medical diagnosis on Pima Indian diabetes using general regression neural networks. Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP); 2003. Kayaer K, Yıldırım T, editors. Medical diagnosis on Pima Indian diabetes using general regression neural networks. Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP); 2003.
10.
Zurück zum Zitat Patil BM, Joshi RC, Toshniwal D. Hybrid prediction model for type-2 diabetic patients. Expert Systems Appl. 2010;37(12):8102–8.CrossRef Patil BM, Joshi RC, Toshniwal D. Hybrid prediction model for type-2 diabetic patients. Expert Systems Appl. 2010;37(12):8102–8.CrossRef
11.
Zurück zum Zitat Al Jarullah AA, editor. Decision tree discovery for the diagnosis of type II diabetes. Innovations in Information Technology (IIT), 2011 International Conference on; 2011: IEEE. Al Jarullah AA, editor. Decision tree discovery for the diagnosis of type II diabetes. Innovations in Information Technology (IIT), 2011 International Conference on; 2011: IEEE.
12.
Zurück zum Zitat Osuna E, Freund R, Girosi F. Support vector machines: training and applications. 1997. Osuna E, Freund R, Girosi F. Support vector machines: training and applications. 1997.
13.
Zurück zum Zitat Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press; 2000. Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press; 2000.
14.
Zurück zum Zitat Shao Y-H, Deng N-Y. A coordinate descent margin-based twin support vector machine for classification. Neural Netw. 2012;25:114–21.CrossRefPubMed Shao Y-H, Deng N-Y. A coordinate descent margin-based twin support vector machine for classification. Neural Netw. 2012;25:114–21.CrossRefPubMed
15.
Zurück zum Zitat Orhan U, Hekim M, Ozer M. EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Systems Appl. 2011;38(10):13475–81.CrossRef Orhan U, Hekim M, Ozer M. EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Systems Appl. 2011;38(10):13475–81.CrossRef
16.
Zurück zum Zitat Yaghini M, Khoshraftar MM, Fallahi M. A hybrid algorithm for artificial neural network training. Eng Appl Artif Intell. 2013;26(1):293–301.CrossRef Yaghini M, Khoshraftar MM, Fallahi M. A hybrid algorithm for artificial neural network training. Eng Appl Artif Intell. 2013;26(1):293–301.CrossRef
17.
Zurück zum Zitat Temurtas F. A comparative study on thyroid disease diagnosis using neural networks. Expert Systems Appl. 2009;36(1):944–9.CrossRef Temurtas F. A comparative study on thyroid disease diagnosis using neural networks. Expert Systems Appl. 2009;36(1):944–9.CrossRef
18.
Zurück zum Zitat Witten I, Frank E, Hall M. Data mining: practical machine learning tools and techniques. 3rd edition. San Francisco: Morgan Kaufmann; 2011. Witten I, Frank E, Hall M. Data mining: practical machine learning tools and techniques. 3rd edition. San Francisco: Morgan Kaufmann; 2011.
19.
Zurück zum Zitat Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM SIGKDD Explorations Newsletter. 2010;12(1):40–8.CrossRef Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM SIGKDD Explorations Newsletter. 2010;12(1):40–8.CrossRef
20.
Zurück zum Zitat Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, et al. Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol. 2012;47(6):664–77.CrossRefPubMedPubMedCentral Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, et al. Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol. 2012;47(6):664–77.CrossRefPubMedPubMedCentral
21.
Zurück zum Zitat Setsirichok D, Piroonratana T, Wongseree W, Usavanarong T, Paulkhaolarn N, Kanjanakorn C, et al. Classification of complete blood count and haemoglobin-typing data by a C4.5 decision tree, a naïve Bayes classifier and a multilayer perceptron for thalassaemia screening. Biomedical Signal Processing and Control. 2012;7(2):202–12.CrossRef Setsirichok D, Piroonratana T, Wongseree W, Usavanarong T, Paulkhaolarn N, Kanjanakorn C, et al. Classification of complete blood count and haemoglobin-typing data by a C4.5 decision tree, a naïve Bayes classifier and a multilayer perceptron for thalassaemia screening. Biomedical Signal Processing and Control. 2012;7(2):202–12.CrossRef
22.
Zurück zum Zitat Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl. 2008;34(1):366–74.CrossRef Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl. 2008;34(1):366–74.CrossRef
23.
Zurück zum Zitat Olson DL, Delen D. Advanced data mining techniques [electronic resource]. Springer; 2008. Olson DL, Delen D. Advanced data mining techniques [electronic resource]. Springer; 2008.
24.
Zurück zum Zitat Karthikeyani V, Begum IP. Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International Journal. 2013. Karthikeyani V, Begum IP. Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International Journal. 2013.
25.
Zurück zum Zitat Huang C-L, Wang C-J. A GA-based feature selection and parameters optimization for support vector machines. Expert Sys Appl. 2006;31(2):231–40.CrossRef Huang C-L, Wang C-J. A GA-based feature selection and parameters optimization for support vector machines. Expert Sys Appl. 2006;31(2):231–40.CrossRef
26.
Zurück zum Zitat Kahramanli H, Allahverdi N. Design of a hybrid system for the diabetes and heart diseases. Expert Sys Appl. 2008;35(1):82–9.CrossRef Kahramanli H, Allahverdi N. Design of a hybrid system for the diabetes and heart diseases. Expert Sys Appl. 2008;35(1):82–9.CrossRef
27.
Zurück zum Zitat Khashei M, Zeinal Hamadani A, Bijari M. A novel hybrid classification model of artificial neural networks and multiple linear regression models. Expert Systems Appl. 2012;39(3):2606–20.CrossRef Khashei M, Zeinal Hamadani A, Bijari M. A novel hybrid classification model of artificial neural networks and multiple linear regression models. Expert Systems Appl. 2012;39(3):2606–20.CrossRef
28.
Zurück zum Zitat Khashei M, Eftekhari S, Parvizian J. Diagnosing diabetes type II using a soft intelligent binary classification model. Review of Bioinformatics and Biometrics. 2012;1 (1). Khashei M, Eftekhari S, Parvizian J. Diagnosing diabetes type II using a soft intelligent binary classification model. Review of Bioinformatics and Biometrics. 2012;1 (1).
29.
Zurück zum Zitat Ibrikci T, Ustun D, Kaya IE. Diagnosis of several diseases by using combined kernels with support vector machine. J Med Syst. 2012;36(3):1831–40.CrossRefPubMed Ibrikci T, Ustun D, Kaya IE. Diagnosis of several diseases by using combined kernels with support vector machine. J Med Syst. 2012;36(3):1831–40.CrossRefPubMed
30.
Zurück zum Zitat Karegowda AG, Manjunath A, Jayaram M. Application of genetic algorithm optimized neural network connection weights for medical diagnosis of Pima Indians diabetes. Int J Soft Computing. 2011;2(2):15–23.CrossRef Karegowda AG, Manjunath A, Jayaram M. Application of genetic algorithm optimized neural network connection weights for medical diagnosis of Pima Indians diabetes. Int J Soft Computing. 2011;2(2):15–23.CrossRef
Metadaten
Titel
Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran
verfasst von
Mahmoud Heydari
Mehdi Teimouri
Zainabolhoda Heshmati
Seyed Mohammad Alavinia
Publikationsdatum
18.04.2015
Verlag
Springer India
Erschienen in
International Journal of Diabetes in Developing Countries / Ausgabe 2/2016
Print ISSN: 0973-3930
Elektronische ISSN: 1998-3832
DOI
https://doi.org/10.1007/s13410-015-0374-4

Weitere Artikel der Ausgabe 2/2016

International Journal of Diabetes in Developing Countries 2/2016 Zur Ausgabe

Editorial Note

Editorial