Skip to main content
Erschienen in: Journal of Medical Systems 7/2016

01.07.2016 | Transactional Processing Systems

A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data

verfasst von: Luxmi Verma, Sangeet Srivastava, P. C. Negi

Erschienen in: Journal of Medical Systems | Ausgabe 7/2016

Einloggen, um Zugang zu erhalten

Abstract

Coronary artery disease (CAD) is caused by atherosclerosis in coronary arteries and results in cardiac arrest and heart attack. For diagnosis of CAD, angiography is used which is a costly time consuming and highly technical invasive method. Researchers are, therefore, prompted for alternative methods such as machine learning algorithms that could use noninvasive clinical data for the disease diagnosis and assessing its severity. In this study, we present a novel hybrid method for CAD diagnosis, including risk factor identification using correlation based feature subset (CFS) selection with particle swam optimization (PSO) search method and K-means clustering algorithms. Supervised learning algorithms such as multi-layer perceptron (MLP), multinomial logistic regression (MLR), fuzzy unordered rule induction algorithm (FURIA) and C4.5 are then used to model CAD cases. We tested this approach on clinical data consisting of 26 features and 335 instances collected at the Department of Cardiology, Indira Gandhi Medical College, Shimla, India. MLR achieves highest prediction accuracy of 88.4 %.We tested this approach on benchmarked Cleaveland heart disease data as well. In this case also, MLR, outperforms other techniques. Proposed hybridized model improves the accuracy of classification algorithms from 8.3 % to 11.4 % for the Cleaveland data. The proposed method is, therefore, a promising tool for identification of CAD patients with improved prediction accuracy.
Literatur
1.
Zurück zum Zitat Wong, N.D., Epidemiological studies of CHD and the evolution of preventive cardiology. Nat. Rev. Cardiol. 11(5):276–289, 2014.PubMedCrossRef Wong, N.D., Epidemiological studies of CHD and the evolution of preventive cardiology. Nat. Rev. Cardiol. 11(5):276–289, 2014.PubMedCrossRef
3.
Zurück zum Zitat Tsipouras, M.G., Exarchos, T.P., Fotiadis, D.I., Kotsia, A.P., Vakalis, K.V., Naka, K.K., and Michalis, L.K., Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling. IEEE Trans. Inf. Technol. Biomed. 12(4):447–458, 2008.PubMedCrossRef Tsipouras, M.G., Exarchos, T.P., Fotiadis, D.I., Kotsia, A.P., Vakalis, K.V., Naka, K.K., and Michalis, L.K., Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling. IEEE Trans. Inf. Technol. Biomed. 12(4):447–458, 2008.PubMedCrossRef
5.
Zurück zum Zitat Acharya, U.R., Faust, O., Sree, V., Swapna, G., Martis, R.J., Kadri, N.A., and Suri, J.S., Linear and nonlinear analysis of normal and CAD-affected heart rate signals. Comput. Methods Prog. Biomed. 113(1):55–68, 2014.CrossRef Acharya, U.R., Faust, O., Sree, V., Swapna, G., Martis, R.J., Kadri, N.A., and Suri, J.S., Linear and nonlinear analysis of normal and CAD-affected heart rate signals. Comput. Methods Prog. Biomed. 113(1):55–68, 2014.CrossRef
6.
Zurück zum Zitat Giri, D., Acharya, U.R., Martis, R.J., Sree, S.V., Lim, T.C., Ahamed, T., and Suri, J.S., Automated diagnosis of coronary artery disease affected patients using LDA, PCA, ICA and discrete wavelet transform. Knowl.-Based Syst. 37:274–282, 2013.CrossRef Giri, D., Acharya, U.R., Martis, R.J., Sree, S.V., Lim, T.C., Ahamed, T., and Suri, J.S., Automated diagnosis of coronary artery disease affected patients using LDA, PCA, ICA and discrete wavelet transform. Knowl.-Based Syst. 37:274–282, 2013.CrossRef
8.
Zurück zum Zitat Alizadehsani, R., Hosseini, M. J., Sani, Z. A., Ghandeharioun, A., & Boghrati, R., Diagnosis of coronary artery disease using cost-sensitive algorithms. In Data Mining Workshops (ICDMW), 2012 I.E. 12th International Conference on (pp. 9–16). IEEE, 2012. Alizadehsani, R., Hosseini, M. J., Sani, Z. A., Ghandeharioun, A., & Boghrati, R., Diagnosis of coronary artery disease using cost-sensitive algorithms. In Data Mining Workshops (ICDMW), 2012 I.E. 12th International Conference on (pp. 9–16). IEEE, 2012.
9.
Zurück zum Zitat Arafat, S., Dohrmann, M., & Skubic, M., Classification of coronary artery disease stress ECGs using uncertainty modeling. In Computational Intelligence Methods and Applications, 2005 ICSC Congress on (pp. 4-pp). IEEE, 2005. Arafat, S., Dohrmann, M., & Skubic, M., Classification of coronary artery disease stress ECGs using uncertainty modeling. In Computational Intelligence Methods and Applications, 2005 ICSC Congress on (pp. 4-pp). IEEE, 2005.
10.
Zurück zum Zitat Lee, H. G., Noh, K. Y., & Ryu, K. H., A data mining approach for coronary heart disease prediction using HRV features and carotid arterial wall thickness. In BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on (Vol. 1, pp. 200–206). IEEE, 2008. Lee, H. G., Noh, K. Y., & Ryu, K. H., A data mining approach for coronary heart disease prediction using HRV features and carotid arterial wall thickness. In BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on (Vol. 1, pp. 200–206). IEEE, 2008.
11.
Zurück zum Zitat Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Molinari, F., Saba, L., Ho, S.Y.S., and Suri, J.S., Atherosclerotic risk stratification strategy for carotid arteries using texture-based features. Ultrasound Med. Biol. 38(6):899–915, 2012.PubMedCrossRef Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Molinari, F., Saba, L., Ho, S.Y.S., and Suri, J.S., Atherosclerotic risk stratification strategy for carotid arteries using texture-based features. Ultrasound Med. Biol. 38(6):899–915, 2012.PubMedCrossRef
12.
Zurück zum Zitat Acharya, U.R., Mookiah, M.R.K., Sree, S.V., Afonso, D., Sanches, J., Shafique, S., and Suri, J.S., Atherosclerotic plaque tissue characterization in 2D ultrasound longitudinal carotid scans for automated classification: a paradigm for stroke risk assessment. Med. Biol. Eng. Comput. 51(5):513–523, 2013.PubMedCrossRef Acharya, U.R., Mookiah, M.R.K., Sree, S.V., Afonso, D., Sanches, J., Shafique, S., and Suri, J.S., Atherosclerotic plaque tissue characterization in 2D ultrasound longitudinal carotid scans for automated classification: a paradigm for stroke risk assessment. Med. Biol. Eng. Comput. 51(5):513–523, 2013.PubMedCrossRef
13.
Zurück zum Zitat Zhao, Z., & Ma, C., An intelligent system for noninvasive diagnosis of coronary artery disease with EMD-TEO and BP neural network. In Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008. International Workshop on (Vol. 2, pp. 631–635). IEEE, 2008. Zhao, Z., & Ma, C., An intelligent system for noninvasive diagnosis of coronary artery disease with EMD-TEO and BP neural network. In Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008. International Workshop on (Vol. 2, pp. 631–635). IEEE, 2008.
14.
Zurück zum Zitat Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Krishnananda, N., Ranjan, S., Umesh, P., and Suri, J.S., Automated classification of patients with coronary artery disease using grayscale features from left ventricle echocardiographic images. Comput. Methods Prog. Biomed. 112(3):624–632, 2013.CrossRef Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Krishnananda, N., Ranjan, S., Umesh, P., and Suri, J.S., Automated classification of patients with coronary artery disease using grayscale features from left ventricle echocardiographic images. Comput. Methods Prog. Biomed. 112(3):624–632, 2013.CrossRef
15.
Zurück zum Zitat Kim, W. S., Jin, S. H., Park, Y. K., & Choi, H. M., A study on development of multi-parametric measure of heart rate variability diagnosing cardiovascular disease. In World Congress on Medical Physics and Biomedical Engineering 2006 (pp. 3480–3483). Springer: Berlin Heidelberg, 2007. Kim, W. S., Jin, S. H., Park, Y. K., & Choi, H. M., A study on development of multi-parametric measure of heart rate variability diagnosing cardiovascular disease. In World Congress on Medical Physics and Biomedical Engineering 2006 (pp. 3480–3483). Springer: Berlin Heidelberg, 2007.
16.
Zurück zum Zitat Patidar, S., Pachori, R.B., and Acharya, U.R., Automated diagnosis of coronary artery disease using tunable-Q wavelet transform applied on heart rate signals. Knowl.-Based Syst. 82:1–10, 2015.CrossRef Patidar, S., Pachori, R.B., and Acharya, U.R., Automated diagnosis of coronary artery disease using tunable-Q wavelet transform applied on heart rate signals. Knowl.-Based Syst. 82:1–10, 2015.CrossRef
17.
Zurück zum Zitat Xing, Y., Wang, J., Zhao, Z., & Gao, Y., Combination data mining methods with new medical data to predicting outcome of coronary heart disease. In Convergence Information Technology, 2007. International Conference on (pp. 868–872). IEEE, 2007. Xing, Y., Wang, J., Zhao, Z., & Gao, Y., Combination data mining methods with new medical data to predicting outcome of coronary heart disease. In Convergence Information Technology, 2007. International Conference on (pp. 868–872). IEEE, 2007.
18.
Zurück zum Zitat Alizadehsani, R., Habibi, J., Hosseini, M.J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., and Sani, Z.A., A data mining approach for diagnosis of coronary artery disease. Comput. Methods Prog. Biomed. 111(1):52–61, 2013.CrossRef Alizadehsani, R., Habibi, J., Hosseini, M.J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., and Sani, Z.A., A data mining approach for diagnosis of coronary artery disease. Comput. Methods Prog. Biomed. 111(1):52–61, 2013.CrossRef
19.
Zurück zum Zitat Karaolis, M.A., Moutiris, J.A., Hadjipanayi, D., and Pattichis, C.S., Assessment of the risk factors of coronary heart events based on data mining with decision trees. IEEE Trans. Inf. Technol. Biomed. 14(3):559–566, 2010.PubMedCrossRef Karaolis, M.A., Moutiris, J.A., Hadjipanayi, D., and Pattichis, C.S., Assessment of the risk factors of coronary heart events based on data mining with decision trees. IEEE Trans. Inf. Technol. Biomed. 14(3):559–566, 2010.PubMedCrossRef
20.
Zurück zum Zitat Ordonez, C., Association rule discovery with the train and test approach for heart disease prediction. IEEE Trans. Inf. Technol. Biomed. 10(2):334–343, 2006.PubMedCrossRef Ordonez, C., Association rule discovery with the train and test approach for heart disease prediction. IEEE Trans. Inf. Technol. Biomed. 10(2):334–343, 2006.PubMedCrossRef
21.
Zurück zum Zitat Srinivas, K., Rao, G. R., & Govardhan, A., Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. In Computer Science and Education (ICCSE), 2010 5th International Conference on (pp. 1344–1349). IEEE, 2010. Srinivas, K., Rao, G. R., & Govardhan, A., Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. In Computer Science and Education (ICCSE), 2010 5th International Conference on (pp. 1344–1349). IEEE, 2010.
22.
Zurück zum Zitat Palaniappan, S., & Awang, R., Intelligent heart disease prediction system using data mining techniques. In Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on (pp. 108–115). IEEE, 2008. Palaniappan, S., & Awang, R., Intelligent heart disease prediction system using data mining techniques. In Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on (pp. 108–115). IEEE, 2008.
23.
Zurück zum Zitat Melillo, P., Izzo, R., Orrico, A., Scala, P., Attanasio, M., Mirra, M., and Pecchia, L., Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PLoS One. 10(3):e0118504, 2015.PubMedPubMedCentralCrossRef Melillo, P., Izzo, R., Orrico, A., Scala, P., Attanasio, M., Mirra, M., and Pecchia, L., Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PLoS One. 10(3):e0118504, 2015.PubMedPubMedCentralCrossRef
24.
Zurück zum Zitat Acharya, U.R., Faust, O., Sree, S.V., Molinari, F., Saba, L., Nicolaides, A., and Suri, J.S., An accurate and generalized approach to plaque characterization in 346 carotid ultrasound scans. IEEE Trans. Instrum. Meas. 61(4):1045–1053, 2012.CrossRef Acharya, U.R., Faust, O., Sree, S.V., Molinari, F., Saba, L., Nicolaides, A., and Suri, J.S., An accurate and generalized approach to plaque characterization in 346 carotid ultrasound scans. IEEE Trans. Instrum. Meas. 61(4):1045–1053, 2012.CrossRef
25.
Zurück zum Zitat Lin, K.C., and Hsieh, Y.H., Classification of medical datasets using SVMs with hybrid evolutionary algorithms based on endocrine-based particle swarm optimization and artificial bee Colony algorithms. J. Med. Syst. 39(10):1–9, 2015. Lin, K.C., and Hsieh, Y.H., Classification of medical datasets using SVMs with hybrid evolutionary algorithms based on endocrine-based particle swarm optimization and artificial bee Colony algorithms. J. Med. Syst. 39(10):1–9, 2015.
26.
Zurück zum Zitat Subanya, B., & Rajalaxmi, R. R., Feature selection using Artificial Bee Colony for cardiovascular disease classification. In Electronics and Communication Systems (ICECS), 2014 International Conference on (pp. 1–6). IEEE, 2014. Subanya, B., & Rajalaxmi, R. R., Feature selection using Artificial Bee Colony for cardiovascular disease classification. In Electronics and Communication Systems (ICECS), 2014 International Conference on (pp. 1–6). IEEE, 2014.
27.
Zurück zum Zitat Amin, S. U., Agarwal, K., & Beg, R., Genetic neural network based data mining in prediction of heart disease using risk factors. In Information & Communication Technologies (ICT), 2013 I.E. Conference on (pp. 1227–1231). IEEE, 2013. Amin, S. U., Agarwal, K., & Beg, R., Genetic neural network based data mining in prediction of heart disease using risk factors. In Information & Communication Technologies (ICT), 2013 I.E. Conference on (pp. 1227–1231). IEEE, 2013.
28.
Zurück zum Zitat Kumar, R., Negi, P.C., Bhardwaj, R., Kandoria, A., Asotra, S., Ganju, N., and Marwah, R., Clinical and non-invasive predictors of the presence and extent of coronary artery disease. Indian Heart J. 66:S28, 2014.CrossRef Kumar, R., Negi, P.C., Bhardwaj, R., Kandoria, A., Asotra, S., Ganju, N., and Marwah, R., Clinical and non-invasive predictors of the presence and extent of coronary artery disease. Indian Heart J. 66:S28, 2014.CrossRef
29.
Zurück zum Zitat Eom, J.H., Kim, S.C., and Zhang, B.T., AptaCDSS-E: a classifier ensemble-based clinical decision support system for cardiovascular disease level prediction. Expert Syst. Appl. 34(4):2465–2479, 2008.CrossRef Eom, J.H., Kim, S.C., and Zhang, B.T., AptaCDSS-E: a classifier ensemble-based clinical decision support system for cardiovascular disease level prediction. Expert Syst. Appl. 34(4):2465–2479, 2008.CrossRef
30.
Zurück zum Zitat Yeh, D.Y., Cheng, C.H., and Chen, Y.W., A predictive model for cerebrovascular disease using data mining. Expert Syst. Appl. 38(7):8970–8977, 2011.CrossRef Yeh, D.Y., Cheng, C.H., and Chen, Y.W., A predictive model for cerebrovascular disease using data mining. Expert Syst. Appl. 38(7):8970–8977, 2011.CrossRef
31.
Zurück zum Zitat Kupusinac, A., Stokic, E., and Kovacevic, I., Hybrid EANN-EA system for the primary estimation of Cardiometabolic risk. J. Med. Syst. 40(6):1–9, 2016.CrossRef Kupusinac, A., Stokic, E., and Kovacevic, I., Hybrid EANN-EA system for the primary estimation of Cardiometabolic risk. J. Med. Syst. 40(6):1–9, 2016.CrossRef
32.
Zurück zum Zitat Le Cessie, S., & Van Houwelingen, J. C., Ridge estimators in logistic regression. Applied statistics, 191–201, 1992. Le Cessie, S., & Van Houwelingen, J. C., Ridge estimators in logistic regression. Applied statistics, 191–201, 1992.
33.
Zurück zum Zitat Cohen, W. W., Fast effective rule induction. In Proceedings of the twelfth international conference on machine learning (pp. 115–123), 1995. Cohen, W. W., Fast effective rule induction. In Proceedings of the twelfth international conference on machine learning (pp. 115–123), 1995.
34.
Zurück zum Zitat Hühn, J., and Hüllermeier, E., FURIA: an algorithm for unordered fuzzy rule induction. Data Min. Knowl. Disc. 19(3):293–319, 2009.CrossRef Hühn, J., and Hüllermeier, E., FURIA: an algorithm for unordered fuzzy rule induction. Data Min. Knowl. Disc. 19(3):293–319, 2009.CrossRef
35.
Zurück zum Zitat Quinlan, J. R., C4. 5: Program for machine learning Morgan Kaufmann. San Mateo, CA, 1993. Quinlan, J. R., C4. 5: Program for machine learning Morgan Kaufmann. San Mateo, CA, 1993.
36.
Zurück zum Zitat Melillo, P., De Luca, N., Bracale, M., and Pecchia, L., Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Health Inform. 17(3):727–733, 2013.PubMedCrossRef Melillo, P., De Luca, N., Bracale, M., and Pecchia, L., Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Health Inform. 17(3):727–733, 2013.PubMedCrossRef
37.
Zurück zum Zitat Novaković, J., Štrbac, P., & Bulatović, D., Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav Journal of Operations Research ISSN: 0354-0243 EISSN: 2334-6043, 21(1), 2011. Novaković, J., Štrbac, P., & Bulatović, D., Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav Journal of Operations Research ISSN: 0354-0243 EISSN: 2334-6043, 21(1), 2011.
38.
Zurück zum Zitat Guyon, I., and Elisseeff, A., An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182, 2003. Guyon, I., and Elisseeff, A., An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182, 2003.
39.
Zurück zum Zitat Piramuthu, S., Evaluating feature selection methods for learning in data mining applications. Eur. J. Oper. Res. 156(2):483–494, 2004.CrossRef Piramuthu, S., Evaluating feature selection methods for learning in data mining applications. Eur. J. Oper. Res. 156(2):483–494, 2004.CrossRef
40.
Zurück zum Zitat Hall, M. A., Correlation-based feature selection for machine learning (Doctoral dissertation, The University of Waikato), 1999. Hall, M. A., Correlation-based feature selection for machine learning (Doctoral dissertation, The University of Waikato), 1999.
41.
Zurück zum Zitat Babaoglu, İ., Findik, O., and Ülker, E., A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Expert Syst. Appl. 37(4):3177–3183, 2010.CrossRef Babaoglu, İ., Findik, O., and Ülker, E., A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Expert Syst. Appl. 37(4):3177–3183, 2010.CrossRef
42.
Zurück zum Zitat Ebenhart, R., Kennedy. Particle swarm optimization. In Proceeding IEEE Inter Conference on Neural Networks, Perth, Australia, Piscat-away (Vol. 4, pp. 1942–1948), 1995. Ebenhart, R., Kennedy. Particle swarm optimization. In Proceeding IEEE Inter Conference on Neural Networks, Perth, Australia, Piscat-away (Vol. 4, pp. 1942–1948), 1995.
43.
Zurück zum Zitat Xue, B., Zhang, M., and Browne, W.N., Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6):1656–1671, 2013.PubMedCrossRef Xue, B., Zhang, M., and Browne, W.N., Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6):1656–1671, 2013.PubMedCrossRef
45.
Zurück zum Zitat Purwar, A., and Singh, S.K., Hybrid prediction model with missing value imputation for medical data. Expert Syst. Appl. 42(13):5621–5631, 2015.CrossRef Purwar, A., and Singh, S.K., Hybrid prediction model with missing value imputation for medical data. Expert Syst. Appl. 42(13):5621–5631, 2015.CrossRef
46.
Zurück zum Zitat Kahramanli, H., and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl. 35(1):82–89, 2008.CrossRef Kahramanli, H., and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl. 35(1):82–89, 2008.CrossRef
47.
Zurück zum Zitat Peter, T. J., & Somasundaram, K., An empirical study on prediction of heart disease using classification data mining techniques. InAdvances in Engineering, Science and Management (ICAESM), 2012 International Conference on (pp. 514–518). IEEE, 2012. Peter, T. J., & Somasundaram, K., An empirical study on prediction of heart disease using classification data mining techniques. InAdvances in Engineering, Science and Management (ICAESM), 2012 International Conference on (pp. 514–518). IEEE, 2012.
48.
Zurück zum Zitat Bouali, H., & Akaichi, J., Comparative Study of Different Classification Techniques: Heart Disease Use Case. In Machine Learning and Applications (ICMLA), 2014 13th International Conference on (pp. 482–486). IEEE, 2014. Bouali, H., & Akaichi, J., Comparative Study of Different Classification Techniques: Heart Disease Use Case. In Machine Learning and Applications (ICMLA), 2014 13th International Conference on (pp. 482–486). IEEE, 2014.
Metadaten
Titel
A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data
verfasst von
Luxmi Verma
Sangeet Srivastava
P. C. Negi
Publikationsdatum
01.07.2016
Verlag
Springer US
Erschienen in
Journal of Medical Systems / Ausgabe 7/2016
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI
https://doi.org/10.1007/s10916-016-0536-z

Weitere Artikel der Ausgabe 7/2016

Journal of Medical Systems 7/2016 Zur Ausgabe

Transactional Processing Systems

Neonatal Jaundice Detection System