Skip to main content
Erschienen in: Journal of Medical Systems 9/2014

01.09.2014 | Systems-Level Quality Improvement

An Intelligent System for Lung Cancer Diagnosis Using a New Genetic Algorithm Based Feature Selection Method

verfasst von: Chunhong Lu, Zhaomin Zhu, Xiaofeng Gu

Erschienen in: Journal of Medical Systems | Ausgabe 9/2014

Einloggen, um Zugang zu erhalten

Abstract

In this paper, we develop a novel feature selection algorithm based on the genetic algorithm (GA) using a specifically devised trace-based separability criterion. According to the scores of class separability and variable separability, this criterion measures the significance of feature subset, independent of any specific classification. In addition, a mutual information matrix between variables is used as features for classification, and no prior knowledge about the cardinality of feature subset is required. Experiments are performed by using a standard lung cancer dataset. The obtained solutions are verified with three different classifiers, including the support vector machine (SVM), the back-propagation neural network (BPNN), and the K-nearest neighbor (KNN), and compared with those obtained by the whole feature set, the F-score and the correlation-based feature selection methods. The comparison results show that the proposed intelligent system has a good diagnosis performance and can be used as a promising tool for lung cancer diagnosis.
Literatur
1.
Zurück zum Zitat Polat, K., and Gunes, S., Principles component analysis, fuzzy weighting pre-processing and artificial immune recognition system based diagnostic system for diagnosis of lung cancer. Expert Syst. Appl. 34(1):214–221, 2008.CrossRefMathSciNet Polat, K., and Gunes, S., Principles component analysis, fuzzy weighting pre-processing and artificial immune recognition system based diagnostic system for diagnosis of lung cancer. Expert Syst. Appl. 34(1):214–221, 2008.CrossRefMathSciNet
2.
Zurück zum Zitat Ahmad, F., Isa, N., Hussain, Z., and Osman, M., Intelligent medical disease diagnosis using improved hybrid genetic algorithm-multilayer perceptron network. J. Med. Syst. 37(2):1–8, 2013.CrossRef Ahmad, F., Isa, N., Hussain, Z., and Osman, M., Intelligent medical disease diagnosis using improved hybrid genetic algorithm-multilayer perceptron network. J. Med. Syst. 37(2):1–8, 2013.CrossRef
3.
Zurück zum Zitat Liang, C., and Peng, L., An automated diagnosis system of liver disease using artificial immune and genetic algorithms. J. Med. Syst. 37(2):1–10, 2013.CrossRef Liang, C., and Peng, L., An automated diagnosis system of liver disease using artificial immune and genetic algorithms. J. Med. Syst. 37(2):1–10, 2013.CrossRef
4.
Zurück zum Zitat Elizabeth, D. S., Nehemiah, H. K., Retmin Raj, C. S., and Kannan, A., Computer-aided diagnosis of lung cancer based on analysis of the significant slice of chest computed tomography image. IET Image Process. 6(6):697–705, 2010.CrossRef Elizabeth, D. S., Nehemiah, H. K., Retmin Raj, C. S., and Kannan, A., Computer-aided diagnosis of lung cancer based on analysis of the significant slice of chest computed tomography image. IET Image Process. 6(6):697–705, 2010.CrossRef
5.
Zurück zum Zitat Ocak, H., A medical decision support system based on support vector machines and the genetic algorithm for the evaluation of fetal well-being. J. Med. Syst. 37(2):1–9, 2013.CrossRef Ocak, H., A medical decision support system based on support vector machines and the genetic algorithm for the evaluation of fetal well-being. J. Med. Syst. 37(2):1–9, 2013.CrossRef
6.
Zurück zum Zitat Avci, E., A new expert system for diagnosis of lung cancer: GDA-LS_SVM. J. Med. Syst. 36(3):2005–2009, 2011.CrossRef Avci, E., A new expert system for diagnosis of lung cancer: GDA-LS_SVM. J. Med. Syst. 36(3):2005–2009, 2011.CrossRef
7.
Zurück zum Zitat Özçift, A., and Gülten, A., Genetic algorithm wrapped Bayesian network feature selection applied to differential diagnosis of erythemato-squamous diseases. Digit. Signal. Process. 23(1):230–237, 2013.CrossRefMathSciNet Özçift, A., and Gülten, A., Genetic algorithm wrapped Bayesian network feature selection applied to differential diagnosis of erythemato-squamous diseases. Digit. Signal. Process. 23(1):230–237, 2013.CrossRefMathSciNet
8.
Zurück zum Zitat Shilaskar, S., and Ghatol, A., Feature selection for medical diagnosis: Evaluation for cardiovascular diseases. Expert Syst. Appl. 40(10):4146–4153, 2013.CrossRef Shilaskar, S., and Ghatol, A., Feature selection for medical diagnosis: Evaluation for cardiovascular diseases. Expert Syst. Appl. 40(10):4146–4153, 2013.CrossRef
9.
Zurück zum Zitat De Stefano, C., Fontanella, F., Marrocco, C., and Scotto Di Freca, A., A GA-based feature selection approach with an application to handwritten character recognition. Pattern Recogn. Lett. 35:130–141, 2014.CrossRef De Stefano, C., Fontanella, F., Marrocco, C., and Scotto Di Freca, A., A GA-based feature selection approach with an application to handwritten character recognition. Pattern Recogn. Lett. 35:130–141, 2014.CrossRef
10.
Zurück zum Zitat Siedlecki, W., and Sklansky, J., A note on genetic algorithms for large-scale feature selection. Pattern Recogn. Lett. 10(5):335–347, 1989.CrossRefMATH Siedlecki, W., and Sklansky, J., A note on genetic algorithms for large-scale feature selection. Pattern Recogn. Lett. 10(5):335–347, 1989.CrossRefMATH
11.
Zurück zum Zitat Oh, I. S., Lee, J. S., and Moon, B. R., Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11):1424–1437, 2004.CrossRef Oh, I. S., Lee, J. S., and Moon, B. R., Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11):1424–1437, 2004.CrossRef
12.
Zurück zum Zitat Kudo, M., and Sklansky, J., Comparison of algorithms that select features for pattern recognition. Pattern Recogn. 33(1):25–41, 2000.CrossRef Kudo, M., and Sklansky, J., Comparison of algorithms that select features for pattern recognition. Pattern Recogn. 33(1):25–41, 2000.CrossRef
13.
Zurück zum Zitat Daliri, M. R., A hybrid automatic system for the diagnosis of lung cancer based on genetic algorithm and fuzzy extreme learning machines. J. Med. Syst. 36(2):1001–1005, 2012.CrossRef Daliri, M. R., A hybrid automatic system for the diagnosis of lung cancer based on genetic algorithm and fuzzy extreme learning machines. J. Med. Syst. 36(2):1001–1005, 2012.CrossRef
14.
Zurück zum Zitat Wu, Y. G., Wu, Y. M., Wang, J., Yan, Z., Qu, L. B., Xiang, B. R., and Zhang, Y. G., An optimal tumor marker group-coupled artificial neural network for diagnosis of lung cancer. Expert Syst. Appl. 38(9):11329–11334, 2011.CrossRef Wu, Y. G., Wu, Y. M., Wang, J., Yan, Z., Qu, L. B., Xiang, B. R., and Zhang, Y. G., An optimal tumor marker group-coupled artificial neural network for diagnosis of lung cancer. Expert Syst. Appl. 38(9):11329–11334, 2011.CrossRef
15.
Zurück zum Zitat Lee, M. C., Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst. Appl. 36(8):10896–10904, 2009.CrossRef Lee, M. C., Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst. Appl. 36(8):10896–10904, 2009.CrossRef
16.
Zurück zum Zitat Yang, K., Yoon, H., and Shahabi, C., A supervised feature subset selection technique for multivariate time series. In: Workshop Feature Selection for Data Mining: Interfacing Machine Learning with Statistics, pp. 92–101, 2005. Yang, K., Yoon, H., and Shahabi, C., A supervised feature subset selection technique for multivariate time series. In: Workshop Feature Selection for Data Mining: Interfacing Machine Learning with Statistics, pp. 92–101, 2005.
17.
Zurück zum Zitat Akay, M. F., Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst. Appl. 36(2):3240–3247, 2009.CrossRef Akay, M. F., Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst. Appl. 36(2):3240–3247, 2009.CrossRef
18.
Zurück zum Zitat Battiti, R., Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 4(5):537–550, 1994.CrossRef Battiti, R., Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 4(5):537–550, 1994.CrossRef
19.
Zurück zum Zitat Doquire, G., and Verleysen, M., Feature selection with missing data using mutual information estimators. Neurocomputing 90:3–11, 2012.CrossRef Doquire, G., and Verleysen, M., Feature selection with missing data using mutual information estimators. Neurocomputing 90:3–11, 2012.CrossRef
20.
Zurück zum Zitat Han, M., and Liu, X. X., Feature selection techniques with class separability for multivariate time series. Neurocomputing 110:29–34, 2013.CrossRef Han, M., and Liu, X. X., Feature selection techniques with class separability for multivariate time series. Neurocomputing 110:29–34, 2013.CrossRef
21.
Zurück zum Zitat Xie, J. Y., and Wang, C. X., Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 38(5):5809–5815, 2011.CrossRef Xie, J. Y., and Wang, C. X., Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 38(5):5809–5815, 2011.CrossRef
22.
Zurück zum Zitat Huang, C. J., Yang, D. X., and Chuang, Y. T., Application of wrapper approach and composite classifier to the stock trend prediction. Expert Syst. Appl. 34(4):2870–2878, 2008.CrossRef Huang, C. J., Yang, D. X., and Chuang, Y. T., Application of wrapper approach and composite classifier to the stock trend prediction. Expert Syst. Appl. 34(4):2870–2878, 2008.CrossRef
23.
Zurück zum Zitat Vapnik, V., and Cortes, C., Support vector networks. Mach. Learn. 20:273–297, 1989. Vapnik, V., and Cortes, C., Support vector networks. Mach. Learn. 20:273–297, 1989.
24.
Zurück zum Zitat Chang, P. C., Liu, C. H., Lin, J. L., Fan, C. Y., and Ng, C. P., A neural network with a case based dynamic window for stock trading prediction. Expert Syst. Appl. 36(3):6889–6898, 2009.CrossRef Chang, P. C., Liu, C. H., Lin, J. L., Fan, C. Y., and Ng, C. P., A neural network with a case based dynamic window for stock trading prediction. Expert Syst. Appl. 36(3):6889–6898, 2009.CrossRef
25.
Zurück zum Zitat Wan, C. H., Lee, L. H., Rajkumar, R., and Isa, D., A hybrid text classification approach with low dependency on parameter by integrating K-nearest neighbor and support vector machine. Expert Syst. Appl. 39(15):11880–118888, 2012.CrossRef Wan, C. H., Lee, L. H., Rajkumar, R., and Isa, D., A hybrid text classification approach with low dependency on parameter by integrating K-nearest neighbor and support vector machine. Expert Syst. Appl. 39(15):11880–118888, 2012.CrossRef
26.
Zurück zum Zitat Wang, L., Feature selection with kernel class separability. IEEE Trans. Patt. Anal. Mach. Intell. 30(9):1534–1546, 2008.CrossRef Wang, L., Feature selection with kernel class separability. IEEE Trans. Patt. Anal. Mach. Intell. 30(9):1534–1546, 2008.CrossRef
28.
Zurück zum Zitat Mitra, P., Murthy, C., and Pal, S. K., Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3):301–312, 2002.CrossRef Mitra, P., Murthy, C., and Pal, S. K., Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3):301–312, 2002.CrossRef
29.
Zurück zum Zitat Yu, L., and Liu, H., Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth Int. Conf. on Machine Learning (ICML-03), pp. 856–863, Washington, D.C., 2003. Yu, L., and Liu, H., Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth Int. Conf. on Machine Learning (ICML-03), pp. 856–863, Washington, D.C., 2003.
30.
Zurück zum Zitat Tan, P. J., and Dowe, D. L., MML inference of oblique decision trees. In: Australian Conf. on Artificial Intelligence, pp. 1082–1088, 2004. Tan, P. J., and Dowe, D. L., MML inference of oblique decision trees. In: Australian Conf. on Artificial Intelligence, pp. 1082–1088, 2004.
31.
Zurück zum Zitat Bostrom, H., Maximizing the area under the ROC curve using incremental rediced error pruning. In: Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning, 2005. Bostrom, H., Maximizing the area under the ROC curve using incremental rediced error pruning. In: Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning, 2005.
Metadaten
Titel
An Intelligent System for Lung Cancer Diagnosis Using a New Genetic Algorithm Based Feature Selection Method
verfasst von
Chunhong Lu
Zhaomin Zhu
Xiaofeng Gu
Publikationsdatum
01.09.2014
Verlag
Springer US
Erschienen in
Journal of Medical Systems / Ausgabe 9/2014
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI
https://doi.org/10.1007/s10916-014-0097-y

Weitere Artikel der Ausgabe 9/2014

Journal of Medical Systems 9/2014 Zur Ausgabe