Skip to main content

RF-SEA-Based Feature Selection for Data Classification in Medical Domain

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 243))

Abstract

Dimensionality reduction is an essential problem in data analysis that has received a significant amount of attention from several disciplines. It includes two types of methods, i.e., feature extraction and feature selection. In this paper, we introduce a simple method for supervised feature selection for data classification tasks. The proposed hybrid feature selection mechanism (HFS), i.e., RF-SEA (ReliefF-Shapley ensemble analysis) which combines both filter and wrapper models for dimension reduction. In the first stage, we use the filter model to rank the features by the ReliefF(RF) between classes and then choose the highest relevant features to the classes with the help of the threshold. In the second stage, we use Shapley ensemble algorithm to evaluate the contribution of features to the classification task in the ranked feature subset and principal component analysis (PCA) is carried out as preprocessing step before both the steps. Experiments with several medical datasets proves that our proposed approach is capable of detecting completely irrelevant features and remove redundant features without significantly hurting the performance of the classification algorithm and also experimental results show obviously that the RF-SEA method can obtain better classification performance than singly Shapley-value-based or ReliefF (RF)-algorithm based method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Liu, H, Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. (2005)

    Google Scholar 

  2. Lemke, F., Mueller, J.-A.: Medical data analysis using self-organizing data mining technologies. Syst. Anal. Model. Simul. 43(10), 1399–1408 (2003)

    Article  Google Scholar 

  3. Li, W., Han, J., Pei, J.: CMAR accurate and efficient classification based on multiple association rules. In: Proceedings of 2001 International Conference on Data Mining (2001)

    Google Scholar 

  4. Importance of feature selection in decision-tree and artificial-neural-network ecological applications Alburnus alburnus alborella: A practical example : Tina Tirelli, Daniela Pessani. Ecol. Inf. 6, 309–315 (2011)

    Google Scholar 

  5. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the AAAI-92, AAAI Press, pp. 129–134 (1992)

    Google Scholar 

  6. Robnic-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn. 53(1–2), 23–69 (2003)

    Article  Google Scholar 

  7. Sun, Y., Wu, D.: A Relief based feature extraction algorithm. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 188–195 (2008)

    Google Scholar 

  8. Ghiselli, E.E.: Theory of Psychological Measurement. McGraw_Hill

    Google Scholar 

  9. Quinlan, J.R.: Induction of decision trees. Machine Learn. 1, 81–106 (1986)

    Google Scholar 

  10. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4), 537–550 (1994)

    Article  Google Scholar 

  11. Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Annals of Mathematics Studies II (28), pp. 307–317. Princeton University Press, Princeton (1953)

    Google Scholar 

  12. Weka 3: Machine learning software in java, The University of Waikato software documentation (http://www.cs.waikato.ac.nz/_ml/weka)

  13. Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases. (http://www.ics.uci.edu/mlearn/MLRepository.html) (1998)

  14. Jolliffe, I.T.: Principal Component Analysis. Springer (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Sasikala .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer India

About this paper

Cite this paper

Sasikala, S., Appavu alias Balamurugan, S., Geetha, S. (2014). RF-SEA-Based Feature Selection for Data Classification in Medical Domain. In: Mohapatra, D.P., Patnaik, S. (eds) Intelligent Computing, Networking, and Informatics. Advances in Intelligent Systems and Computing, vol 243. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1665-0_59

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1665-0_59

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-1664-3

  • Online ISBN: 978-81-322-1665-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics