Skip to main content
Erschienen in: International Journal of Computer Assisted Radiology and Surgery 11/2016

01.11.2016 | Original Article

Analysis of k-means clustering approach on the breast cancer Wisconsin dataset

verfasst von: Ashutosh Kumar Dubey, Umesh Gupta, Sonal Jain

Erschienen in: International Journal of Computer Assisted Radiology and Surgery | Ausgabe 11/2016

Einloggen, um Zugang zu erhalten

Abstract

Purpose

Breast cancer is one of the most common cancers found worldwide and most frequently found in women. An early detection of breast cancer provides the possibility of its cure; therefore, a large number of studies are currently going on to identify methods that can detect breast cancer in its early stages. This study was aimed to find the effects of k-means clustering algorithm with different computation measures like centroid, distance, split method, epoch, attribute, and iteration and to carefully consider and identify the combination of measures that has potential of highly accurate clustering accuracy.

Methods

K-means algorithm was used to evaluate the impact of clustering using centroid initialization, distance measures, and split methods. The experiments were performed using breast cancer Wisconsin (BCW) diagnostic dataset. Foggy and random centroids were used for the centroid initialization. In foggy centroid, based on random values, the first centroid was calculated. For random centroid, the initial centroid was considered as (0, 0).

Results

The results were obtained by employing k-means algorithm and are discussed with different cases considering variable parameters. The calculations were based on the centroid (foggy/random), distance (Euclidean/Manhattan/Pearson), split (simple/variance), threshold (constant epoch/same centroid), attribute (2–9), and iteration (4–10). Approximately, 92 % average positive prediction accuracy was obtained with this approach. Better results were found for the same centroid and the highest variance. The results achieved using Euclidean and Manhattan were better than the Pearson correlation.

Conclusions

The findings of this work provided extensive understanding of the computational parameters that can be used with k-means. The results indicated that k-means has a potential to classify BCW dataset.
Literatur
1.
Zurück zum Zitat Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136(5):E359–E386CrossRefPubMed Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136(5):E359–E386CrossRefPubMed
2.
Zurück zum Zitat Dubey AK, Gupta U, Jain S (2015) Breast cancer statistics and prediction methodology: a systematic review and analysis. Asian Pac J Cancer Prev 16(10):4237–4245CrossRefPubMed Dubey AK, Gupta U, Jain S (2015) Breast cancer statistics and prediction methodology: a systematic review and analysis. Asian Pac J Cancer Prev 16(10):4237–4245CrossRefPubMed
3.
Zurück zum Zitat Dubey AK, Gupta U, Jain S (2014) A Survey on Breast Cancer Scenario and Prediction Strategy. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications (FICTA), 2014. Springer International Publishing, pp 367–375 Dubey AK, Gupta U, Jain S (2014) A Survey on Breast Cancer Scenario and Prediction Strategy. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications (FICTA), 2014. Springer International Publishing, pp 367–375
4.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of 20th international conference on very large data bases, VLDB 1994. vol 1215. pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of 20th international conference on very large data bases, VLDB 1994. vol 1215. pp 487–499
6.
Zurück zum Zitat Alpaydin E (2014) Introduction to machine learning. MIT press, Cambridge, Massachusetts, United States Alpaydin E (2014) Introduction to machine learning. MIT press, Cambridge, Massachusetts, United States
7.
Zurück zum Zitat Bradley PS, Fayyad UM (1998) Refining initial points for k-means clustering. In: Proceedings of the 15th international conference on machine learning (ICML), Morgan Kaufmann, San Francisco, vol 98. pp 91–99 Bradley PS, Fayyad UM (1998) Refining initial points for k-means clustering. In: Proceedings of the 15th international conference on machine learning (ICML), Morgan Kaufmann, San Francisco, vol 98. pp 91–99
8.
Zurück zum Zitat Mary C, Raja SK (2009) Refinement of clusters from k-means with ant colony optimization. J Theor Appl Inf Technol 6(4):28–32 Mary C, Raja SK (2009) Refinement of clusters from k-means with ant colony optimization. J Theor Appl Inf Technol 6(4):28–32
9.
Zurück zum Zitat Wang C, Machiraju R, Huang K (2014) Breast cancer patient stratification using a molecular regularized consensus clustering method. Methods 67(3):304–312CrossRefPubMedPubMedCentral Wang C, Machiraju R, Huang K (2014) Breast cancer patient stratification using a molecular regularized consensus clustering method. Methods 67(3):304–312CrossRefPubMedPubMedCentral
10.
Zurück zum Zitat Rahideh A, Shaheed MH (2011) Cancer classification using clustering based gene selection and artificial neural networks. In: IEEE 2nd international conference on control, instrumentation and automation (ICCIA), 2011. pp 1175–1180 Rahideh A, Shaheed MH (2011) Cancer classification using clustering based gene selection and artificial neural networks. In: IEEE 2nd international conference on control, instrumentation and automation (ICCIA), 2011. pp 1175–1180
11.
Zurück zum Zitat Vanisri D, Loganathan C (2010) Fuzzy pattern cluster scheme for breast cancer datasets. In: IEEE international conference on communication and computational intelligence (INCOCCI), 2010. pp 410–414 Vanisri D, Loganathan C (2010) Fuzzy pattern cluster scheme for breast cancer datasets. In: IEEE international conference on communication and computational intelligence (INCOCCI), 2010. pp 410–414
12.
Zurück zum Zitat Festa P (2013) A biased random-key genetic algorithm for data clustering. Math Biosci 245(1):76–85CrossRefPubMed Festa P (2013) A biased random-key genetic algorithm for data clustering. Math Biosci 245(1):76–85CrossRefPubMed
13.
Zurück zum Zitat Chen CH (2014) A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection. Appl Soft Comput 20:4–14CrossRef Chen CH (2014) A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection. Appl Soft Comput 20:4–14CrossRef
14.
Zurück zum Zitat Wei D, Jiang Q, Wei Y, Wang S (2012) A novel hierarchical clustering algorithm for gene sequences. BMC Bioinform 13(1):174CrossRef Wei D, Jiang Q, Wei Y, Wang S (2012) A novel hierarchical clustering algorithm for gene sequences. BMC Bioinform 13(1):174CrossRef
15.
Zurück zum Zitat Ahmad FK, Yusoff N (2013) Classifying breast cancer types based on fine needle aspiration biopsy data using random forest classifier. In: IEEE 13th international conference on intelligent systems design and applications (ISDA), 2013. pp 121–125 Ahmad FK, Yusoff N (2013) Classifying breast cancer types based on fine needle aspiration biopsy data using random forest classifier. In: IEEE 13th international conference on intelligent systems design and applications (ISDA), 2013. pp 121–125
Metadaten
Titel
Analysis of k-means clustering approach on the breast cancer Wisconsin dataset
verfasst von
Ashutosh Kumar Dubey
Umesh Gupta
Sonal Jain
Publikationsdatum
01.11.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Computer Assisted Radiology and Surgery / Ausgabe 11/2016
Print ISSN: 1861-6410
Elektronische ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-016-1437-9

Weitere Artikel der Ausgabe 11/2016

International Journal of Computer Assisted Radiology and Surgery 11/2016 Zur Ausgabe

Darf man die Behandlung eines Neonazis ablehnen?

08.05.2024 Gesellschaft Nachrichten

In einer Leseranfrage in der Zeitschrift Journal of the American Academy of Dermatology möchte ein anonymer Dermatologe bzw. eine anonyme Dermatologin wissen, ob er oder sie einen Patienten behandeln muss, der eine rassistische Tätowierung trägt.

Ein Drittel der jungen Ärztinnen und Ärzte erwägt abzuwandern

07.05.2024 Klinik aktuell Nachrichten

Extreme Arbeitsverdichtung und kaum Supervision: Dr. Andrea Martini, Sprecherin des Bündnisses Junge Ärztinnen und Ärzte (BJÄ) über den Frust des ärztlichen Nachwuchses und die Vorteile des Rucksack-Modells.

Endlich: Zi zeigt, mit welchen PVS Praxen zufrieden sind

IT für Ärzte Nachrichten

Darauf haben viele Praxen gewartet: Das Zi hat eine Liste von Praxisverwaltungssystemen veröffentlicht, die von Nutzern positiv bewertet werden. Eine gute Grundlage für wechselwillige Ärztinnen und Psychotherapeuten.

Akuter Schwindel: Wann lohnt sich eine MRT?

28.04.2024 Schwindel Nachrichten

Akuter Schwindel stellt oft eine diagnostische Herausforderung dar. Wie nützlich dabei eine MRT ist, hat eine Studie aus Finnland untersucht. Immerhin einer von sechs Patienten wurde mit akutem ischämischem Schlaganfall diagnostiziert.

Update Radiologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.