Comparative Study of Classification Techniques for Breast Cancer Diagnosis
DOI:
https://doi.org/10.26438/ijcse/v7i1.234240Keywords:
Classification Techniques, Feature Selection, k-Nearest Neighbour (KNN), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), Bayesian Network (BN), WEKAAbstract
Classification techniques in Machine Learning are implemented on datasets. In this work, the cancer datasets are used for the classification purpose and collected from UCI Machine Learning repository. There are two types of datasets of breast cancer. Both the datasets are varying by their number of features available across the datasets. This paper presents the implementation and comparative study of major and popular classification techniques such as Decision Tree, k-Nearest Neighbour, Support Vector Machine, Bayesian Network and Naïve Bayes under WEKA environment for accuracy based on evaluation of performance metrics. This paper evaluates that the Bayesian Network gives the best accuracy with less featured dataset while Support Vector Machine gives best accuracy for more featured dataset.
References
[1] B. Nithya, V. Ilango, 2017, “Comparative Analysis of Classification Methods in R Environment with two Different Datasets.”, Intl J Scientific Research and Computer Science, Engineering and Information Technology (IJSRCSEIT), vol 2, Issue 6, ISSN: 2456-3307.
[2] D. Lavanya et al. 2011, “Analysis of Feature Selection with Classification: Breast Cancer Datasets.”, Intl. J of Computer Science & Engineering (IJCSE), vol. 2, No. 5, Oct-Nov 2011, ISSN: 0976-5166.
[3] Deepika Verma et al., 2017. “Analysis and Prediction of Breast Cancer and Diabetes disease datasets using Data Mining Classification Techniques.”, IEEE Xplore Proceeding of the Intl. Conf. on Intelligent Sustainable Systems (ICISS 2017), IEEE Xplore Compliant – part number: CFP17M19-ART, ISBN:978-1-5386-1959-9
[4] Niyati Gupta et al. 2013, “Accuracy, Sensitivity and Specificity Measurement of Various Classification Techniques on Healthcare data.”, IOSR J. of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, p-ISSN: 2278-8727, vol.11, issue 5, pp 70-73, May-Jun 2017.
[5] Morteza Heidari et al. 2017. “Prediction of Breast Cancer Risk using Machine Learning Approach embedded with a Locality Preserving Projection Algorithm.” Institute of Physics in Medicine and Biology (IPEM), doi: https://doi.org/10.1088/1361-6560/aaa1ca.
[6] T. John Peter et al. 2012. “Study and Development of Novel Feature Selection Framework for Heart Disease Prediction.” Intl J. Scientific and Research Publication, IJSRP. Vol.2 Issue 10, (Oct. 2012), ISSN: 2250-3153.
[7] Mahua Nandy , 2013. “An Analytical study of Supervised and Unsupervised Classification Methods for Breast Cancer Diagnosis”. 2nd Intl conf on Computing Communication and Sensor Network (CCSN-2013), Proceedings published by Intl. J Computer Application (IJCA) .
[8] Wenbin Yue, Zidong Wang, Hongwei Chen, and Annette Payne. May 2018. “Machine Learning with Applications in Breast Cancer Diagnosis and Prognosis.”, www.mdpi.com/journal/designs Design 2018, 2, 13; doi:10.3390/designs2013.
[9] Mohd. Milon Islam et al. 2017. Prediction of Breast Cancer using Support Vector Machine and K-Nearest Neighbours.”, 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dec 21-23, 2017, Dhaka, Bangladesh.
[10] Amit Bhola, Arvind Kumar Tiwari, December 2015, “Machine Learning Based Approaches for Cancer Classification using Gene Expression Data.”, Machine Learning and Application: An Intl. J. (MLAIJ), Vol 2, No.3/4
[11] Pedro D.,Micheal P., 1997, “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss.”, Machine Learning, 29, 103-130 (1997), Kluwer Academic Publishers, Netherlands.
[12] Wenbin Yue, Zidong Wang, Hongwei Chen, and Annette Payne. May 2018. “Machine Learning with Applications in Breast Cancer Diagnosis and Prognosis.”, www.mdpi.com/journal/designs Design 2018, 2, 13; doi:10.3390/designs2013,.
[13] Isabelle Guyon, Jason W., Stephen B., et al., 2002, “Gene Selection for Cancer Classification using Support Vector Machine, Machine Learning,” vol.46, pp 389-422, 2002, Springer, Kluwer Academic Publishers, Netherlands
[14] D. Coomans, D.L.Massart, 1982, “Alternate k-Nearest Neighbour Rules in Supervised Pattern Recognition, Part-1. k-NN classification by using Alternative Voting Rules.”, Analytica Chimica Acta, 136 (1982) 15-27, Elsevier Scientific Publishing Company, Amsterdam, Netherlands
[15] Leo Breiman, 2001, “Random Forests, Machine Learning”, vol. 45, issue 1, pp 5-32, Oct 2001, Springer, Kluwer Academic Publishers, Netherlands
[16] Dursun Delen, Glenn Walker, Amit Kadam, 2005, “Predicting Breast Cancer Survivability: a comparison of three data mining methods.”, ELSEVIER Artificial Intelligence in Medicine (2005), 34, 113-127. doi: 10.1016/j.artmed.2004.07.002
[17] A. Marcano-Cedeno et al. 2011, “WBCD breast cancer database classification applying artificial metaplasticity neural network.”, ELSEVIER Expert Systems with Applications 38 (2011) 9573-9579. Doi: 10.1016/j.eswa.2011.01.167
[18] B.B. Chaudhauri, U. Bhattachrya, 2011, “Efficient training and Improved Performance of Multilayer Perceptron in Pattern Classification”. ELSEVIER Neurocomputing 34 (2000) 11-27
[19] J. Tang, C. Deng, G. Huang, 2016, “Extreme Learning Machine for Multilayer Perceptron”, IEEE Transaction on Neural Networks and Learning System, vol 27, No. 4, April 2016.
[20] Cruz-Ramirez Nicandro et al., 2013, “Evaluation of the Diagnostic Power of Thermography in Breast Cancer using Bayesian Network Classifiers.”, Hindwai Publishing Corp, Computational and Mathematical Methods in Medicine. Vol 2013, Article ID 264246, 10 pages, http://dx.doi.org/10.1155/2013/264246
[21] S. Wongthanavasu, 2010, “A Bayesian Belief Network Model for Breast Cancer Diagnosis.”, Springer Operation Research Proceedings 2010, Intl. Conf. German Operation Research Society, Sept 1-3, 2010, pp 3-8,
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
