Dimensionality Reduction and Comparison of Classification Models for Breast Cancer Prognosis
DOI:
https://doi.org/10.26438/ijcse/v6i1.308312Keywords:
Data Mining, Breast Cancer, Bayesian, SVM, Decision Tree, Regression ModelAbstract
Cancer is a most prevailing problem in the society now days. Generally cancer specifically Breast cancer is a major problem in women. On among three cases of cancer is a Breast cancer. There are many factors that affect the cancer. All these factors and the symptoms in the patient can be recorded using hardware and software. Now days, due to advancement in technology data of patient is recorded and processed by using analytical method. Data mining provides various methods to process this data effectively and efficiently. This processed data can be proven very useful in earlier detection of diseases. The earlier detection of these symptoms can be proven helpful to save life of a patient. In our research, original data on Breast cancer from Winconsin has been taken. This data set has 10 attribute and 699 instances. In this study, a comparative model has been developed that compare performance of various data mining technique on the dataset. The study reveals that BayesNet is the best classifier that correctly predicts cancer survivability in the patient. Further, KStar is the fastest algorithm that takes lowest computation time for the classification. In the next step dimensionality reduction using gain ratio is performed to find out most dominant factors causing Breast cancer.
References
H. Trevor, T. Robert, and F. Jerome, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., vol. 2. Springer: New York, 2009, pp. 32-36.
N.T.Nghe, P. Janecek, and P. Haddawy, “A comparative analysis of techniques for predicting academic performance", ASEE/IEEE Frontiers in Education Conference, pp. T2G7-T2G12, 2007.
M. Lichman, UCI Machine Learning Repository, http://www.cs.waikato.ac.nz/ml/weka,2013
J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. University of Illinois at Urbana-Champaign Elsevier San Francisco, 2009, pp. 285-306
Bellaachia, Abdelghani, and Erhan Guven, "Predicting breast cancer survivability using data mining techniques", Age, Vol. 58, Issue 13, 2006, pp. 10-110
Anunciacao Orlando, Gomes C. Bruno, Vinga Susana, Gaspar Jorge, Oliveira L. Arlindo and Rueff Jose, “A Data Mining approach for detection of high-risk Breast Cancer groups,” Advances in Soft Computing, vol. 74, pp. 43-51, 2010.
Shelly Gupta, Dharminder Kumar,Anand Sharma, “DATA MINING CLASSIFICATION TECHNIQUES APPLIED FOR BREAST CANCER DIAGNOSIS AND PROGNOSIS “
Vol. 2 No. 2 Apr-May 2011
Ahmad LG*, Eshlaghy AT, Poorebrahimi A, Ebrahimi M and Razavi AR “Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence” Health and Medical Informatics 2013, 4:2
Htet Thazin Tike Thein1 and Khin Mo Mo Tun “An Approach For Breast Cancer Diagnosis Classification Using Neural Network” Advanced Computing: An International Journal (ACIJ), Vol.6, No.1, January 2015
Uma Ojha, Savita Goel, ”A study on prediction of breast cancer recurrence using data mining techniques” Cloud Computing, Data Science & Engineering - Confluence, 2017 , Noida Inida.
K. Saravanapriya and J. Bagyamani, “Performance analysis of Classification Algorithms on Diabetes DataSet” . International Journal of Computer Science and Engineering (IJCSE), Vol. 5, Issue-9, pp 15-20, sept-2017
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
