Dimensionality Reduction and Comparison of Classification Models for Breast Cancer Prognosis

Authors

  • R Garg Computer Science, Guru Nanak College, Moga, Panjab University Chandigarh, India
  • V Mongia Computer Science, Guru Nanak College, Moga, Panjab University Chandigarh, India

DOI:

https://doi.org/10.26438/ijcse/v6i1.308312

Keywords:

Data Mining, Breast Cancer, Bayesian, SVM, Decision Tree, Regression Model

Abstract

Cancer is a most prevailing problem in the society now days. Generally cancer specifically Breast cancer is a major problem in women. On among three cases of cancer is a Breast cancer. There are many factors that affect the cancer. All these factors and the symptoms in the patient can be recorded using hardware and software. Now days, due to advancement in technology data of patient is recorded and processed by using analytical method. Data mining provides various methods to process this data effectively and efficiently. This processed data can be proven very useful in earlier detection of diseases. The earlier detection of these symptoms can be proven helpful to save life of a patient. In our research, original data on Breast cancer from Winconsin has been taken. This data set has 10 attribute and 699 instances. In this study, a comparative model has been developed that compare performance of various data mining technique on the dataset. The study reveals that BayesNet is the best classifier that correctly predicts cancer survivability in the patient. Further, KStar is the fastest algorithm that takes lowest computation time for the classification. In the next step dimensionality reduction using gain ratio is performed to find out most dominant factors causing Breast cancer.

References

H. Trevor, T. Robert, and F. Jerome, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., vol. 2. Springer: New York, 2009, pp. 32-36.

N.T.Nghe, P. Janecek, and P. Haddawy, “A comparative analysis of techniques for predicting academic performance", ASEE/IEEE Frontiers in Education Conference, pp. T2G7-T2G12, 2007.

M. Lichman, UCI Machine Learning Repository, http://www.cs.waikato.ac.nz/ml/weka,2013

J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. University of Illinois at Urbana-Champaign Elsevier San Francisco, 2009, pp. 285-306

Bellaachia, Abdelghani, and Erhan Guven, "Predicting breast cancer survivability using data mining techniques", Age, Vol. 58, Issue 13, 2006, pp. 10-110

Anunciacao Orlando, Gomes C. Bruno, Vinga Susana, Gaspar Jorge, Oliveira L. Arlindo and Rueff Jose, “A Data Mining approach for detection of high-risk Breast Cancer groups,” Advances in Soft Computing, vol. 74, pp. 43-51, 2010.

Shelly Gupta, Dharminder Kumar,Anand Sharma, “DATA MINING CLASSIFICATION TECHNIQUES APPLIED FOR BREAST CANCER DIAGNOSIS AND PROGNOSIS “

Vol. 2 No. 2 Apr-May 2011

Ahmad LG*, Eshlaghy AT, Poorebrahimi A, Ebrahimi M and Razavi AR “Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence” Health and Medical Informatics 2013, 4:2

Htet Thazin Tike Thein1 and Khin Mo Mo Tun “An Approach For Breast Cancer Diagnosis Classification Using Neural Network” Advanced Computing: An International Journal (ACIJ), Vol.6, No.1, January 2015

Uma Ojha, Savita Goel, ”A study on prediction of breast cancer recurrence using data mining techniques” Cloud Computing, Data Science & Engineering - Confluence, 2017 , Noida Inida.

K. Saravanapriya and J. Bagyamani, “Performance analysis of Classification Algorithms on Diabetes DataSet” . International Journal of Computer Science and Engineering (IJCSE), Vol. 5, Issue-9, pp 15-20, sept-2017

Downloads

Published

2025-11-12
CITATION
DOI: 10.26438/ijcse/v6i1.308312
Published: 2025-11-12

How to Cite

[1]
R. Garg and V. Mongia, “Dimensionality Reduction and Comparison of Classification Models for Breast Cancer Prognosis”, Int. J. Comp. Sci. Eng., vol. 6, no. 1, pp. 308–313, Nov. 2025.

Issue

Section

Research Article