Multi-Class Cancer Classification Using Dimensionally-Reduced Breast Cancer Data

Authors

  • Jency Gracy Bai A Dept of Computer Science and Engineering, Coimbatore Institute Of Technology, Coimbatore, Tamil Nadu, INDIA
  • Lathikaa Sri M Dept of Computer Science and Engineering, Coimbatore Institute Of Technology, Coimbatore, Tamil Nadu, INDIA
  • Jayalakshmi M Dept of Computer Science and Engineering, Coimbatore Institute Of Technology, Coimbatore, Tamil Nadu, INDIA
  • Harinii M Dept of Computer Science and Engineering, Coimbatore Institute Of Technology, Coimbatore, Tamil Nadu, INDIA
  • K Amshakala Dept of Computer Science and Engineering, Coimbatore Institute Of Technology, Coimbatore, Tamil Nadu, INDIA

DOI:

https://doi.org/10.26438/ijcse/v8i5.6169

Keywords:

Breast cancer is an uncontrolled growth of breast cells and the most common invasive cancer in women, the second leading cause of cancer death in women next to lung cancer. Cancer starts from breast and spreads to other parts of the body. People are unable to identify the disease before it becomes dangerous. It can be cured if the disease is identified at an earlier stage. Awareness of breast cancer, public attentiveness, and advancement in breast imaging has made a positive impact on the identification and screening of breast cancer. The interpretation of a tumor image is taken from patients and stored in datasets. This study suggests a feature extraction method such as PCA (Principal Component Analysis) which is used for pre-processing the data and extracting the most relevant features. Several classifiers like KNearest Neighbour (KNN), Naïve Bayes (NB), Linear Support Vector Machine(L-SVM), Gaussian Kernel Support Vector Machine(K-SVM), Logistic Regression(LR) are used to build machine learning model, among these classifiers Linear kernel Support Vector Machine (L-SVM) gives better accuracy. The proposed system uses a Linear kernel Support vector machine(L-SVM) to perform staging. The objective of the project is to carry out dimensionality reduction on cancer datasets and to build a predictive model for multi-class cancer stage classification using a linear kernel SVM classifier

Abstract

Breast cancer is an uncontrolled growth of breast cells and the most common invasive cancer in women, the second leading cause of cancer death in women next to lung cancer. Cancer starts from breast and spreads to other parts of the body. People are unable to identify the disease before it becomes dangerous. It can be cured if the disease is identified at an earlier stage. Awareness of breast cancer, public attentiveness, and advancement in breast imaging has made a positive impact on the identification and screening of breast cancer. The interpretation of a tumor image is taken from patients and stored in datasets. This study suggests a feature extraction method such as PCA (Principal Component Analysis) which is used for pre-processing the data and extracting the most relevant features. Several classifiers like KNearest Neighbour (KNN), Naïve Bayes (NB), Linear Support Vector Machine(L-SVM), Gaussian Kernel Support Vector Machine(K-SVM), Logistic Regression(LR) are used to build machine learning model, among these classifiers Linear kernel Support Vector Machine (L-SVM) gives better accuracy. The proposed system uses a Linear kernel Support vector machine(L-SVM) to perform staging. The objective of the project is to carry out dimensionality reduction on cancer datasets and to build a predictive model for multi-class cancer stage classification using a linear kernel SVM classifier.

References

[1] MadhuKumari and Vijendra Singh, “Breast Cancer Prediction system “. In the proceedings of the 2018 International Conference on Computational Intelligence and Data Science (IJCSES), India, Vol.132, p.371-376, 2018.

[2] David A. Omondiagbe, Shanmugam Veeramani and Amandeep S. Sidhu,”Machine Learning Classification Techniques for Breast Cancer Diagnosis”. In the proceedings of the 2019 IOP Conference series on Materials Science and Engineering , Vol .495, 2019.

[3] J. Taveira De Souza, A. Carlos De Francisco and D. Carla De Macedo, "Dimensionality Reduction in Gene Expression Data Sets," in IEEE Access, vol. 7, pp. 61136-61144, 2019, doi: 10.1109/ACCESS.2019.2915519.

[4] Ajay Kumar, R. Sushil, A. K. Tiwari, “Comparative Study of Classification Techniques for Breast Cancer Diagnosis,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.1, pp.234-240, 2019.

[5] Pritom AI, Munshi MAR, Sabab SA, Shihab S. “Predicting breast cancer recurrence using effective classification and feature selection technique”. In 19th international conference on computer and information technology (ICCIT). New York: IEEE; 2016. p. 310–4.

[6] Lu J, Keech M. “Emerging technologies for health data analytics research: a conceptual architecture”. In 26th international workshop on database and expert systems applications (DEXA). IEEE; 2015. p. 225–9.

[7] Chaurasia V, Pal S. “A novel approach for breast cancer detection using data mining techniques”. In International journal of innovative research in computer and communication engineering (an ISO 3297: 2007 certified organization), vol. 2; 2017.

[8] Kumar UK, Nikhil MS, Sumangali K.” Prediction of breast cancer using voting classifier technique”. In IEEE international conference on smart technologies and management for computing, communication, controls, energy and materials (ICSTM).NewYork:IEEE;2017.p.108–14.

[9] Ajay Kumar, R. Sushil , A. K. Tiwari.” Comparative Study of Classification Techniques for Breast Cancer Diagnosis”. International Journal of Computer Science and Engineering(IJCSE), Vol.-7, Issue-1, p.234-240 Jan 2019.

[10] Vikas S, Thimmaraju S N. “Breast Cancer Diagnosis and Classification Using Support vector machines With Diverse Datasets”. International Journal of Computer Science and Engineering(IJCSE), Vol.-7, Issue-4, p.442-446,April 2019.

Downloads

Published

2020-05-31
CITATION
DOI: 10.26438/ijcse/v8i5.6169
Published: 2020-05-31

How to Cite

[1]
J. G. A. Bai, L. S. M, J. M, H. M, and K. Amshakala, “Multi-Class Cancer Classification Using Dimensionally-Reduced Breast Cancer Data”, Int. J. Comp. Sci. Eng., vol. 8, no. 5, pp. 61–69, May 2020.

Issue

Section

Research Article