A Comparative Study of Spam Detection in Social Networks Using Bayesian Classifier and Correlation Based Feature Subset Selection
Keywords:
Bayesian Classifier, Feature Subset Selection, Naïve Bayesian Classifier, Correlation Based FSS, Info Gain, K-cross validation, Spam, Non-SpamAbstract
The article gives an overview of some of the most popular machine learning methods (Naïve Bayesian classifier, naïve Bayesian k-cross validation, naïve Bayesian info gain, Bayesian classification and Bayesian net with correlation based feature subset selection) and of their applicability to the problem of spam-filtering. Brief descriptions of the algorithms are presented, which are meant to be understandable by a reader not familiar with them before. Classification and clustering techniques in data mining are useful for a wide variety of real time applications dealing with large amount of data. Some of the application areas of data mining are text classification, medical diagnosis, intrusion detection systems etc. The Naive Bayesian Classifier technique is based on the Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. Despite its simplicity, Naive Bayesian can often outperform more sophisticated classification methods. The approach is called “naïve” because it assumes the independence between the various attribute values. Naïve Bayesian classification can be viewed as both a descriptive and a predictive type of algorithm. The probabilities are descriptive are used to predict the class membership for a untrained data.
References
Rushdi Shams and Robert Mercer,” Classifying Spam Emails using Text and Readability Features,” IEEE 13th International Conference on Data Mining (ICDM), 2013, pp. 657-666.
Chotirat “ANN” Ratana Mahatana and Dimitrios Gunppulos,” Feature Selection For the Naïve Bayesian Classifier Using Decision Trees,” Applied Artificial Intelligence, Volume-17, 2003, pp. 475-487.
Mehdi Naseriparsa, Amir-Masoud Bidgoli, Touraj Varaee,”A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms,” International Journal of Computer Applications (0975-8887), Volume 69, No-17, May 2013.
Aakriti Aggarwal and Ankur Gupta, “Detection of DDoS Attack Using UCLA Dataset on Different Classifiers, International Journal of Computer Science and Engineering, Volume-03, Issue-08, August 2015, pp. 33-37.
Ioannis Kanaris, Konstantinos Kanaris, Ioannis Houvardas, And Efstathios Stamatatos, “Words Vs. Character N-Grams For Anti-Spam Filtering,” International Journal on Artificial Intelligence Tools, 2006, pp.1-20.
Mehdi Naseriparsa, Amir-Masoud Bidgoli and Touraj Varaee,” A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms” International Journal of Computer Applications (0975 – 8887),Volume 69, Issue- 17,May 2013
Sanjeev Dhawan and Meena Devi, “Spam Detection in Social Networks Using Correlation Based Feature Subset Selection,” International Journal of Computer Applications Technology and Research, Volume 4, Issue-8, August 2015, pp. 629-632.
Dipali Bhosale and Roshani Ade,” Feature Selection based Classification using Naive Bayesian, J48 and Support Vector Machine,” International Journal of Computer Applications (0975 – 8887) Volume 99– No.16, August 2014.
Anjana Kumari,” Study on Naive Bayesian Classifier and its relation to Information Gain,” International Journal on Recent and Innovation Trends in Computing and Communication, Volume: 2, Issue- 3, March 2014, pp.601 – 603.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
