Spam Detection Approach Using Modified Pre-processing With NLP

Authors

  • Choudhary N Department of Computer Science Engineering, GNCSGI, Jabalpur, MP, India
  • Dubey N Department of Computer Science Engineering, GNCSGI, Jabalpur, MP, India

Keywords:

Spam detection, email, NLP, spam classification

Abstract

However, the growth in emails has also led to an unprecedented increase in the number of illegitimate mail, or spam 49.7% of emails sent is spam - because current spam detection methods lack an accurate spam classifier. We are excited by the decline in the volume of email spam but it also raises the question as to whether the email spam business is dying and will continue to decline. Besides the volume change, we also consider the quality of email spam and the impact, which may constitute a new trend of email spam business. For instance, spammers may post email spam in a more complicated way using spoofed email addresses and changing email relay servers. That kind of email spam may slip away under the inspection of spam filters. Thus, it motivated us to investigate the evolution of email spam using advanced techniques such as topic modelling and network analysis. We try to find out the real trend of email spam business through email content, meta information such as headers, and sender-to-receiver network over a long period of time.

References

[1] A. Bhowmick and S. Hazarika, “Machine learning for e-mail spam filtering: review, techniques and trends,” https://arxiv.org/abs/1606.0104, 2016, accessed: 2017.

[2] A. Aski and N. Sourti, “Proposed efficient algorithm to filter spam using machine,” in Pacific Science Review A: Natural Science and Engineering, vol. 18, 2016, pp. 145–149.

[3] J. Rao and D. Reilly, “The economics of spam,” in Journal of Economic Perspectives, vol. 26, no. 3, 2012.

[4] H. Tschabitscher, “How many emails are sent every day?” https://www.lifewire.com/how-many-emails-are-sent-every-day-117121, 2017, accessed: 2017.

[5] J.S. Kong, P.O. Boykin, B.A. Rezaei, N. Sarshar, and V.P. Roy chowdhury, “Let Your Cyber Alter Ego Share Information and Manage Spam,” Univ. of California, Los Angeles, CA, technical report,2005.

[6] F. Zhou, L. Zhuang, B.Y. Zhao, L. Huang, A.D. Joseph, and J.D. Kubiatowicz, “Approximate Object Location, and Spam Filtering on Peer-to-Peer Systems,” Proc. Middleware, pp. 1–20, 2003.

[7] SPAMNET, http://www.cloudmark.com, accessed in Mar. 2014.

[8] Haiying Shen, Senior Member, IEEE, and Ze Li, Student Member, IEEE, “Leveraging Social Networks for Effective Spam Filtering”, IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 11, NOVEMBER 2014.

[9] Dr Devendra K. Tayal, Amita Jain, Kanak Meena,” Development of Anti-spam techniques using modified K-means & Naive Bayes Algorithms” IEEE-2016.

[10] Weimiao Feng, Jianguo Sun, Qing Yang, “A Support Vector Machine based Naive Bayes Algorithm for Spam Filtering”, IEEE-2016.

[11] Rohit Kumar Solanki, Karun Verma, Ravinder Kumar,” Spam Filtering Using Hybrid Local-Global Naive Bayes Classifier” IEEE-2015.

Downloads

Published

2025-11-25

How to Cite

[1]
N. Choudhary and N. Dubey, “Spam Detection Approach Using Modified Pre-processing With NLP”, Int. J. Comp. Sci. Eng., vol. 7, no. 10, pp. 158–161, Nov. 2025.