Improving Generalization in Sentiment Analysis of Twitter Data with Logistic Regression Model

Authors

  • Kavinder Singh Department of AIT-CSE, Chandigarh University, Punjab, India
  • Syed Mehdi Abbas Razavi Department of AIT-CSE, Chandigarh University, Punjab, India
  • Sneh Sagar Subedi Department of AIT-CSE, Chandigarh University, Punjab, India
  • Akshay Kumar Department of AIT-CSE, Chandigarh University, Punjab, India
  • Gurwinder Singh Department of AIT-CSE, Chandigarh University, Punjab, India https://orcid.org/0000-0002-8378-5637

Keywords:

Sentiment analysis, Opinion mining, Natural language processing, Twitter data

Abstract

Sentiment analysis, commonly referred to as opinion mining, is an important problem in natural language processing that entails figuring out the sentiment represented in a document. Sentiment analysis of Twitter data has drawn a lot of attention as a result of the social media platforms' rapid expansion. Using logistic regression, a well-liked machine learning approach for binary classification applications, this research suggests a sentiment analysis system. The system starts by gathering and preprocessing a sizable Twitter dataset with tweets that have been labelled as positive or negative. By eliminating noise, stop-words, and unimportant information, the text data is cleaned. The techniques of tokenization and vectorization are used to represent the text in a numerical format appropriate for logistic regression. A suitable optimization approach is used to estimate the model parameters as the logistic regression model is trained on the labelled dataset. Cross-validation and performance indicators including accuracy, precision, recall, and F1-score are used to evaluate models. The system's goal for sentiment analysis jobs is high accuracy and reliable generalization

References

[1] R. Wagh and P. Punde, “Survey on sentiment analysis using twitter dataset,” in 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE, pp. 208–211, 2018.

[2] S. A. El Rahman, F. A. AlOtaibi, and W. A. AlShehri, “Sentiment analysis of twitter data,” in 2019 international conference on computer and information sciences (ICCIS). IEEE, pp. 1–4, 2019.

[3] A. Balahur, “Sentiment analysis in social media texts,” in Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp. 120–128, 2013.

[4] A. P. Jain and V. D. Katkar, “Sentiments analysis of twitter data using data mining,” in 2015 International Conference on Information Processing (ICIP). IEEE, pp.807–810, 2015.

[5] V. Sahayak, V. Shete, and A. Pathan, “Sentiment analysis on twitter data,” International Journal of Innovative Research in Advanced Engineering (IJIRAE), Vol.2, No.1, pp.178–183, 2015.

[6] M. R. Hasan, M. Maliha, and M. Arifuzzaman, “Sentiment analysis with nlp on twitter data,” in 2019 international conference on computer, communication, chemical, materials and electronic engineering (IC4ME2). IEEE, pp.1–4, 2019.

[7] S. Bhuta, A. Doshi, U. Doshi, and M. Narvekar, “A review of techniques for sentiment analysis of twitter data,” in 2014 International conference on issues and challenges in intelligent computing techniques (ICICT). IEEE, pp 583–591, 2014.

[8] H. Bagheri and M. J. Islam, “Sentiment analysis of twitter data,” arXiv preprint arXiv:1711.10377, 2017.

[9] A. Alsaeedi and M. Z. Khan, “A study on sentiment analysis techniques of twitter data,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 2, 2019.

[10] C. Shofiya and S. Abidi, “Sentiment analysis on covid-19-related social distancing in canada using twitter data,” International Journal of Environmental Research and Public Health, vol. 18, no. 11, p. 5993, 2021.

[11] L. Nemes and A. Kiss, “Social media sentiment analysis based on covid-19,” Journal of Information and Telecommunication, vol. 5, no. 1, pp. 1–15, 2021.

[12] J. K. Rout, K.-K. R. Choo, A. K. Dash, S. Bakshi, S. K. Jena, and K. L. Williams, “A model for sentiment and emotion analysis of unstructured social media text,” Electronic Commerce Research, vol. 18, pp. 181–199, 2018.

[13] D. Goularas and S. Kamis, “Evaluation of deep learning techniques in sentiment analysis from twitter data,” in 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications (Deep-ML). IEEE, 2019, pp. 12–17.

[14] E. M. Younis, “Sentiment analysis and text mining for social media microblogs using open source tools: an empirical study,” International Journal of Computer Applications, vol. 112, no. 5, 2015.

[15] A. Kumar and G. Garg, “Sentiment analysis of multimodal twitter data,” Multimedia Tools and Applications, vol. 78, pp. 24 103–24 119, 2019.

[16] S. Dhawan, K. Singh, and P. Chauhan, “Sentiment analysis of twitter data in online social network,” in 2019 5th International Conference on Signal Processing, Computing and Control (ISPCC). IEEE, 2019, pp. 255–259.

[17] K. H. Manguri, R. N. Ramadhan, and P. R. M. Amin, “Twitter sentiment analysis on worldwide covid-19 outbreaks,” Kurdistan Journal of Applied Research, pp. 54–65, 2020.

[18] Z. Drus and H. Khalid, “Sentiment analysis in social media and its application: Systematic literature review,” Procedia Computer Science, vol. 161, pp. 707–714, 2019.

[19] P. Chauhan, N. Sharma, and G. Sikka, “The emergence of social media data and sentiment analysis in election prediction,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, pp. 2601–2627, 2021.

[20] A. Srivastava, V. Singh, and G. S. Drall, “Sentiment analysis of twitter data: A hybrid approach,” International Journal of Healthcare Information Systems and Informatics (IJHISI), vol. 14, no. 2, pp. 1–16, 2019.

[21] P. Tyagi and R. Tripathi, “A review towards the sentiment analysis techniques for the analysis of twitter data,” in Proceedings of 2nd international conference on advanced computing and software engineering (ICACSE), 2019.

[22] R. Khan, P. Shrivastava, A. Kapoor, A. Tiwari, and A. Mittal, “Social media analysis with ai: sentiment analysis techniques for the analysis of twitter covid-19 data,” J. Crit. Rev, vol. 7, no. 9, pp. 2761–2774, 2020.

[23] K. Sailunaz and R. Alhajj, “Emotion and sentiment analysis from twitter text,” Journal of Computational Science, vol. 36, p. 101003, 2019.

[24] S. Tiwari, A. Verma, P. Garg, and D. Bansal, “Social media sentiment analysis on twitter datasets,” in 2020 6th international conference on advanced computing and communication systems (ICACCS). IEEE, pp.925–927, 2020.

[25] A. L´opez-Chau, D. Valle-Cruz, and R. Sandoval-Almaz´an, “Sentiment analysis of twitter data

Downloads

Published

2026-01-19

How to Cite

[1]
K. Singh, S. M. A. Razavi, S. Sagar Subedi, A. K. Akshay Kumar, and G. Singh, “Improving Generalization in Sentiment Analysis of Twitter Data with Logistic Regression Model”, Int. J. Comp. Sci. Eng., vol. 11, no. 1, pp. 201–207, Jan. 2026.