Comparison of Text Classification Algorithms of People Sentiments on Twitter (Case: Transjakarta)
DOI:
https://doi.org/10.26438/ijcse/v7i9.812Keywords:
Naïve Bayes, K-Nearest Neighbor, Logistic Regression, F-measurement, Sentiment Analysis, TransjakartaAbstract
Nowadays social media is one to express things that are thought and felt by the community. One of the things that’s much talked about is responses from the consumer of products or services. This is very useful for companies to find out the level of satisfaction of their products or services. Twitter is one of the most widely used social media by users. With this fact, it`s really interesting for companies to use the data on Twitter for the company`s progress generally in customer relations. In this study an analysis of public sentiments towards the use of Transjakarta. This study divides community sentiments into three classes, positive, neutral and negative. For data taken from Twitter with the results of research from June to July 2019 by dividing the data into training data and testing data. The amount of training data is 144 tweets and testing data are 36 tweets. Then for the text classification uses 3 algorithms, namely naïve bayes, k-nearest neighbor and logistic regression. Then after the results are obtained, next is to compare the performance levels of three methods by finding the highest f-measurement value using micro average formula. Micro average is chosen because it’s the best for calculating imbalanced datasets. The results show the naïve bayes method has the best f-measurement with 0.861 value. For the next largest f-measurement value is the logistic regression method with an f-measurement value of 0.833, and the last is the k-nearest neighbor method with an f-measurement value of 0.806.
References
[1] B. Liu, “Sentiment Analysis and Opinion Mining”, Synthesis Lectures on Human Language Technologies, Vol.5, No.1, pp.1-167, 2012.
[2] M. Trivedi, N. Soni, S. Sharma. S. Nair, “Comparison of Text Classification Alghorithms”, International Journal of Engineering Research & Technology (IJERT), Vol.4, Issue.02, pp.334-336, 2015.
[3] K. Chouksey, A. Ranjan, “Analysis of Indian Election using Random Forest Algorithm”, International Journal of Computer Sciences and Engineering, Vol.7, Issue.10, pp.50-57, 2019.
[4] B. Sharma, S. Gandotra, U. Sharma, R. Thakur, A. Mahajan, “A Comparative Analysis of Different Machine Learning Classification Algorithms for Predicting Chronic Kidney Disease”, International Journal of Computer Sciences and Engineering, Vol.7, Issue.6, pp.8-13, 2019.
[5] S.S. Bhadoria, R.K. Patel, “Web Text Content Extraction and Classification using Naïve Bayes Classifier Algorithm”, International Journal of Scientific Research in Computer Science and Engineering, Vol.2, Issue.5, pp.1-4, 2014.
[6] K. Sarvakar, U.K. Kuchara, “Sentiment Analysis of movie reviews: A new feature-based sentiment classification”, International Journal of Scientific Research in Computer Science and Engineering, Vol.6, Issue.3, pp.8-12, 2018.
[7] M. Bekkar, Dr.H.K Djemaa, Dr.T.A. Alitouche, “Evaluation Measures for Models Assessment over Imbalanced Data Sets”, Journal of Information Engineering and Applications, Vol.3, No.10, pp.27-38, 2013.
[8] Y. Liu, H.T. Loh, A. Sun, “Imbalanced text classification: A term weighting approach”, Expert Systems with Applications, Vol.36, No.1, pp.690-701, 2009.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
