Unsupervised Context-Based Probabilistic Text Classification
DOI:
https://doi.org/10.26438/ijcse/v9i12.914Keywords:
Document Categorization, , Keywords Extraction, Concept Learnin, Multi-class Probabilistic Classification, Content MiningAbstract
Text classification is the one of the primary tasks in Natural Language Processing (NLP). Key phrase extraction is the fundamental component that aids the mapping of documents to a set of emblematic phrases. For example, a category that includes IT documents can be described as “Information and Computer” or “Information and Technology”. If a text document includes keywords such as “issue” and “order”, then it belongs to “Issue Category”. Multiple pre-trained and deep learning approaches are available now-a-days for semantic analysis. Word embeddings are predominant technique that provides light to find the semantic similarity between tokens/phrases using word vectors. The most widely used word embeddings are GloVe, Word2vec, BERT etc. Experimental results show that the strategy produced by this study have more precision and simplicity than that of other methods.
References
[1] An Jiyuan and Chen, Yi?Ping Phoebe 2005, "Keyword text extraction for text categorization", in Proceedings of the 2005 International Conference on Active Media Technology, pp.556?56 June 2005
[2] Bhumika, Prof Sehra SS, Prof Nayyar A, “A review paper on algorithms used for text classification,” International Journal of Application or Innovation in Engineering & Management (IJAIEM), vol. 2, Issue 3, March 2013.
[3] R Shrihari C, Desai A, “A review on knowledge discovery using text classification techniques in text mining,” International Journal of Computer Applications (0975 – 8887), vol. 111 – No 6, February 2015.
[4] Dang, S., & Ahmad, P.H ,”A Review of Text Mining Techniques Associated with Various Application Areas”, International Journal of Science and Research (IJSR), Vol.4, No.2, pp.2461-2466,2015.
[5] Mehmet Fatih KARACA and Safak BAYIR "Examining the Impact of Feature Selection Methods on Text Classification" Published by : (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 12 , 2017.
[6] R. Janani1, S. Vijayarani "Text Classification: A Comparative Analysis of Word Embedding Algorithms", International Journal of Computer Sciences and Engineering, Vol.7, Issue-4, April 2019.
[7] Riya Tyagi "Review of Extraction and Classification of Key-Phrases in Scientific Publications using CRF and WEDP" Published by : International Journal of Engineering Research & Technology, Volume 09, Paper ID : IJERTV9IS060480, 2020.
[8] Rafly Indra and Abba Suganda Girsang "Classification of User Comment Using Word2vec and Deep Learning" Published by : Advances in Science, Technology and Engineering Systems Journal, Volume 9 No.1 , Page 643– 648 , 2021.
[9] S. Sreedhar, S. Ahmed, P. Flora, LS Hemanth, J. Aishwarya, R. Naik "An Improved Approach of Unstructured Text Document Classification Using Predetermined Text Model and Probability Technique",Published by : International Journal of Recent Technology and Engineering, ISSN: 2277-3878, Volume-8 Issue-2S10. January 2021.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
