Comparative Study on Information Retrieval Approaches for Text Mining
Keywords:
Text Mining, Text Representaion, Rule based Phrase Extraction, Sequential Pattern MiningAbstract
Text mining is the process of extracting information form unstructured to structured text data. The challenging issue in text mining is to extract user required information in efficient manner. To perform this task various data mining methods are used in which the text document analyzed on the basis of term, phrase, concept and pattern. This paper studies the text representation methods and basic term weighing schemes. Ruled-based Phrase Extraction method and Sequential Pattern mining method are discussed to improve the system performance for finding relevant and interesting information.
References
Ning Zhong, Yuefeng Li, and Sheng-Tang Wu, “Effective Pattern Discovery for Text Mining,” IEEE Transactions on Knowledge and Data Engineering, VOL. 24, NO. 1, January 2012.
Y. J. Fu., Data mining: Tasks, techniques and applications, IEEE Potentials, 16(4):18-20, 1997.
Sheng Tang Wu, “Knowledge Discovery using Pattern Taxonomy Model in Text Mining,” Doctor of Philosophy Thesis, Queensland University of Technology, December 2007.
F. Sebastiani. “Machine learning in automated text categorization,” ACM Computing Surveys, 34(1):1–47, 2002.
G. Salton and C. Buckley, “Term-Weighting Approaches in Automatic Text Retrieval,” Information Processing and Management: International Journal, vol. 24, no. 5, pp. 513-523, 1988.
Yutaka Matsuo, Mitsuru Ishizuka “Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information,” FLAIRS 2003.
A. Inkeri Verkamo, Helena Ahonen-Myka, Oskari Heinonen, Mika Klemettinen, “Finding Co-occurring Text Phrases by Combining Sequence and Frequent Set Discovery,” proc. Of the workshop on Text Mining: Foundation, techniques and Applications, IJCAI, 1999.
H. Ahonen, 0. Heinonen, M. Klemettinen, and A. I. Verkamo, “Mining in the phrasal frontier,” In Proceedings of PKDD, pages 343-350, 1997.
H. Ahonen, 0. Heinonen, M. Klemettinen, and A. I. Verkamo “Applying data mining techniques for descriptive phrase extraction in digital document collections,” In Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries (ADL98), pages 2-11, 1998.
R. Sharma and S. Raman, “Phrase-Based Text Representation for Managing the Web Document,” Proc. Int’l Conf. Information Technology: Computers and Comm. (ITCC), pp. 165-169, 2003.
P D. Turney, “Learning algorithms for keyphrase extraction,” Information Retrieval, 2(4):303-336, 2000.
Yongzheng Zhang, Nur ZincirHeywood, Evangelos Milios, “Narrative Text Classification for Automatic Key Phrase Extraction in Web Document Corpora,” WIDM ACM 2005.
S. Shehata, F. Karray, and M. Kamel, “Enhancing Text Clustering Using Concept-Based Mining Model,” Proc. IEEE Sixth Int’l Conf. Data Mining (ICDM ’06), pp. 1043-1048, 2006.
S.-T. Wu, Y. Li, and Y. Xu, “Deploying Approaches for Pattern Refinement in Text Mining,” Proc. IEEE Sixth Int’l Conf. Data Mining (ICDM ’06), pp. 1157-1161, 2006.
S. Shehata, F. Karray, and M. Kamel, “A Concept-Based Model for Enhancing Text Categorization,” Proc. 13th Int’l Conf. Knowledge Discovery and Data Mining (KDD ’07), pp. 629-637, 2007.
S.-T. Wu, Y. Li, Y. Xu, B. Pham, and P. Chen, “Automatic Pattern- Taxonomy Extraction for Web Mining,” Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence (WI ’04), pp. 242-248, 2004.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
