An Overview of Ontology Based Text Document Clustering Algorithms
Keywords:
Term-Clustering, k-means, Single-Linkage, DBSCAN, Self-Organizing Maps, F1MeasureAbstract
Text document clustering is an important activity in data mining. It is emerged from text retrieval, and had important application in establishing information retrieval, knowledge management system. Clustering can help to get solutions for many problems associated with real time applications such as, in commercial; in biotechnology; in geography; in the banking sector; in the insurance industry; in the Internet etc. Hence It is important to know different ways available to implement clustering. In Text based clustering approach using title of document it is found out to which cluster this document belongs But It doesn’t give better results because it may possible that same document is renamed with two different names, as content of both documents are similar it is expected that the document should go to the same cluster but depending on the name of the document it may go to two different clusters . A new approach called semantic based text clustering [1] comes into picture in which entire document is parsed and depending on its content it is clustered. Ontology based text clustering [2] is a way to implement semantic based clustering. In this paper we discussed about different Ontology based algorithms like K-means, DBScan, SOM etc.
References
“Ontology-based Semantic clustering” by dr. Aida alls and dr. Karina gibert computer science and mathematics sanroma aPh.d. thesis supervised by department of Tarragona.
“Ontology-based Text Document Clustering” by Andreas Hotho and Alexander Maedche and teffen Staab Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany.
“Ontology-based Text Clustering” by A. Hotho and S. Staab A. Maedche.
“ Survey of Clustering Algorithms ” by Rui Xu, Student Member, IEEE and Donald Wunsch II, Fellow, IEEE.
“Support Vector Clustering” by Asa Ben-Hur asa ,Raymond and Beverly, Nello Critianini, John Shawe-Taylor and Bob Williamson.
“Modern Information Retrieval ” a book written by yates & neto.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
