An Overview of Ontology Based Text Document Clustering Algorithms

Authors

  • Anuradha Awachar Computer Department, PCCOE, Pune University, India
  • Rajashree Bairagi Computer Department, PCCOE, Pune University, India
  • Vijayalaxmi Hegade Computer Department, PCCOE, Pune University, India
  • Mahadev Khandagale Computer Department, PCCOE, Pune University, India

Keywords:

Term-Clustering, k-means, Single-Linkage, DBSCAN, Self-Organizing Maps, F1Measure

Abstract

Text document clustering is an important activity in data mining. It is emerged from text retrieval, and had important application in establishing information retrieval, knowledge management system. Clustering can help to get solutions for many problems associated with real time applications such as, in commercial; in biotechnology; in geography; in the banking sector; in the insurance industry; in the Internet etc. Hence It is important to know different ways available to implement clustering. In Text based clustering approach using title of document it is found out to which cluster this document belongs But It doesn’t give better results because it may possible that same document is renamed with two different names, as content of both documents are similar it is expected that the document should go to the same cluster but depending on the name of the document it may go to two different clusters . A new approach called semantic based text clustering [1] comes into picture in which entire document is parsed and depending on its content it is clustered. Ontology based text clustering [2] is a way to implement semantic based clustering. In this paper we discussed about different Ontology based algorithms like K-means, DBScan, SOM etc.

References

“Ontology-based Semantic clustering” by dr. Aida alls and dr. Karina gibert computer science and mathematics sanroma aPh.d. thesis supervised by department of Tarragona.

“Ontology-based Text Document Clustering” by Andreas Hotho and Alexander Maedche and teffen Staab Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany.

“Ontology-based Text Clustering” by A. Hotho and S. Staab A. Maedche.

“ Survey of Clustering Algorithms ” by Rui Xu, Student Member, IEEE and Donald Wunsch II, Fellow, IEEE.

“Support Vector Clustering” by Asa Ben-Hur asa ,Raymond and Beverly, Nello Critianini, John Shawe-Taylor and Bob Williamson.

“Modern Information Retrieval ” a book written by yates & neto.

Downloads

Published

2014-02-28

How to Cite

[1]
A. Awachar, R. Bairagi, V. Hegade, and M. Khandagale, “An Overview of Ontology Based Text Document Clustering Algorithms”, Int. J. Comp. Sci. Eng., vol. 2, no. 2, pp. 60–64, Feb. 2014.

Issue

Section

Technical Article