An Analysis of the Effectiveness of Various Similarity Measures for Web Page Clustering
Keywords:
Web Page Clustering, vector space model, Genetic AlgorithmAbstract
One of the prominent challenges encountered with regard to web search engines is the large number of documents retrieved by the user in response to their queries. In this regard Various solutions have been proposed in the literature .One approach is to use clustering of web documents. In this paper we propose a genetic algorithm approach for clustering of web documents and study the effectiveness of using various similarity measures in this context. This paper proposes various similarities have been employed and the cosine similarity yields better results when compared to other similarity measures.
References
A.Huang “Similarity measures for text document clustering” NZCSRS(2008)
A.Strehl,J.Ghoesh “Impact of similarity measures”
N.Oikonomakon,M.vazirginnn “A review of web document Approaches”
R.kala,A.Shukla and R.Tiwang “ A novel Approach to clustering using genetic algorithm”International journal of engineering research 2010.
U.Maulik,S.Bandyopadhyay “Genetic algorithm based clustering”
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
