Document Clustering with Omni-Directional Data Similarity Process

Authors

  • Lakshmi K Department of Computer Science, Idhaya College for Women, Kumbakonam, Tamil Nadu, Indi

Keywords:

Document Clustering, Correlation Measure, Similarity Measure, Data Mining

Abstract

Most of the clustering techniques must presume some cluster relationship relating to the data thing. Similarity among some items is usually defined clearly or sometimes absolutely. In this paper, is an introduction to some novel reference centered similarity gauge and two related clustering approaches. The significant difference between an old-fashioned dissimilarity/similarity gauge and the approach considered in this paper is how the former uses simple single standpoint. In the existing approach it considers the origin, while the latter utilizes a number of reference details, which are objects assumed not to ever be inside the same cluster while using two things being scored. Using several reference details, more useful assessment of similarity could be possibly achieved. In document clustering two qualification functions are proposed and is determined by the fresh measure. The above functions are being examined along with frequently used clustering based algorithms which use other well known similarity measures in various document collections in order to verify the approach under consideration in this paper

References

X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M.steinbach,D.J.Hand,andD.Steinberg,―Top10algorithmsindata mining,‖Knowl. Inf. Syst., vol. 14, no. 1, pp. 1–37, 2007.

HChimand X. Deng, ―Efficient phrase-based document similarity for clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 20, no. 9, pp. 1217–1229,2008.

D. Lee andJ. Lee, ―Dynamic dissimilarity measure for support based clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 22, no. 6, pp. 900–905,2010.

P. Lakkaraju, S. Gauch, andM. Speretta, ―Document similarity based on concept tree distance,‖ in Proc. of the 19th ACM conf. on Hypertext and hypermedia, 2008, pp.127–132.

D. Ienco, R. G. Pensa, andR. Meo, ―Context-based distance learning for categorical data clustering,‖ in Proc. of the 8th Int. Symp. IDA, 2009, pp.83–94.

I. Guyon, U.von Luxburg, and R. C. Williamson, ―Clustering: Science or Art?‖ NIPS‘09 Workshop on Clustering Theory,2009.

E. Pekalska, A. Harol, R. P. W. Duin, B. Spillmann, and H. Bunke,―Non-Euclideanornon-metricmeasurescanbeinformative,‖ in Structural, Syntactic, and Statistical Pattern Recognition, ser. LNCS, vol. 4109, 2006, pp.871–880.

M.Pelillo, ―What is a cluster?Perspectivesfromgametheory,‖in Proc. of the NIPS Workshop on Clustering Theory,2009.

I. Dhillon and D. Modha, ―Concept decompositionsfor large sparse text data using clustering,‖ Mach. Learn., vol. 42, no. 1-2, pp. 143–175, Jan2001.

S. Zhong, ―Efficient online spherical K-means clustering,‖in IEEE IJCNN, 2005, pp. 3180–3185

Downloads

Published

2025-11-24

How to Cite

[1]
K. Lakshmi, “Document Clustering with Omni-Directional Data Similarity Process”, Int. J. Comp. Sci. Eng., vol. 7, no. 4, pp. 87–90, Nov. 2025.