Document Clustering with Omni-Directional Data Similarity Process

Authors

Lakshmi K Department of Computer Science, Idhaya College for Women, Kumbakonam, Tamil Nadu, Indi

Keywords:

Document Clustering, Correlation Measure, Similarity Measure, Data Mining

Abstract

Most of the clustering techniques must presume some cluster relationship relating to the data thing. Similarity among some items is usually defined clearly or sometimes absolutely. In this paper, is an introduction to some novel reference centered similarity gauge and two related clustering approaches. The significant difference between an old-fashioned dissimilarity/similarity gauge and the approach considered in this paper is how the former uses simple single standpoint. In the existing approach it considers the origin, while the latter utilizes a number of reference details, which are objects assumed not to ever be inside the same cluster while using two things being scored. Using several reference details, more useful assessment of similarity could be possibly achieved. In document clustering two qualification functions are proposed and is determined by the fresh measure. The above functions are being examined along with frequently used clustering based algorithms which use other well known similarity measures in various document collections in order to verify the approach under consideration in this paper

References

X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M.steinbach,D.J.Hand,andD.Steinberg,―Top10algorithmsindata mining,‖Knowl. Inf. Syst., vol. 14, no. 1, pp. 1–37, 2007.

HChimand X. Deng, ―Efficient phrase-based document similarity for clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 20, no. 9, pp. 1217–1229,2008.

D. Lee andJ. Lee, ―Dynamic dissimilarity measure for support based clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 22, no. 6, pp. 900–905,2010.

P. Lakkaraju, S. Gauch, andM. Speretta, ―Document similarity based on concept tree distance,‖ in Proc. of the 19th ACM conf. on Hypertext and hypermedia, 2008, pp.127–132.

D. Ienco, R. G. Pensa, andR. Meo, ―Context-based distance learning for categorical data clustering,‖ in Proc. of the 8th Int. Symp. IDA, 2009, pp.83–94.

I. Guyon, U.von Luxburg, and R. C. Williamson, ―Clustering: Science or Art?‖ NIPS‘09 Workshop on Clustering Theory,2009.

E. Pekalska, A. Harol, R. P. W. Duin, B. Spillmann, and H. Bunke,―Non-Euclideanornon-metricmeasurescanbeinformative,‖ in Structural, Syntactic, and Statistical Pattern Recognition, ser. LNCS, vol. 4109, 2006, pp.871–880.

M.Pelillo, ―What is a cluster?Perspectivesfromgametheory,‖in Proc. of the NIPS Workshop on Clustering Theory,2009.

I. Dhillon and D. Modha, ―Concept decompositionsfor large sparse text data using clustering,‖ Mach. Learn., vol. 42, no. 1-2, pp. 143–175, Jan2001.

S. Zhong, ―Efficient online spherical K-means clustering,‖in IEEE IJCNN, 2005, pp. 3180–3185

Downloads

PDF ⁰

Published

2025-11-24

How to Cite

[1]

K. Lakshmi, “Document Clustering with Omni-Directional Data Similarity Process”, Int. J. Comp. Sci. Eng., vol. 7, no. 4, pp. 87–90, Nov. 2025.

Download Citation

Issue

Vol. 7 No. 4 (2019): IJCSE Special Issue Feb Edition

Section

Research Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.

Document Clustering with Omni-Directional Data Similarity Process

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Journal Information

UGC Gazette Regulation

Join Editorial Board

Information

Current Issue

Keywords