Document Clustering with Omni-Directional Data Similarity Process
Keywords:
Document Clustering, Correlation Measure, Similarity Measure, Data MiningAbstract
Most of the clustering techniques must presume some cluster relationship relating to the data thing. Similarity among some items is usually defined clearly or sometimes absolutely. In this paper, is an introduction to some novel reference centered similarity gauge and two related clustering approaches. The significant difference between an old-fashioned dissimilarity/similarity gauge and the approach considered in this paper is how the former uses simple single standpoint. In the existing approach it considers the origin, while the latter utilizes a number of reference details, which are objects assumed not to ever be inside the same cluster while using two things being scored. Using several reference details, more useful assessment of similarity could be possibly achieved. In document clustering two qualification functions are proposed and is determined by the fresh measure. The above functions are being examined along with frequently used clustering based algorithms which use other well known similarity measures in various document collections in order to verify the approach under consideration in this paper
References
X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M.steinbach,D.J.Hand,andD.Steinberg,―Top10algorithmsindata mining,‖Knowl. Inf. Syst., vol. 14, no. 1, pp. 1–37, 2007.
HChimand X. Deng, ―Efficient phrase-based document similarity for clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 20, no. 9, pp. 1217–1229,2008.
D. Lee andJ. Lee, ―Dynamic dissimilarity measure for support based clustering,‖ IEEE Trans. on Knowl. and Data Eng., vol. 22, no. 6, pp. 900–905,2010.
P. Lakkaraju, S. Gauch, andM. Speretta, ―Document similarity based on concept tree distance,‖ in Proc. of the 19th ACM conf. on Hypertext and hypermedia, 2008, pp.127–132.
D. Ienco, R. G. Pensa, andR. Meo, ―Context-based distance learning for categorical data clustering,‖ in Proc. of the 8th Int. Symp. IDA, 2009, pp.83–94.
I. Guyon, U.von Luxburg, and R. C. Williamson, ―Clustering: Science or Art?‖ NIPS‘09 Workshop on Clustering Theory,2009.
E. Pekalska, A. Harol, R. P. W. Duin, B. Spillmann, and H. Bunke,―Non-Euclideanornon-metricmeasurescanbeinformative,‖ in Structural, Syntactic, and Statistical Pattern Recognition, ser. LNCS, vol. 4109, 2006, pp.871–880.
M.Pelillo, ―What is a cluster?Perspectivesfromgametheory,‖in Proc. of the NIPS Workshop on Clustering Theory,2009.
I. Dhillon and D. Modha, ―Concept decompositionsfor large sparse text data using clustering,‖ Mach. Learn., vol. 42, no. 1-2, pp. 143–175, Jan2001.
S. Zhong, ―Efficient online spherical K-means clustering,‖in IEEE IJCNN, 2005, pp. 3180–3185
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
