Survey on A Connectivity and Density Dissimilarity Based Clustering
Keywords:
Clustering, DistanceMetricStyling, Ensembling, LargeDimensionsAbstract
In an aeon where information is precious, all these data need to uncover the relations presented between a set of unlabeled dataset for the purposes of manifold - revise, explore, sort, store, analyze; arises all the more. A very feasible way to explore the relations between data is clustering, an unsupervised data mining technique. Clustering aims to group like data points together in clusters with no similarity between data points of different clusters and leaves behind outliers or points not belonging to any of the clusters. Clustering can be applied to all types of data with varying nature (numeric, categorical, mixed), and dimensions (low, high), however, methodologies and similarity measures that can be applied may vary accordingly. In this manuscript we will discuss about various technologies used for clustering of data like role of distance metrics in clustering, clustering using ensembles and dimensionality reduction/minimaization techniques for modeling complex data relations.
References
A. Hinneburg. and D. Keim, “An efficient approach to clustering large multimedia databases with noise”, Proceedings of the 4th ACM SIGKDD Conference, US, pp. 58-65, 1998.
A. Rodriguez, A. Laio, “Clustering by fast search and find of density peaks”, Science, Vol. 344, Issue.6191, pp. 1492-1496, 2014.
J. Y. Chen, H.H. He, “A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data”, Information Sciences, Vol. 345, Issue.6, pp. 271-293, 2016.
A. E. Bayá, P. M. Granitto, “Clustering gene expression data with a penalized graph-based metric”, BMC Bioinformatics, Vol. 12, Issue.1,pp.10-21, 2011.
G. Hinton, S. Roweis, “Stochastic neighbor embedding”, Proceedings of International Conference on Advances in Neural Information Processing Systems, US, pp. 833-840, 2003.
M. Belkin, P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation”, Neural Computation, Vol.15, No. 6, pp. 1373-1396, 2003.
A. Strehl, J. Ghosh, “Cluster ensembles: A knowledge reuse framework for combining multiple partitions”, Journal of Machine Learning Research, vol. 3, Issue.5, pp. 583-617, 2002.
B. Fischer, J. Buhmann, “Bagging for path-based clustering”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 11, pp.1411-1415, 2003.
S. Vega-Pons, J. Ruiz-Shulcloper, “Clustering ensemble method for heterogeneous partitions”, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Barlin, pp.481-488 2009.
A. E. Bayá, M. G. Larese, P. M. Granitto, “Clustering using PK-D: A connectivity and density dissimilarity”, Expert Systems with Applications, Vol. 51, Issue.1, pp. 151-160, 2016.
T. F. Cox, M. A. A Cox, “Multidimensional scaling (2nd ed.)”, Chapman & Hall/CRC, USA, pp.1-220, 2000.
S. W. Kim and R. Duin, “An empirical comparison of kernel-based and dissimilarity-based feature spaces”, InJoint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), Berlin, pp. 559-568, 2010.
S. Roweis, L. Saul, “Nonlinear dimensionality reduction by locally linear embedding”, Science, vol.290, no.5500, pp.2323-2326.
A.Fred, A.K. Jain, “Combining multiple clusterings using evidence accumulation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, 2005, pp. 835-850.
E. Pekalska, P. Paclik, R.P.W. Duin, “A generalized kernel approach to dissimilarity-based classification”, Journal of Machine Learning Research, vol. 2, Issue.1, pp. 175-211, 2002.
E. Pekalska, R. Duin, “Beyond traditional kernels: Classification in two dissimilarity-based representation spaces”, IEEE Transactions on Systems, Man and Cybernetics Part C : Applications and Reviews, vol. 38, no. 6, pp. 729-744, 2008.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
