Joint Feature Learning and Clustering Techniques for Clustering High Dimensional Data: A Review
Keywords:
Clustering, high dimensional data, feature learning, dimensionality reductionAbstract
In many real world applications, we often face high dimensional data. Developing efficient clustering methods for high dimensional datasets may be a challenging problem because of the curse of dimensionality. Common method to deal with this is to use first dimensionality reduction approach and then cluster the data in the lower dimensions. Even though we can initially reduce the dimensionality by any approach and then use clustering approaches to group high dimensional data, performance can also be improved since these two techniques are conducted in sequence. Naturally, if we consider the requirement of clustering during the process of dimensionality reduction and vice versus then the performance of clustering can be improved. This paper presents a review of different techniques for clustering high dimensional data by joint feature learning and clustering.
References
J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Second Edition. Morgan Kaufmann, 2006.
C. Ding, X. He, H. Zha, and H. D. Simon, “Adaptive dimension reduction for clustering high dimensional data,” in Proc. ICDM, Page No (147–154), 2002.
C. Ding and T. Li, “Adaptive dimension reduction using discriminant analysis and K-means clustering”, in Proc., Page No (521–528), ICML, 2007.
F. De La Torre and T. Kanade, “Discriminative cluster analysis”, in Proc. ICML, Page No (241–248), 2006.
J. Ye, Z. Zhao, and M. Wu, “Discriminative K-means for clustering”, in Advances in Neural Information Proc. Systems. Cambridge, MA, USA: MIT Press, 2007.
T. Li, S. Ma, and M. Ogihara, “Document clustering via adaptive subspace iteration”, 27th Annual. Int. ACM SIGIR Conf. Rese. Develop. Inform. Retri., Page No (218–225), Jul. 2004.
Chenping Ho, Feiping Nie, Dongyun Yi, and Dacheng Tao, “Discriminative Embedded Clustering: A Framework for Grouping High-Dimensional Data”, IEEE Trans. Neural Network. Learn. Syst., Volume- 26, no. 6, Page No (1287-1299), June 2015.
J. Shi and J. Malik, “Normalized cuts and image segmentation”, IEEE Transaction. Pattern Anal. Mach. Intell., Volume-22, no. 8, Page No (888–905), Aug. 2000.
Duda R. O., Hart P. E., and Stork D. G., “Pattern Classification”, 2nd edition, New York, NY, USA: Wiley, 2000.
L. Parsons, E. Haque, and H. Liu, “Subspace clustering for high dimensional data: A review”, ACM SIGKDD Explorations Newslett., Volume-6, no. 1, Page No (90–105), 2004.
Jolliffe, I. “Principal component analysis”, Springer. Second Edition, 2002.
Dasgupta S. Experiments with random projection. Proc. 16th Conf. Uncertainty in Artificial Intelligence (UAI 2000).
J. Ye, Z. Zhao, and H. Liu, “Adaptive distance metric learning for clustering”, in Proc. IEEE CVPR, Page No (1–7), Jun. 2007.
D. Niu, J. G. Dy, and M. I. Jordan, “Dimensionality reduction for spectral clustering”, in Proc. Int. Conf. Artif. Intell. Statist, Volume-15., Page No (552–560), 2011.
Q. Gu and J. Zhou, “Subspace maximum margin clustering,” in Proc. CIKM, Page No (1337–1346), Nov. 2009.
C. Domeniconi, D. Papadopoulos, D. Gunopulos, and S. Ma, “Subspace clustering of high dimensional data”, in Proc. SIAM Int. Conf. Data Mining (SDM), Page No (517–521), Apr. 2004.
D. Wang, F. Nie, and H. Huang, “Unsupervised feature selection via unified trace ratio formulation and K-means clustering (track),” in Proc. Eur. Conf. Mach. Learn. Principles Pract. Knowl. Discovery Databases (ECML PKDD), Nancy, France, 2014.
C. Hou, C. Zhang, F. Nie and Y. Wu, “Learning a subspace for face image clustering via trace ratio criterion”, Opt. Eng., Volume-48, no. 6, p. 060501, 2009.
T. Li, S. Ma, and M. Ogihara, “Document clustering via adaptive subspace iteration,” in Proc. 27th Annu. Int. ACM SIGIR Conf. Res. Develop. Inform. Retr., Page No (218–225), Jul. 2004.
R. W. Sembiring, S. Sembiring, and J. M. Zain, “An efficient dimensional reduction method for data clustering”, Bull. Math., Volume-4, no. 1, Page No (43–58), 2012.
G. Golub and C. Van Loan, “Matrix Computations”, Third Edition, Johns Hopkins, Baltimore, 1996.
L. Vandenberghe and S. Boyd. “Semidefinite programming”, SIAM Review, 38:49–95, 1996.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
