Joint Feature Learning and Clustering Techniques for Clustering High Dimensional Data: A Review

Authors

  • Ghatage Trupti B Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Kolhapur, Maharashtra, India
  • Patil Deepali E Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Kolhapur, Maharashtra, India
  • Takmare Sachin B Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Kolhapur, Maharashtra, India
  • Patil Sushama A DC Branch, Dept of Digital Communication, SSSIST Sehore.

Keywords:

Clustering, high dimensional data, feature learning, dimensionality reduction

Abstract

In many real world applications, we often face high dimensional data. Developing efficient clustering methods for high dimensional datasets may be a challenging problem because of the curse of dimensionality. Common method to deal with this is to use first dimensionality reduction approach and then cluster the data in the lower dimensions. Even though we can initially reduce the dimensionality by any approach and then use clustering approaches to group high dimensional data, performance can also be improved since these two techniques are conducted in sequence. Naturally, if we consider the requirement of clustering during the process of dimensionality reduction and vice versus then the performance of clustering can be improved. This paper presents a review of different techniques for clustering high dimensional data by joint feature learning and clustering.

References

J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Second Edition. Morgan Kaufmann, 2006.

C. Ding, X. He, H. Zha, and H. D. Simon, “Adaptive dimension reduction for clustering high dimensional data,” in Proc. ICDM, Page No (147–154), 2002.

C. Ding and T. Li, “Adaptive dimension reduction using discriminant analysis and K-means clustering”, in Proc., Page No (521–528), ICML, 2007.

F. De La Torre and T. Kanade, “Discriminative cluster analysis”, in Proc. ICML, Page No (241–248), 2006.

J. Ye, Z. Zhao, and M. Wu, “Discriminative K-means for clustering”, in Advances in Neural Information Proc. Systems. Cambridge, MA, USA: MIT Press, 2007.

T. Li, S. Ma, and M. Ogihara, “Document clustering via adaptive subspace iteration”, 27th Annual. Int. ACM SIGIR Conf. Rese. Develop. Inform. Retri., Page No (218–225), Jul. 2004.

Chenping Ho, Feiping Nie, Dongyun Yi, and Dacheng Tao, “Discriminative Embedded Clustering: A Framework for Grouping High-Dimensional Data”, IEEE Trans. Neural Network. Learn. Syst., Volume- 26, no. 6, Page No (1287-1299), June 2015.

J. Shi and J. Malik, “Normalized cuts and image segmentation”, IEEE Transaction. Pattern Anal. Mach. Intell., Volume-22, no. 8, Page No (888–905), Aug. 2000.

Duda R. O., Hart P. E., and Stork D. G., “Pattern Classification”, 2nd edition, New York, NY, USA: Wiley, 2000.

L. Parsons, E. Haque, and H. Liu, “Subspace clustering for high dimensional data: A review”, ACM SIGKDD Explorations Newslett., Volume-6, no. 1, Page No (90–105), 2004.

Jolliffe, I. “Principal component analysis”, Springer. Second Edition, 2002.

Dasgupta S. Experiments with random projection. Proc. 16th Conf. Uncertainty in Artificial Intelligence (UAI 2000).

J. Ye, Z. Zhao, and H. Liu, “Adaptive distance metric learning for clustering”, in Proc. IEEE CVPR, Page No (1–7), Jun. 2007.

D. Niu, J. G. Dy, and M. I. Jordan, “Dimensionality reduction for spectral clustering”, in Proc. Int. Conf. Artif. Intell. Statist, Volume-15., Page No (552–560), 2011.

Q. Gu and J. Zhou, “Subspace maximum margin clustering,” in Proc. CIKM, Page No (1337–1346), Nov. 2009.

C. Domeniconi, D. Papadopoulos, D. Gunopulos, and S. Ma, “Subspace clustering of high dimensional data”, in Proc. SIAM Int. Conf. Data Mining (SDM), Page No (517–521), Apr. 2004.

D. Wang, F. Nie, and H. Huang, “Unsupervised feature selection via unified trace ratio formulation and K-means clustering (track),” in Proc. Eur. Conf. Mach. Learn. Principles Pract. Knowl. Discovery Databases (ECML PKDD), Nancy, France, 2014.

C. Hou, C. Zhang, F. Nie and Y. Wu, “Learning a subspace for face image clustering via trace ratio criterion”, Opt. Eng., Volume-48, no. 6, p. 060501, 2009.

T. Li, S. Ma, and M. Ogihara, “Document clustering via adaptive subspace iteration,” in Proc. 27th Annu. Int. ACM SIGIR Conf. Res. Develop. Inform. Retr., Page No (218–225), Jul. 2004.

R. W. Sembiring, S. Sembiring, and J. M. Zain, “An efficient dimensional reduction method for data clustering”, Bull. Math., Volume-4, no. 1, Page No (43–58), 2012.

G. Golub and C. Van Loan, “Matrix Computations”, Third Edition, Johns Hopkins, Baltimore, 1996.

L. Vandenberghe and S. Boyd. “Semidefinite programming”, SIAM Review, 38:49–95, 1996.

Downloads

Published

2025-11-11

How to Cite

[1]
B. Ghatage Trupti, E. Patil Deepali, B. Takmare Sachin, and A. Patil Sushama, “Joint Feature Learning and Clustering Techniques for Clustering High Dimensional Data: A Review”, Int. J. Comp. Sci. Eng., vol. 4, no. 3, pp. 54–58, Nov. 2025.

Issue

Section

Review Article