Optimized K-Mode Algorithm Using Harmonic Technique

Authors

  • Goyal M Department of Computer Science and Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib, India
  • Aggarwal S Department of Computer Science and Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib, India

Keywords:

Data Mining, Clustering, K-Means Algorithm, K-Mode Algorithm

Abstract

Data Mining is the extraction of useful information from a huge amount of datasets. As one of the most important tasks in data mining, clustering aims to group a set of objects such that the objects within the same cluster are more similar to each other than to the objects in another cluster. An extension of the K-Means Algorithm, K-Mode Algorithm, is partitioning based clustering algorithm does not guarantee for the optimal solution. To overcome this problem, entropy based similarity coefficient was introduced in order to find good initial center points and the accurate result of the clusters were obtained. The nature-inspired harmonic algorithm is hybridized to optimize the k-mode algorithm. In this paper, Harmonic K-Mode Algorithm is proposed that reduces the computation time and improves the accuracy for cluster generation. The experimental result shows that the proposed algorithm gives better results than the existing algorithms.

References

. V. Sawant, K. Shah, “Performance Evaluation of Distributed Association Rule Mining Algorithms”, 7th International Conference on Communication, Computing and Virtualization, Elsevier, Vol. 79, pp. 127-134, 2016.

. J. Karimov, M. Ozbayoglu, “Clustering Quality Improvement of k-means using a Hybrid Evolutionary Model”, Conference Organized by Missouri University of Science and Technology, San Jose, California, Elsevier, Vol. 61, pp. 38-45, 2015.

. J. Han, M. Kamber, J. Pei, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publishers, 3rd Edition, India, 2011.

. L.V. Bijuraj, “Clustering and its Applications”, In the Proceedings of National Conference on New Horizons in IT – NCNHIT, India, pp. 169- 172, 2013.

. P. Arora, Deepali, S.Varshney, “Analysis of K-Means and K-Medoids Algorithm For Big Data”, International Conference on Information Security & Privacy, India, Science Direct, Vol. 78, pp. 507-512, 2016.

. Z. Huang, ”Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values”, ACM Transaction on Data Mining and Knowledge Discovery, Vol. 2, pp. 283–304, 1998.

. Y. Sun, Q. Zhu, Z. Chen, “An iterative initial-points refinement algorithm for categorical data clustering”, Pattern Recognition Letters, Elsevier, Vol. 23, Issue. 7, pp. 875–884, 2002.

. D. Barbara, J. Coute, Yi Li, “COOLCAT: An entropy based algorithm for categorical clustering”, Proceedings of the eleventh international conference on Information and knowledge management, USA, ACM, pp. 582-589, 2002.

. O. M. San, V. Hyunh, Y. Nakamori, “An Alternative Extension of the k-Means Algorithm for Clustering Categorical Data”. International Journal Applied Math and Computer Science, Vol.14, pp. 241–247, 2004.

. F. Cao, J. Liang, L. Bai, “A new initialization method for categorical data clustering”, Expert Systems with Applications, Science Direct, Vol. 36, pp. 10223-10228, 2009.

. J. Lee, Y. J. Lee, M. Park, “Clustering with Domain Value Dissimilarity for Categorical Data”, Advances in Data Mining, Applications and Theoretical Aspects, Lecture Notes in Computer Science, Springer, Vol. 5633, pp. 310-324, 2009.

. D. Ienco, R. G. Pensa, R. Meo, “From Context to Distance: Learning Dissimilarity for Categorical Data Clustering”, ACM Transactions on Knowledge Discovery from Data, pp.1-22, 2011.

. A. Desai, H. Singh, V. Pudi, “DISC: Data Intensive Similarity Measure for Categorical Data”, Proceedings of Advances in Knowledge Discovery and Data Mining – 15th Pacific Asia Conference, Springer, pp. 469 – 481, 2011.

. F. Cao, J. Liang, D. Li, L. Bai, C. Dang, “A dissimilarity measure for the k-modes clustering algorithm”, Knowledge-Based Systems, Elsevier, Vol. 26, pp. 120–127, 2012.

. Y. M. Cheung, H. Jia, “Categorical and numerical attribute data clustering based on a unified similarity metric without knowing cluster number”, Pattern Recognition, Elsevier, Vol. 46, pp. 2228–2238, 2013.

. S. S. Khan, A. Ahmad, “Cluster Center Initialization for Categorical Data Using Multiple Attribute Clustering”, Expert Systems with Applications, Elsevier, Vol. 40, pp. 7444–7456, 2013.

. R. S. Sangam, H. Om, “The k-modes algorithm with entropy based similarity coefficient”, 2nd International Symposium on Big Data and Cloud Computing, Procedia Computer Science, Elsevier, Vol. 50, pp. 93-98, 2015.

. R.Viederyte, “Preconditions evaluation in Maritime Clustering research”, 3rd Global Conference on Business, Economics, Management and Tourism, Rome, Italy, Elsevier, Vol. 39, pp. 365-372, 2016.

Downloads

Published

2025-11-11

How to Cite

[1]
M. Goyal and S. Aggarwal, “Optimized K-Mode Algorithm Using Harmonic Technique”, Int. J. Comp. Sci. Eng., vol. 5, no. 6, pp. 143–148, Nov. 2025.

Issue

Section

Research Article