An Effective K-means approach for Imbalance data clustering using Precise Reduction Sampling
DOI:
https://doi.org/10.26438/ijcse/v6i3.6570Keywords:
Data Mining, Knowledge Discovery, Clustering, K-means, imbalance data, uniform effect, under sampling, PRS_K-meansAbstract
K-means clustering is one of the top 10 algorithms in the field data mining and knowledge discovery. The uniform effect in the k-means clustering reveals that, the imbalance nature of the data source hampered the performance in terms of efficient knowledge discovery. In this paper, we proposed a novel clustering algorithm known as Precise Reduction Sampling K-means (PRS_K-means) for efficient handling of imbalance data and reducing the uniform effect. The experiments shows that the algorithm can not only give attention to different instances of sub clusters for identify the intrinsic properties of the instances for clustering; and it performs better than K-means in terms of reduction in error rate and has higher accuracy and recall rate for improved performance.
References
Prateeksha Tomar, Amit Kumar Manjhvar, "Clustering Classification for Diabetic Patients using K-Means and M-Tree prediction model", International Journal of Scientific Research in Multidisciplinary Studies , Vol.3, Issue.6, pp.48-53, 2017
Hui Xiong, Junjie Wu, and Jian Chen,” K-Means Clustering Versus Validation Measures: A Data-Distribution Perspective”, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 39, NO. 2, APRIL 2009.
Abhishek kumar K and Sadhana,: SURVEY ON K-MEANS CLUSTERING ALGORITHM”, International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 04, Issue 4, [April– 2017]
Farhad Pourkamali-Anaraki and Stephen Becker, “Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means”,
Fabon Dzogan, Christophe Marsala, Marie-Jeanne Lesot and Maria Rifqi,” An ellipsoidal K-means for document clustering”, 2012 IEEE 12th International Conference on Data Mining
Kaile Zhou, Shanlin Yang,” Exploring the uniform effect of FCM clustering: A data distribution Perspective”, Knowledge-Based Systems 96 (2016) 76–83
Jaya Rama Krishnaiah VV, Ramchand H Rao K, Satya Prasad R (2012) Entropy Based Mean Clustering: An Enhanced Clustering Approach. J Comput Sci Syst Biol 5: 062-067. doi:10.4172/jcsb.1000091
Hartono, O S Sitompul, Tulus and E B Nababan,: Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem”, IOP Conf. Series: Materials Science and Engineering 288 (2017) 012075
Md. Akmol Hussain, Akbar Sheikh Akbari, Ahmad Ghaffari, “Colour Constancy using K-means Clustering Algorithm”, 2016 9th International Conference on Developments in eSystems Engineering.
Junjie Wu, Hui Xiong and Jian Chen,” Adapting the Right Measures for K-means Clustering”,
Richard Nock and Frank Nielsen,” On Weighting Clustering”, EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 8, AUGUST 2006
Wu.J,”The Uniform Effect of K-means Clustering”, J. Wu, Advances in K-means Clustering, Springer Theses, DOI:10.1007/978-3-642-29807-3_2, © Springer-Verlag Berlin Heidelberg 2012.
HamiltonA. Asuncion D. Newman. (2007). UCI Repository of Machine Learning Database (School of Information and Computer Science, Irvine, CA: Univ. of California [Online]. Available: http://www.ics.uci.edu/∼mlearn/MLRepository.html
Witten, I.H. and Frank, E. (2005) Data Mining: Practical machine learning tools and techniques. 2nd edition Morgan Kaufmann, San Francisco.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
