Comparing clustering Algorithms with Diabetic Datasets in WEKA Tool

Authors

  • GG Gokilam Department of Computer Science and Engineering, PRIST University, TamilNadu, India
  • K Shanthi Department of Computer Science and Engineering, Principal of Ponnaiyah Ramajayam Polytechnic College, TamilNadu, India

Keywords:

Cluster, Diabetes, Weka, Data Mining

Abstract

Data mining is the process of discover useful information from large datasets. The data mining techniques are used to analyze and evaluate diabetic dataset in the field of bio-medical. One of the most important techniques of data mining is clustering which is used to analyzing data from different perspectives and summarizing into useful information. Clustering is the task of assigning a set of objects into group called clusters. This paper discusses different clustering algorithms like cobweb, DBSCAN, EM, Farthest first, filtered cluster hierarchical cluster, OPTICS, simple Kmeans. The algorithms are used to compare its performance by Time taken to build the clusters, the cluster differentiated by its true positive and true negative values. Our main aim to show the comparison of the different cluster algorithms are evaluated in weka tool (Data mining Tool) and find out which algorithm will be most suitable for the diabetes dataset.

References

Jiawei Han and Micheline Kamber, “Data Mining Concepts and Techniques”, second edition, Morgan Kaufmann Publishers an imprint of Elsevier.

A.K. JAIN Michigan State University, M.N.MURTY Indian Institute of Science AND P.J. FLYNN The Ohio State University: “Data Clustering”.

P. Vijaya, M N Murthy and D K Subramanian. Leaders-sub leaders, “An efficient hierarchical clustering algorithm for large data sets”,Pattern Recognition Letters (2004) 505-513.

Rama. B,“A Survey on clustering Current status and challenging issues” (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 09, 2010, 2976-2980.

M. Pramod Kumar “Simultaneous Pattern and Data Clustering Using Modified K-Means Algorithm” International Journal on Computer Science and Engineering Vol. 02, No. 06, 2010, 2003-2008.

Miroslav Marinov, M.S.,1 Abu Saleh Mohammad Mosa, M.S.,1 Illhoi Yoo, Ph.D.,1,2 and Suzanne Austin Boren, Ph.D., MHA1,2 “ Data-Mining Technologies for Diabetes: A Systematic Review” Journal of Diabetes Science and Technology Volume 5, Issue 6, November 2011 © Diabetes Technology Society.

Celeux, G. and Govaert, G. (1992). “A classification EM algorithm for clustering and two stochastic versions. Computational statistics and data analysis”, 14:315–332

Narendra Sharma , Aman Bajpai , Mr. Ratnesh Litoriya, “Comparison the various clustering algorithms of weka tools” International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012).

Dr. Wenjia Wang, “Tutorial for DM tool Weka 1 CMP: Data Mining and Statistics within the Health Services”.

K. Rajesh, V. Sangeetha ,“ Application of Data Mining Methods and Techniques for Diabetes Diagnosis” International Journal of Engineering and Innovative Technology (IJEIT) Volume 2, Issue 3, September 2012

Downloads

Published

2015-02-28

How to Cite

[1]
G. Gokilam and K. Shanthi, “Comparing clustering Algorithms with Diabetic Datasets in WEKA Tool”, Int. J. Comp. Sci. Eng., vol. 3, no. 2, pp. 1–5, Feb. 2015.