Detection of Motif in Protein Sequence Using K-Means and Fuzzy C-Means Algorithms
Keywords:
clustering, k-means and fuzzy c-means, SMAAbstract
Finding the Motif in biological sequences of protein synthesis is a basic problem in determining the protein structure. Detection of the Motif is used with many applications in gene regulation, protein family identification and determination of functionally and structurally important identities. Large amount of biological data is used to resolve the problem of discovering patterns in biological sequences computationally. In this research, we have designed an approach using a system of clustering in data mining to detect frequently occurring informative motifs that are high in information content. We have proposed a comparative approach for Skin Melanin associated problems(SMA) detection in preliminary stages using protein sequence. We have used the protein sequence with normal and abnormal data as the trained dataset. Test instances are classified into normal to abnormal by comparing it with the fundamental dataset. In this paper, We compare and evaluate the performance of two clustering algorithms namely K-means and fuzzy c-means clustering for protein sequences
References
[1] Jipkate, BR & Gohokar, VV 2012 ‘A Comparative Analysis of Fuzzy C-Means Clustering and K Means Clustering Algorithms’. Int. J. of Computational Engineering, vol. 2, no. 3, pp. 737-739
[2] Bezdek, JC 1981, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York., doi: 10.1007/978-1-4757-0450-1
[3] Bora, DJ & Gupta, AK 2014 ‘A Comparative study Between Fuzzy Clustering Algorithm and Hard Clustering Algorithm’. Int. J. of Computer Trends and Technology, vol. 10, no. 2, pp. 108-113
[4] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Protein and Nucleic Acids. Cambridge,U.K.: Cambridge Univ. Press, 1998.
[5] J. Finer-Moore and R. M. Stroud, “Amphipathic analysis and possible formation of the ion channel in an acetocholine receptor,” Proc. Nat.Acad. Sci. USA, vol. 81, no. 1, pp. 155–159, 1984.
[6] D. Frishman and P. Args, “Knowledge-based protein secondary structure assignment,” Proteins Struct. Funct. Genet., vol. 23, pp. 566–579, 1995.
[7] S. K. Gupta, K. S. Rao, andV. Bhatnagar, “K-means clustering algorithm for categorical attributes,” in Proc. Data Warehousing and Knowledge Discovery (DaWaK-99), pp. 203–208
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
