Genome Based Classification of Human Papilloma Virus Using Linear Discriminant Analysis
Keywords:
Genome, Genes, HPV, LDA, Papillomaviridae, Multivariate analysis, Univariate analysisAbstract
Biological classification of Papillomaviridae leads to several hundred different genera (classes) of Human Papilloma Viruses (HPV) that are discriminated on the basis of more than hundred different characteristics. Statistical procedures of classification based on genome and gene size are being applied to biologically define different class labels for HPV. In this paper, Fisher’s linear discriminant analysis (LDA) has been used for classification of HPV on the basis of total genome size and gene sizes. Univariate and multivariate modes of classification have been employed to recognize two distinct classes of HPV viz., alpha- papilloma and beta-papilloma that cause cervical cancer in humans. The aim is to build a classification model so as to predict unknown samples. The accuracy of the proposed model has been measured on a sample dataset
References
[1] Dudoit, S., Fridlyand, J. and Speed, T.P. (2002). Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Jrnl of the American Statistical Association 97(457), 77-87.
[2] Fisher R.A. (1938) “The statistical utilization of multiple measurements", Annals of Eugenics, 8, 376– 386.
[3] Gnanadesikan, R. (1977). “Methods for Statistical Data Analysis of Multivariate Observations”, Wiley. ISBN 0-471-30845-5 (p. 83–86)
[4] Han, J., Kamber, M. and Pei, J. (2012), Data Mining Concepts and Techniques, Morgan Kaufmann.
[5] Heel, M.V. (1984) “Multivariate statistical classification of noisy images (randomly oriented biological macromolecules)”. Ultramicroscopy Volume 13, Issues 1–2, Pages 165-183.
[6] McLachlan, G. J. (2004). Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience. ISBN 0-471-69115-1. MR 1190469.
[7] Sihua Peng,Qianghua Xu, Xuefeng Bruce Ling, Xiaoning Peng, Wei Du, Liangbiao Chen. “Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines”. FEBS letters, 2003 - Wiley Online Library.
[8] T. Ryan Gregory James A. Nicol Heidi Tamm Bellis Kullman Kaur KullmanIlia J. Leitch Brian G. Murray Donald F. Kapraun Johann Greilhuber Michael D. Bennett “Eukaryotic genome size databases”, Nucleic Acids Research, Volume 35, Issue suppl_1, 1 January 2007, Pages D332–D338.
[9] Villiers, E.D., Fauquet, c., Broker, T.R., Bernard HU., “Classification of papillomaviruses”, Virology 324, 2004 – Elsevier pp 17-27.
[10] Yushan Qiu, Xiaoqing Cheng, Wenpin Hou, Wai-Ki Ching. (2015) “On classification of biological data using outlier detection”. 12th International Symposium on Operations Research and its Applications inEngineering, Technology and Management (ISORA 2015).
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
