A Texonomy on Web Page Categorization

Authors

  • Bhavana CSE Department, Maharishi Markandeshwar (Deemed to be) University, Mullana, Haryana, India
  • Raheja N CSE Department, Maharishi Markandeshwar (Deemed to be) University, Mullana, Haryana, India

DOI:

https://doi.org/10.26438/ijcse/v7i1.637641

Keywords:

Web Page Categorization, Web Mining, Web Content Mining, Naive Bayes, KNN, SVM

Abstract

Web Page Categorization becomes essential due to the increase in the information on the Internet. As pages on the web are growing regularly and can cover almost all types of information. However finding accurate and useful information from these large amounts of web pages for a user is difficult, so efficient and accurate methods for categorizing this large of information is very necessary. Web page categorization is to categorized web pages into specified categories. It improves the efficiency of search on the web. This paper discusses various methods, approaches & uses of web page categorization.

References

[1] Blockeel, R. k. " Web Mining Research:A survey". Vol. 2, PP. 1-15, 2000.

[2] R. Jain and Dr. G. N. Purohit,” Page Ranking Algorithms for Web Mining”,International Journal of Computer Applications, ISSN: 0975 – 8887, Vol. 13, No.5, pp. 22–25, 2011.

[3] Xiaoguang Qi and Brian d. Davison, “Web Page Classification: Features and Algorithms” ACM Computing Surveys, Vol. 41, No. 2, Article 12, 2009.

[4]P., R.B. Plastino, A. Zadrozny, B. and L.H. Merschmann, “Categorizing feature selection methods for multi-label classification”, Artificial Intelligence Review, 49(1): 57-78, 2018.

[5] A. Osanyin, O. Oladipupo and Ibukun Afolabi, “A Review on Web Page Classification”, Covenant Journal of Informatics & Communication Technology, Vol. 6, No. 2, Dec. 2018.

[6] S. Dixit, & R. K. Gupta, “Layered Approach to Classify Web Pages using Firefly Feature Selection by Support Vector Machine (SVM)”, International Journal of u-and e-Service, Science and Technology, vol. 8, No. 5, pp. 355-364, 2015.

[7] B. Tang, H. Haibo, M. Paul, ” A Bayesian Classification Approach Using Class-Specific Features for Text Categorization”, IEEE ,2015.

[8] W. A. Awad, ”Machine Learning Algorithms in Web Page Classification”, International Journal of Computer Science & Information Technology (IJCSIT), Vol. 4, No. 5, 2012.

[9]T. Joachims, “Text categorization with support vector machines: Learning with many relevant features”, In: Proceedings of European Conference on Machine Learning E, CML, vol. 1398, pp. 137–142, 2000,.

[10] M. B. Revanasiddappa, B. S. Harish, S. V. A. Kumar, ”Meta-cognitive Neural Network based Sequential Learning Framework for Text Categorization”, ICCIDS, 2018.

[11] Liu, C. Wang, W. Tu, G. Xiang, Y. Wang, S. and L, F. “A new Centroid-Based Classification model for text categorization.”, Knowledge-Based Systems, vol. 136, pp. 15-26, 2017.

[12] R., S., V., S.P. “Text categorization by backpropagation network”, International Journal of Computer Applications, vol. 8, No. 6, pp. 1-5, 2010.

[13] C. Chang, M. Kayed, M. R. Girgis and K. F. Shaalan, “A Survey of Web Information Extraction Systems”, in IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1411-1428, Oct. 2006.

[14] K. Donghwa, S. Deokseong, S. Deokseong, C. Suhyoun, K. Pilsung, ”Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec”, 2018.

[15] Dıaz, A. B. Rios, J. H. Barron, T. Y. Guerrero, J. C. Elizondo, ”An automatic document classifier system based on genetic algorithm and taxonomy”, 2018.

[16] J. Hyoungil , K. Youngong , S. Jungyun, ”How to Improve Text Summarization and Classification by Mutual Cooperation on an Integrated Framework”, 2016.

[17] Qi Luo, ”Research on Paper Submission Management System by Using Automatic Text Categorization”, Springer International Publishing AG, 2018.

[18] J. Moorey, Eui-Hong (Sam) Han, “Web Page Categorization and Feature Selection Using Association Rule and Principal Component Clustering”, 2010.

[19] S. Roy, P. Shivakumara, N. Jain, V. Khare, A. Dutta, U. P. and Tong Lu, ”Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video” Pattern Recognition”, doi: 10.1016/j.patcog.2018.02.014, 2018.

[20] H. S. Gowda, M. Suhil(B), D.S. Guru, and L. N. Raju, “Semi-supervised Text Categorization Using Recursive K-means Clustering” Recent Trends in Image Processing and Pattern Recognition, Springer, 2016.

[21] A. Qaziaand R.H. Goudar, “An Ontology-based Term Weighting Technique for Web Document Categorization”, Science Direct, Procedia Computer Science vol. 133, pp. 75–81, 2018.

[22] D. L. sanchez, A. G. Arrieta and J. M. Corchado, “Deep neural networks and transfer learning applied to multimedia web mining”, Springer International Publishing AG, 2018.

[23] S. Shinde, J. Prasanna and S. Vanjale, “Web Document Classification using Support Vector Machine”, IEEE, 2017.

Downloads

Published

2019-01-31
CITATION
DOI: 10.26438/ijcse/v7i1.637641
Published: 2019-01-31

How to Cite

[1]
Bhavana and N. Raheja, “A Texonomy on Web Page Categorization”, Int. J. Comp. Sci. Eng., vol. 7, no. 1, pp. 637–641, Jan. 2019.