Effectuation of Web Log Preprocessing and Page Access Frequency using Web Usage Mining
Keywords:
Web Usage Mining, Preprocessing, Web Log Data, Frequency, ClusteringAbstract
For accessing the information from web log, this is very important task and this task can be accomplished by web usage mining technique. Through web usage mining technique we can find out visitors behavior which can automatically and very fast access intrinsic information from huge amount of web log data, such as interesting access path, identify the user, accessing the web page group, web user clustering and web pre-fetching. Web usage mining is milestone for decision making process for an organization. Data preprocessing is very important concepts for the mining process. If our web log data is preprocessed then we can easily find out the desire information about visitor and also retrieve other hidden information from web log data. In this paper we focus on data preprocessing technique of web usage mining, after completion of data preprocessing, any king of irrelevant information can be sort out. We have also proposed an algorithm and its implementation for web log preprocessing in web usage mining. Every page has been assigned with an individual token. According to this token and frequency, data mining technique (Classification, Association Rules, and Clustering) can be applied. In this article we can easily find the highest and lowest value according to page access frequency.
References
Theint Theint Aye, "Web Log Cleaning of Web Usage Patterns," IEEE, 2011.
Ms.Dipa Dixit and Ms. M. Kiruthika, "Preprocessing of Web Logs," International Journal on Computer Science and Engineering,vol. 02, 2010.
Arshi Shamsi, Rahul Nayak, Pankaj Pratap Singh and Mahesh Kumar Tiwari , "Web Usage Mining by Data Preprocessing," IJCST, vol. 3, 2012.
Mahendra Pratap Yadav,Pankaj Kumar Keserwani and Shefalika Ghosh Samaddar, "An Efficient Web Mining Algorithm for Web Log Analysis: E-Web Miner," IEEE, 2012.
Shaimaa Ezzat Salama, Mohamed I. Marie, "Web Server Logs preprocessing for Web Intrusion Detection," Computer and Information Science, vol. 4, 2011.
Jaideep Srivastava, Robert Cooley, Mukund Deshpande and Pang-Ning Tan, "Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data," SIGKDD Explorations, vol. 1, 2000.
Liu Kewen, "Analysis of Preprocessing Methods for Web Usage Data," International Conference on Measurement , Information and control(MIC),IEEE,2012.
R. Cooley,B. Mobasher and J Shrivastava, "Web Mining:information and pattern discoveryon the World Wide web," Ninth International Conference, 2011.
Web Log Data, "http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html,".
Zhuang Like, Kou Zhongbao and Zhang Changshui, "Session identification based on time intervals in Web log mining," Journal of Tsinghua University (Science and Technology), 2005.
N. Zhang and W. F. Lu, " An Efficient Data Preprocessing Method for Mining Customer Survey Data," IEEE, 2007.
Tasawar Hussain, Dr. Sohail Asghar, Dr. Nayyer Masood, " Web Usage Mining: A Survey on Preprocessing of Web Log File," IEEE, 2010.
T. Murata and K. Saito, "Extracting Users` Interests from Web Log Data," Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings, 2006.
Ling Zheng , Hui Gui and Feng Li, "Optimized Data Preprocessing Technology for Web Log Mining," International Conference On Computer Design And Appliations ICCDA, 2010.
R. Cooley, B. Mobasher and J. Srivastava, "Data preparation for mining world wide web browsing patterns," Knowledge and Information System, 1999.
Brijesh Bakariya and G.S.Thakur, "Preprocessing on Web Log Data in Web Usage Mining," International Conference on Intelligent Computing and Information System ICICIS, 2012.
Thi Thanh Sang Nguyen, Hai Yan Lu and Jie Lu, "Web-page Recommendation based on Web Usage and Domain Knowledge," IEEE, 2013.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
