An Approach to Design Personalized Focused Crawler
Keywords:
Web Crawler, Focused Crawler, World Wide Web(WWW), Content Analysis, Link Scoring, Change DetectionAbstract
The amount of data and its dynamicity makes it impossible to crawl the World Wide Web (WWW) completely. It�s a challenge in front of crawlers to crawl only the relevant pages from this information explosion. Thus a focused crawler solves this issue of relevancy to a certain level, by focusing on web pages for some given topic or a set of topics. Also a focused crawler with a page change detection policy can help in narrowing down the search to only newer pages, and thus eliminates risk of redundancy and missing updated data. This paper proposes a policy for design of a focused crawler with web page change detection policy.
References
Mahdi Bazarganigilani, Ali Syed and Sandid Burki, “Focused web crawling using decay concept and genetic programming”, published in International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol.1, No.1, Page no(1-12), January 2011.
3Swati Mali and B B Meshram, “Focused Web Crawler with Page Change Detection Policy”, published in International Journal of Computer Applications (IJCA) proceedings on International Conference and workshop on Emerging Trends in Technology (ICWET), No 9 Article 9, Page No 51-56, 2011.
4DivakarYadav, AK Sharma, Sonia Sanchez-Cuadrado, Jorge Morato, “an approach to design incremental parallel webcrawler”, published in Journal of Theoretical and Applied Information Technology, Volume 43 No 1, Page no:(8-29), 15 September 2012.
6Anshika Pal, Deepak Tomar and S.C. Shrivastava, “Effective Focused Crawling Based on Content and Link Structure Analysis”, published in (IJCSIS) International Journal of Computer Science and Information Security, Vol. 2, no. 1, Page No: (1-5), June 2009.
7Ioannis Avraam and Ioanni Anagnostopoulos, “A Comparison over Focused Web Crawling Strategies”, published in Panhellenic Conference on Informatics(IEEE), Print ISBN 978-1-61284-962-1,Page No: (245-249), September 2011.
9Weicheng Ma, Xiuxia Chen and Wenqian Shang, “Advanced deep web crawler based on Dom”, published in IEEE Fifth International Joint Conference on Computational Sciences and Optimization, print ISBN 978-1-4673-1365-0, Page No: (605-609), June 2012
Mejdl S. Safran, Abdullah Althagafi and Dunren Che, ”Improving Relevance Prediction for Focused Web Crawlers”, published in IEEE/ACIS 11th International Conference on Computer and Information Science, print ISBN 978-1-4673-1536-4, page no: (161-166), May 2012.
Jatinder Manhas, “A Study of Factors Affecting Websites Page Loading Speed for Efficient Web Performance”, published in International Journal of Computer Sciences and Engineering (IJCSE), Vol-1, Issue-3, Nov 2013.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
