Deep Web Data Scraper: Search Engine
Keywords:
Surface Web, Deep Web, Search Engine, Deep Web Search Engine, Crawler, Indexer, Human Powered DirectoryAbstract
World Wide Web is growing every day and people generally depend on search engine to explore the web. Searching on the web today can be compared to dragging a net across the surface of the ocean. Traditional search engine extracts data from the small portion of the web whereas the large portions of the web are hidden behind search forms, in searchable structured and unstructured database. Deep web contains the high quality content and large coverage area. A lot of research has been carried out in this area to make the hidden data float on the surface of web. In this paper, we discussed the problem faced by users in scraping the information from the deep web and also discussed the solution of these problems by using our new approach
References
Bergman,Michael K., “White Paper: The Deep Web: Surfacing Hidden Value” Journal of Electronic Publishing Vol.7,Issue-1,2001.
Ling Liu, James Caverlee, “Deep Web Data Extraction”
Emilio Ferrrara, Giacomo F., Robert B., “Web Data Extraction, Applications and Techniques: A Survey” ACM Transaction on Computational Logic, Vol.5, June 2010, pp.1-20.
Brin, Lawrence Page “The anatomy of large-scale hypertexual Web Search Engine”, Computer Networks and ISDN Systems, Vol.30, 1998, pp.107-111.
Laender, Silva, Juliana S., “ A Brief Survey of Web Data Extraction Tools”.
Sriram R., Hector, “Crawling the Hidden Web” in the proceeding of the 27th VLDB Conference, Roma, Italy,2001.
Babita, Anuradha, Ashish, “Hidden Web Data Extraction Tools” International Journal of Computer Applications, Vol.82,2013.
Deep Web Website: //en.wikipedia.org/wiki/Deep_Web
WikipediaWebsite: //en.wikipedia.org/wiki/Web_crawler
Wikipedia Website: //en.wikipedia.org/wiki/Search_Engine
Ntoulas, Zerfos, Junghoo Cho, “Downloading Hidden Web Content”
Anuradha, A. K. Sharma, “Design of Hidden Web Search Engine” International Journal of Computer Application, Vol.30, 2011.
Chez Hong-ping, Fang Wei, Yang Zhou, “Automatic Data Records Extraction from List Page in Deep Web Sources” Vol.6, 2009, pp.370-373.
Chris Sherman, GARY Price, “Hidden web: Uncovering Information Sources Search Engines Can’t See” CyberAge Book, 2001.
Manuel, Juan R., Fidel, Alberto Pan, “A Task specific Approach for Crawling the Deep Web” 2006.
Califf, M. E., and Mooney, R. J., “Relational Learning of Pattern-Match Rules for Information Extraction” In Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence (Orlando, Florida, 1999), pp.328-334.
Crescenzi, V., and Mecca, G., “Grammars Have Exceptions”, Information Systems 23, 8, (1998), pp.539-565.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
