Deep Web Data Scraper: Search Engine

Authors

  • Sneh Nain Computer Science Department, MDU University, India
  • Bhumika Lall Computer Science Department, MDU University, India

Keywords:

Surface Web, Deep Web, Search Engine, Deep Web Search Engine, Crawler, Indexer, Human Powered Directory

Abstract

World Wide Web is growing every day and people generally depend on search engine to explore the web. Searching on the web today can be compared to dragging a net across the surface of the ocean. Traditional search engine extracts data from the small portion of the web whereas the large portions of the web are hidden behind search forms, in searchable structured and unstructured database. Deep web contains the high quality content and large coverage area. A lot of research has been carried out in this area to make the hidden data float on the surface of web. In this paper, we discussed the problem faced by users in scraping the information from the deep web and also discussed the solution of these problems by using our new approach

References

Bergman,Michael K., “White Paper: The Deep Web: Surfacing Hidden Value” Journal of Electronic Publishing Vol.7,Issue-1,2001.

Ling Liu, James Caverlee, “Deep Web Data Extraction”

Emilio Ferrrara, Giacomo F., Robert B., “Web Data Extraction, Applications and Techniques: A Survey” ACM Transaction on Computational Logic, Vol.5, June 2010, pp.1-20.

Brin, Lawrence Page “The anatomy of large-scale hypertexual Web Search Engine”, Computer Networks and ISDN Systems, Vol.30, 1998, pp.107-111.

Laender, Silva, Juliana S., “ A Brief Survey of Web Data Extraction Tools”.

Sriram R., Hector, “Crawling the Hidden Web” in the proceeding of the 27th VLDB Conference, Roma, Italy,2001.

Babita, Anuradha, Ashish, “Hidden Web Data Extraction Tools” International Journal of Computer Applications, Vol.82,2013.

Deep Web Website: //en.wikipedia.org/wiki/Deep_Web

WikipediaWebsite: //en.wikipedia.org/wiki/Web_crawler

Wikipedia Website: //en.wikipedia.org/wiki/Search_Engine

Ntoulas, Zerfos, Junghoo Cho, “Downloading Hidden Web Content”

Anuradha, A. K. Sharma, “Design of Hidden Web Search Engine” International Journal of Computer Application, Vol.30, 2011.

Chez Hong-ping, Fang Wei, Yang Zhou, “Automatic Data Records Extraction from List Page in Deep Web Sources” Vol.6, 2009, pp.370-373.

Chris Sherman, GARY Price, “Hidden web: Uncovering Information Sources Search Engines Can’t See” CyberAge Book, 2001.

Manuel, Juan R., Fidel, Alberto Pan, “A Task specific Approach for Crawling the Deep Web” 2006.

Califf, M. E., and Mooney, R. J., “Relational Learning of Pattern-Match Rules for Information Extraction” In Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence (Orlando, Florida, 1999), pp.328-334.

Crescenzi, V., and Mecca, G., “Grammars Have Exceptions”, Information Systems 23, 8, (1998), pp.539-565.

Downloads

Published

2014-05-31

How to Cite

[1]
S. Nain and B. Lall, “Deep Web Data Scraper: Search Engine”, Int. J. Comp. Sci. Eng., vol. 2, no. 5, pp. 52–56, May 2014.

Issue

Section

Research Article