Deep Web Data Scraper: Search Engine

Authors

Sneh Nain Computer Science Department, MDU University, India
Bhumika Lall Computer Science Department, MDU University, India

Keywords:

Surface Web, Deep Web, Search Engine, Deep Web Search Engine, Crawler, Indexer, Human Powered Directory

Abstract

World Wide Web is growing every day and people generally depend on search engine to explore the web. Searching on the web today can be compared to dragging a net across the surface of the ocean. Traditional search engine extracts data from the small portion of the web whereas the large portions of the web are hidden behind search forms, in searchable structured and unstructured database. Deep web contains the high quality content and large coverage area. A lot of research has been carried out in this area to make the hidden data float on the surface of web. In this paper, we discussed the problem faced by users in scraping the information from the deep web and also discussed the solution of these problems by using our new approach

References

Bergman,Michael K., “White Paper: The Deep Web: Surfacing Hidden Value” Journal of Electronic Publishing Vol.7,Issue-1,2001.

Ling Liu, James Caverlee, “Deep Web Data Extraction”

Emilio Ferrrara, Giacomo F., Robert B., “Web Data Extraction, Applications and Techniques: A Survey” ACM Transaction on Computational Logic, Vol.5, June 2010, pp.1-20.

Brin, Lawrence Page “The anatomy of large-scale hypertexual Web Search Engine”, Computer Networks and ISDN Systems, Vol.30, 1998, pp.107-111.

Laender, Silva, Juliana S., “ A Brief Survey of Web Data Extraction Tools”.

Sriram R., Hector, “Crawling the Hidden Web” in the proceeding of the 27th VLDB Conference, Roma, Italy,2001.

Babita, Anuradha, Ashish, “Hidden Web Data Extraction Tools” International Journal of Computer Applications, Vol.82,2013.

Deep Web Website: //en.wikipedia.org/wiki/Deep_Web

WikipediaWebsite: //en.wikipedia.org/wiki/Web_crawler

Wikipedia Website: //en.wikipedia.org/wiki/Search_Engine

Ntoulas, Zerfos, Junghoo Cho, “Downloading Hidden Web Content”

Anuradha, A. K. Sharma, “Design of Hidden Web Search Engine” International Journal of Computer Application, Vol.30, 2011.

Chez Hong-ping, Fang Wei, Yang Zhou, “Automatic Data Records Extraction from List Page in Deep Web Sources” Vol.6, 2009, pp.370-373.

Chris Sherman, GARY Price, “Hidden web: Uncovering Information Sources Search Engines Can’t See” CyberAge Book, 2001.

Manuel, Juan R., Fidel, Alberto Pan, “A Task specific Approach for Crawling the Deep Web” 2006.

Califf, M. E., and Mooney, R. J., “Relational Learning of Pattern-Match Rules for Information Extraction” In Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence (Orlando, Florida, 1999), pp.328-334.

Crescenzi, V., and Mecca, G., “Grammars Have Exceptions”, Information Systems 23, 8, (1998), pp.539-565.

Downloads

PDF ⁰

Published

2014-05-31

How to Cite

[1]

S. Nain and B. Lall, “Deep Web Data Scraper: Search Engine”, Int. J. Comp. Sci. Eng., vol. 2, no. 5, pp. 52–56, May 2014.

Download Citation

Issue

Vol. 2 No. 5 (2014): IJCSE May Edition

Section

Research Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.

Deep Web Data Scraper: Search Engine

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Journal Information

UGC Gazette Regulation

Join Editorial Board

Information

Current Issue

Keywords