Efficient Indexing and Searching of Big Data in HDFs

Authors

  • D Deepika M.E Scholar, Department of Computer Science & Engineering, A.R.J College of Engineering & Technology, Mannargudi.
  • K Pugazhmathi Asst.Prof, Department of Computer Science & Engineering, A.R.J College of Engineering & Technology, Mannargudi.

Keywords:

Hadoop, Enormous Data, Efficient Indexing, Data Structure

Abstract

Efficient indexing is an efficient, standard data structure, most suited for look operation over an exhaustive set of data. The enormous set of data is mostly unstructured furthermore, does not fit into traditional database categories. Extensive scale preparing of such data needs a dispersed structure such as Hadoop where computational assets could easily be shared furthermore, accessed. An execution of a look motor in Hadoop over millions of Wikipedia reports utilizing an transformed record data structure would be conveyed out for making look operation more accomplished. Transformed record data structure is utilized for mapping a word in a record or set of records to their relating locations. A hash table is utilized in this data structure which stores each word as record furthermore, their relating areas as its values thereby providing simple lookup furthermore, extremely of data making it suitable for look operations.

References

Raj, A. Kaur, K. ; Dutta, U. ; Sandeep, V.V. ; Rao, S. "Enhancement of Hadoop Clusters with Virtualization Using the Capacity Scheduler", Third International Conference on Services in Emerging Markets (ICSEM),Mysore, India, Dec 2012. Page(s): 50 - 57.

Jiong Xie; Shu Yin ; Xiaojun Ruan ; Zhiyang Ding ; Yun Tian ; Majors, J. ; Manzanares, A. ; Xiao Qin. "Improving MapReduce performance through data placement in heterogeneous Hadoop clusters". IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), Atlanta, GA, April, 2010. Page(s): 1 - 9.

Kala Karun, A ; Chitharanjan, K ; "A review on hadoop — HDFS infrastructure extensions ", IEEE Conference on Information & Communication Technologies (ICT), JeJu Island, April 2013. Page(s): 132 - 137.

Richard Mccreadie ; Craig Macdonald ; Iadh Ounis; "MapReduce indexing strategies: Studying scalability and efficiency". International Journal of Information Processing and Management. Volume 48 Issue 5, September, 2012. Pages: 873-888.

Apache Hadoop, Hadoop, HDFS, Avro, Cassandra, Chukwa, HBase, Hive, Mahout, Pig, Zookeeper are trademarks of the Apache Software Foundation. http://www.hadoop.apache.org/ Last Published: 10/16/2013

Barry Wilkinson; Michael Allen; “Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers” (2nd Edition). Publication Date: March 14, 2004,

Gal Lavee ; Ronny Lempel ; Edo Liberty ; Oren Somekh ; " Inverted index compression via online document routing" Published in: WWW '11 Proceedings of the 20th international conference on World Wide Web. Pages 487-496.

Guanghui Xu; Feng Xu; Hongxu Ma; "Deploying and researching Hadoop in virtual machines". Published in: IEEE International Conference on Automation and Logistics (ICAL), Zhengzhou, Aug. 2012. Page(s): 395 - 399.

Shvachko, K.; Hairong Kuang ; Radia, S. ; Chansler, R. " The Hadoop Distributed File System". Published in: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, May 2010. Page(s): 1 - 10.

Ishii, M.; Jungkyu Han; Makino, H; "Design and performance evaluation for Hadoop clusters on virtualized environment" Published in: International Conference on Information Networking (ICOIN), Bangkok, Jan. 2013. Page(s): 244 - 249.

Downloads

Published

2025-11-11

How to Cite

[1]
D. Deepika and K. Pugazhmathi, “Efficient Indexing and Searching of Big Data in HDFs”, Int. J. Comp. Sci. Eng., vol. 4, no. 4, pp. 237–243, Nov. 2025.

Issue

Section

Research Article