Query Optimization of Big Data Using Hive
Keywords:
Big Data, HDFS, Map Reduce, Hive, JoinAbstract
Huge amounts of data are required to build internet search engines and therefore large number of machines to process this entire data. The Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of machines. The Hadoop having two modules 1. Hadoop distributed file system and 2. Map Reduce. The Hadoop distributed file system is different from the local normal file system. The HDFS can be implemented as single node cluster and multi node cluster. The large datasets are processed more efficiently by the multi node clusters. By using the hive query language on the Hadoop and increasing number of nodes the data will be processed fastest than with the fewer nodes.
References
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters by YangDasdan and Hasio ,Parker Vol-8,Issue-7,1029-1040,2007
J.Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, pages 137–150, 2004,
Liu Liu, Jiangtao Yin, Lixin Gao, “Efficient Social Network Data Query Processing on MapReduce” ACM August 16, 2013.
Stephen Kaisler, Frank Armour, J. Alberto Espinosa, William Money, “Big Data: Issues and Challenges Moving Forward” 1530-1605/12, Jan 2013.
“Hadoop Mapreduce Outline in Big Figures Analytics” IJCSE,Vol-2,Issue-9 100-104,Sep 2014.
ApacheHadoop.http://hadoop.apache.org/.friday 2 Dec,14
https://en.wikipedia.org/wiki/Apache_Hadoop, 25 Jan,15
http://hashprompt.blogspot.in/2014/06/multi-node-hadoop-cluster-on-ubuntu-1404.html, 7 April,2015
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
