Big Data Platform-A Review

Authors

  • Kumar S Dpt. of Computer Science and Engineering, Giani Zail Singh College Bhatinda, India

Keywords:

Hadoop, HDFS, Name node, Data node, Map Reduce, Data locality, Job Tracker, Task Tracker

Abstract

Hadoop is popular distributed system used for the analysis of large amount of data. Hadoop is based on distributed computing having HDFS (Hadoop Distributed File System) &Map Reduce programming paradigm. Hadoop is highly fault-tolerant due to its imitation of data transversely on multiple nodes and can be set out on low cost hardware. The file system –HDFS—written in JAVA and designed for heterogeneous hardware and software. Hadoop is very much appropriate for high volume of data & where data format is different like semi structured, unstructured. Hadoop also make available the high speed admittance to the data of the application which we want to use. Hadoop architecture is cluster based (cluster consists of racks), which is consist of nodes (data note, name node), physically separate to each other, in idyllic circumstances. In Hadoop a program known as map-reduce is used to collect data according to query. As Hadoop is used for massive amount of data therefore scheduling and way of containing data in Hadoop must be efficient for better presentation. With this feature of Hadoop the traditional system is replacing with Hadoop. The research objective is to study and explore various scheduling techniques, which are used to increase performance in Hadoop. This paper include the idea of working of Hadoop, its internal details and why Hadoop is better than the Traditional system.

References

Transl. J. Magn. Japan, [Digests 9th Annual Conf. Magnetics Japan, Vol. 2, pp. 740-741, August 1987 pp. 301, 1982].

Chris Eaton and Tom Deutsch, Understanding Big Data-Analytics for Enterprise Class Hadoop and Streaming Data.

Arun C. Murthy and Vinod Kumar Vavilapalli, Apache Hadoop YARN-Moving beyond MapReduce and Batch Processing with Apache Hadoop 2.

http://www.bigdatauniversity.com/web/media/player.php?file=BD001V212EN/Videos/Unit_1_What_is_Hadoop_Part1.mp4&caption=files.db2university.com/BD001V212EN/Videos/EN/Unit_1_What_i s_Hadoop_Part1.srt

https://www.youtube.com/watch?v=DLutRT6K2rM

Figure 2. The flow of data in a simple MapReduce job pp.62 Chris Eaton and Tom Deutsch, Understanding Big Data- Analytics for Enterprise Class Hadoop and Streaming Data.

Downloads

Published

2025-11-10

How to Cite

[1]
S. Kumar, “Big Data Platform-A Review”, Int. J. Comp. Sci. Eng., vol. 3, no. 10, pp. 84–87, Nov. 2025.