Comparative Study on Speculative Execution Strategy to Improve MapReduce Performance

Authors

  • Rahul R Ghule Department of Computer Science & Information Technology Dr. BabasahebAmbedkarMarathwada University, Aurangabad, India
  • Sachin N Deshmukh Department of Computer Science & Information Technology Dr. BabasahebAmbedkarMarathwada University, Aurangabad, India

Keywords:

MapReduce, Hadoop, Straggler, speculative execution

Abstract

MapReduce is widely used and popular programming model for huge amount of data processing. Hadoop is open source implementation of MapReduce framework. Performance of Hadoop depends some of the metrics like job execution time and cluster throughput. In MapReduce, Job is divided into multiple map and reduce tasks. Some tasks can be executed slowly due to internal or external reasons. Because of this slow tasks job execution time is prolonged which leads to degradation of Hadoop performance. To overcome this, current MapReduce framework launch speculative execution in which each slow tasks is backed up other node in order to reduce the job execution time. These slow tasks can be called as straggler tasks. However, current MapReduce speculative execution does not estimate the progress of the tasks properly which leads to identifying incorrect slow tasks. Also, they do not consider data skew among the tasks. This paper studies various speculative execution strategy like HAT (History based auto-tuning), Longest Approximate Time to End (LATE) and Maximum Cost Performance (MCP). These strategies overcome the drawbacks of default speculative execution to improve MapReduce performance.

References

Qi Chen, Cheng Liu and Zhen Xiao, “Improving MapReduce performance using smart speculative execution strategy”, IEEE Transaction on Computers VOL 63, NO. 4, APRIL 2014.

Apache hadoop, http://hadoop.apache.org/, Accessed on 26 December 2015

J. Dean and S. Ghemawat, “Mapreduce: Simplified Data Processing on Large Clusters,” Comm. ACM, vol. 51, pp. 107-113, Jan. 2008.

Exponential Weighted Moving Average, http://en.wikipedia.org/wiki1, Accessed on 7 January 2015

MapReduce. [Online] Available: http://www.ibm.com, Accessed on 15January 2015

G. Ananthanarayana, S. Kandula, A. Greenberg, I. Stocia, Y. Lu, B.Saha, and E. Harris, “Reining in the Outliers in Mapreduce Clusters Using Mantri” Proc. Inth USENIX Conf. Operating System Design and implementation, (OSDI ‘10), 2010.

M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, “Improving MapReduce Performance in Heterogeneous Environments,” in Proc. of the 8th USENIX conference on Operating systems design and implementation , ser. OSDI, 2008.

Zhe Wang, Zhengdong Zhu, Pengfei Zheng, Qiang Liu, Xiaoshe Dong, “New Scheduler Strategy for Heterogeneous Workload-aware in Hadoop,” 8th Annual ChinaGrid Conference, 2013.

Huanle Xu, Wing Cheong Lau, “Optimization for Speculative Execution of Multiple Jobs in a MapReduce-like Cluster,” 8th Annual ChinaGrid Conference, 2013.

Xuelian Lin, Chunming Hu, Richong Zhang, Chengzhang Wang, “Modeling the Performance of MapReduce under Resource Contentions and Task Failures,” Cloud Computing Technology and Science (CloudCom), IEEE 5th International Conference on (Vol 1 ), December 2013.

Tao Gu, Chuang Zuo, Qun Liao, Yulu Yangand Tao Li, “Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments”, International Journal of Grid and Distributed ComputingVol.6, No.5, 2013.

G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris, “Reining in the Outliers in Map-Reduce Clusters Using Mantri,” Proc. Ninth USENIX Conf. Operating Systems Design and Implementation (OSDI), 2010.

Y. Kwon, M. Balazinska, and B. Howe, “A Study of Skew in Mapreduce Applications,” Proc. Fifth Open Cirrus Summit, 2011.

Open Stack Cloud Operating System, http://www.openstack.org/, Accessed on 13 February 2015.

Amazon Elastic Compute Cloud (EC2), http://aws.amazon.com/ec2/,Accessed on 28 January 2015

Quan Chen, MinyiGuo, Qianni Deng, Long Zheng, Song Guo, Yao Shen, “HAT: History-based auto-tunningMapReduce in heterogenous environments” Springer Science+Business media, LLC, 2011

Downloads

Published

2015-03-31

How to Cite

[1]
G. Rahul R and D. Sachin N, “Comparative Study on Speculative Execution Strategy to Improve MapReduce Performance”, Int. J. Comp. Sci. Eng., vol. 3, no. 3, pp. 197–200, Mar. 2015.

Issue

Section

Review Article