A Streamlined Frequent Item set excavating using FP growth from Map Reduce
Keywords:
Frequent item set mining, Hadoop Map Reduce, Parallel FP-Growth, Small files problemAbstract
As a significant part of discovering association rules, frequent item sets excavating plays a key role in removal associations, correlations, bass and other imperative data mining tasks. Since a little customary frequent item sets mining algorithms are incapable to knob gargantuan small file datasets effectively, such as high recall cost, high I/O operating price, and squat computing recitals, a better Parallel FP-Growth (EPFP) algorithm and converse its applications in this paper. In particular, a small file processing strategy for huge small file datasets to reimburse defects of squat read/write speed and low processing efficiency in Hadoop. Moreover, utilize of Map Reduce to execute the parallelization of FP-Growth algorithm, thereby improving the general performance of frequent item set mining. The investigation results demonstrate that the EPFP algorithm is practicable and suitable with a excellent speedup and a higher mining efficiency, and can convene the rapidly growing needs of frequent item sets mining for enormous petite file data sets.
References
Khurana K and Sharma S, ―A comparative analysis of association rule mining algorithms, International Journal of Scientific and Research Publications, Volume 3, Issue 5, pp 38-45, May 2013.
Peng Zhao, “Research Mining Frequent Items Algorithm in Massive High-dimensional Data Sets”, Computer Applications and Software, 2012.
Ahilandeeswari.G, DR.R Manicka Chezian, “A Comparative analysis of Association rule excavating in Big Data Mining Algorithms ”, International Journal Of Computer Science and Engineering, Volume 3, Issue 6, pp 82-88,June 2015
Ms. Dhamdhere Jyoti L., Prof. Deshpande Kiran B. "An Effective Algorithm for Frequent Itemset Mining on Hadoop.", International Journal of Science, Engineering and Technology Research (IJSETR), Volume 3, Issue 8, August 2014.
Guojun Mao, Lijuan Duan, Shi Wang, Yun Shi, “data mining principles and algorithms (the second edition)”, Tsinghua University Press, Beijing, 2007.
Tom White, “Hadoop: The Definitive Guide, Second Editon”, Tsinghua University Press, 2011.
Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, Edward Chang,“Pfp: Parallel Fp-Growth for Query Recommendation”, RecSys '08 Proceedings of the 2008 ACM conference on Recommender systems, Pages 107-114 ACM New York, NY, USA ©2008.
Ferenc Kovacs and Janos Illes “Frequent Itemset Mining on Hadoop.”,IEEE 9th International conference on Computational Cybernetics, Volume 2 Issue 4, June 2013.
A. Swami, T. Imielienski, R. Agrawal," Mining Association Rules between Sets of Items in Large databases.", ACM Press, pp 207–216, July 1993
Yang Liu, Maozhen Li, Alham, N.K., Hammoud, S.,Ponraj, M. “Load balancing in MapReduce environments for data intensive applications”, Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on,Page(s): 2675 - 2678 ,2011.
Ferenc Kovacs and Janos Illes “Frequent Itemset Mining on Hadoop.”,IEEE 9th International conference on Computational Cybernetics, Volume 2 Issue 4, June 2013.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
