A Novel Data AggregationTechnique for Removing Redundant Data in Hadoop
Keywords:
herewe are grouping the frequent itemsetand remove the redundant dataAbstract
Hadoop is the software framework which was developed by Apache Software Foundation.Hadoop framework is written in java with purpose to handle large amount of data. Hadoop manages huge volume of data.Hadoop runs the task under the MapReduce algorithm. MapReduce is a programming model suitable for processing of huge data. MapReduce framework has two phase, map phase and reduce phase.a mapredce job is usually splits the input data set into independent chunks,which is done by map phase.the framework sorts the output of the map which are input to reduce framework. To running frequent itemset require more resource and time consuming. To overcome this problem here we implementing the nobel data aggregation technique.
References
[1] Y. Xun, J. Zhang, and X. Qin, “Fidoop: Parallel mining of frequent itemsets using mapreduce,” IEEE Transactions on Systems,Man ,and Cybernetics: Systems, doi: 10.1109/TSMC.2015.2437327, 2015.
[2] J. Leskovec, A. Rajaraman, and J. D. Ullman, Mining of massive datasets. Cambridge University Press, 2014.
[3] M. Liroz-Gistau, R. Akbarinia, D. Agrawal, E. Pacitti, and P. Valduriez,“Data partitioning for minimizing transferred data in mapreduce,” in Data Management in Cloud, Grid and P2P Systems. Springer,2013.
[4] T. Kirsten, L. Kolb, M. Hartung, A. Groß, H. K¨opcke, and E. Rahm,“Data partitioning for parallel entity matching,” Proceedings of theVLDB Endowment, vol. 3, no. 2, 2010.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
