Generating Frequent Item Sets Using Apache Hadoop Map Reduce and Mahout

Authors

  • Sasanka Kasyap B Department of CSE, Prasad V Potluri Siddhartha Institute of Technology, Vijayawada, A.P, India
  • K Syama Sundara Rao Department of CSE, Prasad V Potluri Siddhartha Institute of Technology, Vijayawada, A.P, India

Keywords:

Big Data, Hadoop, Map Reduce, Mahout, K-Means

Abstract

The Item set Mining is one of the most well known techniques to extract knowledge from data. The mechanism having some problematic data, for those further enhancements have been applied based on the Big Data in which some performances has-been explores on Map Reduce machine. The new approach for mining large datasets such as K-means, Mahout which targets on speed and time while Big FIM is optimized to run on really large datasets. K-means algorithm depends on Map Reduce, which is the infrastructure for prepare more datasets of certain scattered clusters. These clusters are combined in the form of nodes and edges and also display the item sets. The execution of these clusters can also be implemented on large datasets also with high scalability and performance.

References

Apache Hadoop project, http://hadoop.apache.org/

Apache Mahout, http://mahout.apache.org/ , 2014.

Moens, S.; Aksehirli, E.; Goethals, B., "Frequent Item set Mining for Big Data," Big Data, 2013 IEEE International Conference Page No( 111, 118), Oct. 2013

Srinath Parera, Thilina Gunarathane, “Hadoop Map Reduce Cook Book”, [PACKT] publishing, ISBN: 9781849517287, Page No. (129-133), Jan 2013.

Dean, Jeffrey, and Sanjay Ghemawat. "Map Reduce: simplified data processing on large clusters." Communications of the ACM volume 51, Issue 1, January 2008, Page No (107-113). ISSN: 0001-0782 EISSN: 1557-7317. In Proc.OSDI. USEXNIC, Association 2004.

Borthakur, D. “The Hadoop Distributed File System: Architecture and Design”, 2007

W. Z. Zhao, H. F. Ma, Q. He. “Parallel k-means clustering based on Map Reduce”. In CloudCom’09: Proceedings of the 1st International Conference on Cloud Computing, Page No (674-679), Berlin, Heidelberg, 2009.

Ping ZHOU, Jingsheng LEI, Wenjun YE, “Large-Scale Data Sets Clustering Based on Map Reduce and Hadoop”, Journal of Computational Information Systems, 2011

Zhihua Li, Xugong Song, WenhuiZh, Yanxia Chen,”K-Means Clustering Optimizisation Algorithm Based on Map Reduce”. ISCI. March 2015.

Jiawei Han and Michelin Kamber. “Data Mining, Concepts and Techniques”. Morgan Kaufmann, 2001

Qing He, Fuzhen Zhuang, Jincheng Li, Zhongzhi Shi, “Parallel Implementation of Classification Algorithms Based on Map Reduce”. Page No (655-662). 5th International Conference, RSKT, 2010.

Downloads

Published

2025-11-10

How to Cite

[1]
B. Sasanka Kasyap and K. Syama Sundara Rao, “Generating Frequent Item Sets Using Apache Hadoop Map Reduce and Mahout”, Int. J. Comp. Sci. Eng., vol. 3, no. 10, pp. 43–46, Nov. 2025.

Issue

Section

Research Article