Infrequent Weighted Itemset Mining for Large Dataset
Keywords:
Data Mining, frequent Itemset, Infrequent Itemset, Weighted Itemset, Hadoop, MapReduceAbstract
Data mining is the process of analysing data from many different perspectives or dimensions, categorize it and finally summarize it into useful information. This information can be used to increase profits, cut costs, or both. Data mining software is used for analysing data. It allows users to analyse data from many different perspectives, categorize it, and summarize the relationships discovered. Specially, data mining is the way of extracting valuable correlations or patterns among many number of fields in large relational databases. Pattern mining has become an important task in data mining. Mining frequent and infrequent itemsets from a dataset is the most important field of data mining. Mining frequent itemset is very expensive when minimum support threshold is low, and when a minimum support threshold is high mining in frequent itemsets is highly expensive. The proposed system uses multiple level minimum supports to constrain infrequent itemsets by giving different minimum supports to itemsets with different length in order to mine a number of infrequent itemsets in an appropriate degree. In this paper, we are implementing the concept of infrequent weighted itemset mining based on Hadoop-MapReduce model, which can handle massive datasets in mining in frequent itemsets, in that we proposed two novel algorithms based on IWI Miner, IWI Miner to drive the IWI mining process. This paper emphasis on the issue of discovering those itemsets which occurs rarely in large dataset called infrequent weighted itemset (IWI) mining problem.
References
Aruna J. Chamatkar and P.K. Butey , "Comparison on Different Data Mining Algorithms", International Journal of Computer Sciences and Engineering, Vol.2, Issue.10, pp.54-58, 2014.
Akilandeswari. S and A.V.Senthil Kumar, "A Novel Low Utility Based Infrequent Weighted Itemset Mining Approach Using Frequent Pattern", International Journal of Computer Sciences and Engineering, Vol.3, Issue.7, pp.181-185, 2015.
Jeffery Dean and Sanjay Ghemawat, “MapReduce: simplified data processing on large clusters”, Communications of the ACM, Vol. 51, No.1, 2008, pp. 107-113.
Dong, Z Zheng, Z Niu and Q Jiam,” Mining infrequent itemset based on multiple level minimum supports”, 2nd Int. Conf. on Innovative Computing, Information Control, 2007.
He Jiang, Xiumei Luan, Xiangjun Dong,” Mining Weighted Negative Association Rules from Infrequent Itemsets Based on Multiple Supports”, 978-0-7695-4792-3/12 $26.00 © 2012 IEEE 2012 International Conference on Industrial Control and Electronics Engineering.
A. Gupta, A. Mittal, and A. Bhattacharya, “Minimally Infrequent Itemset Mining Using Pattern-Growth Paradigm and Residual Trees”, Proc. Int’l Conf. Management of Data (COMAD), pp. 57-68, 2011.
T Ramakrishnudu, R B V Subramanyam,” Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework”, I.J. Intelligent Systems and Applications, 2015, 07, 44-49.
David J. Haglin and Anna M. Manning, “On Minimal Infrequent Itemset Mining”.
K. Sun and F. Bai, “Mining Weighted Association Rules Without Preassigned Weights,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 4, pp. 489-495, Apr. 2008.
Ling Zhou, Stephen Yau ∗,” Efficient association rule mining among both frequent and infrequent items”, Computers and Mathematics with Applications 54 (2007) 737–749.
J.Jaya1, S.V.Hemalatha2,” A Survey of Frequent and Infrequent Weighted Itemset Mining Approaches”.
He Jiang, Xiumei Luan, Xiangjun Dong,” Mining Weighted Negative Association Rules from Infrequent Itemsets Based on Multiple Supports”, 978-0-7695-4792-3/12 $26.00 © 2012 IEEE 2012 International Conference on Industrial Control and Electronics Engineering.
Junfeng Ding, Stephen S.T. Yau, “TCOM, an innovative data structure for mining association rules among infrequent items”, Computers and Mathematics with Applications, Vol. 57, No. 2, January 2009, pp. 290-301.
Guru Prasad M.S., Nagesh H.R., Swathi Prabhu, "An Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework", International Journal of Computer Sciences and Engineering, Vol.5, Issue.6, pp.112-120, 2017.
Nidhi Sethi and Pradeep Sharma, "Mining Frequent Pattern from Large Dynamic Database Using Compacting Data Sets", International Journal of Scientific Research in Computer Science and Engineering, Vol.1, Issue.3, pp.31-34, 2013.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
