Regression Based Data Mining Techniques for Frequent Data Stream
Keywords:
Data mining, Time Series Data, Regression Techniques, Stream DataAbstract
Data mining in the stream data handles quality and data analysis using extremely large and infinite amount of data and disk or memory with limited volume[2]. In such traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data and which is probable to become a frequent item. This paper analyze a way to predict frequent items using linear regression model[5] to the continuously incoming one dimensional stream data like the time series data. By establishing the regression model from the stream data, it may be used as a prediction model to uncertain items. The proposing way will exhibit its effectiveness through experiment in stream data.
References
D.F. Andrews, :A robust method for multiple linear regression,Technometrics , vol 16, 1974, pp 125 - 127.
Chai, Eun Hee Kim and Long Jin:prediction of Frequent Items to OneDimensional Stream Data; Fifth International Conference on Computational Science and Applications ; page 353-360, 2001
Y. Chen, G.Dong, J.Han, B.W.Wah, and J.Wang : .Multi-Dimensional Regression Analysis of Time- Series Data Streams; Proc. Int. Conf. Very Large Data Bases;Hong Kong, China, Aug. 2002.
C. Giannella, J. Han, J. Pei, X. Yan, and P. S. Yu, :Mining Frequent Patterns in Data Streams at Multiple Time Granularities, In H. Kargupta, A. Joshi, K. Sivakumar, and Y.Yeshar(eds.), Next Generation Data Mining, AAAI/MIT, 2003.
R. Hayward; A Basic Approach to Linear Regression; RWJ linical Scholars Program; pp1-3,University of Michigan , 2005.
O.B.Yaik, C.H.Yong, and FHaron, Time Series Prediction using Adaptive Association rules,InProc.of DFMA05, pp.310-314, 2005.
Omid Rouhani-Kalleh; Algorithms for Fast Large Scale data Mining Using Logistic Regression; Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Data Mining; pp 155-162, 2007.
Feng Zhao, Qing-Hua A Li :A Plane Regression Based Sequence Forecast Algorithms for Stream Data ; Proc. of the Fourth International Conference on Machine Learning and Cybernetics; pp-1559-1562 Guangzhou,18-21 August, 2005.
Y. Peng, G. Kou, Y. Shi, Z. Chen; A Descriptive Framework for the Field of Data Mining and Knowledge Discovery. International Journal of Information Technology and Decision Making, Volume 7, Issue 4: 639 – 682; 2000
Perlich, C,Provost, F., Simonoff, J. S. Tree Induction verses. Logistic Regression:A Learning-Curve Analysis. Journal of Machine Learning Research Vol. 4 pp-211- 255. 2003.
Amir Bar-Or, Daniel Keren, Assaf Schuster, and Ran Wolff: Hierarchical Decision Tree Induction in istributed Genomic Databases; IEEERANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,VOL. 17;pp; 1138- 1150,2007.
Qi Luo; Advancing Knowledge Discovery and Data Mining; Workshop on Knowledge Discovery and Data Mining pp;3-5, 2008. [13]Fayyad, Usama; Gregory Piatetsky-Shapiro, and adhraic Smyth; From Data Mining to Knowledge Discovery in Databases. -pp:12-17, June 2008.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
