Credit Card Fraud Detection using Feature Augmentation based Boosted Ensemble (FABE)
DOI:
https://doi.org/10.26438/ijcse/v6i12.841846Keywords:
Credit card fraud detection, Ensemble model, Feature Augmentation, Feature Reduction, Feature Engineering, BoostingAbstract
Fraud detection in credit card transactions have become mandatory for the financial services industry due to the huge levels of automations observed in the industry. This work presents a Feature Augmentation based Boosted Ensemble (FABE) for credit card fraud detection on huge data. The proposed model integrates two major components; feature augmentation and ensemble creation. Feature augmentation phase performs feature reduction, feature transformation and feature engineering. Feature reduction aids in effective elimination of unnecessary features, while feature transformation and feature engineering aids in creation of new features that can aid in better predictions. The ensemble creation phase models a boosted ensemble using Decision Trees. Multiple training data bags are created, and multiple base learners are created. The learner with highest weight and lowest error levels is iteratively modelled and used as the final learner. Experiments were performed and comparisons with existing models in literature exhibit the high-performance levels of the proposed FABE model.
References
[1] J. Fan, F. Han, H. Liu. "Challenges of Big Data Analysis”, National science review, Vol.1, pp.293-314, 2014
[2] R. J.Bolton, D.J. Hand, “Statistical fraud detection: A review”,. Statistical Science, Vol. 17, Issue 3, pp. 235–249, 2002
[3] A. D. Pozzolo, O. Caelen, Y. Le Borgne, S. Waterschoot, G. Bontempi, “Learned lessons in credit card fraud detection from a practitioner perspective”, Expert Systems with Applications, Vol. 41, Issue 10, pp. 4915– 4928, 2014
[4] A. Somasundaram, U.S. Reddy. "Data Imbalance: Effects and Solutions for Classification of Large and Highly Imbalanced Data", Proceedin,gs of ICRECT 16, pp. 28-34, 2016
[5] A. Somasundaram, U.S. Reddy. "Modelling a stable classifier for handling large scale data with noise and imbalance", In Proceedings of International Conference on Computational Intelligence in Data Science (ICCIDS-17), pp. 1-6, 2017.
[6] R. Akbani, S. Kwek, N. Japkowicz, “Applying support vector machines to imbalanced datasets,” Machine Learning: ECML 2004, pp. 39–50, 2004.
[7] N. S. Halvaiee, M. K. Akbari, “A novel model for credit card fraud detection using artificial immune systems”, Appl. Soft Comput. Vol. 24, pp. 40–49, 2014
[8] A.B. Hens, M.K. Tiwari, “Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method”, Expert Syst. Appl. Vol. 39, Issue 8, pp. 6774–6781, 2012
[9] A. Somaundaram., U.S. Reddy, "Risk based bagged ensemble (RBE) for credit card fraud detection." In Proceedings of International Conference on Inventive Computing and Informatics (ICICI-17), pp. 670-674, 2017
[10] G. Vaughan, “Efficient big data model selection with applications to fraud detection”, International Journal of Forecasting, 2018
[11] P. Xenopoulos, "Introducing DeepBalance: Random deep belief network ensembles to address class imbalance", arXiv preprint arXiv:1709.10056, 2017
[12] A. G. de Sá, A. C. Pereira, G.L. Pappa, "A customized classification algorithm for credit card fraud detection", Engineering Applications of Artificial Intelligence, Vol. 72, pp. 21-29, 2018
[13] A.Somasundaram, and U.S. Reddy, "Cost Sensitive Risk Induced Bayesian Inference Bagging (RIBIB) for Credit Card Fraud Detection", Journal of Computational Science, 2018
[14] A. Somasundaram, U. S. Reddy, "Parallel and incremental credit card fraud detection model to handle concept drift and data imbalance", Neural Computing and Applications, pp. 1-12, 2018
[15] A. Somasundaram., U.S. Reddy, "Credit Card Fraud Detection Using Non-Overlapped Risk Based Bagging Ensemble (NRBE)" In Proceedings of IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1-4, 2017
[16] W. Fan, Y. A. Huang, H. Wang,P. S. Yu, “Active mining of data streams”, In Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 457–461. SIAM , 2004
[17] F. Carcillo, Y. L. Borgne, O. Caelen,G. Bontempi, "Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization", International Journal of Data Science and Analytics, pp.1-16, 2018.
[18] K. Pichara, A. Soto, A. Araneda, “Detection of anomalies in large datasets using an active learning scheme based on dirichlet distributions”, In Proceedings of Ibero-American Conference on Artificial Intelligence, pp. 163–172, 2008
[19] J. Fan, Y. Feng, J. Jiang, X. Tong, “Feature Augmentation via Nonparametrics and Selection (FANS) in high-dimensional classification”. Journal of the American Statistical Association, Vol. 111, Issue. 513, pp.275-287, 2016
[20] S. T. Roweis, L.K. Saul, “Nonlinear dimensionality reduction by locally linear embedding”, Science, Vol.290, Issue.5500, pp.2323-2326, 2000
[21] J. B. Tenenbaum, V.D. Silva,J.C. Langford, “A global geometric framework for nonlinear dimensionality reduction”, Science, Vol. 290, Issue. 5500, pp.2319-2323, 2000
[22] C. R. Turner, A. Fuggetta, L. Lavazza, A. L. Wolf, “A conceptual basis for feature engineering”, Journal of Systems and Software, Vol. 49, Issue.1, pp.3-15, 1999
[23] L. Lopez, E. Alonso, A. Stefan, “Banksim: A bank payments simulator for fraud detection research”, In proceedings of 26th European Modeling and Simulation Symposium, EMSS 2014, pp. 144–152, France, 2014
[24] V. Jain, "Outlier Detection Based on Clustering Over Sensed Data Using Hadoop", International Journal of Scientific Research in Computer Science and Engineering, Vol.1, Issue.2, pp.45-50, 2013
[25] Namrata Ghuse, Pranali Pawar, Amol Potgantwar, "An Improved Approch For Fraud Detection In Health Insurance Using Data Mining Techniques", International Journal of Scientific Research in Network Security and Communication, Vol.5, Issue.3, pp.27-33, 2017
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
