Unsupervised Distance-Based Outlier Detection using Reversible KNN with Fuzzy Clustering

Authors

  • S Vasuki Dept. of Computer Applications, .J .J College of Arts and Science (Autonomous), Pudukkottai, India

DOI:

https://doi.org/10.26438/ijcse/v7i6.11951199

Keywords:

Clustering, data mining, fuzzy c-means, outliers, unsupervised learning

Abstract

The detection of outliers in high-dimensional data raises some of the challenges of “dimension curse”. A major point of view is that the concentration of distances, that is, the distance trends in high-dimensional data becomes illegible, making it difficult to detect outliers by marking all points as values by a distance-based approach. In this paper, implement that the idea of distance-based methods can produce more contrast outliers in high-dimensional environments to provide evidence to support the idea that this view is too simple. In addition, we show that high dimensions can have different effects when there is no oversight to re-examine the concept of a more recent inverse neighbor in the context of atypical detection. It has recently been observed that the distribution of the inverse neighborhood count of points deviates in a high dimension, which causes a phenomenon called a hubness. This work provide information on how some antihubs rarely appear in the k-NN list at other points, and explain the connection between antihubs, outlier values and existing unsupervised outlier detection methods. In evaluating the classical approach to k-NN, angle-based techniques are designed for high-dimensional data, local outliers based on density, and various methods based on anti-sheathing. Combining and real-world data, this work provide new information about the utility of reverse neighborhood counting to detect outliers without supervision.

References

[1]. Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection by Milos Radovanovi ˇ c, Alexandros Nanopoulos, and Mirjana Ivanovi, IEEE Transactions On Knowledge And Data Engineering, Revised October 2014

[2]. An Efficient Anomaly Detection System Using Featured Histogram and Fuzzy Rule Mining by Ranjita Singh, Sreeja Nair., January 2014 ISSN: 2277 128X Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering

[3]. Robust Regression and Outlier Detection with the ROBUSTREG Procedure by Colin Chen, SAS Institute Inc., Cary, NC, Paper 265-27, Feb 2013, IEEE Trans. Automat Control 19, 716–723.

[4]. Recursive Antihub2 Outlier Detection In High Dimensional Data by J.Michael Antony Sylvia, Dr.T.C.Rajakumar. Vol-2, Issue-8 PP. 1269-1274, 3 0 A u g u s t 2 0 1 5, Global Journal of Advanced Research(GJAR) Vol-2, Issue-8 PP. 1269-1274 ISSN: 2394-5788

[5]. Unsupervised Distance-Based Outlier Detection Using Nearest Neighbours Algorithm on Distributed Approach: Survey by Jayshree S.Gosavi , Vinod S.Wadne, (An ISO 3297: 2007 Certified Organization) IJIRCCE ,Vol. 2, Issue 12, December 2014

[6]. Pamula, Rajendra, Jatindra Kumar Deka, and Sukumar Nandi. "An outlier detection method based on clustering." Emerging Applications of Information Technology (EAIT), 2011 Second International Conference on. IEEE, 2011.

[7]. W. Jin, A. K. H. Tung, and J. Han. Finding top-n local outliers in large database. In 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 293–298, 2001.

[8]. E.M. Knorr and R. T. Ng. Algorithms for mining distancebased outliers in large datasets. In Proceedings 24th Int. Conf. Very Large Data Bases, pages 392–403, New York, USA, 1998.

[9]. Lee, S. J. Stolfo, and K. W. Mok. A data mining framework for building intrusion detection models. In IEEE Symposium on Security and Privacy, pages 120–132, May 1999.

[10]. J. Liu and P. Gader. Outlier rejection with mlps and variants of RBF networks. In Proceedings of The 15th Int. Conf. on Pattern Recognition, pages 680–683, 2000.

[11]. S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD Int. Conf. on Management of Data, pages 427–438, Dallas, Texas, May 2000.

[12]. P. J. Rousseeuw and A. M. Leroy. Robust Regression and Outlier Detection. John Wiley and Sons, New York, October 1987.

[13]. G. Williams, R. Baxter, H. He, S. Hawkings, and L. Gu. A comparative study of RNN for outlier detection in data mining. In Proceedings of the 2nd IEEE Int. Conf. on Data Mining, Maebashi City, Japan, December 2002.

[14]. K. Yamanishi, J. Takeuchi, G. Williams, and P. Milne. Online unsupervised outlier detection using finite mixtures with discounting learning algorithm. In Proceedings The Sixth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 320–324, August 2000.

[15]. T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1(2):141–182, 1997.

Downloads

Published

2019-06-30
CITATION
DOI: 10.26438/ijcse/v7i6.11951199
Published: 2019-06-30

How to Cite

[1]
V. S, “Unsupervised Distance-Based Outlier Detection using Reversible KNN with Fuzzy Clustering”, Int. J. Comp. Sci. Eng., vol. 7, no. 6, pp. 1195–1199, Jun. 2019.

Issue

Section

Research Article