Improvement of Time Complexity on External Sorting using Refined Approach and Data Preprocessing

Authors

  • S Hrushikesava Raju Research Scholar, Regd.No:PP.CSE.0158,Rayalaseema University,Kurnool A.P.
  • M Nagabhusana Rao Professor, Department of CSE, K L University, Vijayawada, A.P.

Keywords:

data preprocessing, external sorting, Data cleaning, passes, Inputs / Outputs, and runs

Abstract

Generally, huge data of any organization possess data redundancy, noise and data inconsistency. To eliminate, Data preprocessing should be performed on raw data, then sorting technique is applied on it. Data preprocessing includes many methods such as data cleaning, data integration, data transformation and data reduction. Depending on the complexity of given data, these methods are taken and applied on raw data in order to produce quality of data. Then, external sorting is applied. The proposed external sorting now takes the number of passes less than actual passes log B (N/M) + 1 for the traditional B – way external merge sorting. Also, the number of Input / Outputs of proposed method is less than 2*N* (log B (N/M) + 1) of Input / Outputs than traditional method, and also proposed method consume least number of runs compared to actual basic external sorting.

References

Mark Allen Weiss, “Data Structures and Algorithm Analysis in C++”, Chapter7, Fourth Edition, Pearson, Florida International University, ISBN-13: 978-0-13-284737-7, ISBN-10: 0-13-284737-X.

Mark Allen Weiss, “Data Structures and Algorithm Analysis in Java “,Chapter7,Third Edition, Pearson, Florida International University ISBN-13: 978-0-13-257627-7,ISBN-10: 0-13-257627-9.

Alfred V. Aho, John E. HopCroft and Jelfrey D. Ullman, “Data Structures and Algorithms”, Chapter- Sorting,Addison –Wesley, 1983.

Micheline Kamber and Jiawei Han,”Data Preprocessing, Data Mining Principles and Techniques”.

Margaret H Dunham, “Data Mining Introductory and Advanced Topics”, Pearson Education, 2e, 2006.

Sam Anahory and Dennis Murry,”Data warehousing in the Real World”,Pearson Education,2003.

D. E. Knuth (1985), Sorting and Searching, The Art of Computer Programming, Vol. 3, Addison –Wesley, Reading, MA, (1985).

] Alok Aggarwal and Jeffrey Scott Vitter, Input and Output Complexity of Sorting and related problems, Algorithms and Data Structure, AV88.pdf.

Leu, , Fang-Cheng; Tsai, Yin-Te; Tang, Chuan Yi,”An efficient External Sorting Algorithm”, pp.159 – 163, Information Processing Letters 75 2000.

Ian H. Witten, Eibe Frank, Morgan Kaufmann,”Data Mining: Practical Machine Learning Tools and Techniques”, Second Edition (Morgan Kaufmann Series in Data Management Systems), 2005.

Zhi – Hua Zhou, Dept. of CSE, Nanjing University,”Introduction to Data Mining”, part3: Data Preprocessing, Pt03.pdf, Spring 2012.

Chapter 3. Data Preprocessing, www.cs.uiuc.edu /homes/hanj/cs412/bk3.../ 03Preprocessing.ppt.

Chapter 2. Data Preprocessing, ww.cs.gsu.edu/~cscyqz/ courses/dm/slides/ch02.ppt.

R&G Chapter 13:External Sorting, inst.eecs.berkeley.edu /~cs186/fa06/lecs/05Sorting.ppt.

Chapter11:External Sorting, www.cs.rutgers.edu /~muthu/lec9-04.ppt.

DATAMINING/IT0467, http://www.srmuniv.ac.in/sites/ default/files/ files/Data%20Mining.pdf.

Chiara Rebso, KDD- LAB, ISTI – CNR, Pisa, Italy ,http://www.techrepublic.com/resource-library

/whitepapers/an-unique-data-mining-task-for-sorting-data-preprocessing-for-efficient-external-sorting/

APPLICATION OF A DATA MINING TASK CALLEDDATA PREPROCESSING ON THE INPUT DATA AND EFFICIENT EXTERNAL SORTING USING REFINEMENT OF EXISTING ALGORITHM, http://esatjournals.net/ijret/2012v01/i03/IJRET

pdf

A Survey on Improved Time Complexities for the certain data structures using data preprocessing and refinement of existing algorithms used over them, http://ijarcet.org/wp-content/uploads/IJARCET-VOL-5-ISSUE-4-1147-1154.pdf

Performance Analysis of Data Reduction Algorithms using Attribute Selection in NSL-KDD Dataset, http://ijesat.org/Volumes/2014_Vol_04_Iss_02/IJESAT_2014_04_02_16.pdf.

Downloads

Published

2025-11-11

How to Cite

[1]
S. Hrushikesava Raju and M. Nagabhusana Rao, “Improvement of Time Complexity on External Sorting using Refined Approach and Data Preprocessing”, Int. J. Comp. Sci. Eng., vol. 4, no. 11, pp. 82–86, Nov. 2025.

Issue

Section

Research Article