Enhancing the Productivity of Digital Data Retrieval
Keywords:
Textual Evidences, Stemming, Information Retrieval, PreprocessingAbstract
The escalate of crime rate around the global manifests the lag in the ongoing methodology in extraction or inspection of information retrieved and stored in cache from divers communication channels of the digital investigation system. The Principle objective for the development of this strategy, is to revamp the existing methodology and develop a crime relates information mining framework for extracting and examine pertinent information from stored data which incorporates emails, chat threads and any text messages to discover the criminal activity and solve the enigma with the help of the certainty concealed within the data. This strategy is achieved by the prominence of the three aspects initiated to the input data of textual evidences such as mails, text messages, chat threads, etc. the three aspects are, the separation of the body and header part of a textual corpuses consisting of all kinds of textual evidences of diverse communication channels and preprocessing of text , which is achieved using regular expressions of PHP script and stemming process respectively, in order to make the mining process of text more reliable for the forensic department of crime investigation. Further the searching technique used in this methodology which is constructed for the most highly effective and efficient retrieval of data. Although the technique followed in this methodology is an ancient technique of separation of body and header in a mail, the main aspect of this methodology is to focus over the efficiency in searching technique built for the purpose of effectiveness. A Clustering algorithm is used in this methodology to improvise the system. This algorithm has mainly three feature, it is the alternative form of the Reverse Factor algorithm, it uses bit-parallelism simulation of the suffix automaton of xR and it is efficiency is high if the pattern length is not longer than the memory-word size of the machine. Using this kind of technique to improvise the existing system would bring about a methodical procedure, which would initiate in a highly efficacious searching system, evidenced by the time complexity, precision and recall value.
References
S.Gowri; G.S.Anandha Mala; “Improving Intelligent IR Effectiveness in Forensic Analysis” Institution of Computer Science Informatics and Telecommunication Engineering 2012, Page(s): 451.
Ms. Vandana Dhingra; Dr. Komal Kumar Bhatia; “Towards Intelligent Information Retrieval on Web” International Journal on Computer Science and Engineering (IJCSE) ISSN : 0975-3397Vol. 3 No. 4 Apr 2011
Gabarro, S. "String Manipulations Revisited" Web Application Design and Implementation:Apache 2, PHP5, MySQL, JavaScript, and Linux/UNIX Digital Object Identifier: 10.1109/9780470083963.ch17 Page(s): 209- 216 Copyright Year: 2007.
Minamide,Y. ; Sakuma,Y. ; Voronkov, A."Translating Regular Expression Matching into Transducers" Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2010 12th International Symposium on Digital Object Identifier: 10.1109/SYNASC.2010.50 Publication Year: 2010 , Page(s): 107- 115 Cited by: Papers (2).
Bartoli, A. ; Davanzo, G. ; De Lorenzo, A. ; Medvet, E. ; Sorio, E. "Automatic Synthesis of Regular Expressions from Examples" Computer Volume:PP , Issue: 99 Digital Object Identifier: 10.1109/MC.2013.403 Publication Year: 2013 , Page(s): 1.
Jesse Davis; Mark Goadrich; “The Relationship Between Precision-Recall and ROC Curves” Appearing in Proceedings of the 23rd international conference on Machine Learning, Pittsburg, PA, 2006.
Suzumura, T. ; Trent, S. ; Tatsubori, M. ; Tozawa, A. ; Onodera, T. "Performance Comparison of Web Service Engines in PHP, Java and C" Web Services, 2008. ICWS '08. IEEE International Conference on Digital Object Identifier: 10.1109/ICWS.2008.71 Publication Year: 2008 , Page(s): 385- 392 Cited by: Papers (1).
NAVARRO G., RAFFINOT M., 1998. “A Bit-Parallel Approach to Suffix Automata: Fast Extended String Matching”, In Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science 1448, Springer-Verlag, Berlin, 14-31.
Nascimento, M.A. ; Da Cunha, A.C.R.,"An experiment stemming non-traditional text" String Processing and Information Retrieval: A South American Symposium, 1998. Proceedings Digital Object Identifier: 10.1109/SPIRE.1998.712985 Publication Year: 1998.
Inikpi O. Ademu, Dr Chris O. Imafidon, Dr David S. Preston, “A New Approach of Digital Forensic Model for Digital Forensic Investigation” (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 2, No.12, 2011.
S.Gowri; G.S.Anandha Mala; G.Divya; “Suspicious Data Mining from Chat and Email Data” International Journal of Advances in Science Engineering and Technology, ISSN: 2321-9009 Volume- 2, Issue-2, April-2014.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
