Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure
DOI:
https://doi.org/10.26438/ijcse/v6i4.400404Keywords:
Levenshtein distance, Sequence matcherAbstract
The key predicament in the present circumstances is how to categorize the mathematically related keywords from a given text file and store them in one math text file. As the math text file contains only the keywords which are related to mathematics. The math dataset is a collection of huge amount of tested documents and stored in math text file. The dataset is trained with giant amount of text files and the size of dataset increases, training with various text samples. Finally the dataset contains only math-related keywords. The proposed approaches evaluated on the text containing individual formulas and repeated formulas. The two approaches proposed are one is Sequence matcher and another one is Levenshtein Distance, both are used for checking string similarity. The performance of the repossession is premeditated based on dataset of repetitive formulas and formulas appearing once and the time taken for reclamation is also measured.
References
Kai Ma, Siu Cheung Hui and Kuiyu Chang “Feature Extraction and Clustering-based Retrieval for Mathematical Formulas”, pp. 372-377.
Sidath Harshanath Samarasinghe and Siu Cheung Hui “Mathematical Document Retrieval for Problem Solving”, International Conference on Computer Engineering and Technology, pp.583-587,2009.
J. Misutka and L. Galambos, “Mathematical Extension of Full Text Search Engine Indexer”, Proc. 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA 08), , pp. 1-6,April 2008.
B.R. Miller and A. Youssef, “Technical Aspects of the Digital Library of Mathematical Functions”, in Annals of Mathematics and Artificial Intelligence, Springer Netherlands, pp. 121-136, 2003.
H. Zhang, T.B. and M.S. Lin, “An Evolutionary Kmeans Algorithm for Clustering Time Series Data” ,Proc. International Conference on Machine Learning and Cybernetics, pp. 1282-1287, 2004.
R. Munavalli and M.R. MathFind, “A Math-aware Search Engine”, Proc. Annual International ACM SIGIR Conference on Research and development in information retrieval, pp.735-735, 2006.
M. Kohlhase. “Markup for Mathematical Knowledge,” An Open Markup format for Mathematical Documents”, Ver. 1.2, Lecture Notes in Computer Science, , Springer Berlin, pp. 13-23.
G.AppaRao,K.Venkata Rao,PVGD Prasad Reddy and T.Lava Kumar,“An Efficient Procedure for Characteristic mining of Mathematical Formulas from Document”, International Journal of Engineering Science and Technology (IJEST), Vol. 10 No.03,pp152-157, Mar 2018
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
