Comparative Study of String Matching Algorithms for DNA dataset

Authors

  • Rahate PM Dept. of Computer Science and Tech., Shri Ramdeobaba College of Engineering & Management, Nagpur, Maharashtra, India
  • Chandak MB Dept. of Computer Science and Tech., Shri Ramdeobaba College of Engineering & Management, Nagpur, Maharashtra, India

DOI:

https://doi.org/10.26438/ijcse/v6i5.10671074

Keywords:

String Matching Algorithm, DNA sequence

Abstract

String matching algorithms are widely used in computer science fields for information retrieval, intrusion detection, music retrieval, database queries, language syntax checker, bioinformatics, DNA sequence matching and etc. The most common and well-known use of string matching algorithms is for bioinformatics. In bioinformatics the DNA sequences of the normal human being and matched with the DNA sequence of a person having viruses or any kind disease. The pattern of any disease or virus is matched with the normal DNA genome sequence. If the pattern is found in the sequence which is in the form of string it is considered that the human being or patient is having the tested disease. Thus the pattern is matched with the large amount of DNA sequence which is sometimes very complex and not easy to retrieve. Thus to get the result or matched pattern in the less time with more accuracy the algorithms such as Knuth-Morris-Pratt(KMP), Boyer-Moore, Brute Force, Rabin-Karp and other algorithms are used. This paper presents five string matching algorithms from which four are exact matching algorithms and one is approximate string matching algorithm (Edit Distance). The above listed algorithms complexity will be compared using the DNA dataset to find the appropriate algorithm with high quality time and accuracy[1].

References

NYO ME TUN, THIN MYA MYA SWE, “Comparison of Three Pattern Matching Algorithms using DNA Sequences”, IJSETR, Vol.3, Issue.35, pp.6916-6920, 2014.

https://en.wikipedia.org/wiki/DNA

Robert Sedgewick, Kevin Wayne, “Algorithms”, Fourth Edition, Addison-Wesley,Pearson Edition, India, pp. 760-776, 2011.

Thomas Cormen,Charles E. Leiserson,Ronald L. Rivest,Clifford Stein, “Introduction to Algorithms”, McGraw-Hill Publication, India, pp.909-926, 2001.

Raju Bhukya, DVLN Somayajulu, “Exact Multiple Pattern Matching Algorithm using DNA Sequence and Pattern Pair”, International Journal of Computer Applications, Number 8,Article 6, pp.32-38, 2011.

Petteri Jokinen, Jorma Tarhio and Esko Ukkonen, “A Comparison of Approximate String Matching Algorithms”, Software-Practice and Experience, Vol.1(1), pp.1-4, 1988.

http://ccg.vitalit.ch/cgibin/htpselex/show?htpselex&tf=NF1_1&clone

https://archive.ics.uci.edu/ml/machine-learning-databases/molecular-biology/promoter-gene-sequences/promoters.data

Downloads

Published

2025-11-13
CITATION
DOI: 10.26438/ijcse/v6i5.10671074
Published: 2025-11-13

How to Cite

[1]
P. M. Rahate and M. B. Chandak, “Comparative Study of String Matching Algorithms for DNA dataset”, Int. J. Comp. Sci. Eng., vol. 6, no. 5, pp. 1067–1074, Nov. 2025.

Issue

Section

Research Article