Text Line Extraction of Handwritten Kannada Documents Based on Bounding Box Technique

Authors

  • Chethana HT Department of Information Science & Engineering , P.E.S Institute of Technology, VTU,Bangalore,560085,India
  • Mamatha HR Department of Information Science & Engineering , P.E.S Institute of Technology, VTU,Bangalore,560085,India

Keywords:

Segmentation, Handwriting, Text lines, OCR, Bounding Box

Abstract

Optical Character Recognition is the process of transforming printed or handwritten text in to a form in which computer can understand and manipulate. An important task of any Optical Character Recognition(OCR)system is segmentation. Characters, words and lines are separated from image text documents by segmentation. Depending on the segmentation algorithm which is being used can affect the accuracy of OCR system. Segmentation of handwritten Kannada script poses challenges due to writing styles, skewed lines, overlapping lines, inter and intra word gaps. In this paper we have proposed method for segmentation of handwritten Kannada documents based on bounding box and morphological operations, an average segmentation rate of 92% for lines is obtained.

References

Priyadharshini N and Vijaya MS , "Genetic Programming for Document Segmentation and Region Classification using Discipulus Perceptron", (IJARAI) International Journal of Advanced Research in Artificial Intelligence ,Vol.2 ,No.2, 2013

Rafael C. Gonzalez, Richard E. Woods and Steven L. Eddins ,"Digital Image Processing using MATLAB , Indian Edition,2009,pp. 348-361.

Pulagam Soujanya, Vijaya Kumar Koppula , Kishore Gaddam and P. Sruthi , "Comparative Study of Text Line Segmentation Algorithms on Low Quality Documents", Special Issue of International Journal of Computer Science & Informatics (IJCSI) ,ISSN (PRINT): 22315292 , Vol .II , Issue1 , 2.

Mamatha HR and Srikantamurthy K , "Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document", International Journal of Applied Information Systems (IJAIS)– ISSN:2249-0868 Foundation of Computer Science FCS,2012.

Laurence Likforman- Sulem , Abderrazak Zahour and Bruno Taconet,"Text line segmentation of historical documents: a survey", IJDAR9:123–138 DOI 10.1007/s10032-006-0023-z . M , 2007.

Munish Kumar, R.K. Sharma and M.K. Jindal , "Segmentation of Lines and Words in Handwritten Gurumukhi Script Documents", Indian Institute of Information Technology Allahabad, India.

Vijaya Kumar Koppula and Atul Negi , "Using Fringe Maps for Text Line Segmentation in Printed or Handwritten Document Images", 2010 ,pp.8388.

Mamatha H R and Srikantamurthy K ,"Skew Detection, Correction and Segmentation of Handwritten Kannada Document", International Journal of Advanced Science and Technology ,Vol. 48, November,2012.

Nagabhushan P, Alireza Alaei and Umapada pal, "A Benchmark Kannada Handwritten Document Dataset and its Segmentation", International Conference on Document Analysis and Recognition,2011.

Laurence Likforman- Sulem and Ana hid Hanimyan , "A Hough Based Algorithm for Extracting Text Lines in Handwritten Documents", Claudie Faure Ecole Nationale SupCrieure des T&communications, CNRS-URA 82046 rue Barrault,1995.

M. Arivazhagan, H. Srinivasan and S. N. Srihari , "A Statistical Approach to Handwritten Line Segmentation", In Proceedings of SPIE Document Recognition and Retrieval XIV, SanJose , CA,February2007.

A.V. Aho, J.E. Hopcroft and J.D. Ullman , "Data Structures and Algorithms", Addison- Wesley, 1983.

A. Alaei, U. Pal and P. Nagabhushan, "A new scheme for unconstrained handwritten text-line segmentation" , Pattern Recognition,44(4), pp.917–928, 2011.

V. N. Manjunath Aradhya and C Naveena ,"Text Line Segmentation of Unconstrained Handwritten Kannada Script", In the proceedings of ICCCS’11,pp.231-23, 2011.

M.K Jindal, R. K. Sharma & G.S. Lehal , "Segmentation of Horizontally Overlapping Lines in Printed Indian Scripts", International Journal of Computational Intelligence Research, ISSN 0973-1873 Vol.3, No.4, pp. 277–286,2007.

G. Louloudis, B. Gatos, I. Pratikakis & K.Halatsis, "A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents", Proceedings of the Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule, Oct. 2006.

B.M.Sagar, Dr.Shobha G and Dr. Ramakanth kumar P, "OCR for printed kannada text to Machine editable format using Database approach", 9th WSEAS International Conference on AUTOMATION and INFORMATION (ICAI'08) , Bucharest , Romania , June24-26 , 2008.

Downloads

Published

2015-05-30

How to Cite

[1]
H. Chethana and H. Mamatha, “Text Line Extraction of Handwritten Kannada Documents Based on Bounding Box Technique”, Int. J. Comp. Sci. Eng., vol. 3, no. 5, pp. 297–303, May 2015.

Issue

Section

Research Article