Text Line Extraction of Handwritten Kannada Documents Based on Bounding Box Technique
Keywords:
Segmentation, Handwriting, Text lines, OCR, Bounding BoxAbstract
Optical Character Recognition is the process of transforming printed or handwritten text in to a form in which computer can understand and manipulate. An important task of any Optical Character Recognition(OCR)system is segmentation. Characters, words and lines are separated from image text documents by segmentation. Depending on the segmentation algorithm which is being used can affect the accuracy of OCR system. Segmentation of handwritten Kannada script poses challenges due to writing styles, skewed lines, overlapping lines, inter and intra word gaps. In this paper we have proposed method for segmentation of handwritten Kannada documents based on bounding box and morphological operations, an average segmentation rate of 92% for lines is obtained.
References
Priyadharshini N and Vijaya MS , "Genetic Programming for Document Segmentation and Region Classification using Discipulus Perceptron", (IJARAI) International Journal of Advanced Research in Artificial Intelligence ,Vol.2 ,No.2, 2013
Rafael C. Gonzalez, Richard E. Woods and Steven L. Eddins ,"Digital Image Processing using MATLAB , Indian Edition,2009,pp. 348-361.
Pulagam Soujanya, Vijaya Kumar Koppula , Kishore Gaddam and P. Sruthi , "Comparative Study of Text Line Segmentation Algorithms on Low Quality Documents", Special Issue of International Journal of Computer Science & Informatics (IJCSI) ,ISSN (PRINT): 22315292 , Vol .II , Issue1 , 2.
Mamatha HR and Srikantamurthy K , "Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document", International Journal of Applied Information Systems (IJAIS)– ISSN:2249-0868 Foundation of Computer Science FCS,2012.
Laurence Likforman- Sulem , Abderrazak Zahour and Bruno Taconet,"Text line segmentation of historical documents: a survey", IJDAR9:123–138 DOI 10.1007/s10032-006-0023-z . M , 2007.
Munish Kumar, R.K. Sharma and M.K. Jindal , "Segmentation of Lines and Words in Handwritten Gurumukhi Script Documents", Indian Institute of Information Technology Allahabad, India.
Vijaya Kumar Koppula and Atul Negi , "Using Fringe Maps for Text Line Segmentation in Printed or Handwritten Document Images", 2010 ,pp.8388.
Mamatha H R and Srikantamurthy K ,"Skew Detection, Correction and Segmentation of Handwritten Kannada Document", International Journal of Advanced Science and Technology ,Vol. 48, November,2012.
Nagabhushan P, Alireza Alaei and Umapada pal, "A Benchmark Kannada Handwritten Document Dataset and its Segmentation", International Conference on Document Analysis and Recognition,2011.
Laurence Likforman- Sulem and Ana hid Hanimyan , "A Hough Based Algorithm for Extracting Text Lines in Handwritten Documents", Claudie Faure Ecole Nationale SupCrieure des T&communications, CNRS-URA 82046 rue Barrault,1995.
M. Arivazhagan, H. Srinivasan and S. N. Srihari , "A Statistical Approach to Handwritten Line Segmentation", In Proceedings of SPIE Document Recognition and Retrieval XIV, SanJose , CA,February2007.
A.V. Aho, J.E. Hopcroft and J.D. Ullman , "Data Structures and Algorithms", Addison- Wesley, 1983.
A. Alaei, U. Pal and P. Nagabhushan, "A new scheme for unconstrained handwritten text-line segmentation" , Pattern Recognition,44(4), pp.917–928, 2011.
V. N. Manjunath Aradhya and C Naveena ,"Text Line Segmentation of Unconstrained Handwritten Kannada Script", In the proceedings of ICCCS’11,pp.231-23, 2011.
M.K Jindal, R. K. Sharma & G.S. Lehal , "Segmentation of Horizontally Overlapping Lines in Printed Indian Scripts", International Journal of Computational Intelligence Research, ISSN 0973-1873 Vol.3, No.4, pp. 277–286,2007.
G. Louloudis, B. Gatos, I. Pratikakis & K.Halatsis, "A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents", Proceedings of the Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule, Oct. 2006.
B.M.Sagar, Dr.Shobha G and Dr. Ramakanth kumar P, "OCR for printed kannada text to Machine editable format using Database approach", 9th WSEAS International Conference on AUTOMATION and INFORMATION (ICAI'08) , Bucharest , Romania , June24-26 , 2008.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
