Detection of Text Using Connected Component Clustering and Nontext Filtering

Authors

  • S Elakkiya Dept. of CSE, Periyar Maniammai University, Thanjavur
  • T Kavitha Dept. of CSE, Periyar Maniammai University, Thanjavur

Keywords:

Connected Component, Clustering, Extraction, Filtering

Abstract

Several methods have been developed for text detection and extraction to achieve accuracy for natural scene text and for multi-oriented text. However most of the methods use classifier to improve text detection accuracy. So this paper uses two machine learning classifiers one is to generate candidate region and the other filters nontext. Here connected components (CCs) in images are extracted by using the maximally stable extremal region algorithm. These extracted CCs are partitioned into clusters so that we can generate candidate regions. An AdaBoost classifier is trained to determine the adjacency relationship and cluster CCs by using their pair-wise relations. Since the scale, skew, and color of each candidate can be estimated from CCs, we can develop a text/nontext classifier for normalized images. This classifier will be based on multilayer perceptrons and we can control recall and precision rates with a single free parameter. Finally, the approach can be extended to exploit multichannel information and this method yields the state-of-the-art performance both in speed and accuracy.

References

K. Jung, “Text information extraction in images and video A survey,” Pattern Recognit., vol. 37, no. 5, pp. 977–997, May 2004.

S. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young, “ICDAR 2003 robust reading competitions,” in Proc. Int. Conf. Document Anal. Recognit, 2003, pp. 682 687.

S. Lucas, “Icdar 2005 text locating competition results,” in Proc. Int. Conf. Document Anal. Recognit. 2005, pp. 80–84.

Shahab, F. Shafait, and A. Dengel, “ICDAR 2011 robust reading competition challenge 2: Reading text in scene images,” in Proc. Int. Conf. Document Anal. Recognit, 2011, pp. 1491–1496.

Hyung Il Koo and Duck Hoon Kim, “Scene Text Detection via Connected Component Clustering and Nontext Filtering,” IEEE Trans. Image Process., vol. 22, no. 6, pp.2296-2305, June. 2011.

X. Chen and A. Yuille, “Detecting and reading text in natural scenes,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2004, pp. 366–373.

B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2010, pp. 2963–2970.

H. Chen, S. Tsai, G. Schroth, D. Chen, R. Grzeszczuk, and B. Girod, “Robust text detection in natural images with edge-enhanced maximally stable extremal regions,” in Proc. IEEE Int. Conf. Image Process., Sep. 2011, pp. 2609–2612.

X. Chen and A. Yuille, “A time-efficient cascade for real-time object detection: With applications for the visually impaired,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Workshops, Jun. 2005, pp. 1–8.

J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: A statistical view of boosting,” Ann. Stat., vol. 28, no. 2, pp. 337–407, 1998.

Downloads

Published

2015-04-30

How to Cite

[1]
S. Elakkiya and T. Kavitha, “Detection of Text Using Connected Component Clustering and Nontext Filtering”, Int. J. Comp. Sci. Eng., vol. 3, no. 4, pp. 53–57, Apr. 2015.

Issue

Section

Research Article