A Convolution Neural Network, Particle Swarm Optimization Hybrid Model for Scripting Language Handwritten Character Recognition
DOI:
https://doi.org/10.26438/ijcse/v13i8.3041Keywords:
Data analysis, machine learning, deep learningAbstract
In the realm of character recognition, the availability of comprehensive and relevant datasets is essential for developing accurate and robust models. In this research, we address the dearth of datasets for Pashto handwritten character recognition by introducing a novel and extensive dataset, previously unavailable in the field. The absence of a Pashto dataset posed a significant challenge to developing effective recognition models, making this dataset a valuable contribution to the research community. To optimize the recognition process, we put forward a CNN-PSO (Convolutional Neural Network–Particle Swarm Optimization) based hybrid approach. The CNN component is utilized for extracting features, while PSO is applied for parameter optimization. By incorporating PSO, the aim is to strengthen the model’s capacity to accurately recognize and classify handwritten Pashto characters. To validate this approach, we compare the CNN-PSO framework with a standard CNN baseline. The outcomes clearly indicate that the hybrid design surpasses the performance of the CNN-only model in Pashto handwritten character recognition. The proposed research findings reveal the potential of hybrid models in character recognition tasks and underline the significance of utilizing the Pashto dataset in advancing research in this domain. This study aids in building recognition systems for Pashto that are more precise and effective, with possible applications in OCR technologies and language processing for Pashto-speaking regions.
References
[1] W. Khan, A. Daud, K. Khan, S. Muhammad, and R. Haq, “Exploring the frontiers of deep learning and natural language processing: A comprehensive overview of key challenges and emerging trends,” Natural Language Processing Journal, Vol.4, pp.100026, 2023. DOI: 10.1016/j.nlp.2023.100026
[2] J. Torregrosa, G. Bello-Orgaz, E. Martínez-Cámara, et al., “A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges,” Journal of Ambient Intelligence and Humanized Computing, Vol.14, pp.9869–9905, 2023. DOI: 10.1007/s12652-021-03658-z
[3] F. Meng and B. Ghena, “Research on Text Recognition Methods Based on Artificial Intelligence and Machine Learning,” Advances in Computer and Communication, Vol.4, No.5, pp.340–344, 2023. DOI: 10.26855/acc.2023.10.014
[4] D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: state of the art, current trends and challenges,” Multimedia Tools and Applications, Vol.82, No.3, pp.3713–3744, 2023.
[5] Y. Yang, A. K. Singh, M. Elhoushi, A. Mahmoud, K. Tirumala, F. Gloeckle, B. Rozière, C.-J. Wu, A. S. Morcos, and N. Ardalani, “Decoding data quality via synthetic corruptions: Embedding-guided pruning of code data,” arXiv preprint arXiv:2312.02418, 2023.
[6] H. Wang, Z. Yue, Q. Xie, Q. Zhao, Y. Zheng, and D. Meng, “From rain generation to rain removal,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.14791–14801, 2021.
[7] S. Khan, S. Nazir, H. U. Khan, and A. Hussain, “Pashto characters recognition using multi-class enabled support vector machine,” International Journal/Conference (details missing), 2021.
[8] Y. N. Pawan, K. B. Prakash, S. Chowdhury, and Y. C. Hu, “Particle swarm optimization performance improvement using deep learning techniques,” Multimedia Tools and Applications, Vol.81, No. 19, pp.27949–27968, 2022.
[9] T. Lawrence, L. Zhang, C. P. Lim, and E. J. Phillips, “Particle swarm optimization for automatically evolving convolutional neural networks for image classification,” IEEE Access, Vol.9, pp.14369–14386, 2021.
[10] M. Abd Elaziz et al., “Advanced metaheuristic optimization techniques in applications of deep neural networks: A review,” Neural Computing and Applications, pp.1–21, 2021.
[11] L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” Journal of Big Data, Vol.8, pp.1–74, 2021.
[12] Y. B. Hamdan and A. Sathesh, “Construction of statistical SVM based recognition model for handwritten character recognition,” Journal of Information Technology, Vol.3, No.2, pp.92–107, 2021.
[13] J. Memon, M. Sami, R. A. Khan, and M. Uddin, “Handwritten optical character recognition (OCR): A comprehensive systematic literature review (SLR),” IEEE Access, Vol.8, pp.142642–142668, 2020.
[14] F. Asadi-zeydabadi, A. Afkari-Fahandari, A. Faraji, E. Shabaninia, and H. Nezamabadi-pour, “IDPL-PFOD2: A New Large-Scale Dataset for Printed Farsi Optical Character Recognition,” arXiv preprint arXiv:2312.01177, 2023.
[15] S. Naz et al., “Urdu Nastaliq recognition using convolutional recursive deep learning,” Neurocomputing, Vol.243, pp.80–87, 2017.
[16] F. Alotaibi et al., “Optical character recognition for Quranic image similarity matching,” IEEE Access, Vol.6, pp.554–562, 2018.
[17] C. Boufenar, M. Batouche, and M. Schoenauer, “An artificial immune system for offline isolated handwritten Arabic character recognition,” Evolving Systems, Vol.9, No.1, pp.25–41, 2018.
[18] M. I. Razzak, “Urdu nastaliq text recognition system based on multidimensional recurrent neural network and statistical features,” Neural Computing and Applications, Vol.28, No.2, pp.219–231, 2017.
[19] S. Naz et al., “Statistical features extraction for character recognition using recurrent neural network,” Pakistan Journal of Statistics, Vol.34, No.1, pp.47–53, 2018.
[20] M. Rabi, M. Amrouch, and Z. Mahani, “A survey of contextual handwritten recognition systems based HMMs for cursive Arabic and Latin script,” International Journal of Computer Applications, Vol.160, No.21, 2017.
[21] S. Islam et al., “A comprehensive survey on applications of transformers for deep learning tasks,” Expert Systems with Applications, pp.122666, 2023.
[22] A. V. Geetha, T. Mala, D. Priyanka, and E. Uma, “Multimodal emotion recognition with deep learning: advancements, challenges, and future directions,” Information Fusion, Vol.105, pp.102218, 2024.
[23] F. Zeng, W. Gan, Y. Wang, N. Liu, and P. S. Yu, “Large language models for robotics: A survey,” arXiv preprint arXiv:2311.07226, 2023.
[24] J. Wang et al., “Review of large vision models and visual prompt engineering,” Meta-Radiology, pp.100047, 2023.
[25] J. Lin et al., “M6: A Chinese multimodal pretrainer,” arXiv preprint arXiv:2103.00823, 2021.
[26] K. Han et al., “A survey on visual transformer,” arXiv preprint arXiv:2012.12556, 2020.
[27] M. Al-Qurishi, T. Khalid, and R. Souissi, “Deep learning for sign language recognition: Current techniques, benchmarks, and open issues,” IEEE Access, Vol.9, pp.126917–126951, 2021.
[28] J. Zheng, Y. Gao, H. Zhang, Y. Lei, and J. Zhang, “OTSU multi-threshold image segmentation based on improved particle swarm algorithm,” Applied Sciences, Vol.12, No.22, pp.11514, 2022.
[29] D. Wang, D. Tan, and L. Liu, “Particle swarm optimization algorithm: An overview,” Soft Computing, Vol.22, pp.387–408, 2018.
[30] T. Blackwell and J. Kennedy, “Impact of communication topology in particle swarm optimization,” IEEE Transactions on Evolutionary Computation, Vol.23, pp.689–702, 2018.
[31] A. G. Gad, “Particle swarm optimization algorithm and its applications: A systematic review,” Archives of Computational Methods in Engineering, Vol.29, pp.2531–2561, 2022.
[32] N. A. A. Aziz and K. A. Aziz, “Pendulum search algorithm: An optimization algorithm based on simple harmonic motion and its application for a vaccine distribution problem,” Algorithms, Vol.15, pp.214, 2022.
[33] A. K. Paul, P. C. Shill, R. I. Rabin, A. Kundu, and A. H. Akhand, “Fuzzy membership function generation using DMS-PSO for the diagnosis of heart disease,” Proceedings of the 2015 18th International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, pp.456–461, 2015.
[34] Z. Beheshti, S. M. H. Shamsuddin, E. Beheshti, and S. S. Yuhaniz, “Enhancement of artificial neural network learning using centripetal accelerated particle swarm optimization for medical diseases diagnosis,” Soft Computing, Vol.18, pp.2253–2270, 2013.
[35] J. Li, T. Liu, X. Wang, and J. Yu, “Automated asphalt pavement damage rate detection based on optimized GA-CNN,” Automation in Construction, Vol.136, pp.104180, 2022.
[36] A. Mloey, “AHCD1: Arabic handwritten characters dataset,” Dataset Publication (details missing), 2022.
[37] S. Poornima and M. Pushpalatha, “Prediction of rainfall using intensified LSTM based recurrent neural network with weighted linear units,” Atmosphere, Vol.10, No.11, pp.668, 2019.
[38] M. F. Bashir et al., “Context-aware emotion detection from low-resource Urdu language using deep neural network,” ACM Transactions on Asian and Low-Resource Language Information Processing, Vol.22, No.5, pp.1–30, 2023.
[39] V. N. Boddeti, “Advances in correlation filters: vector features, structured prediction and shape alignment,” Ph.D. Thesis, Carnegie Mellon Univ., USA, 2012.
[40] R. Ahmad et al., “Scale and rotation invariant OCR for Pashto cursive script using MDLSTM network,” Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp.1101–1105, Aug. 2015.
[41] F. M. Nashwan et al., “A holistic technique for an Arabic OCR system,” Journal of Imaging, Vol.4, No.1, pp.6, 2017.
[42] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[43] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770–778, 2016.
[44] Google Research Blog, “AutoML for large scale image classification and object detection,” Nov. 2017.
[45] A. G. Howard et al., “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
[46] F. Chollet, “Exception: Deep learning with depth wise separable convolutions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1251–1258, 2017.
[47] C. Szegedy et al., “Going deeper with convolutions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1–9, 2015.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
