Denoising Dirty Document using Autoencoder

Authors

  • Imran M Computer Science and Engineering, Neil Gogte Institute of Technology (NGIT), Affiliated to Osmania University, Survey No-35, Peerzadiguda Road, Kachawanisingaram, Uppal, Hyderabad, India
  • T Sita Mahalakshmi Department of Computer Science and Engineering, GITAM Institute of Technology, Andhra Pradesh, India
  • MD Venkata Prasad Research Scholar (Regd No: 1260316406), Dept. of Computer Science and Engineering, GITAM Deemed to be University, Visakhapatnam, Andhra Pradesh, India
  • Kumar Kopparty Research Scholar (Regd No: 41900148), Dept. of Computer Science and Engineering, LPU (Lovely Professional University), Jalandhar - Delhi G.T. Road, Phagwara, Punjab, India

DOI:

https://doi.org/10.26438/ijcse/v7i10.2126

Keywords:

document denoising, deep autoencoder, supervised learning, deep learning, classification, cleaned and noisy images

Abstract

An autoencoder is an unsupervised machine learning algorithm [12] that applies back propagation, setting the target values to be equal to the inputs. Deep autoencoders are used to reduce the size of our inputs into a minor representation. If anyone needs the original data, they can reconstruct it from the compressed data.The input seen by the autoencoder is not the raw input but a stochastically corrupted version. A denoising autoencoder is thus trained to reconstruct the original document from the noisy version.In the implementation of Deep autoencoders we have trained the algorithm with noisy and cleaned document images; we generated a model which helps us in removing noise or unnecessary interruption from the documents. Document denoising can be achieved with the deep learning model which automatically learns the discriminative features necessary for classification of input images.

References

[1]. Xie, J., Xu, L., Chen, E.: Image denoising and in painting with deep neural networks. In: NIPS. (2012)

[2]. J. Portilla, V. Strela, M.J. Wainwright, and E.P. Simoncelli. Image denoising using scale mixtures of Gaussians in the wavelet domain. Image Processing, IEEE Transactions on, 12(11):13381351, 2003.

[3]. F. Luisier, T. Blu, and M. Unser. A new SURE approach to image denoising: Interscale orthonormal wavelet thresholding. IEEE Transactions on Image Processing, 16(3):593606, 2007.

[4]. K. Matsumoto et al.,”Learning classifier system with deep autoencoder,” 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, 2016,pp. 4739- 4746.

[5] A. Krizhevsky, I. Sutskever and G. Hinton,”ImageNet classification with deep convolutional neural networks”, Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.

[6] Semeion Research Center of Sciences of Communication, via Sersale 117, 00128 Rome, Italy Tattile Via Gaetano Donizetti, 1-3-5, 25030 Mairano (Brescia), Italy.

[7] L. Deng, ”The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web],” in IEEE Signal Processing Magazine, vol. 29, no.6, pp.141-142, Nov.2012.

[8] J. Schmidhuber, ”Deep learning in neural networks: An overview”, Neural Networks, vol. 61, pp. 85-117, 2015.

[9] “All About Autoencoders”, Pythonmachinelearning.pro, 2018.

[10] “Image recovery Theory and application”, Automatica, vol. 24, no. 5, pp. 726-727, 1988.

[11] “Building Autoencoders in Keras”, Blog.keras.io, 2018.

[12] M. Celebi and K. Aydin, Unsupervised learning algorithms.

[13] A. Krizhevsky, I. Sutskever and G. Hinton, ”ImageNet classification with deep convolutional neural networks”, Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.

[14] V. Nair and G. E. Hinton. Rectified linear units improve restricted Boltzmann machines. In ICML, 2010

[15] ”PyTorch”, Pytorch.org, 2018.

[16] K. He, X. Zhang, S. Ren and J. Sun, ”Deep Residual Learning for Image Recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778. DOI: 10.1109/CVPR.2016.90

[17] T. D. Gedeon and D. Harris, ”Progressive image compression,” [Proceedings 1992] IJCNN International Joint Conference on Neural Networks, Baltimore, MD,1992, pp. 403-407 vol.4.

[18] L. Bottou. Large-scale machine learning with stochastic gradient descent. COMPSTAT, 2010.

[19] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. ICLR, 2015.

[20] A. V. Lugt, ”Signal detection by complex spatial filtering,” in IEEE Transactions on Information Theory, vol. 10, no. 2, pp. 139-145, Apr 1964.

[21] E. Kaur and N. Singh, ”Image Denoising Techniques: A Review”, Rroij.com, 2018.

Downloads

Published

2019-10-31
CITATION
DOI: 10.26438/ijcse/v7i10.2126
Published: 2019-10-31

How to Cite

[1]
M. Imran, S. M. T, V. P. MD, and K. K. V, “Denoising Dirty Document using Autoencoder”, Int. J. Comp. Sci. Eng., vol. 7, no. 10, pp. 21–26, Oct. 2019.

Issue

Section

Research Article