Comparison of various Activation Functions: A Deep Learning Approach
DOI:
https://doi.org/10.26438/ijcse/v6i3.122126Keywords:
CNN (Convolution Neural Network), activation functions and MNIST(Modified National Institute of Standards and Technology) datasetAbstract
A branch of machine learning that attempts to model high-level abstractions in data through algorithms by the use of multiple processing layers with complex structures and nonlinear transformations is known as Deep Learning. In this paper, we present the results of testing neural networks architectures through tensorflow for various activation functions of machine learning algorithms. It was demonstrated on MNIST database of handwritten digits in single-threaded mode that blind selection of these parameters can hugely increase the runtime without the significant increase of precision. Here, we try out different activation functions in a Convolutional Neural Network on the MNIST database and provide as results the change in loss values during training and the final prediction accuracy for all of the functions used. These results create an impactful analysis for optimization and training loss reduction strategy in image recognition problems and provide useful conclusions regarding the use of these activation functions.
References
Srinivas Jagirdar, K. Venkata Subba Reddy, Dr. Ahmed Abdul Moiz Qyser, “Cloud Powered Deep Learning-Emerging Trends”, International Journal of Computer Sciences and Engineering (IJCSE), Vol-4, Issue-6, 2016
N. Gordienko, S. Stirenko, Yu. Kochura, O. Alienin, M. Novotarskiy, Yu. Gordienko, A. Rojbi (2017), “Deep Learning for Fatigue Estimation on the Basis of Multimodal Human-Machine Interactions”, XXIX IUPAP Conference on Computational Physics (CCP2017) (Paris, France).
S. Hamotskyi, A. Rojbi, S. Stirenko, and Yu. Gordienk(2017), “Automatize Generation of Alphabets of Symbols for Multimodal HumanComputer Interfaces”, Proc. of Federated Conference on Computer Science and Information Systems, Prague (FedCSIS-2017) (Prague,Czech Republic).
Zhang, Cha and Ma, Yunqian. “EnsembleMachine Learning” volume 1. Springer, 2012.
Zhou, Zhi-Hua, Wu, Jianxin, and Tang,Wei. “Ensembling neural networks: many could be better than all.” Artificial Intelligence, 137(1-2):239–263, 2002.
Srivastava, Rupesh K, Greff, Klaus, and Schmidhuber, J¨urgen. “Training very deep networks”. In Advances in Neural Information Processing Systems, pp. 2377–2385, 2015.
Gulcehre, Caglar, Moczulski, Marcin, Denil, Misha, and Bengio, Yoshua. “Noisy activation functions”. In International Conference on Machine Learning, pp. 3059–3068,2016a.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
