TanhSoft -- a family of activation functions combining Tanh and Softplus

doi:10.48550/arXiv.2009.03863

TanhSoft -- a family of activation functions combining Tanh and Softplus

Deep learning at its core, contains functions that are composition of a linear transformation with a non-linear function known as activation function. In past few years, there is an increasing interest in construction of novel activation functions resulting in better learning. In this work, we propose a family of novel activation functions, namely TanhSoft, with four undetermined hyper-parameters of the form tanh({\alpha}x+{\beta}e^{{\gamma}x})ln({\delta}+e^x) and tune these hyper-parameters to obtain activation functions which are shown to outperform several well known activation functions. For instance, replacing ReLU with xtanh(0.6e^x)improves top-1 classification accuracy on CIFAR-10 by 0.46% for DenseNet-169 and 0.7% for Inception-v3 while with tanh(0.87x)ln(1 +e^x) top-1 classification accuracy on CIFAR-100 improves by 1.24% for DenseNet-169 and 2.57% for SimpleNet model.

Publication:

arXiv e-prints

Pub Date:

September 2020

DOI:

10.48550/arXiv.2009.03863

arXiv:

arXiv:2009.03863

Bibcode:

2020arXiv200903863B

Keywords:

Computer Science - Neural and Evolutionary Computing;
Computer Science - Artificial Intelligence;
Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Machine Learning

E-Print:

11 pages

NASA/ADS

TanhSoft -- a family of activation functions combining Tanh and Softplus

Abstract