Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin. Over the years, many smooth approximations of ReLU have been proposed using various smoothing techniques. We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities. In particular, we present smooth approximations of Leaky ReLU and show that they outperform several well-known activation functions in various datasets and models. We call this function Smooth Activation Unit (SAU). Replacing ReLU by SAU, we get 5.12% improvement with ShuffleNet V2 (2.0x) model on CIFAR100 dataset.
- Pub Date:
- September 2021
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Neural and Evolutionary Computing
- arXiv admin note: text overlap with arXiv:2109.04386