LayerAct: Advanced activation mechanism utilizing layer-direction normalization for CNNs with BatchNorm

doi:10.48550/arXiv.2306.04940

LayerAct: Advanced activation mechanism utilizing layer-direction normalization for CNNs with BatchNorm

In this work, we propose a novel activation mechanism aimed at establishing layer-level activation (LayerAct) functions for CNNs with BatchNorm. These functions are designed to be more noise-robust compared to existing element-level activation functions by reducing the layer-level fluctuation of the activation outputs due to shift in inputs. Moreover, the LayerAct functions achieve this noise-robustness independent of the activation's saturation state, which limits the activation output space and complicates efficient training. We present an analysis and experiments demonstrating that LayerAct functions exhibit superior noise-robustness compared to element-level activation functions, and empirically show that these functions have a zero-like mean activation. Experimental results with three clean and three out-of-distribution benchmark datasets for image classification tasks show that LayerAct functions excel in handling noisy datasets, outperforming element-level activation functions, while the performance on clean datasets is also superior in most cases.

Publication:

arXiv e-prints

Pub Date:

June 2023

DOI:

10.48550/arXiv.2306.04940

arXiv:

arXiv:2306.04940

Bibcode:

2023arXiv230604940Y

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Neural and Evolutionary Computing;
68T07 (Primary) 68T45 (Secondary)

E-Print:

10 pages, 3 figures, 3 tables except appendix

NASA/ADS

LayerAct: Advanced activation mechanism utilizing layer-direction normalization for CNNs with BatchNorm

Abstract