On a Sparse Shortcut Topology of Artificial Neural Networks
Abstract
Over recent years, deep learning has become the mainstream datadriven approach to solve many important realworld problems. In the successful network architectures, shortcut connections are well established to take the outputs of earlier layers as additional inputs to later layers, which have produced excellent results. Despite the extraordinary effectiveness of shortcuts, there remain important questions on the underlying mechanism and associated functionalities. For example, why are shortcuts powerful? Why shortcuts generalize well? To address these questions, we investigate the representation and generalization ability of a sparse shortcut topology. Specifically, we first demonstrate that this topology can empower a oneneuronwide deep network to approximate any univariate continuous function. Then, we present a novel widthbounded universal approximator in contrast to depthbounded universal approximators, and also extend the approximation result to a family of networks such that in the view of approximation ability, these networks are equally competent. Furthermore, we use the generalization bound theory to show that the investigated shortcut topology enjoys an excellent generalizability. Finally, we corroborate our theoretical analyses with experiments on some wellknown benchmarks.
 Publication:

arXiv eprints
 Pub Date:
 November 2018
 arXiv:
 arXiv:1811.09003
 Bibcode:
 2018arXiv181109003F
 Keywords:

 Computer Science  Machine Learning;
 Statistics  Machine Learning