Porcupine Neural Networks: (Almost) All Local Optima are Global
Abstract
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2017
- DOI:
- 10.48550/arXiv.1710.02196
- arXiv:
- arXiv:1710.02196
- Bibcode:
- 2017arXiv171002196F
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning