We propose and systematically evaluate three strategies for training dynamically-routed artificial neural networks: graphs of learned transformations through which different input signals may take different paths. Though some approaches have advantages over others, the resulting networks are often qualitatively similar. We find that, in dynamically-routed networks trained to classify images, layers and branches become specialized to process distinct categories of images. Additionally, given a fixed computational budget, dynamically-routed networks tend to perform better than comparable statically-routed networks.
- Pub Date:
- March 2017
- Statistics - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Computer Science - Neural and Evolutionary Computing
- ICML 2017. Code at https://github.com/MasonMcGill/multipath-nn Video abstract at https://youtu.be/NHQsDaycwyQ