SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
Abstract
Despite the soaring use of convolutional neural networks (CNNs) in mobile applications, uniformly sustaining high-performance inference on mobile has been elusive due to the excessive computational demands of modern CNNs and the increasing diversity of deployed devices. A popular alternative comprises offloading CNN processing to powerful cloud-based servers. Nevertheless, by relying on the cloud to produce outputs, emerging mission-critical and high-mobility applications, such as drone obstacle avoidance or interactive applications, can suffer from the dynamic connectivity conditions and the uncertain availability of the cloud. In this paper, we propose SPINN, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings. The proposed system introduces a novel scheduler that co-optimises the early-exit policy and the CNN splitting at run time, in order to adapt to dynamic conditions and meet user-defined service-level requirements. Quantitative evaluation illustrates that SPINN outperforms its state-of-the-art collaborative inference counterparts by up to 2x in achieved throughput under varying network conditions, reduces the server cost by up to 6.8x and improves accuracy by 20.7% under latency constraints, while providing robust operation under uncertain connectivity conditions and significant energy savings compared to cloud-centric execution.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2020
- DOI:
- 10.48550/arXiv.2008.06402
- arXiv:
- arXiv:2008.06402
- Bibcode:
- 2020arXiv200806402L
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Statistics - Machine Learning
- E-Print:
- Accepted at the 26th Annual International Conference on Mobile Computing and Networking (MobiCom), 2020