CheapET-3: Cost-Efficient Use of Remote DNN Models
Abstract
On complex problems, state of the art prediction accuracy of Deep Neural Networks (DNN) can be achieved using very large-scale models, consisting of billions of parameters. Such models can only be run on dedicated servers, typically provided by a 3rd party service, which leads to a substantial monetary cost for every prediction. We propose a new software architecture for client-side applications, where a small local DNN is used alongside a remote large-scale model, aiming to make easy predictions locally at negligible monetary cost, while still leveraging the benefits of a large model for challenging inputs. In a proof of concept we reduce prediction cost by up to 50% without negatively impacting system accuracy.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2022
- DOI:
- 10.48550/arXiv.2208.11552
- arXiv:
- arXiv:2208.11552
- Bibcode:
- 2022arXiv220811552W
- Keywords:
-
- Computer Science - Software Engineering;
- Computer Science - Artificial Intelligence;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- Research Abstract. Contact me for a pre-print of the full paper (currently not yet published)