Compilation and Optimizations for Efficient Machine Learning on Embedded Systems
Abstract
Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However, DNN-based ML applications also bring much increased computational and storage requirements, which are particularly challenging for embedded systems with limited compute/storage resources, tight power budgets, and small form factors. Challenges also come from the diverse application-specific requirements, including real-time responses, high-throughput performance, and reliable inference accuracy. To address these challenges, we introduce a series of effective design methodologies, including efficient ML model designs, customized hardware accelerator designs, and hardware/software co-design strategies to enable efficient ML applications on embedded systems.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2022
- DOI:
- 10.48550/arXiv.2206.03326
- arXiv:
- arXiv:2206.03326
- Bibcode:
- 2022arXiv220603326Z
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Hardware Architecture
- E-Print:
- This article will appear as a book chapter in a new book: Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing, Springer Nature