Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception

Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception

We study the application of emerging chiplet-based Neural Processing Units to accelerate vehicular AI perception workloads in constrained automotive settings. The motivation stems from how chiplets technology is becoming integral to emerging vehicular architectures, providing a cost-effective trade-off between performance, modularity, and customization; and from perception models being the most computationally demanding workloads in a autonomous driving system. Using the Tesla Autopilot perception pipeline as a case study, we first breakdown its constituent models and profile their performance on different chiplet accelerators. From the insights, we propose a novel scheduling strategy to efficiently deploy perception workloads on multi-chip AI accelerators. Our experiments using a standard DNN performance simulator, MAESTRO, show our approach realizes 82% and 2.8x increase in throughput and processing engines utilization compared to monolithic accelerator designs.

Publication:

arXiv e-prints

Pub Date:

November 2024

arXiv:

arXiv:2411.16007

Bibcode:

2024arXiv241116007O

Keywords:

Computer Science - Hardware Architecture;
Computer Science - Artificial Intelligence;
Computer Science - Distributed, Parallel, and Cluster Computing;
Computer Science - Performance

E-Print:

DATE'2025

NASA/ADS

Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception

Abstract