Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems
Abstract
In this paper, we introduce a software-defined framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and programmable CPUs. Two platforms with different architectures are considered, and a single C/C++ source code is used in both of them for the CPU and FPGA resources. Instead of simply using the hardware accelerator to offload a task from the CPU, we propose a scheduler that dynamically distributes the tasks among all the resources to fully exploit all computing devices while minimizing load unbalance. The multi-architecture study compares an ARMV7 and ARMV8 implementation with different number and type of CPU cores and also different FPGA micro-architecture and size. We measure that both platforms benefit from having the CPU cores assist FPGA execution at the same level of energy requirements.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2018
- DOI:
- 10.48550/arXiv.1802.03316
- arXiv:
- arXiv:1802.03316
- Bibcode:
- 2018arXiv180203316N
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- Presented at HIP3ES, 2018