Octopus: A Heterogeneous In-network Computing Accelerator Enabling Deep Learning for network
Abstract
Deep learning (DL) for network models have achieved excellent performance in the field and are becoming a promising component in future intelligent network system. Programmable in-network computing device has great potential to deploy DL for network models, however, existing device cannot afford to run a DL model. The main challenges of data-plane supporting DL-based network models lie in computing power, task granularity, model generality and feature extracting. To address above problems, we propose Octopus: a heterogeneous in-network computing accelerator enabling DL for network models. A feature extractor is designed for fast and efficient feature extracting. Vector accelerator and systolic array work in a heterogeneous collaborative way, offering low-latency-highthroughput general computing ability for packet-and-flow-based tasks. Octopus also contains on-chip memory fabric for storage and connecting, and Risc-V core for global controlling. The proposed Octopus accelerator design is implemented on FPGA. Functionality and performance of Octopus are validated in several use-cases, achieving performance of 31Mpkt/s feature extracting, 207ns packet-based computing latency, and 90kflow/s flow-based computing throughput.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2023
- DOI:
- 10.48550/arXiv.2308.11312
- arXiv:
- arXiv:2308.11312
- Bibcode:
- 2023arXiv230811312W
- Keywords:
-
- Computer Science - Hardware Architecture;
- Computer Science - Networking and Internet Architecture