HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking
Abstract
Recent years witness a trend of applying large-scale distributed deep learning algorithms (HPC AI) in both business and scientific computing areas, whose goal is to speed up the training time to achieve a state-of-the-art quality. The HPC AI benchmarks accelerate the process. Unfortunately, benchmarking HPC AI systems at scale raises serious challenges. This paper presents a representative, repeatable and simple HPC AI benchmarking methodology. Among the seventeen AI workloads of AIBench Training -- by far the most comprehensive AI Training benchmarks suite -- we choose two representative and repeatable AI workloads. The selected HPC AI benchmarks include both business and scientific computing: Image Classification and Extreme Weather Analytics. To rank HPC AI systems, we present a new metric named Valid FLOPS, emphasizing both throughput performance and a target quality. The specification, source code, datasets, and HPC AI500 ranking numbers are publicly available from \url{https://www.benchcouncil.org/HPCAI500/}.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2021
- DOI:
- 10.48550/arXiv.2102.12848
- arXiv:
- arXiv:2102.12848
- Bibcode:
- 2021arXiv210212848J
- Keywords:
-
- Computer Science - Performance
- E-Print:
- arXiv admin note: substantial text overlap with arXiv:2007.00279