HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking

Jiang, Zihan; Gao, Wanling; Tang, Fei; Xiong, Xingwang; Wang, Lei; Lan, Chuanxin; Luo, Chunjie; Li, Hongxiao; Zhan, Jianfeng

Full-text links:

Download:

Current browse context:

cs.PF

< prev | next >

new | recent | 2102

Change to browse by:

Computer Science > Performance

Title: HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking

Authors: Zihan Jiang, Wanling Gao, Fei Tang, Xingwang Xiong, Lei Wang, Chuanxin Lan, Chunjie Luo, Hongxiao Li, Jianfeng Zhan

(Submitted on 25 Feb 2021)

Abstract: Recent years witness a trend of applying large-scale distributed deep learning algorithms (HPC AI) in both business and scientific computing areas, whose goal is to speed up the training time to achieve a state-of-the-art quality. The HPC AI benchmarks accelerate the process. Unfortunately, benchmarking HPC AI systems at scale raises serious challenges. This paper presents a representative, repeatable and simple HPC AI benchmarking methodology. Among the seventeen AI workloads of AIBench Training -- by far the most comprehensive AI Training benchmarks suite -- we choose two representative and repeatable AI workloads. The selected HPC AI benchmarks include both business and scientific computing: Image Classification and Extreme Weather Analytics. To rank HPC AI systems, we present a new metric named Valid FLOPS, emphasizing both throughput performance and a target quality. The specification, source code, datasets, and HPC AI500 ranking numbers are publicly available from \url{this https URL}.

Comments:	arXiv admin note: substantial text overlap with arXiv:2007.00279
Subjects:	Performance (cs.PF)
Cite as:	arXiv:2102.12848 [cs.PF]
	(or arXiv:2102.12848v1 [cs.PF] for this version)

Submission history

From: Zihan Jiang [view email]
[v1] Thu, 25 Feb 2021 13:40:17 GMT (594kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2102.12848

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Performance

Title: HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking

Submission history