References & Citations
Computer Science > Artificial Intelligence
Title: Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads
(Submitted on 12 Apr 2021 (v1), last revised 30 Nov 2021 (this version, v4))
Abstract: During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems is stagnant. DL workloads leverage either highly-optimized, yet platform-specific and inflexible kernels from DL libraries, or in the case of novel operators, reference implementations are built via DL framework primitives with underwhelming performance. This work introduces the Tensor Processing Primitives (TPP), a programming abstraction striving for efficient, portable implementation of DL workloads with high-productivity. TPPs define a compact, yet versatile set of 2D-tensor operators (or a virtual Tensor ISA), which subsequently can be utilized as building-blocks to construct complex operators on high-dimensional tensors. The TPP specification is platform-agnostic, thus code expressed via TPPs is portable, whereas the TPP implementation is highly-optimized and platform-specific. We demonstrate the efficacy and viability of our approach using standalone kernels and end-to-end DL & HPC workloads expressed entirely via TPPs that outperform state-of-the-art implementations on multiple platforms.
Submission history
From: Evangelos Georganas [view email][v1] Mon, 12 Apr 2021 18:35:49 GMT (442kb,D)
[v2] Wed, 14 Apr 2021 15:38:38 GMT (568kb,D)
[v3] Thu, 26 Aug 2021 17:27:06 GMT (455kb,D)
[v4] Tue, 30 Nov 2021 23:40:39 GMT (4003kb,D)
Link back to: arXiv, form interface, contact.