We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Computational Performance Predictions for Deep Neural Network Training: A Runtime-Based Approach

Abstract: Deep learning researchers and practitioners usually leverage GPUs to help train their deep neural networks (DNNs) faster. However, choosing which GPU to use is challenging both because (i) there are many options, and (ii) users grapple with competing concerns: maximizing compute performance while minimizing costs. In this work, we present a new practical technique to help users make informed and cost-efficient GPU selections: make performance predictions using the help of a GPU that the user already has. Our technique exploits the observation that, because DNN training consists of repetitive compute steps, predicting the execution time of a single iteration is usually enough to characterize the performance of an entire training process. We make predictions by scaling the execution time of each operation in a training iteration from one GPU to another using either (i) wave scaling, a technique based on a GPU's execution model, or (ii) pre-trained multilayer perceptrons. We implement our technique into a Python library called Surfer and find that it makes accurate iteration execution time predictions on ResNet-50, Inception v3, the Transformer, GNMT, and DCGAN across six different GPU architectures. Surfer currently supports PyTorch, is easy to use, and requires only a few lines of code.
Comments: 17 pages, 7 figures
Subjects: Machine Learning (cs.LG); Performance (cs.PF)
Cite as: arXiv:2102.00527 [cs.LG]
  (or arXiv:2102.00527v1 [cs.LG] for this version)

Submission history

From: Geoffrey Yu [view email]
[v1] Sun, 31 Jan 2021 20:17:46 GMT (139kb,D)
[v2] Mon, 7 Jun 2021 19:23:40 GMT (5622kb,D)

Link back to: arXiv, form interface, contact.