On the Predictability of Pruning Across Scales

Rosenfeld, Jonathan S.; Frankle, Jonathan; Carbin, Michael; Shavit, Nir

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2006

Computer Science > Machine Learning

Title: On the Predictability of Pruning Across Scales

Authors: Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit

(Submitted on 18 Jun 2020 (v1), last revised 4 Jul 2021 (this version, v3))

Abstract: We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task. We functionally approximate the error of the pruned networks, showing it is predictable in terms of an invariant tying width, depth, and pruning level, such that networks of vastly different pruned densities are interchangeable. We demonstrate the accuracy of this approximation over orders of magnitude in depth, width, dataset size, and density. We show that the functional form holds (generalizes) for large scale data (e.g., ImageNet) and architectures (e.g., ResNets). As neural networks become ever larger and costlier to train, our findings suggest a framework for reasoning conceptually and analytically about a standard method for unstructured pruning.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:2006.10621 [cs.LG]
	(or arXiv:2006.10621v3 [cs.LG] for this version)

Submission history

From: Jonathan Rosenfeld [view email]
[v1] Thu, 18 Jun 2020 15:41:46 GMT (1053kb,D)
[v2] Fri, 19 Jun 2020 14:01:40 GMT (1052kb,D)
[v3] Sun, 4 Jul 2021 02:51:24 GMT (1317kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.10621

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: On the Predictability of Pruning Across Scales

Submission history