We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: How much pre-training is enough to discover a good subnetwork?

Abstract: Neural network pruning is useful for discovering efficient, high-performing subnetworks within pre-trained, dense network architectures. However, more often than not, it involves a three-step process--pre-training, pruning, and re-training--that is computationally expensive, as the dense model must be fully pre-trained. Luckily, several works have empirically shown that high-performing subnetworks can be discovered via pruning without fully pre-training the dense network. Aiming to theoretically analyze the amount of dense network pre-training needed for a pruned network to perform well, we discover a theoretical bound in the number of SGD pre-training iterations on a two-layer, fully-connected network, beyond which pruning via greedy forward selection yields a subnetwork that achieves good training error. This threshold is shown to be logarithmically dependent upon the size of the dataset, meaning that experiments with larger datasets require more pre-training for subnetworks obtained via pruning to perform well. We empirically demonstrate the validity of our theoretical results across a variety of architectures and datasets, including fully-connected networks trained on MNIST and several deep convolutional neural network (CNN) architectures trained on CIFAR10 and ImageNet.
Comments: 33 pages, 5 figures
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Optimization and Control (math.OC)
MSC classes: 68T07
ACM classes: I.2.6; I.2.10; I.4.0
Cite as: arXiv:2108.00259 [stat.ML]
  (or arXiv:2108.00259v2 [stat.ML] for this version)

Submission history

From: Cameron R. Wolfe [view email]
[v1] Sat, 31 Jul 2021 15:08:36 GMT (722kb,D)
[v2] Tue, 7 Dec 2021 04:36:21 GMT (834kb,D)

Link back to: arXiv, form interface, contact.