References & Citations
Computer Science > Computer Vision and Pattern Recognition
Title: Progressive Skeletonization: Trimming more fat from a network at initialization
(Submitted on 16 Jun 2020 (v1), last revised 19 Mar 2021 (this version, v5))
Abstract: Recent studies have shown that skeletonization (pruning parameters) of networks \textit{at initialization} provides all the practical benefits of sparsity both at inference and training time, while only marginally degrading their performance. However, we observe that beyond a certain level of sparsity (approx $95\%$), these approaches fail to preserve the network performance, and to our surprise, in many cases perform even worse than trivial random pruning. To this end, we propose an objective to find a skeletonized network with maximum {\em foresight connection sensitivity} (FORCE) whereby the trainability, in terms of connection sensitivity, of a pruned network is taken into consideration. We then propose two approximate procedures to maximize our objective (1) Iterative SNIP: allows parameters that were unimportant at earlier stages of skeletonization to become important at later stages; and (2) FORCE: iterative process that allows exploration by allowing already pruned parameters to resurrect at later stages of skeletonization. Empirical analyses on a large suite of experiments show that our approach, while providing at least as good a performance as other recent approaches on moderate pruning levels, provides remarkably improved performance on higher pruning levels (could remove up to $99.5\%$ parameters while keeping the networks trainable). Code can be found in this https URL
Submission history
From: Pau De Jorge Aranda [view email][v1] Tue, 16 Jun 2020 11:32:47 GMT (455kb,D)
[v2] Tue, 23 Jun 2020 14:41:08 GMT (455kb,D)
[v3] Tue, 14 Jul 2020 12:02:15 GMT (455kb,D)
[v4] Wed, 21 Oct 2020 13:54:26 GMT (394kb,D)
[v5] Fri, 19 Mar 2021 13:06:16 GMT (421kb,D)
Link back to: arXiv, form interface, contact.