Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Mostafa, Hesham; Wang, Xin

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1902

Computer Science > Machine Learning

Title: Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Authors: Hesham Mostafa, Xin Wang

(Submitted on 15 Feb 2019 (v1), last revised 13 May 2019 (this version, v3))

Abstract: Modern deep neural networks are typically highly overparameterized. Pruning techniques are able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic reallocation of non-zero parameters have emerged, allowing direct training of sparse networks without having to pre-train a large dense model. Here we present a novel dynamic sparse reparameterization method that addresses the limitations of previous techniques such as high computational cost and the need for manual configuration of the number of free parameters allocated to each layer. We evaluate the performance of dynamic reallocation methods in training deep convolutional networks and show that our method outperforms previous static and dynamic reparameterization methods, yielding the best accuracy for a fixed parameter budget, on par with accuracies obtained by iteratively pruning a pre-trained dense model. We further investigated the mechanisms underlying the superior generalization performance of the resultant sparse networks. We found that neither the structure, nor the initialization of the non-zero parameters were sufficient to explain the superior performance. Rather, effective learning crucially depended on the continuous exploration of the sparse network structure space during training. Our work suggests that exploring structural degrees of freedom during training is more effective than adding extra parameters to the network.

Comments:	Proceedings of the 36th International Conference on MachineLearning, Long Beach, California, PMLR 97, 2019
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1902.05967 [cs.LG]
	(or arXiv:1902.05967v3 [cs.LG] for this version)

Submission history

From: Xin Wang [view email]
[v1] Fri, 15 Feb 2019 19:11:55 GMT (257kb,D)
[v2] Tue, 19 Mar 2019 21:50:24 GMT (257kb,D)
[v3] Mon, 13 May 2019 00:02:04 GMT (2052kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1902.05967

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Submission history