Differentiable Sparsification for Deep Neural Networks

Lee, Yognjin

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1910

Computer Science > Machine Learning

Title: Differentiable Sparsification for Deep Neural Networks

Authors: Yognjin Lee

(Submitted on 8 Oct 2019 (v1), last revised 24 Oct 2023 (this version, v6))

Abstract: Deep neural networks have significantly alleviated the burden of feature engineering, but comparable efforts are now required to determine effective architectures for these networks. Furthermore, as network sizes have become excessively large, a substantial amount of resources is invested in reducing their sizes. These challenges can be effectively addressed through the sparsification of over-complete models. In this study, we propose a fully differentiable sparsification method for deep neural networks, which can zero out unimportant parameters by directly optimizing a regularized objective function with stochastic gradient descent. Consequently, the proposed method can learn both the sparsified structure and weights of a network in an end-to-end manner. It can be directly applied to various modern deep neural networks and requires minimal modification to the training process. To the best of our knowledge, this is the first fully differentiable sparsification method.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1910.03201 [cs.LG]
	(or arXiv:1910.03201v6 [cs.LG] for this version)

Submission history

From: Yongjin Lee [view email]
[v1] Tue, 8 Oct 2019 03:57:04 GMT (532kb,D)
[v2] Thu, 7 May 2020 05:29:13 GMT (1587kb,D)
[v3] Tue, 16 Jun 2020 02:38:35 GMT (3417kb,D)
[v4] Tue, 29 Sep 2020 01:40:03 GMT (9587kb,D)
[v5] Thu, 1 Jul 2021 09:42:57 GMT (1404kb,D)
[v6] Tue, 24 Oct 2023 10:59:28 GMT (10679kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1910.03201

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Differentiable Sparsification for Deep Neural Networks

Submission history