We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Training Binary Neural Networks using the Bayesian Learning Rule

Abstract: Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.
Comments: accepted by ICML 2020, the camera-ready version
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:2002.10778 [cs.LG]
  (or arXiv:2002.10778v4 [cs.LG] for this version)

Submission history

From: Xiangming Meng [view email]
[v1] Tue, 25 Feb 2020 10:20:10 GMT (1332kb,D)
[v2] Tue, 10 Mar 2020 09:04:24 GMT (1327kb,D)
[v3] Tue, 30 Jun 2020 14:48:33 GMT (1723kb,D)
[v4] Tue, 18 Aug 2020 00:48:15 GMT (2434kb,D)

Link back to: arXiv, form interface, contact.