BinaryBERT: Pushing the Limit of BERT Quantization

Bai, Haoli; Zhang, Wei; Hou, Lu; Shang, Lifeng; Jin, Jing; Jiang, Xin; Liu, Qun; Lyu, Michael; King, Irwin

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2012

Change to browse by:

Computer Science > Computation and Language

Title: BinaryBERT: Pushing the Limit of BERT Quantization

Authors: Haoli Bai, Wei Zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael Lyu, Irwin King

(Submitted on 31 Dec 2020 (v1), last revised 22 Jul 2021 (this version, v2))

Abstract: The rapid development of large pre-trained language models has greatly increased the demand for model compression techniques, among which quantization is a popular solution. In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization. We find that a binary BERT is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape. Therefore, we propose ternary weight splitting, which initializes BinaryBERT by equivalently splitting from a half-sized ternary network. The binary model thus inherits the good performance of the ternary one, and can be further enhanced by fine-tuning the new architecture after splitting. Empirical results show that our BinaryBERT has only a slight performance drop compared with the full-precision model while being 24x smaller, achieving the state-of-the-art compression results on the GLUE and SQuAD benchmarks.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2012.15701 [cs.CL]
	(or arXiv:2012.15701v2 [cs.CL] for this version)

Submission history

From: Lu Hou [view email]
[v1] Thu, 31 Dec 2020 16:34:54 GMT (3857kb,D)
[v2] Thu, 22 Jul 2021 13:13:45 GMT (12801kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2012.15701

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: BinaryBERT: Pushing the Limit of BERT Quantization

Submission history