Corrected CBOW Performs as well as Skip-gram

İrsoy, Ozan; Benton, Adrian; Stratos, Karl

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2012

Computer Science > Computation and Language

Title: Corrected CBOW Performs as well as Skip-gram

Authors: Ozan İrsoy, Adrian Benton, Karl Stratos

(Submitted on 30 Dec 2020 (v1), last revised 9 Nov 2021 (this version, v2))

Abstract: Mikolov et al. (2013a) observed that continuous bag-of-words (CBOW) word embeddings tend to underperform Skip-gram (SG) embeddings, and this finding has been reported in subsequent works. We find that these observations are driven not by fundamental differences in their training objectives, but more likely on faulty negative sampling CBOW implementations in popular libraries such as the official implementation, word2vec.c, and Gensim. We show that after correcting a bug in the CBOW gradient update, one can learn CBOW word embeddings that are fully competitive with SG on various intrinsic and extrinsic tasks, while being many times faster to train.

Comments:	Presented at WINR at EMNLP 2021, added discussion about FastText, more discussion about findings, additional results on C4 data, wording changes
Subjects:	Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as:	arXiv:2012.15332 [cs.CL]
	(or arXiv:2012.15332v2 [cs.CL] for this version)

Submission history

From: Ozan İrsoy [view email]
[v1] Wed, 30 Dec 2020 21:37:28 GMT (140kb,D)
[v2] Tue, 9 Nov 2021 16:28:00 GMT (162kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2012.15332

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Corrected CBOW Performs as well as Skip-gram

Submission history