Attention Word Embedding

Sonkar, Shashank; Waters, Andrew E.; Baraniuk, Richard G.

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2006

Computer Science > Computation and Language

Title: Attention Word Embedding

Authors: Shashank Sonkar, Andrew E. Waters, Richard G. Baraniuk

(Submitted on 1 Jun 2020)

Abstract: Word embedding models learn semantically rich vector representations of words and are widely used to initialize natural processing language (NLP) models. The popular continuous bag-of-words (CBOW) model of word2vec learns a vector embedding by masking a given word in a sentence and then using the other words as a context to predict it. A limitation of CBOW is that it equally weights the context words when making a prediction, which is inefficient, since some words have higher predictive value than others. We tackle this inefficiency by introducing the Attention Word Embedding (AWE) model, which integrates the attention mechanism into the CBOW model. We also propose AWE-S, which incorporates subword information. We demonstrate that AWE and AWE-S outperform the state-of-the-art word embedding models both on a variety of word similarity datasets and when used for initialization of NLP models.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2006.00988 [cs.CL]
	(or arXiv:2006.00988v1 [cs.CL] for this version)

Submission history

From: Shashank Sonkar [view email]
[v1] Mon, 1 Jun 2020 14:47:48 GMT (27kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.00988

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Attention Word Embedding

Submission history