Sparse Lifting of Dense Vectors: Unifying Word and Sentence Representations

Li, Wenye; Hao, Senyue

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 1911

Computer Science > Computation and Language

Title: Sparse Lifting of Dense Vectors: Unifying Word and Sentence Representations

Authors: Wenye Li, Senyue Hao

(Submitted on 5 Nov 2019)

Abstract: As the first step in automated natural language processing, representing words and sentences is of central importance and has attracted significant research attention. Different approaches, from the early one-hot and bag-of-words representation to more recent distributional dense and sparse representations, were proposed. Despite the successful results that have been achieved, such vectors tend to consist of uninterpretable components and face nontrivial challenge in both memory and computational requirement in practical applications. In this paper, we designed a novel representation model that projects dense word vectors into a higher dimensional space and favors a highly sparse and binary representation of word vectors with potentially interpretable components, while trying to maintain pairwise inner products between original vectors as much as possible. Computationally, our model is relaxed as a symmetric non-negative matrix factorization problem which admits a fast yet effective solution. In a series of empirical evaluations, the proposed model exhibited consistent improvement and high potential in practical applications.

Comments:	11 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1911.01625 [cs.CL]
	(or arXiv:1911.01625v1 [cs.CL] for this version)

Submission history

From: Wenye Li [view email]
[v1] Tue, 5 Nov 2019 05:28:05 GMT (295kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1911.01625

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Sparse Lifting of Dense Vectors: Unifying Word and Sentence Representations

Submission history