We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings

Abstract: This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings. Our method integrates multiple word embeddings created from complementary techniques, textual sources, knowledge bases and languages. Existing word vectors are projected to a common semantic space using linear transformations and averaging. With our method the resulting meta-embeddings maintain the dimensionality of the original embeddings without losing information while dealing with the out-of-vocabulary problem. An extensive empirical evaluation demonstrates the effectiveness of our technique with respect to previous work on various intrinsic and extrinsic multilingual evaluations, obtaining competitive results for Semantic Textual Similarity and state-of-the-art performance for word similarity and POS tagging (English and Spanish). The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities. In other words, we can leverage pre-trained source embeddings from a resource-rich language in order to improve the word representations for under-resourced languages.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2001.06381 [cs.CL]
  (or arXiv:2001.06381v2 [cs.CL] for this version)

Submission history

From: Iker García-Ferrero [view email]
[v1] Fri, 17 Jan 2020 15:42:29 GMT (104kb,D)
[v2] Wed, 8 Sep 2021 15:03:41 GMT (105kb,D)

Link back to: arXiv, form interface, contact.