We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Debiasing Multilingual Word Embeddings: A Case Study of Three Indian Languages

Abstract: In this paper, we advance the current state-of-the-art method for debiasing monolingual word embeddings so as to generalize well in a multilingual setting. We consider different methods to quantify bias and different debiasing approaches for monolingual as well as multilingual settings. We demonstrate the significance of our bias-mitigation approach on downstream NLP applications. Our proposed methods establish the state-of-the-art performance for debiasing multilingual embeddings for three Indian languages - Hindi, Bengali, and Telugu in addition to English. We believe that our work will open up new opportunities in building unbiased downstream NLP applications that are inherently dependent on the quality of the word embeddings used.
Comments: This work is accepted as a long paper in the proceedings of ACM HyperText 2021
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2107.10181 [cs.CL]
  (or arXiv:2107.10181v2 [cs.CL] for this version)

Submission history

From: Ayush Suhane [view email]
[v1] Wed, 21 Jul 2021 16:12:51 GMT (1974kb,D)
[v2] Thu, 22 Jul 2021 16:57:31 GMT (2356kb,D)

Link back to: arXiv, form interface, contact.