We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Acquisition of Translation Lexicons for Historically Unwritten Languages via Bridging Loanwords

Abstract: With the advent of informal electronic communications such as social media, colloquial languages that were historically unwritten are being written for the first time in heavily code-switched environments. We present a method for inducing portions of translation lexicons through the use of expert knowledge in these settings where there are approximately zero resources available other than a language informant, potentially not even large amounts of monolingual data. We investigate inducing a Moroccan Darija-English translation lexicon via French loanwords bridging into English and find that a useful lexicon is induced for human-assisted translation and statistical machine translation.
Comments: 5 pages, 1 figure, 1 table; published in the Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pages 21-25, Vancouver, Canada, August 2017
Subjects: Computation and Language (cs.CL)
ACM classes: I.2.7
Journal reference: In Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pages 21-25, Vancouver, Canada, August 2017. Association for Computational Linguistics
Cite as: arXiv:1706.01570 [cs.CL]
  (or arXiv:1706.01570v2 [cs.CL] for this version)

Submission history

From: Michael Bloodgood [view email]
[v1] Tue, 6 Jun 2017 00:55:25 GMT (75kb,D)
[v2] Sun, 20 Aug 2017 20:09:30 GMT (75kb,D)

Link back to: arXiv, form interface, contact.