References & Citations
Computer Science > Computation and Language
Title: Acquisition of Translation Lexicons for Historically Unwritten Languages via Bridging Loanwords
(Submitted on 6 Jun 2017 (v1), last revised 20 Aug 2017 (this version, v2))
Abstract: With the advent of informal electronic communications such as social media, colloquial languages that were historically unwritten are being written for the first time in heavily code-switched environments. We present a method for inducing portions of translation lexicons through the use of expert knowledge in these settings where there are approximately zero resources available other than a language informant, potentially not even large amounts of monolingual data. We investigate inducing a Moroccan Darija-English translation lexicon via French loanwords bridging into English and find that a useful lexicon is induced for human-assisted translation and statistical machine translation.
Submission history
From: Michael Bloodgood [view email][v1] Tue, 6 Jun 2017 00:55:25 GMT (75kb,D)
[v2] Sun, 20 Aug 2017 20:09:30 GMT (75kb,D)
Link back to: arXiv, form interface, contact.