We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Sideways Transliteration: How to Transliterate Multicultural Person Names?

Abstract: In a global setting, texts contain transliterated names from many cultural origins. Correct transliteration depends not only on target and source languages but also, on the source language of the name. We introduce a novel methodology for transliteration of names originating in different languages using only monolingual resources. Our method is based on a step of noisy transliteration and then ranking of the results based on origin specific letter models. The transliteration table used for noisy generation is learned in an unsupervised manner for each possible origin language. We present a solution for gathering monolingual training data used by our method by mining of social media sites such as Facebook and Wikipedia. We present results in the context of transliterating from English to Hebrew and provide an online web service for transliteration from English to Hebrew
Comments: Rejected from a bunch of conferences - but submitted due to popular demand
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:1911.12022 [cs.CL]
  (or arXiv:1911.12022v1 [cs.CL] for this version)

Submission history

From: Raphael Cohen [view email]
[v1] Wed, 27 Nov 2019 08:38:57 GMT (561kb)

Link back to: arXiv, form interface, contact.