We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: A Resource for Computational Experiments on Mapudungun

Abstract: We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers. We provide 142 hours of culturally significant conversations in the domain of medical treatment. The conversations are fully transcribed and translated into Spanish. The transcriptions also include annotations for code-switching and non-standard pronunciations. We also provide baseline results on three core NLP tasks: speech recognition, speech synthesis, and machine translation between Spanish and Mapudungun. We further explore other applications for which the corpus will be suitable, including the study of code-switching, historical orthography change, linguistic structure, and sociological and anthropological studies.
Comments: accepted at LREC 2020
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:1912.01772 [cs.CL]
  (or arXiv:1912.01772v2 [cs.CL] for this version)

Submission history

From: Antonios Anastasopoulos [view email]
[v1] Wed, 4 Dec 2019 02:26:39 GMT (139kb)
[v2] Sun, 5 Apr 2020 03:27:12 GMT (139kb)

Link back to: arXiv, form interface, contact.